Compare commits

..

1273 Commits

Author SHA1 Message Date
Christophe Fergeau
569a93cbbe spice: Disallow use of gl + TCP port
Currently, virgl support has to go through a local unix socket, trying
to connect to a VM using -spice gl through spice://localhost:5900 will
only result in a black screen.
This commit errors out when the user tries to start a VM with both GL
support and a port/tls-port set.
This would fit better in spice-server, but currently QEMU does not call
into spice-server when parsing 'gl' on its command line, so we have to
do this check in QEMU instead.

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1457955672-28758-1-git-send-email-cfergeau@redhat.com

[ applied codestyle fix: break long line ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-24 08:04:01 +01:00
Gerd Hoffmann
81b00c968a input-linux: fix Coverity warning
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1458129049-12484-1-git-send-email-kraxel@redhat.com
2016-03-24 07:58:20 +01:00
Gerd Hoffmann
0e066b2cc5 input-linux: switch over to -object
This patches makes input-linux use -object instead of a new command line
switch.  So, instead of the switch ...

    -input-linux /dev/input/event$nr

... you must create an object this way:

    -object input-linux,id=$name,evdev=/dev/input/event$nr

Bonus is that you can hot-add and hot-remove them via monitor now.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1457681901-30916-1-git-send-email-kraxel@redhat.com
2016-03-24 07:58:20 +01:00
Peter Maydell
2538039f2c Merge remote-tracking branch 'remotes/armbru/tags/pull-ivshmem-2016-03-18' into staging
ivshmem: Fixes, cleanups, device model split

# gpg: Signature made Mon 21 Mar 2016 20:33:54 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-ivshmem-2016-03-18: (40 commits)
  contrib/ivshmem-server: Print "not for production" warning
  ivshmem: Require master to have ID zero
  ivshmem: Drop ivshmem property x-memdev
  ivshmem: Clean up after the previous commit
  ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem
  ivshmem: Replace int role_val by OnOffAuto master
  qdev: New DEFINE_PROP_ON_OFF_AUTO
  ivshmem: Inline check_shm_size() into its only caller
  ivshmem: Simplify memory regions for BAR 2 (shared memory)
  ivshmem: Implement shm=... with a memory backend
  ivshmem: Tighten check of property "size"
  ivshmem: Simplify how we cope with short reads from server
  ivshmem: Drop the hackish test for UNIX domain chardev
  ivshmem: Rely on server sending the ID right after the version
  ivshmem: Propagate errors through ivshmem_recv_setup()
  ivshmem: Receive shared memory synchronously in realize()
  ivshmem: Plug leaks on unplug, fix peer disconnect
  ivshmem: Disentangle ivshmem_read()
  ivshmem: Simplify rejection of invalid peer ID from server
  ivshmem: Assert interrupts are set up once
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-23 12:57:44 +00:00
Peter Maydell
ffa6564c9b Merge remote-tracking branch 'remotes/weil/tags/pull-wxx-20160322' into staging
wxx patch queue

# gpg: Signature made Tue 22 Mar 2016 18:18:36 GMT using RSA key ID 677450AD
# gpg: Good signature from "Stefan Weil <sw@weilnetz.de>"
# gpg:                 aka "Stefan Weil <stefan.weil@weilnetz.de>"
# gpg:                 aka "Stefan Weil <stefan.weil@bib.uni-mannheim.de>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 4923 6FEA 75C9 5D69 8EC2  B78A E08C 21D5 6774 50AD

* remotes/weil/tags/pull-wxx-20160322:
  wxx: Add support for ncurses
  Remove unneeded include statements for setjmp.h
  Include setjmp.h in qemu/osdep.h (bug fix for w64)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-22 20:27:55 +00:00
Stefan Weil
ae6296342a wxx: Add support for ncurses
We used to support only pdcurses for Windows, but recently Cygwin added
mingw64-i686-ncurses and mingw64-x86_64-ncurses packages which are
supported now, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-03-22 19:17:38 +01:00
Stefan Weil
8ff98f1ed2 Remove unneeded include statements for setjmp.h
As soon as setjmp.h is included from qemu/osdep.h, those old include
statements are no longer needed.

Add also setjmp.h to the list in scripts/clean-includes.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-03-22 19:11:15 +01:00
Stefan Weil
e89fdafb58 Include setjmp.h in qemu/osdep.h (bug fix for w64)
setjmp must be declared before sysemu/os-win32.h
because it is redefined there for 64 bit Windows.

Reviewed-by: Richard Henderson  <rth@twiddle.net>
Tested-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
2016-03-22 19:11:15 +01:00
Peter Maydell
459621ac1a Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-03-21-tag' into staging
qemu-ga patch queue for 2.6

* remove unused variable

# gpg: Signature made Mon 21 Mar 2016 17:32:42 GMT using RSA key ID F108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"

* remotes/mdroth/tags/qga-pull-2016-03-21-tag:
  qemu-ga: drop unused local err variable

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-22 17:39:48 +00:00
Peter Maydell
ac0d25e843 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160321-1' into staging
usb: bugfix collection.

# gpg: Signature made Mon 21 Mar 2016 11:07:39 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160321-1:
  usb: ehci: add capability mmio write function
  hw/usb/dev-mtp: Guard inotify usage with CONFIG_INOTIFY1
  usb: fix unbound stack warning for inotify_watchfn
  usb: fix unbound stack usage for usb_mtp_add_str
  usb: fix unbounded stack warning for xhci_dma_write_u32s
  usb: Fix compilation for Windows

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-22 16:42:06 +00:00
Markus Armbruster
a335c6f204 contrib/ivshmem-server: Print "not for production" warning
The code is okay for illustrating how things work and for testing, but
its error handling make it unfit for production use.  Print a warning
to protect the innocent.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-41-git-send-email-armbru@redhat.com>
2016-03-21 21:29:03 +01:00
Markus Armbruster
62a830b688 ivshmem: Require master to have ID zero
Migration with ivshmem needs to be carefully orchestrated to work.
Exactly one peer (the "master") migrates to the destination, all other
peers need to unplug (and disconnect), migrate, plug back (and
reconnect).  This is sort of documented in qemu-doc.

If peers connect on the destination before migration completes, the
shared memory can get messed up.  This isn't documented anywhere.  Fix
that in qemu-doc.

To avoid messing up register IVPosition on migration, the server must
assign the same ID on source and destination.  ivshmem-spec.txt leaves
ID assignment unspecified, however.

Amend ivshmem-spec.txt to require the first client to receive ID zero.
The example ivshmem-server complies: it always assigns the first
unused ID.

For a bit of additional safety, enforce ID zero for the master.  This
does nothing when we're not using a server, because the ID is zero for
all peers then.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-40-git-send-email-armbru@redhat.com>
2016-03-21 21:29:03 +01:00
Markus Armbruster
13fd2cb689 ivshmem: Drop ivshmem property x-memdev
Use ivshmem-plain instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-39-git-send-email-armbru@redhat.com>
2016-03-21 21:29:03 +01:00
Markus Armbruster
ddc8528443 ivshmem: Clean up after the previous commit
Move code to more sensible places.  Use the opportunity to reorder and
document IVShmemState members.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-38-git-send-email-armbru@redhat.com>
2016-03-21 21:29:03 +01:00
Markus Armbruster
5400c02b90 ivshmem: Split ivshmem-plain, ivshmem-doorbell off ivshmem
ivshmem can be configured with and without interrupt capability
(a.k.a. "doorbell").  The two configurations have largely disjoint
options, which makes for a confusing (and badly checked) user
interface.  Moreover, the device can't tell the guest whether its
doorbell is enabled.

Create two new device models ivshmem-plain and ivshmem-doorbell, and
deprecate the old one.

Changes from ivshmem:

* PCI revision is 1 instead of 0.  The new revision is fully backwards
  compatible for guests.  Guests may elect to require at least
  revision 1 to make sure they're not exposed to the funny "no shared
  memory, yet" state.

* Property "role" replaced by "master".  role=master becomes
  master=on, role=peer becomes master=off.  Default is off instead of
  auto.

* Property "use64" is gone.  The new devices always have 64 bit BARs.

Changes from ivshmem to ivshmem-plain:

* The Interrupt Pin register in PCI config space is zero (does not use
  an interrupt pin) instead of one (uses INTA).

* Property "x-memdev" is renamed to "memdev".

* Properties "shm" and "size" are gone.  Use property "memdev"
  instead.

* Property "msi" is gone.  The new device can't have MSI-X capability.
  It can't interrupt anyway.

* Properties "ioeventfd" and "vectors" are gone.  They're meaningless
  without interrupts anyway.

Changes from ivshmem to ivshmem-doorbell:

* Property "msi" is gone.  The new device always has MSI-X capability.

* Property "ioeventfd" defaults to on instead of off.

* Property "size" is gone.  The new device can only map all the shared
  memory received from the server.

Guests can easily find out whether the device is configured for
interrupts by checking for MSI-X capability.

Note: some code added in sub-optimal places to make the diff easier to
review.  The next commit will move it to more sensible places.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-37-git-send-email-armbru@redhat.com>
2016-03-21 21:29:03 +01:00
Markus Armbruster
2a845da736 ivshmem: Replace int role_val by OnOffAuto master
In preparation of making it a qdev property.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-36-git-send-email-armbru@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
55e8a15435 qdev: New DEFINE_PROP_ON_OFF_AUTO
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-35-git-send-email-armbru@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
8baeb22bfc ivshmem: Inline check_shm_size() into its only caller
Improve the error messages while there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-34-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
c2d8019cd7 ivshmem: Simplify memory regions for BAR 2 (shared memory)
ivshmem_realize() puts the shared memory region in a container region.
Used to be necessary to permit delayed mapping of the shared memory.
However, we recently moved to synchronous mapping, in "ivshmem:
Receive shared memory synchronously in realize()" and the commit
following it.  The container is redundant since then.  Drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-33-git-send-email-armbru@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
5503e28504 ivshmem: Implement shm=... with a memory backend
ivshmem has its very own code to create and map shared memory.
Replace that with an implicitly created memory backend.  Reduces the
number of ways we create BAR 2 from three to two.

The memory-backend-file is currently available only with CONFIG_LINUX,
so this adds a second Linuxism to ivshmem (the other one is eventfd).
Should we ever need to make it portable to systems where
memory-backend-file can't be made to serve, we could create a
memory-backend-shmem that allocates memory with shm_open().

Bonus fix: shared memory files are now created with permissions 0655
instead of 0777.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-32-git-send-email-armbru@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
08183c20b8 ivshmem: Tighten check of property "size"
If size_t is narrower than 64 bits, passing uint64_t ivshmem_size to
mmap() truncates.  Reject such sizes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-31-git-send-email-armbru@redhat.com>
2016-03-21 21:29:02 +01:00
Markus Armbruster
ee276391a3 ivshmem: Simplify how we cope with short reads from server
Short reads from a UNIX domain sockets are exceedingly unlikely when
the other side always sends eight bytes and we always read eight
bytes.  We cope with them anyway.  However, the code doing that is
rather convoluted.  Dumb it down radically.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-30-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
ba5970a178 ivshmem: Drop the hackish test for UNIX domain chardev
The chardev must be capable of transmitting SCM_RIGHTS ancillary
messages.  We check it by comparing CharDriverState member filename to
"unix:".  That's almost as brittle as it is disgusting.

When the actual transmission all happened asynchronously, this check
was all we could do in realize(), and thus better than nothing.  But
now we receive at least one SCM_RIGHTS synchronously in realize(),
it's not worth its keep anymore.  Drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-29-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
a3feb08639 ivshmem: Rely on server sending the ID right after the version
The protocol specification (ivshmem-spec.txt, formerly
ivshmem_device_spec.txt) has always required the ID message to be sent
right at the beginning, and ivshmem-server has always complied.  The
device, however, accepts it out of order.  If an interrupt setup
arrived before it, though, it would be misinterpreted as connect
notification.  Fix the latent bug by relying on the spec and
ivshmem-server's actual behavior.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-28-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
1309cf448a ivshmem: Propagate errors through ivshmem_recv_setup()
This kills off the funny state described in the previous commit.

Simplify ivshmem_io_read() accordingly, and update documentation.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-27-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
3a55fc0f24 ivshmem: Receive shared memory synchronously in realize()
When configured for interrupts (property "chardev" given), we receive
the shared memory from an ivshmem server.  We do so asynchronously
after realize() completes, by setting up callbacks with
qemu_chr_add_handlers().

Keeping server I/O out of realize() that way avoids delays due to a
slow server.  This is probably relevant only for hot plug.

However, this funny "no shared memory, yet" state of the device also
causes a raft of issues that are hard or impossible to work around:

* The guest is exposed to this state: when we enter and leave it its
  shared memory contents is apruptly replaced, and device register
  IVPosition changes.

  This is a known issue.  We document that guests should not access
  the shared memory after device initialization until the IVPosition
  register becomes non-negative.

  For cold plug, the funny state is unlikely to be visible in
  practice, because we normally receive the shared memory long before
  the guest gets around to mess with the device.

  For hot plug, the timing is tighter, but the relative slowness of
  PCI device configuration has a good chance to hide the funny state.

  In either case, guests complying with the documented procedure are
  safe.

* Migration becomes racy.

  If migration completes before the shared memory setup completes on
  the source, shared memory contents is silently lost.  Fortunately,
  migration is rather unlikely to win this race.

  If the shared memory's ramblock arrives at the destination before
  shared memory setup completes, migration fails.

  There is no known way for a management application to wait for
  shared memory setup to complete.

  All you can do is retry failed migration.  You can improve your
  chances by leaving more time between running the destination QEMU
  and the migrate command.

  To mitigate silent memory loss, you need to ensure the server
  initializes shared memory exactly the same on source and
  destination.

  These issues are entirely undocumented so far.

I'd expect the server to be almost always fast enough to hide these
issues.  But then rare catastrophic races are in a way the worst kind.

This is way more trouble than I'm willing to take from any device.
Kill the funny state by receiving shared memory synchronously in
realize().  If your hot plug hangs, go kill your ivshmem server.

For easier review, this commit only makes the receive synchronous, it
doesn't add the necessary error propagation.  Without that, the funny
state persists.  The next commit will do that, and kill it off for
real.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-26-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
9db51b4d64 ivshmem: Plug leaks on unplug, fix peer disconnect
close_peer_eventfds() cleans up three things: ioeventfd triggers if
they exist, eventfds, and the array to store them.

Commit 98609cd (v1.2.0) fixed it not to clean up ioeventfd triggers
when they don't exist (property ioeventfd=off, which is the default).
Unfortunately, the fix also made it skip cleanup of the eventfds and
the array then.  This is a memory and file descriptor leak on unplug.

Additionally, the reset of nb_eventfds is skipped.  Doesn't matter on
unplug.  On peer disconnect, however, this permanently wedges the
interrupt vectors used for that peer's ID.  The eventfds stay behind,
but aren't connected to a peer anymore.  When the ID gets recycled for
a new peer, the new peer's eventfds get assigned to vectors after the
old ones.  Commonly, the device's number of vectors matches the
server's, so the new ones get dropped with a "Too many eventfd
received" message.  Interrupts either don't work (common case) or go
to the wrong vector.

Fix by narrowing the conditional to just the ioeventfd trigger
cleanup.

While there, move the "invalid" peer check to the only caller where it
can actually happen, and tighten it to reject own ID.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-25-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
ca0b7566cc ivshmem: Disentangle ivshmem_read()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-24-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
cd9953f720 ivshmem: Simplify rejection of invalid peer ID from server
ivshmem_read() processes server messages.  These are 64 bit signed
integers.  -1 is shared memory setup, 16 bit unsigned is a peer ID,
anything else is invalid.

ivshmem_read() rejects invalid negative messages right away, silently.

Invalid positive messages get rejected only in resize_peers(), and
ivshmem_read() then prints the rather cryptic message "failed to
resize peers array".

Extend the first check to cover all invalid messages, make it report
"server sent invalid message", and drop the second check.

Now resize_peers() can't fail anymore; simplify.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-23-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
3c27969b3e ivshmem: Assert interrupts are set up once
An interrupt is set up when the interrupt's file descriptor is
received.  Each message applies to the next interrupt vector.
Therefore, each vector cannot be set up more than once.

ivshmem_add_kvm_msi_virq() half-heartedly tries not to rely on this by
doing nothing then, but that's not going to recover from this error
should it become possible in the future.  watch_vector_notifier()
doesn't even try.

Simply assert what is the case, so we get alerted if we ever screw it
up.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-22-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
2d1d422d11 ivshmem: Leave INTx alone when using MSI-X
The ivshmem device can either use MSI-X or legacy INTx for interrupts.

With MSI-X enabled, peer interrupt events trigger an MSI as they
should.  But software can still raise INTx via interrupt status and
mask register in BAR 0.  This is explicitly prohibited by PCI Local
Bus Specification Revision 3.0, section 6.8.3.3:

    While enabled for MSI or MSI-X operation, a function is prohibited
    from using its INTx# pin (if implemented) to request service (MSI,
    MSI-X, and INTx# are mutually exclusive).

Fix the device model to leave INTx alone when using MSI-X.

Document that we claim to use INTx in config space even when we don't.
Unlike other devices, ivshmem does *not* use INTx when configured for
MSI-X and MSI-X isn't enabled by software.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-21-git-send-email-armbru@redhat.com>
2016-03-21 21:29:01 +01:00
Markus Armbruster
082751e82b ivshmem: Clean up MSI-X conditions
There are three predicates related to MSI-X:

* ivshmem_has_feature(s, IVSHMEM_MSI) is true unless the non-MSI-X
  variant of the device is selected with msi=off.

* msix_present() is true when the device has the PCI capability MSI-X.
  It's initially false, and becomes true during successful realize of
  the MSI-X variant of the device.  Thus, it's the same as
  ivshmem_has_feature(s, IVSHMEM_MSI) for realized devices.

* msix_enabled() is true when msix_present() is true and guest software
  has enabled MSI-X.

Code that differs between the non-MSI-X and the MSI-X variant of the
device needs to be guarded by ivshmem_has_feature(s, IVSHMEM_MSI) or
by msix_present(), except the latter works only for realized devices.

Code that depends on whether MSI-X is in use needs to be guarded with
msix_enabled().

Code review led me to two minor messes:

* ivshmem_vector_notify() calls msix_notify() even when
  !msix_enabled(), unlike most other MSI-X-capable devices.  As far as
  I can tell, msix_notify() does nothing when !msix_enabled().  Add
  the guard anyway.

* Most callers of ivshmem_use_msix() guard it with
  ivshmem_has_feature(s, IVSHMEM_MSI).  Not necessary, because
  ivshmem_use_msix() does nothing when !msix_present().  That's
  ivshmem's only use of msix_present(), though.  Guard it
  consistently, and drop the now redundant msix_present() check.
  While there, rename ivshmem_use_msix() to ivshmem_msix_vector_use().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-20-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
434ad76db5 ivshmem: Clean up register callbacks
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-19-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
d855e27565 ivshmem: Failed realize() can leave migration blocker behind
If pci_ivshmem_realize() fails after it created its migration blocker,
the blocker is left in place.  Fix that by creating it last.

Likewise, if it fails after it called fifo8_create(), it leaks fifo
memory.  Fix that the same way.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-18-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
9cf70c5225 ivshmem: Fix harmless misuse of Error
We reuse errp after passing it host_memory_backend_get_memory().  If
both host_memory_backend_get_memory() and the reuse set an error, the
reuse will fail the assertion in error_setv().  Fortunately,
host_memory_backend_get_memory() can't fail.

Pass it &error_abort to make our assumption explicit, and to get the
assertion failure in the right place should it become invalid.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-17-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
71c265816d ivshmem: Don't destroy the chardev on version mismatch
Yes, the chardev is commonly useless after we read a bad version from
it, but destroying it is inappropriate anyway: the user created it, so
the user should be able to hold on to it as long as he likes.  We
don't destroy it on other errors.  Screwed up in commit 5105b1d.

Stop reading instead.

Also note QEMU's behavior in ivshmem-spec.txt.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-16-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
c20fc0c3ee ivshmem: Drop ivshmem_event() stub
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-15-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
e64befe929 ivshmem: Clean up after commit 9940c32
IVShmemState member eventfd_chr is useless since commit 9940c32.  Drop
it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-14-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
a4fa93bf20 ivshmem: Compile debug prints unconditionally to prevent bit-rot
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-13-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
97553976dd ivshmem: Add missing newlines to debug printfs
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-12-git-send-email-armbru@redhat.com>
2016-03-21 21:29:00 +01:00
Markus Armbruster
fdee2025dd ivshmem: Rewrite specification document
This started as an attempt to update ivshmem_device_spec.txt for
clarity, accuracy and completeness while working on its code, and
quickly became a full rewrite.  Since the diff would be useless
anyway, I'm using the opportunity to rename the file to
ivshmem-spec.txt.

I tried hard to ensure the new text contradicts neither the old text
nor the code.  If the new text contradicts the old text but not the
code, it's probably a bug in the old text.  If the new text
contradicts both, its probably a bug in the new text.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-11-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Markus Armbruster
41b65e5eda ivshmem-test: Improve test cases /ivshmem/server-*
Document missing test: behavior with MSI-X present but not enabled.

For MSI-X, we test and clear the interrupt pending bit before testing
the interrupt.  For INTx, we only clear.  Change to test and clear for
consistency.

Test MSI-X vector 1 in addition to vector 0.

Improve comments.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-10-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Markus Armbruster
14c5d49ab3 ivshmem-test: Clean up wait for devices to become operational
test_ivshmem_server() waits until the first byte in BAR 2 contains the
0x42 we put into shared memory.  Works because the byte reads zero
until the device maps the shared memory gotten from the server.

Check the IVPosition register instead: it's initially -1, and becomes
non-negative right when the device maps the share memory, so no
change, just cleaner, because it's what guest software is supposed to
do.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-9-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Markus Armbruster
4958fe5d3c ivshmem-test: Improve test case /ivshmem/single
Test state of registers after reset.

Test reading Interrupt Status clears it.

Test (invalid) read of Doorbell.

Add more comments.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-8-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Markus Armbruster
998261726a tests/libqos/pci-pc: Fix qpci_pc_iomap() to map BARs aligned
qpci_pc_iomap() maps BARs one after the other, without padding.  This
is wrong.  PCI Local Bus Specification Revision 3.0, 6.2.5.1. Address
Maps: "all address spaces used are a power of two in size and are
naturally aligned".  That's because the size of a BAR is given by the
number of address bits the device decodes, and the BAR needs to be
mapped at a multiple of that size to ensure the address decoding
works.

Fix qpci_pc_iomap() accordingly.  This takes care of a FIXME in
ivshmem-test.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-7-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Markus Armbruster
330b58368c event_notifier: Make event_notifier_init_fd() #ifdef CONFIG_EVENTFD
Event notifiers are designed for eventfd(2).  They can fall back to
pipes, but according to Paolo, event_notifier_init_fd() really
requires the real thing, and should therefore be under #ifdef
CONFIG_EVENTFD.  Do that.

Its only user is ivshmem, which is currently CONFIG_POSIX.  Narrow it
to CONFIG_EVENTFD.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-6-git-send-email-armbru@redhat.com>
2016-03-21 21:28:59 +01:00
Peter Maydell
9fa570d57e Merge remote-tracking branch 'remotes/berrange/tags/pull-crypto-2016-03-21-1' into staging
Merge crypto 2016/03/21 v1

# gpg: Signature made Mon 21 Mar 2016 10:05:51 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-crypto-2016-03-21-1:
  crypto: fix cipher function signature mismatch with nettle & xts
  crypto: add compat cast5_set_key with  nettle < 3.0.0

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-21 10:19:12 +00:00
Daniel P. Berrange
f7ac78cfe1 crypto: fix cipher function signature mismatch with nettle & xts
For versions of nettle < 3.0.0, the cipher functions took a
'void *ctx' and 'unsigned len' instad of 'const void *ctx'
and 'size_t len'. The xts functions though are builtin to
QEMU and always expect the latter signatures. Define a
second set of wrappers to use with the correct signatures
needed by XTS mode.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-21 10:03:45 +00:00
Daniel P. Berrange
621e6ae657 crypto: add compat cast5_set_key with nettle < 3.0.0
Prior to the nettle 3.0.0 release, the cast5_set_key function
was actually named cast128_set_key, so we must add a compatibility
definition.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-21 10:02:22 +00:00
Stefan Hajnoczi
a284974dee qemu-ga: drop unused local err variable
Commit 125b310e1d ("qemu-ga: move
channel/transport functionality into wrapper class") stopped using the
local err variable in channel_event_cb().

This patch deletes the unused variable.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-03-20 19:51:18 -05:00
Peter Maydell
4829e0378d Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-03-18' into staging
QAPI patches for 2016-03-18

# gpg: Signature made Fri 18 Mar 2016 09:54:57 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2016-03-18:
  qapi: Use anonymous bases in QMP flat unions
  qapi: Allow anonymous base for flat union
  qapi: Make BlockdevOptions doc example closer to reality
  qapi: Don't special-case simple union wrappers
  qapi: Drop unused c_null()
  qapi: Inline gen_visit_members() into lone caller
  qapi-commands: Inline single-use helpers of gen_marshal()
  qapi-commands: Utilize implicit struct visits
  qapi-event: Utilize implicit struct visits
  qapi-event: Drop qmp_output_get_qobject() null check
  qapi: Emit implicit structs in generated C
  qapi: Adjust names of implicit types
  qapi: Make c_type() more OO-like
  qapi: Fix command with named empty argument type
  qapi: Assert in places where variants are not handled

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-18 17:18:41 +00:00
Markus Armbruster
ad4929384b qemu-doc: Fix ivshmem huge page example
Option parameter "share" is missing.  Without it, you get a *private*
mmap(), which defeats ivshmem's purpose pretty thoroughly ;)

While there, switch to the conventional mountpoint of hugetlbfs
/dev/hugepages.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1458066895-20632-5-git-send-email-armbru@redhat.com>
2016-03-18 17:34:55 +01:00
Markus Armbruster
3625c739ea ivshmem-server: Don't overload POSIX shmem and file name
Option -m NAME is interpreted as directory name if we can statfs() it
and its on hugetlbfs.  Else it's interpreted as POSIX shared memory
object name.  This is nuts.

Always interpret -m as directory.  Create new -M for POSIX shared
memory.  Last of -m or -M wins.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-4-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-18 17:34:40 +01:00
Markus Armbruster
e3ad72965a ivshmem-server: Fix and clean up command line help
Burying error messages in ~20 lines of usage help is bad form.  Print
a single line pointing to -h instead.

Print -h help to stdout rather than stderr.  Fix default of -p.  Clean
up the help text a bit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1458066895-20632-3-git-send-email-armbru@redhat.com>
2016-03-18 17:34:40 +01:00
Markus Armbruster
3be5cc2324 target-ppc: Document TOCTTOU in hugepage support
The code to find the minimum page size is is vulnerable to TOCTTOU.
Added in commit 2d103aa "target-ppc: fix hugepage support when using
memory-backend-file" (v2.4.0).  Since I can't fix it myself right now,
add a FIXME comment.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1458066895-20632-2-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-03-18 17:34:21 +01:00
Prasad J Pandit
dff0367cf6 usb: ehci: add capability mmio write function
USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).

Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1454072434-16045-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 14:20:39 +01:00
Matthew Fortune
983bff3530 hw/usb/dev-mtp: Guard inotify usage with CONFIG_INOTIFY1
inotify_init1 usage was guarded by a check for linux but does not
exist on older distributions like CentOS 5 resulting in build
failures.

Signed-off-by: Matthew Fortune <matthew.fortune@imgtec.com>
Message-id: 6D39441BF12EF246A7ABCE6654B023536BB85D4A@hhmail02.hh.imgtec.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 13:58:15 +01:00
Peter Xu
f34d57d359 usb: fix unbound stack warning for inotify_watchfn
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1457503640-31473-1-git-send-email-peterx@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 13:56:24 +01:00
Peter Xu
e3d60bc7c6 usb: fix unbound stack usage for usb_mtp_add_str
Use heap instead of stack.

Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 13:55:16 +01:00
Peter Xu
182b391e79 usb: fix unbounded stack warning for xhci_dma_write_u32s
All the callers for xhci_dma_write_u32s() are using mostly 5 * uint32_t
in len. To avoid unbound stack warning for the function, make it
statically allocated, and assert when it's not big enough in the
future.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 1457661106-9569-1-git-send-email-peterx@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 13:42:14 +01:00
Stefan Weil
0ab6d12ffd usb: Fix compilation for Windows
Mingw-w64 does not provide sys/ioctl.h and Linux builds don't need it,
so remove that include statement.

ERROR is defined by wingdi.h (included via windows.h). Undefine it before
it is redefined to avoid a compiler warning / error.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1458159439-32322-1-git-send-email-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-18 13:13:30 +01:00
Eric Blake
3666a97f78 qapi: Use anonymous bases in QMP flat unions
Now that the generator supports it, we might as well use an
anonymous base rather than breaking out a single-use Base
structure, for all three of our current QMP flat unions.

Oddly enough, this change does not affect the resulting
introspection output (because we already inline the members of
a base type into an object, and had no independent use of the
base type reachable from a command).

The case_whitelist now has to list the name of an implicit
type; which is not too bad (consider it a feature if it makes
it harder for developers to make the whitelist grow :)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-16-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
ac4338f8eb qapi: Allow anonymous base for flat union
Rather than requiring all flat unions to explicitly create
a separate base struct, we can allow the qapi schema to specify
the common members via an inline dictionary. This is similar to
how commands can specify an inline anonymous type for its 'data'.
We already have several struct types that only exist to serve as
a single flat union's base; the next commit will clean them up.
In particular, this patch's change to the BlockdevOptions example
in qapi-code-gen.txt will actually be done in the real QAPI schema.

Now that anonymous bases are legal, we need to rework the
flat-union-bad-base negative test (as previously written, it
forms what is now valid QAPI; tweak it to now provide coverage
of a new error message path), and add a positive test in
qapi-schema-test to use an anonymous base (making the integer
argument optional, for even more coverage).

Note that this patch only allows anonymous bases for flat unions;
simple unions are already enough syntactic sugar that we do not
want to burden them further.  Meanwhile, while it would be easy
to also allow an anonymous base for structs, that would be quite
redundant, as the members can be put right into the struct
instead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-15-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
bd59adce69 qapi: Make BlockdevOptions doc example closer to reality
Although we don't want to repeat the entire BlockdevOptions
QMP command in the example, it helps if we aren't needlessly
diverging (the initial example was written before we had
committed the actual QMP interface).  Use names that match what
is found in qapi/block-core.json, such as '*read-only' rather
than 'readonly', or 'BlockdevRef' rather than 'BlockRef'.

For the simple union example, invent BlockdevOptionsSimple so
that later text is unambiguous which of the two union forms is
meant (telling the user to refer back to two 'BlockdevOptions'
wasn't nice, and QMP has only the flat union form).

Also, mention that the discriminator of a flat union is
non-optional.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-14-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
32bafa8fdd qapi: Don't special-case simple union wrappers
Simple unions were carrying a special case that hid their 'data'
QMP member from the resulting C struct, via the hack method
QAPISchemaObjectTypeVariant.simple_union_type().  But by using
the work we started by unboxing flat union and alternate
branches, coupled with the ability to visit the members of an
implicit type, we can now expose the simple union's implicit
type in qapi-types.h:

| struct q_obj_ImageInfoSpecificQCow2_wrapper {
|     ImageInfoSpecificQCow2 *data;
| };
|
| struct q_obj_ImageInfoSpecificVmdk_wrapper {
|     ImageInfoSpecificVmdk *data;
| };
...
| struct ImageInfoSpecific {
|     ImageInfoSpecificKind type;
|     union { /* union tag is @type */
|         void *data;
|-        ImageInfoSpecificQCow2 *qcow2;
|-        ImageInfoSpecificVmdk *vmdk;
|+        q_obj_ImageInfoSpecificQCow2_wrapper qcow2;
|+        q_obj_ImageInfoSpecificVmdk_wrapper vmdk;
|     } u;
| };

Doing this removes asymmetry between QAPI's QMP side and its
C side (both sides now expose 'data'), and means that the
treatment of a simple union as sugar for a flat union is now
equivalent in both languages (previously the two approaches used
a different layer of dereferencing, where the simple union could
be converted to a flat union with equivalent C layout but
different {} on the wire, or to an equivalent QMP wire form
but with different C representation).  Using the implicit type
also lets us get rid of the simple_union_type() hack.

Of course, now all clients of simple unions have to adjust from
using su->u.member to using su->u.member.data; while this touches
a number of files in the tree, some earlier cleanup patches
helped minimize the change to the initialization of a temporary
variable rather than every single member access.  The generated
qapi-visit.c code is also affected by the layout change:

|@@ -7393,10 +7393,10 @@ void visit_type_ImageInfoSpecific_member
|     }
|     switch (obj->type) {
|     case IMAGE_INFO_SPECIFIC_KIND_QCOW2:
|-        visit_type_ImageInfoSpecificQCow2(v, "data", &obj->u.qcow2, &err);
|+        visit_type_q_obj_ImageInfoSpecificQCow2_wrapper_members(v, &obj->u.qcow2, &err);
|         break;
|     case IMAGE_INFO_SPECIFIC_KIND_VMDK:
|-        visit_type_ImageInfoSpecificVmdk(v, "data", &obj->u.vmdk, &err);
|+        visit_type_q_obj_ImageInfoSpecificVmdk_wrapper_members(v, &obj->u.vmdk, &err);
|         break;
|     default:
|         abort();

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-13-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
861877a0dd qapi: Drop unused c_null()
Now that we are always bulk-initializing a QAPI C struct to 0
(whether by g_malloc0() or by 'Type arg = {0};'), we no longer
have any clients of c_null() in the generator for per-element
initialization.  This patch is easy enough to revert if we find
a use in the future, but in the present, get rid of the dead code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-12-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
12f254fd5f qapi: Inline gen_visit_members() into lone caller
Commit 82ca8e46 noticed that we had multiple implementations of
visiting every member of a struct, and consolidated it into
gen_visit_fields() (now gen_visit_members()) with enough
parameters to cater to slight differences between the clients.
But recent exposure of implicit types has meant that we are now
down to a single use of that method, so we can clean up the
unused conditionals and just inline it into the remaining
caller: gen_visit_object_members().

Likewise, gen_err_check() no longer needs optional parameters,
as the lone use of non-defaults was via gen_visit_members().

No change to generated code.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-11-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
c1ff0e6c85 qapi-commands: Inline single-use helpers of gen_marshal()
Originally, gen_marshal_input_visit() (or gen_visitor_input_block()
before commit f1538019) was factored out to make it easy to do two
passes of a visit to each member of a (possibly-implicit) object,
without duplicating lots of code.  But after recent changes, those
visits now occupy a single line of emitted code, and the helper
method has become a series of conditionals both before and after
the one important line, making it rather awkward to see at a glance
what gets emitted on the first (parsing) or second (deallocation)
pass.  It's a lot easier to read the generator code if we just
inline both uses directly into gen_marshal(), without all the
conditionals.

Once we've done that, it's easy to notice that gen_marshal_vars()
is used only once, and inlining it too lets us consolidate some
mcgen() calls that used to be split across helpers.

gen_call() remains a single-use helper function, but it has
enough indentation and complexity that inlining it would hamper
legibility.

No change to generated output.  The fact that the diffstat shows
a net reduction in lines is an argument in favor of this cleanup.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-10-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:26 +01:00
Eric Blake
386230a249 qapi-commands: Utilize implicit struct visits
Rather than generate inline per-member visits, take advantage
of the 'visit_type_FOO_members()' function for command
marshalling.  This is possible now that implicit structs can be
visited like any other.  Generate call arguments from a stack-
allocated struct, rather than a list of local variables:

|@@ -57,26 +57,15 @@ void qmp_marshal_add_fd(QDict *args, QOb
|     QmpInputVisitor *qiv = qmp_input_visitor_new_strict(QOBJECT(args));
|     QapiDeallocVisitor *qdv;
|     Visitor *v;
|-    bool has_fdset_id = false;
|-    int64_t fdset_id = 0;
|-    bool has_opaque = false;
|-    char *opaque = NULL;
|+    q_obj_add_fd_arg arg = {0};
|
|     v = qmp_input_get_visitor(qiv);
|-    if (visit_optional(v, "fdset-id", &has_fdset_id)) {
|-        visit_type_int(v, "fdset-id", &fdset_id, &err);
|-        if (err) {
|-            goto out;
|-        }
|-    }
|-    if (visit_optional(v, "opaque", &has_opaque)) {
|-        visit_type_str(v, "opaque", &opaque, &err);
|-        if (err) {
|-            goto out;
|-        }
|+    visit_type_q_obj_add_fd_arg_members(v, &arg, &err);
|+    if (err) {
|+        goto out;
|     }
|
|-    retval = qmp_add_fd(has_fdset_id, fdset_id, has_opaque, opaque, &err);
|+    retval = qmp_add_fd(arg.has_fdset_id, arg.fdset_id, arg.has_opaque, arg.opaque, &err);
|     if (err) {
|         goto out;
|     }
|@@ -88,12 +77,7 @@ out:
|     qmp_input_visitor_cleanup(qiv);
|     qdv = qapi_dealloc_visitor_new();
|     v = qapi_dealloc_get_visitor(qdv);
|-    if (visit_optional(v, "fdset-id", &has_fdset_id)) {
|-        visit_type_int(v, "fdset-id", &fdset_id, NULL);
|-    }
|-    if (visit_optional(v, "opaque", &has_opaque)) {
|-        visit_type_str(v, "opaque", &opaque, NULL);
|-    }
|+    visit_type_q_obj_add_fd_arg_members(v, &arg, NULL);
|     qapi_dealloc_visitor_cleanup(qdv);
| }

This also has the nice side effect of eliminating a chance of
collision between argument QMP names and local variables.

This patch also paves the way for some followup simplifications
in the generator, in subsequent patches.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
0949e95b48 qapi-event: Utilize implicit struct visits
Rather than generate inline per-member visits, take advantage
of the 'visit_type_FOO_members()' function for emitting events.
This is possible now that implicit structs can be visited like
any other.  Generated code shrinks accordingly; by initializing
a struct based on parameters, through a new gen_param_var()
helper, like:

|@@ -338,6 +250,9 @@ void qapi_event_send_block_job_error(con
|     QMPEventFuncEmit emit = qmp_event_get_func_emit();
|     QmpOutputVisitor *qov;
|     Visitor *v;
|+    q_obj_BLOCK_JOB_ERROR_arg param = {
|+        (char *)device, operation, action
|+    };
|
|     if (!emit) {
|         return;
@@ -351,19 +266,7 @@ void qapi_event_send_block_job_error(con
|     if (err) {
|         goto out;
|     }
|-    visit_type_str(v, "device", (char **)&device, &err);
|-    if (err) {
|-        goto out_obj;
|-    }
|-    visit_type_IoOperationType(v, "operation", &operation, &err);
|-    if (err) {
|-        goto out_obj;
|-    }
|-    visit_type_BlockErrorAction(v, "action", &action, &err);
|-    if (err) {
|-        goto out_obj;
|-    }
|-out_obj:
|+    visit_type_q_obj_BLOCK_JOB_ERROR_arg_members(v, &param, &err);
|     visit_end_struct(v, err ? NULL : &err);

Notice that the initialization of 'param' has to cast away const
(just as the old gen_visit_members() had to do): we can't change
the signature of the user function (which uses 'const char *'), but
have to assign it to a non-const QAPI object (which requires
'char *').

While touching this, document with a FIXME comment that there is
still a potential collision between QMP members and our choice of
local variable names within qapi_event_send_FOO().

This patch also paves the way for some followup simplifications
in the generator, in subsequent patches.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-8-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
8df59565d2 qapi-event: Drop qmp_output_get_qobject() null check
qmp_output_get_qobject() was changed never to return null some time
ago (in commit 6c2f9a15), but the qapi_event_send_FOO() functions
still check.  Clean that up:

|@@ -28,7 +28,6 @@ void qapi_event_send_acpi_device_ost(ACP
|     QMPEventFuncEmit emit;
|     QmpOutputVisitor *qov;
|     Visitor *v;
|-    QObject *obj;
|
|     emit = qmp_event_get_func_emit();
|     if (!emit) {
|@@ -54,10 +53,7 @@ out_obj:
|         goto out;
|     }
|
|-    obj = qmp_output_get_qobject(qov);
|-    g_assert(obj);
|-
|-    qdict_put_obj(qmp, "data", obj);
|+    qdict_put_obj(qmp, "data", qmp_output_get_qobject(qov));
|     emit(QAPI_EVENT_ACPI_DEVICE_OST, qmp, &err);
|
| out:

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-7-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
7ce106a96f qapi: Emit implicit structs in generated C
We already have several places that want to visit all the members
of an implicit object within a larger context (simple union variant,
event with anonymous data, command with anonymous arguments struct);
and will be adding another one soon (the ability to declare an
anonymous base for a flat union).  Having a C struct declared for
these implicit types, along with a visit_type_FOO_members() helper
function, will make for fewer special cases in our generator.

We do not, however, need qapi_free_FOO() or visit_type_FOO()
functions for implicit types, because they should not be used
directly outside of the generated code.  This is done by adding a
conditional in visit_object_type() for both qapi-types.py and
qapi-visit.py based on the object name.  The comparison of
"name.startswith('q_')" is a bit hacky (it's basically duplicating
what .is_implicit() already uses), but beats changing the signature
of the visit_object_type() callback to pass a new 'implicit' flag.
The hack should be temporary: we are considering adding a future
patch that consolidates the narrow visit_object_type(..., base,
local_members, variants) and visit_object_type_flat(...,
all_members, variants) [where different sets of information are
already broken out, and the QAPISchemaObjectType is no longer
available] into a broader visit_object_type(obj_type) [where the
visitor can query the needed fields from obj_type directly].

Also, now that we WANT to output C code for implicits, we no longer
need the visit_needed() filter, leaving 'q_empty' as the only object
still needing a special case.  Remember, 'q_empty' is the only
built-in generated object, which means that without a special case
it would be emitted in multiple files (the main qapi-types.h and in
qga-qapi-types.h) causing compilation failure due to redefinition.
But since it has no members, it's easier to just avoid an attempt to
visit that particular type; since gen_object() is called recursively,
we also prime the objects_seen set to cover any recursion into the
empty type.

The patch relies on the changed naming of implicit types in the
previous patch.  It is a bit unfortunate that the generated struct
names and visit_type_FOO_members() don't match normal naming
conventions, but it's not too bad, since they will only be used in
generated code.

The generated code grows substantially in size: the implicit
'-wrapper' types must be emitted in qapi-types.h before any union
can include an unboxed member of that type.  Arguably, the '-args'
types could be emitted in a private header for just qapi-visit.c
and qmp-marshal.c, rather than polluting qapi-types.h; but adding
complexity to the generator to split the output location according
to role doesn't seem worth the maintenance costs.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-6-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
7599697c66 qapi: Adjust names of implicit types
The original choice of ':obj-' as the prefix for implicit types
made it obvious that we weren't going to clash with any user-defined
names, which cannot contain ':'.  But now we want to create structs
for implicit types, to get rid of special cases in the generators,
and our use of ':' in implicit names needs a tweak to produce valid
C code.

We could transliterate ':' to '_', except that C99 mandates that
"identifiers that begin with an underscore are always reserved for
use as identifiers with file scope in both the ordinary and tag name
spaces".  So it's time to change our naming convention: we can
instead use the 'q_' prefix that we reserved for ourselves back in
commit 9fb081e0.  Technically, since we aren't planning on exposing
the empty type in generated code, we could keep the name ':empty',
but renaming it to 'q_empty' makes the check for startswith('q_')
cover all implicit types, whether or not code is generated for them.

As long as we don't declare 'empty' or 'obj' ticklish, it shouldn't
clash with c_name() prepending 'q_' to the user's ticklish names.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-5-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
4040d995e4 qapi: Make c_type() more OO-like
QAPISchemaType.c_type() is a bit awkward: it takes two optional
boolean flags is_param and is_unboxed, and they should never both
be True.

Add a new method for each of the flags, and drop the flags from
c_type().

Most callers pass no flags; they remain unchanged.

One caller passes is_param=True; call the new .c_param_type()
instead.

One caller passes is_unboxed=True, except for simple union types.
This is actually an ugly special case that will go away soon, so
until then, we now have to call either .c_type() or the new
.c_unboxed_type().  Tolerable in the interim.

It requires slightly more Python, but is arguably easier to read.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
972a110162 qapi: Fix command with named empty argument type
The generator special-cased

 { 'command':'foo', 'data': {} }

to avoid emitting a visitor variable, but failed to see that

 { 'struct':'NamedEmptyType, 'data': {} }
 { 'command':'foo', 'data':'NamedEmptyType' }

needs the same treatment.  There, the generator happily generates a
visitor to get no arguments, and a visitor to destroy no arguments;
and the compiler isn't happy with that, as demonstrated by the updated
qapi-schema-test.json:

  tests/test-qmp-marshal.c: In function ‘qmp_marshal_user_def_cmd0’:
  tests/test-qmp-marshal.c:264:14: error: variable ‘v’ set but not used [-Werror=unused-but-set-variable]
       Visitor *v;
                ^

No change to generated code except for the testsuite addition.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Eric Blake
29f6bd15eb qapi: Assert in places where variants are not handled
We are getting closer to the point where we could use one union
as the base or variant type within another union type (as long
as there are no collisions between any possible combination of
member names allowed across all discriminator choices).  But
until we get to that point, it is worth asserting that variants
are not present in places where we are not prepared to handle
them: when exploding a type into a parameter list, we do not
expect variants.  The qapi.py code is already checking this,
via the older check_type() method; but someday we hope to get
rid of that and move checking into QAPISchema*.check().  The
two asserts added here make sure any refactoring still catches
problems, and makes it locally obvious why we can iterate over
only type.members without worrying about type.variants.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1458254921-17042-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-18 10:29:25 +01:00
Peter Maydell
879c26fb9f Merge remote-tracking branch 'remotes/berrange/tags/pull-qcrypto-2016-03-17-3' into staging
Merge QCrypto 2016/03/17 v3

# gpg: Signature made Thu 17 Mar 2016 16:51:32 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-qcrypto-2016-03-17-3:
  crypto: implement the LUKS block encryption format
  crypto: add block encryption framework
  crypto: wire up XTS mode for cipher APIs
  crypto: refactor code for dealing with AES cipher
  crypto: import an implementation of the XTS cipher mode
  crypto: add support for the twofish cipher algorithm
  crypto: add support for the serpent cipher algorithm
  crypto: add support for the cast5-128 cipher algorithm
  crypto: skip testing of unsupported cipher algorithms
  crypto: add support for anti-forensic split algorithm
  crypto: add support for generating initialization vectors
  crypto: add support for PBKDF2 algorithm
  crypto: add cryptographic random byte source

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-17 16:57:50 +00:00
Daniel P. Berrange
3e308f20ed crypto: implement the LUKS block encryption format
Provide a block encryption implementation that follows the
LUKS/dm-crypt specification.

This supports all combinations of hash, cipher algorithm,
cipher mode and iv generator that are implemented by the
current crypto layer.

There is support for opening existing volumes formatted
by dm-crypt, and for formatting new volumes. In the latter
case it will only use key slot 0.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 16:50:40 +00:00
Peter Maydell
6741d38ad0 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Thu 17 Mar 2016 15:49:29 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (29 commits)
  iotests: Test QUORUM_REPORT_BAD in fifo mode
  quorum: Emit QUORUM_REPORT_BAD for reads in fifo mode
  block: Use blk_co_pwritev() in blk_co_write_zeroes()
  block: Use blk_aio_prwv() for aio_read/write/write_zeroes
  block: Use blk_prw() in blk_pread()/blk_pwrite()
  block: Use blk_co_pwritev() in blk_write_zeroes()
  block: Pull up blk_read_unthrottled() implementation
  block: Use blk_co_pwritev() for blk_write()
  block: Use blk_co_preadv() for blk_read()
  block: Use BdrvChild in BlockBackend
  block: Remove bdrv_states list
  block: Use bdrv_next() instead of bdrv_states
  block: Rewrite bdrv_next()
  block: Add blk_next_root_bs()
  block: Add bdrv_next_monitor_owned()
  block: Move some bdrv_*_all() functions to BB
  blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()
  blockdev: Split monitor reference from BB creation
  blockdev: Separate BB name management
  blockdev: Add list of all BlockBackends
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-17 15:59:42 +00:00
Kevin Wolf
361dca7a5a Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-03-17-v2' into queue-block
Two quorum patches for the block queue, v2.

# gpg: Signature made Thu Mar 17 16:44:11 2016 CET using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-03-17-v2:
  iotests: Test QUORUM_REPORT_BAD in fifo mode
  quorum: Emit QUORUM_REPORT_BAD for reads in fifo mode

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 16:48:49 +01:00
Alberto Garcia
509565f36f iotests: Test QUORUM_REPORT_BAD in fifo mode
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: c0a8dbfdbe939520cda5f661af6f1cd7b6b4df9d.1458034554.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-17 16:43:30 +01:00
Alberto Garcia
6049490df4 quorum: Emit QUORUM_REPORT_BAD for reads in fifo mode
If there's an I/O error in one of Quorum children then QEMU
should emit QUORUM_REPORT_BAD. However this is not working with
read-pattern=fifo. This patch fixes this problem.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: d57e39e8d3e8564003a1e2aadbd29c97286eb2d2.1458034554.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-17 16:43:30 +01:00
Kevin Wolf
8896e08814 block: Use blk_co_pwritev() in blk_co_write_zeroes()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 16:30:00 +01:00
Kevin Wolf
57d6a42883 block: Use blk_aio_prwv() for aio_read/write/write_zeroes
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 16:30:00 +01:00
Kevin Wolf
a55d3fba99 block: Use blk_prw() in blk_pread()/blk_pwrite()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Kevin Wolf
fc1453cdfc block: Use blk_co_pwritev() in blk_write_zeroes()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Kevin Wolf
5bd5119667 block: Pull up blk_read_unthrottled() implementation
Use blk_read(), so that it goes through blk_co_preadv() like all read
requests from the BB to the BDS.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Kevin Wolf
a8823a3bfd block: Use blk_co_pwritev() for blk_write()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Kevin Wolf
1bf1cbc91f block: Use blk_co_preadv() for blk_read()
This patch introduces blk_co_preadv() as a central function on the
BlockBackend level that is supposed to handle all read requests from the
BB to its root BDS eventually.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Kevin Wolf
f21d96d04b block: Use BdrvChild in BlockBackend
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Max Reitz
9aaf28c61d block: Remove bdrv_states list
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Max Reitz
79720af640 block: Use bdrv_next() instead of bdrv_states
There is no point in manually iterating through the bdrv_states list
when there is bdrv_next().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:57 +01:00
Max Reitz
2626058034 block: Rewrite bdrv_next()
Instead of using the bdrv_states list, iterate over all the
BlockDriverStates attached to BlockBackends, and over all the
monitor-owned BDSs afterwards (except for those attached to a BB).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
981f4f578e block: Add blk_next_root_bs()
This function iterates over all BDSs attached to a BB. We are going to
need it when rewriting bdrv_next() so it no longer uses bdrv_states.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
262b4e8f74 block: Add bdrv_next_monitor_owned()
Add a function for iterating over all monitor-owned BlockDriverStates so
the generic block layer can do so.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
fe1a9cbc33 block: Move some bdrv_*_all() functions to BB
Move bdrv_commit_all() and bdrv_flush_all() to the BlockBackend level.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
7c735873d9 blockdev: Remove blk_hide_on_behalf_of_hmp_drive_del()
We can basically inline it in hmp_drive_del(); monitor_remove_blk() is
called already, so we just need to call bdrv_make_anon(), too.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
efaa7c4eeb blockdev: Split monitor reference from BB creation
Before this patch, blk_new() automatically assigned a name to the new
BlockBackend and considered it referenced by the monitor. This patch
removes the implicit monitor_add_blk() call from blk_new() (and
consequently the monitor_remove_blk() call from blk_delete(), too) and
thus blk_new() (and related functions) no longer take a BB name
argument.

In fact, there is only a single point where blk_new()/blk_new_open() is
called and the new BB is monitor-owned, and that is in blockdev_init().
Besides thus relieving us from having to invent names for all of the BBs
we use in qemu-img, this fixes a bug where qemu cannot create a new
image if there already is a monitor-owned BB named "image".

If a BB and its BDS tree are created in a single operation, as of this
patch the BDS tree will be created before the BB is given a name
(whereas it was the other way around before). This results in minor
change to the output of iotest 087, whose reference output is amended
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
e5e785500b blockdev: Separate BB name management
Introduce separate functions (monitor_add_blk() and
monitor_remove_blk()) which set or unset a BB name. Since the name is
equivalent to the monitor's reference to a BB, adding a name the same as
declaring the BB to be monitor-owned and removing it revokes this
status, hence the function names.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
2cf22d6a1a blockdev: Add list of all BlockBackends
While monitor_block_backends contains nearly all BBs, we sometimes
really need all BBs. To this end, this patch adds the block_backend
list.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
9492b0b928 blockdev: Rename blk_backends
The blk_backends list does not contain all BlockBackends but only the
ones which are referenced by the monitor, and that is not necessarily
true for every BlockBackend. Rename the list to monitor_block_backends
to make that fact clear.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
d0e46a5577 block: Drop BB name from bad option error
The information which BB is concerned does not seem useful enough to
justify its existence in most other place (which may be related to qemu
printing the -drive parameter in question anyway, and for blockdev-add
the attribution is naturally unambiguous). Furthermore, as of a future
patch, bdrv_get_device_name(bs) will always return the empty string
before bdrv_open_inherit() returns.

Therefore, just dropping that information seems to be the best course of
action.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
a55448b368 qapi: Drop QERR_UNKNOWN_BLOCK_FORMAT_FEATURE
Just specifying a custom string is simpler in basically all places that
used it, and in addition, specifying the BB or node name is something we
generally do not do in other error messages when opening a BDS, so we
should not do it here.

This changes the output for iotest 036 (to the better, in my opinion),
so the reference output needs to be changed accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
da31d594cf block: Use blk_{commit,flush}_all() consistently
Replace bdrv_commmit_all() and bdrv_flush_all() by their BlockBackend
equivalents.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
1393f21270 block: Add blk_commit_all()
Later, we will remove bdrv_commit_all() and move its contents here, and
in order to replace bdrv_commit_all() calls by calls to blk_commit_all()
before doing so, we need to add it as an alias now.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
74d1b8fc27 block: Use blk_next() in block-backend.c
Instead of iterating directly through blk_backends, we can use
blk_next() instead. This gives us some abstraction from the list itself
which we can use to rename it, for example.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Max Reitz
da27a00e27 monitor: Use BB list for BB name completion
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf
f8746fb804 block: Fix memory leak in hmp_drive_add_node()
hmp_drive_add_node() leaked qdict in the error path when no node-name is
specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-03-17 15:47:56 +01:00
Kevin Wolf
23f7fcb295 block: Fix qemu_root_bds_opts.head initialisation
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-17 15:47:56 +01:00
Daniel P. Berrange
7d9690148a crypto: add block encryption framework
Add a generic framework for supporting different block encryption
formats. Upon instantiating a QCryptoBlock object, it will read
the encryption header and extract the encryption keys. It is
then possible to call methods to encrypt/decrypt data buffers.

There is also a mode whereby it will create/initialize a new
encryption header on a previously unformatted volume.

The initial framework comes with support for the legacy QCow
AES based encryption. This enables code in the QCow driver to
be consolidated later.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
eaec903c5b crypto: wire up XTS mode for cipher APIs
Introduce 'XTS' as a permitted mode for the cipher APIs.
With XTS the key provided must be twice the size of the
key normally required for any given algorithm. This is
because the key will be split into two pieces for use
in XTS mode.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
e3ba0b6701 crypto: refactor code for dealing with AES cipher
The built-in and nettle cipher backends for AES maintain
two separate AES contexts, one for encryption and one for
decryption. This is going to be inconvenient for the future
code dealing with XTS, so wrap them up in a single struct
so there is just one pointer to pass around for both
encryption and decryption.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
84f7f180b0 crypto: import an implementation of the XTS cipher mode
The XTS (XEX with tweaked-codebook and ciphertext stealing)
cipher mode is commonly used in full disk encryption. There
is unfortunately no implementation of it in either libgcrypt
or nettle, so we need to provide our own.

The libtomcrypt project provides a repository of crypto
algorithms under a choice of either "public domain" or
the "what the fuck public license".

So this impl is taken from the libtomcrypt GIT repo and
adapted to be compatible with the way we need to call
ciphers provided by nettle/gcrypt.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
50f6753e27 crypto: add support for the twofish cipher algorithm
New cipher algorithms 'twofish-128', 'twofish-192' and
'twofish-256' are defined for the Twofish algorithm.
The gcrypt backend does not support 'twofish-192'.

The nettle and gcrypt cipher backends are updated to
support the new cipher and a test vector added to the
cipher test suite. The new algorithm is enabled in the
LUKS block encryption driver.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
94318522ed crypto: add support for the serpent cipher algorithm
New cipher algorithms 'serpent-128', 'serpent-192' and
'serpent-256' are defined for the Serpent algorithm.

The nettle and gcrypt cipher backends are updated to
support the new cipher and a test vector added to the
cipher test suite. The new algorithm is enabled in the
LUKS block encryption driver.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
084a85eedd crypto: add support for the cast5-128 cipher algorithm
A new cipher algorithm 'cast-5-128' is defined for the
Cast-5 algorithm with 128 bit key size. Smaller key sizes
are supported by Cast-5, but nothing in QEMU should use
them, so only 128 bit keys are permitted.

The nettle and gcrypt cipher backends are updated to
support the new cipher and a test vector added to the
cipher test suite. The new algorithm is enabled in the
LUKS block encryption driver.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:15 +00:00
Daniel P. Berrange
aa41363598 crypto: skip testing of unsupported cipher algorithms
We don't guarantee that all crypto backends will support
all cipher algorithms, so we should skip tests unless
the crypto backend indicates support.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:14 +00:00
Daniel P. Berrange
5a95e0fccd crypto: add support for anti-forensic split algorithm
The LUKS format specifies an anti-forensic split algorithm which
is used to artificially expand the size of the key material on
disk. This is an implementation of that algorithm.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:14 +00:00
Daniel P. Berrange
cb730894ae crypto: add support for generating initialization vectors
There are a number of different algorithms that can be used
to generate initialization vectors for disk encryption. This
introduces a simple internal QCryptoBlockIV object to provide
a consistent internal API to the different algorithms. The
initially implemented algorithms are 'plain', 'plain64' and
'essiv', each matching the same named algorithm provided
by the Linux kernel dm-crypt driver.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:14 +00:00
Daniel P. Berrange
37788f253a crypto: add support for PBKDF2 algorithm
The LUKS data format includes use of PBKDF2 (Password-Based
Key Derivation Function). The Nettle library can provide
an implementation of this, but we don't want code directly
depending on a specific crypto library backend. Introduce
a new include/crypto/pbkdf.h header which defines a QEMU
API for invoking PBKDK2. The initial implementations are
backed by nettle & gcrypt, which are commonly available
with distros shipping GNUTLS.

The test suite data is taken from the cryptsetup codebase
under the LGPLv2.1+ license. This merely aims to verify
that whatever backend we provide for this function in QEMU
will comply with the spec.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 14:41:07 +00:00
Peter Maydell
331ac65963 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Thu 17 Mar 2016 11:08:28 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  Revert "qed: Implement .bdrv_drain"
  aio-posix: Change CONFIG_EPOLL to CONFIG_EPOLL_CREATE1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-17 11:27:54 +00:00
Stefan Hajnoczi
1f3ddfcb25 Revert "qed: Implement .bdrv_drain"
This reverts commit df9a681dc9.

Note that commit df9a681dc9 included some
unrelated hunks, possibly due to a merge failure or an overlooked
squash.  This only reverts the qed .bdrv_drain() implementation.

The qed .bdrv_drain() implementation is unsafe and can lead to a double
request completion.

Paolo Bonzini reports:
"The problem is that bdrv_qed_drain calls qed_plug_allocating_write_reqs
unconditionally, but this is not correct if an allocating write is
queued.  In this case, qed_unplug_allocating_write_reqs will restart the
allocating write and possibly cause it to complete.  The aiocb however
is still in use for the L2/L1 table writes, and will then be completed
again as soon as the table writes are stable."

For QEMU 2.6 we can simply revert this commit.  A full solution for the
qed need check timer may be added if the bdrv_drain() implementation is
extended.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1457431876-8475-1-git-send-email-stefanha@redhat.com
2016-03-17 09:50:14 +00:00
Matthew Fortune
147dfab747 aio-posix: Change CONFIG_EPOLL to CONFIG_EPOLL_CREATE1
CONFIG_EPOLL was being used to guard epoll_create1 which results
in build failures on CentOS 5.

Signed-off-by: Matthew Fortune <matthew.fortune@imgtec.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 6D39441BF12EF246A7ABCE6654B023536BB85D08@hhmail02.hh.imgtec.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-17 09:50:14 +00:00
Daniel P. Berrange
b917da4cbd crypto: add cryptographic random byte source
There are three backend impls provided. The preferred
is gnutls, which is backed by nettle in modern distros.
The gcrypt impl is provided for cases where QEMU build
against gnutls is disabled, but crypto is still desired.
No nettle impl is provided, since it is non-trivial to
use the nettle APIs for random numbers. Users of nettle
should ensure gnutls is enabled for QEMU.

Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-17 09:49:01 +00:00
Peter Maydell
8c45754724 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine Core queue, 2016-03-16

# gpg: Signature made Wed 16 Mar 2016 18:57:34 GMT using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/machine-pull-request:
  module: Rename machine_init() to opts_init()
  machine: Use type_init() to register machine classes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-17 08:52:58 +00:00
Eduardo Habkost
34294e2f54 module: Rename machine_init() to opts_init()
The only remaining users of machine_init() only call
qemu_add_opts(). Rename machine_init() to opts_init() and move it
closer to the qemu_add_opts() calls on vl.c.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-03-16 15:54:23 -03:00
Eduardo Habkost
0e6aac87fd machine: Use type_init() to register machine classes
Change all machine_init() users that simply call type_register*()
to use type_init().

Cc: Evgeny Voevodin <e.voevodin@samsung.com>
Cc: Maksim Kozlov <m.kozlov@samsung.com>
Cc: Igor Mitsyanko <i.mitsyanko@gmail.com>
Cc: Dmitry Solodkiy <d.solodkiy@samsung.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Rob Herring <robh@kernel.org>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: Michael Walle <michael@walle.cc>
Cc: "Hervé Poussineau" <hpoussin@reactos.org>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Max Filippov <jcmvbkbc@gmail.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Acked-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-03-16 15:34:05 -03:00
Peter Maydell
33616ace9f Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Wed 16 Mar 2016 17:33:44 GMT using RSA key ID C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"

* remotes/cody/tags/block-pull-request:
  MAINTAINERS: Fix typo, block/stream.h -> block/stream.c
  block/sheepdog: fix argument passed to qemu_strtoul()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 18:20:10 +00:00
Peter Maydell
d1f8764099 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160316-1' into staging
target-arm queue:
 * loader: Fix incorrect parameter name in load_image_mr()
 * Implement MRS (banked) and MSR (banked) instructions
 * virt: Implement versioning for machine model
 * i.MX: some initial patches preparing for i.MX6 support
 * new ASPEED AST2400 SoC and palmetto-bmc machine
 * bcm2835: add some more raspi2 devices
 * sd: fix segfault running "info qtree"

# gpg: Signature made Wed 16 Mar 2016 17:42:43 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160316-1: (21 commits)
  sd: Fix "info qtree" on boards with SD cards
  bcm2835_dma: add emulation of Raspberry Pi DMA controller
  bcm2835_property: implement framebuffer control/configuration properties
  bcm2835_fb: add framebuffer device for Raspberry Pi
  bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block
  bcm2835_peripherals: enable sdhci pending-insert quirk for raspberry pi
  hw/arm: Add palmetto-bmc machine
  hw/arm: Add ASPEED AST2400 SoC model
  hw/intc: Add (new) ASPEED VIC device model
  hw/timer: Add ASPEED timer device model
  i.MX: Add missing descriptions in devices.
  i.MX: Add i.MX6 CCM and ANALOG device.
  i.MX: Add the CLK_IPG_HIGH clock
  i.MX: Remove CCM useless clock computation handling.
  i.MX: Rename CCM NOCLK to CLK_NONE for naming consistency.
  i.MX: Allow GPT timer to rollover.
  arm: virt: Move machine class init code to the abstract machine type
  arm: virt: Add an abstract ARM virt machine type
  target-arm: Fix translation level on early translation faults
  target-arm: Implement MRS (banked) and MSR (banked) instructions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:43:37 +00:00
Peter Maydell
fec44a8c70 sd: Fix "info qtree" on boards with SD cards
The SD card object is not a SysBusDevice, so don't create it with
qdev_create() if we're not assigning it to a specific bus; use
object_new() instead.

This was causing 'info qtree' to segfault on boards with SD cards,
because qdev_create(NULL, TYPE_FOO) puts the created object on the
system bus, and then we may try to run functions like sysbus_dev_print()
on it, which fail when casting the object to SysBusDevice.

(This is the same mistake that we made with the NAND device
and fixed in commit 6749695eaaf346c1.)

Reported-by: xiaoqiang.zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: xiaoqiang.zhao <zxq_yx_007@163.com>
Message-id: 1458061009-7733-1-git-send-email-peter.maydell@linaro.org
2016-03-16 17:42:19 +00:00
Grégory ESTRADE
6717f587a4 bcm2835_dma: add emulation of Raspberry Pi DMA controller
At present, all DMA transfers complete inline (so a looping descriptor
queue will lock up the device). We also do not model pause/abort,
arbitrarion/priority, or debug features.

Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-6-git-send-email-Andrew.Baumann@microsoft.com
[AB: implement 2D mode, cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Grégory ESTRADE
355a8ccc5c bcm2835_property: implement framebuffer control/configuration properties
The property channel driver now interfaces with the framebuffer device
to query and set framebuffer parameters. As a result of this, the "get
ARM RAM size" query now correctly returns the video RAM base address
(not total RAM size), and the ram-size property is no longer relevant
here.

Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-5-git-send-email-Andrew.Baumann@microsoft.com
[AB: cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Grégory ESTRADE
5e9c2a8dac bcm2835_fb: add framebuffer device for Raspberry Pi
The framebuffer occupies the upper portion of memory (64MiB by
default), but it can only be controlled/configured via a system
mailbox or property channel (to be added by a subsequent patch).

Signed-off-by: Grégory ESTRADE <gregory.estrade@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1457467526-8840-4-git-send-email-Andrew.Baumann@microsoft.com
[AB: added Windows (BGR) support and cleanup/refactoring for upstream submission]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Baumann
97398d900c bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block
At present only the core UART functions (data path for tx/rx) are
implemented, which is enough for UEFI to boot. The following
features/registers are unimplemented:
  * Line/modem control
  * Scratch register
  * Extra control
  * Baudrate
  * SPI interfaces

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-3-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Baumann
a2a8dfa8d8 bcm2835_peripherals: enable sdhci pending-insert quirk for raspberry pi
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1457467526-8840-2-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Jeffery
327d8e4ed2 hw/arm: Add palmetto-bmc machine
The new machine is a thin layer over the AST2400 ARM926-based SoC[1].
Between the minimal machine and the current SoC implementation there is
enough functionality to boot an aspeed_defconfig Linux kernel to
userspace. Nothing yet is specific to the Palmetto's BMC (other than
using an AST2400 SoC), but creating specific machine types is preferable
to a generic machine that doesn't match any particular hardware.

[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-5-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Jeffery
43e3346e43 hw/arm: Add ASPEED AST2400 SoC model
While the ASPEED AST2400 SoC[1] has a broad range of capabilities this
implementation is minimal, comprising an ARM926 processor, ASPEED VIC
and timer devices, and a 8250 UART.

[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-4-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Jeffery
0c69996e22 hw/intc: Add (new) ASPEED VIC device model
Implement a basic ASPEED VIC device model for the AST2400 SoC[1], with
enough functionality to boot an aspeed_defconfig Linux kernel. The model
implements the 'new' (revised) register set: While the hardware exposes
both the new and legacy register sets, accesses to the model's legacy
register set will not be serviced (however the access will be logged).

[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-3-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Andrew Jeffery
c04bd47db6 hw/timer: Add ASPEED timer device model
Implement basic ASPEED timer functionality for the AST2400 SoC[1]: Up to
8 timers can independently be configured, enabled, reset and disabled.
Some hardware features are not implemented, namely clock value matching
and pulse generation, but the implementation is enough to boot the Linux
kernel configured with aspeed_defconfig.

[1] http://www.aspeedtech.com/products.php?fPath=20&rId=376

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1458096317-25223-2-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
eccfa35e9f i.MX: Add missing descriptions in devices.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: f1f565eb9dffdeb582feb1b15ba9e8b0afcf5468.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
a66d815cd5 i.MX: Add i.MX6 CCM and ANALOG device.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 9fa80b4d8c5d0f50c94e77d74f952a7a665e168f.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
d552f675fb i.MX: Add the CLK_IPG_HIGH clock
EPIT, GPT and other i.MX timers are using "abstract" clocks among which
a CLK_IPG_HIGH clock.

On i.MX25 and i.MX31 CLK_IPG and CLK_IPG_HIGH are mapped to the same clock
but on other SOC like i.MX6 they are mapped to distinct clocks.

This patch add the CLK_IPG_HIGH to prepare for SOC where these 2 clocks are
different.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 224bf650194760284cb40630e985867e1373276a.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
f4b2add6cc i.MX: Remove CCM useless clock computation handling.
Most clocks supported by the CCM are useless to the qemu framework.

Only clocks related to timers (EPIT, GPT, PWM, WATCHDOG, ...) are usefull
to QEMU code.

Therefore this patch removes clock computation handling for all clocks but:
* CLK_NONE,
* CLK_IPG,
* CLK_32k

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 9e7222efb349801032e60c0f6b0fbad0e5dcf648.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
c91a5883c3 i.MX: Rename CCM NOCLK to CLK_NONE for naming consistency.
This way all CCM clock defines/enums are named CLK_XXX

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 8537df765c1713625c7a8b9aca4c7ca60b42e0c0.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jean-Christophe Dubois
4833e15f74 i.MX: Allow GPT timer to rollover.
GPT timer need to rollover when it reaches 0xffffffff.

It also need to reset to 0 when in "restart mode" and crossing the
compare 1 register.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 6e2b36117a249a78bf822dd59a390368f407136e.1456868959.git.jcd@tribudubois.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Wei Huang
9c94d8e6c9 arm: virt: Move machine class init code to the abstract machine type
This patch moves the common class initialization code from
"virt-2.6" to the new abstract class. An empty property is added to
"virt-2.6" machine. In the meanwhile, related funtions are renamed
to "virt_2_6_*" for consistency.

Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-3-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Wei Huang
ed796373b4 arm: virt: Add an abstract ARM virt machine type
In preparation for future ARM virt machine types, this patch creates
an abstract type for all ARM machines. The current machine type in
QEMU (i.e. "virt") is renamed to "virt-2.6", whose naming scheme is
similar to other architectures. For the purpose of backward compatibility,
"virt" is converted to an alias, pointing to "virt-2.6". With this patch,
"qemu -M ?" lists the following virtual machine types along with others:

virt                 QEMU 2.6 ARM Virtual Machine (alias of virt-2.6)
virt-2.6             QEMU 2.6 ARM Virtual Machine

Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1457717778-17727-2-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Sergey Sorokin
1b4093ea66 target-arm: Fix translation level on early translation faults
Qemu reports translation fault on 1st level instead of 0th level in case of
AArch64 address translation if the translation table walk is disabled or
the address is in the gap between the two regions.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-id: 1457527503-25958-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:42:18 +00:00
Jeff Cody
773460256b MAINTAINERS: Fix typo, block/stream.h -> block/stream.c
There is no block/stream.h, the intended filename is block/stream.c
instead.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: b9feeac95301c1b0b1c28a485da5e3781370c31a.1457578261.git.jcody@redhat.com
2016-03-16 13:25:29 -04:00
Jeff Cody
03c698f0a2 block/sheepdog: fix argument passed to qemu_strtoul()
The function qemu_strtoul() reads 'unsigned long' sized data,
which is larger than uint32_t on 64-bit machines.

Even though the snap_id field in the header is 32-bits, we must
accommodate the full size in qemu_strtoul().

This patch also adds more meaningful error handling to the
qemu_strtoul() call, and subsequent results.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Message-id: e56fc50abedd9a112e0683342c8eafda063cd2f9.1456935548.git.jcody@redhat.com
2016-03-16 13:25:29 -04:00
Peter Maydell
8bfd0550be target-arm: Implement MRS (banked) and MSR (banked) instructions
Starting with the ARMv7 Virtualization Extensions, the A32 and T32
instruction sets provide instructions "MSR (banked)" and "MRS
(banked)" which can be used to access registers for a mode other
than the current one:
 * R<m>_<mode>
 * ELR_hyp
 * SPSR_<mode>

Implement the missing instructions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1456762734-23939-1-git-send-email-peter.maydell@linaro.org
2016-03-16 17:05:58 +00:00
Jens Wiklander
f09f9bd9fa loader: Fix incorrect parameter name in load_image_mr() macro
Fix a typo in the load_image_mr() macro: 'mr' was written when
the parameter name is '_mr'. (This had no visible effects since
the single use of the macro used 'mr' as the argument.)

Fixes 76151cacfe "loader: Add
load_image_mr() to load ROM image to a MemoryRegion"

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 17:05:58 +00:00
Peter Maydell
0ebc03bc06 util/base64.c: Clean includes
Remove unnecessary include of config-host.h.
(This was missed by the clean-includes script because of the
incorrect use of <> for a QEMU header.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-5-git-send-email-peter.maydell@linaro.org
2016-03-16 12:48:11 +00:00
Peter Maydell
8bc92a762a update-linux-headers.sh: Fake types.h doesn't need to include anything
We have a fake linux/types.h which we create in update-linux-headers.h.
Now that every QEMU source file includes osdep.h, this fake header
doesn't need to include anything at all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-4-git-send-email-peter.maydell@linaro.org
2016-03-16 12:48:11 +00:00
Peter Maydell
8816c600d3 include/config.h: Remove
include/config.h just includes config-target.h (and used to also
include config-host.h).
It is now obsolete and unused, because osdep.h does this job, so
remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-3-git-send-email-peter.maydell@linaro.org
2016-03-16 12:48:11 +00:00
Peter Maydell
4674da1c49 slirp/slirp.h: Remove now-empty #ifdefs
After automatic cleanup to remove unnecessary #includes of headers that
osdep.h provides, slirp.h has a few now unnecessary #ifdef/#endif pairs;
remove them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456237112-32662-2-git-send-email-peter.maydell@linaro.org
2016-03-16 12:48:11 +00:00
Peter Maydell
6aeda86890 Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-03-16' into staging
Error reporting patches for 2016-03-16

# gpg: Signature made Wed 16 Mar 2016 09:57:00 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-error-2016-03-16:
  error: ensure errno detail is printed with error_abort

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 11:09:36 +00:00
Peter Maydell
cad0b273e5 Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2016-03-16' into staging
Monitor patches for 2016-03-16

# gpg: Signature made Wed 16 Mar 2016 09:47:23 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-monitor-2016-03-16:
  qdev-monitor: add missing aliases for virtio device classes
  qdev-monitor: sort alias table by typename
  qdev-monitor: improve error message when alias device is unavailable

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 10:38:15 +00:00
Peter Maydell
f235538e38 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160316' into staging
ppc patch queue for 2016-03-16

Accumulated patches for target-ppc, pseries machine type and related
devices.  As we are now in soft freeze, these are mostly fixes.
   * Fix KVM migration for several SPRs that qemu didn't handle
   * Clean up handling of SDR1, which allows a fix to the gdbstub
   * Fix a race in spapr_rng
   * Fix a bug with multifunction hotplug

The exception is the 7 patches to allow EEH on spapr-pci-host-bridge
devices (rather than the special and poorly designed
spapr-vfio-pci-host-bridge device).  I believe these are low risk of
breaking non-EEH cases, and EEH cases were little used in practice
previously (since libvirt did not support the special device amongst
other things).  It did have a draft posted before the soft freeze,
removes a very ugly VFIO interface, and removes device we'd like to
deprecate sooner rather than later.  So, I'm hoping we can squeeze
these in during the soft freeze.

This includes two patches to the VFIO code, which Alex Williamson has
indicated he's ok with coming through my tree.

# gpg: Signature made Wed 16 Mar 2016 05:04:52 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160316:
  vfio: Eliminate vfio_container_ioctl()
  spapr_pci: Remove finish_realize hook
  spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
  spapr_pci: Allow EEH on spapr-pci-host-bridge
  spapr_pci: Eliminate class callbacks
  spapr_pci: Switch to vfio_eeh_as_op() interface
  vfio: Start improving VFIO/EEH interface
  spapr_rng: fix race with main loop
  target-ppc: Eliminate kvmppc_kern_htab global
  target-ppc: Add helpers for updating a CPU's SDR1 and external HPT
  target-ppc: Split out SREGS get/put functions
  spapr_pci: fix multifunction hotplug
  target-ppc: Add PVR for POWER8NVL processor
  ppc: Add a few more P8 PMU SPRs
  ppc: Fix migration of the TAR SPR
  ppc: Define the PSPB register on POWER8

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 10:09:26 +00:00
Daniel P. Berrange
20e2dec149 error: ensure errno detail is printed with error_abort
When &error_abort is passed in, the error reporting code
will print the current error message and then abort() the
process. Unfortunately at the time it aborts, we've not
yet appended the errno detail. This makes debugging certain
problems significantly harder as the log is incomplete.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457544504-8548-22-git-send-email-berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-16 10:55:51 +01:00
Peter Maydell
af1d3ebbef Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
acpi: minor fix

Since previous pull acpi test triggers warnings,
fix it up.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 15 Mar 2016 21:26:38 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  acpi-test: update UID for GSI links

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-16 09:27:58 +00:00
Sascha Silbe
588c36cac7 qdev-monitor: add missing aliases for virtio device classes
virtio-{blk,balloon,net,serial} are aliases for their actual,
architecture-dependent implementations (*-ccw on s390x, *-pci on other
architectures supporting virtio). This makes it a lot easier to craft
qemu invocations that work on all supported architectures. Complete
the set to cover all existing non-abstract virtio device classes.

For virtio-balloon, only the CCW implementation was missing.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-4-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-16 10:13:10 +01:00
Sascha Silbe
36e9916811 qdev-monitor: sort alias table by typename
Sort the alias table by typename so it's easier to see which aliases
exist.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-3-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-16 10:13:10 +01:00
Sascha Silbe
f6b5319d41 qdev-monitor: improve error message when alias device is unavailable
When trying to instantiate an alias that points to a device class that
doesn't exist, the error message looks like qemu misunderstood the
request:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu-ccw' is not a valid
device model name

Special-case the error message to make it explicit that alias
expansion is going on:

$ s390x-softmmu/qemu-system-s390x -device virtio-gpu
qemu-system-s390x: -device virtio-gpu: 'virtio-gpu' (alias
'virtio-gpu-ccw') is not a valid device model name

Suggested-By: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1455831854-49013-2-git-send-email-silbe@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-16 10:13:10 +01:00
David Gibson
3356128cd1 vfio: Eliminate vfio_container_ioctl()
vfio_container_ioctl() was a bad interface that bypassed abstraction
boundaries, had semantics that sat uneasily with its name, and was unsafe
in many realistic circumstances.  Now that spapr-pci-vfio-host-bridge has
been folded into spapr-pci-host-bridge, there are no more users, so remove
it.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-16 09:55:11 +11:00
David Gibson
a36304fdca spapr_pci: Remove finish_realize hook
Now that spapr-pci-vfio-host-bridge is reduced to just a stub, there is
only one implementation of the finish_realize hook in sPAPRPHBClass.  So,
we can fold that implementation into its (single) caller, and remove the
hook.  That's the last thing left in sPAPRPHBClass, so that can go away as
well.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
72700d7e73 spapr_pci: (Mostly) remove spapr-pci-vfio-host-bridge
Now that the regular spapr-pci-host-bridge can handle EEH, there are only
two things that spapr-pci-vfio-host-bridge does differently:
    1. automatically sizes its DMA window to match the host IOMMU
    2. checks if the attached VFIO container is backed by the
       VFIO_SPAPR_TCE_IOMMU type on the host

(1) is not particularly useful, since the default window used by the
regular host bridge will work with the host IOMMU configuration on all
current systems anyway.

Plus, automatically changing guest visible configuration (such as the DMA
window) based on host settings is generally a bad idea.  It's not
definitively broken, since spapr-pci-vfio-host-bridge is only supposed to
support VFIO devices which can't be migrated anyway, but still.

(2) is not really useful, because if a guest tries to configure EEH on a
different host IOMMU, the first call will fail and that will be that.

It's possible there are scripts or tools out there which expect
spapr-pci-vfio-host-bridge, so we don't remove it entirely.  This patch
reduces it to just a stub for backwards compatibility.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
c1fa017c7e spapr_pci: Allow EEH on spapr-pci-host-bridge
Now that the EEH code is independent of the special
spapr-vfio-pci-host-bridge device, we can allow it on all spapr PCI
host bridges instead.  We do this by changing spapr_phb_eeh_available()
to be based on the vfio_eeh_as_ok() call instead of the host bridge class.

Because the value of vfio_eeh_as_ok() can change with devices being
hotplugged or unplugged, this can potentially lead to some strange edge
cases where the guest starts using EEH, then it starts failing because
of a change in status.

However, it's not really any worse than the current situation.  Cases that
would have worked previously will still work (i.e. VFIO devices from at
most one VFIO IOMMU group per vPHB), it's just that it's no longer
necessary to use spapr-vfio-pci-host-bridge with the groupid pre-specified.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:11 +11:00
David Gibson
fbb4e98341 spapr_pci: Eliminate class callbacks
The EEH operations in the spapr-vfio-pci-host-bridge no longer rely on the
special groupid field in sPAPRPHBVFIOState.  So we can simplify, removing
the class specific callbacks with direct calls based on a simple
spapr_phb_eeh_enabled() helper.  For now we implement that in terms of
a boolean in the class, but we'll continue to clean that up later.

On its own this is a rather strange way of doing things, but it's a useful
intermediate step to further cleanups.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:10 +11:00
David Gibson
76a9e9f680 spapr_pci: Switch to vfio_eeh_as_op() interface
This switches all EEH on VFIO operations in spapr_pci_vfio.c from the
broken vfio_container_ioctl() interface to the new vfio_as_eeh_op()
interface.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-03-16 09:55:10 +11:00
David Gibson
3153119e9b vfio: Start improving VFIO/EEH interface
At present the code handling IBM's Enhanced Error Handling (EEH) interface
on VFIO devices operates by bypassing the usual VFIO logic with
vfio_container_ioctl().  That's a poorly designed interface with unclear
semantics about exactly what can be operated on.

In particular it operates on a single vfio container internally (hence the
name), but takes an address space and group id, from which it deduces the
container in a rather roundabout way.  groupids are something that code
outside vfio shouldn't even be aware of.

This patch creates new interfaces for EEH operations.  Internally we
have vfio_eeh_container_op() which takes a VFIOContainer object
directly.  For external use we have vfio_eeh_as_ok() which determines
if an AddressSpace is usable for EEH (at present this means it has a
single container with exactly one group attached), and vfio_eeh_as_op()
which will perform an operation on an AddressSpace in the unambiguous case,
and otherwise returns an error.

This interface still isn't great, but it's enough of an improvement to
allow a number of cleanups in other places.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-16 09:55:10 +11:00
Greg Kurz
f1a6cf3ef7 spapr_rng: fix race with main loop
Since commit "60253ed1e6ec rng: add request queue support to rng-random",
the use of a spapr_rng device may hang vCPU threads.

The following path is taken without holding the lock to the main loop mutex:

h_random()
  rng_backend_request_entropy()
    rng_random_request_entropy()
      qemu_set_fd_handler()

The consequence is that entropy_available() may be called before the vCPU
thread could even queue the request: depending on the scheduling, it may
happen that entropy_available() does not call random_recv()->qemu_sem_post().
The vCPU thread will then sleep forever in h_random()->qemu_sem_wait().

This could not happen before 60253ed1e6 because entropy_available() used
to call random_recv() unconditionally.

This patch ensures the lock is held to avoid the race.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:06 +11:00
David Gibson
c18ad9a54b target-ppc: Eliminate kvmppc_kern_htab global
fa48b43 "target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM"
purports to remove a hack in the handling of hash page tables (HPTs)
managed by KVM instead of qemu.  However, it actually went in the wrong
direction.

That patch requires anything looking for an external HPT (that is one not
managed by the guest itself) to check both env->external_htab (for a qemu
managed HPT) and kvmppc_kern_htab (for a KVM managed HPT).  That's a
problem because kvmppc_kern_htab is local to mmu-hash64.c, but some places
which need to check for an external HPT are outside that, such as
kvm_arch_get_registers().  The latter was subtly broken by the earlier
patch such that gdbstub can no longer access memory.

Basically a KVM managed HPT is much more like a qemu managed HPT than it is
like a guest managed HPT, so the original "hack" was actually on the right
track.

This partially reverts fa48b43, so we again mark a KVM managed external HPT
by putting a special but non-NULL value in env->external_htab.  It then
goes further, using that marker to eliminate the kvmppc_kern_htab global
entirely.  The ppc_hash64_set_external_hpt() helper function is extended
to set that marker if passed a NULL value (if you're setting an external
HPT, but don't have an actual HPT to set, the assumption is that it must
be a KVM managed HPT).

This also has some flow-on changes to the HPT access helpers, required by
the above changes.

Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-03-16 09:55:06 +11:00
David Gibson
e5c0d3ce40 target-ppc: Add helpers for updating a CPU's SDR1 and external HPT
When a Power cpu with 64-bit hash MMU has it's hash page table (HPT)
pointer updated by a write to the SDR1 register we need to update some
derived variables.  Likewise, when the cpu is configured for an external
HPT (one not in the guest memory space) some derived variables need to be
updated.

Currently the logic for this is (partially) duplicated in ppc_store_sdr1()
and in spapr_cpu_reset().  In future we're going to need it in some other
places, so make some common helpers for this update.

In addition the new ppc_hash64_set_external_hpt() helper also updates
SDR1 in KVM - it's not updated by the normal runtime KVM <-> qemu CPU
synchronization.  In a sense this belongs logically in the
ppc_hash64_set_sdr1() helper, but that is called from
kvm_arch_get_registers() so can't itself call cpu_synchronize_state()
without infinite recursion.  In practice this doesn't matter because
the only other caller is TCG specific.

Currently there aren't situations where updating SDR1 at runtime in KVM
matters, but there are going to be in future.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-16 09:55:06 +11:00
David Gibson
a7a00a729a target-ppc: Split out SREGS get/put functions
Currently the getting and setting of Power MMU registers (sregs) take up
large inline chunks of the kvm_arch_get_registers() and
kvm_arch_put_registers() functions.  Especially since there are two
variants (for Book-E and Book-S CPUs), only one of which will be used in
practice, this is pretty hard to read.

This patch splits these out into helper functions for clarity.  No
functional change is expected.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-03-16 09:55:05 +11:00
Michael Roth
788d2599de spapr_pci: fix multifunction hotplug
Since 3f1e147, QEMU has adopted a convention of supporting function
hotplug by deferring hotplug events until func 0 is hotplugged.
This is likely how management tools like libvirt would expose
such support going forward.

Since sPAPR guests rely on per-func events rather than
slot-based, our protocol has been to hotplug func 0 *first* to
avoid cases where devices appear within guests without func 0
present to avoid undefined behavior.

To remain compatible with new convention, defer hotplug in a
similar manner, but then generate events in 0-first order as we
did in the past. Once func 0 present, fail any attempts to plug
additional functions (as we do with PCIe).

For unplug, defer unplug operations in a similar manner, but
generate unplug events such that function 0 is removed last in guest.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Alexey Kardashevskiy
a88dced8eb target-ppc: Add PVR for POWER8NVL processor
This adds a new POWER8+NVLink CPU PVR which core is identical to POWER8
but has a different PVR. The only available machine now has PVR
pvr 004c 0100 so this defines "POWER8NVL" alias as v1.0.

The corresponding kernel commit is
https://github.com/torvalds/linux/commit/ddee09c099c3
"powerpc: Add PVR for POWER8NVL processor"

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Benjamin Herrenschmidt
14646457ae ppc: Add a few more P8 PMU SPRs
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Thomas Huth
1e440cbc99 ppc: Fix migration of the TAR SPR
The TAR special purpose register currently does not get migrated
under KVM because it does not get synchronized with the kernel.
Use spr_register_kvm() instead of spr_register() to fix this issue.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Thomas Huth
d6f1445faf ppc: Define the PSPB register on POWER8
POWER8 / PowerISA 2.07 has a new special purpose register called PSPB
("Problem State Priority Boost Register"). The contents of this register
are currently lost during migration. To be able to migrate this register,
too, we've got to define this SPR along with the other SPRs of POWER8.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-03-16 09:55:05 +11:00
Michael S. Tsirkin
3ba6a710e6 acpi-test: update UID for GSI links
Update acpi test data to match
commit 6a991e07bb
("hw/acpi: fix GSI links UID").

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-15 23:25:52 +02:00
Peter Maydell
4caecccbc1 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Miscellaneous exec.c fixes (Markus, myself)
* Q35 support for -machine kernel_irqchip=split (Rita)
* Chardev replay support (Pavel)
* icount "warping" cleanups (Pavel)

# gpg: Signature made Tue 15 Mar 2016 17:24:08 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  icount: decouple warp calls
  icount: remove obsolete warp call
  replay: character devices
  exec: fix early return from ram_block_add
  exec: Fix memory allocation when memory path isn't on hugetlbfs
  exec: Fix memory allocation when memory path names new file
  update-linux-headers: Add userfaultfd.h
  kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 17:56:14 +00:00
Pavel Dovgalyuk
e76d1798fa icount: decouple warp calls
qemu_clock_warp function is called to update virtual clock when CPU
is sleeping. This function includes replay checkpoint to make execution
deterministic in icount mode.
Record/replay module flushes async event queue at checkpoints.
Some of the events (e.g., block devices operations) include interaction
with hardware. E.g., APIC polled by block devices sets one of IRQ flags.
Flag to be set depends on currently executed thread (CPU or iothread).
Therefore in replay mode we have to process the checkpoints in the same thread
as they were recorded.
qemu_clock_warp function (and its checkpoint) may be called from different
thread. This patch decouples two different execution cases of this function:
call when CPU is sleeping from iothread and call from cpu thread to update
virtual clock.
First task is performed by qemu_start_warp_timer function. It sets warp
timer event to the moment of nearest pending virtual timer.
Second function (qemu_account_warp_timer) is called from cpu thread
before execution of the code. It advances virtual clock by adding the length
of period while CPU was sleeping.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160310115609.4812.44986.stgit@PASHA-ISP>
[Update docs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:45 +01:00
Pavel Dovgalyuk
281b2201e4 icount: remove obsolete warp call
qemu_clock_warp call in qemu_tcg_wait_io_event function is not needed
anymore, because it is called in every iteration of main_loop_wait.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160310115603.4812.67559.stgit@PASHA-ISP>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:42 +01:00
Pavel Dovgalyuk
33577b47c6 replay: character devices
This patch implements record and replay of character devices.
It records chardevs communication in replay mode. Recorded information
include data read from backend and counter of bytes written
from frontend to backend to preserve frontend internal state.
If character device was configured through the command line in record mode,
then in replay mode it should be also added to command line. Backend of
the character device could be changed in replay mode.
Replaying of devices that perform ioctl and get_msgfd operations is not
supported.
gdbstub which also acts as a backend is not recorded to allow controlling
the replaying through gdb. Monitor backends are also not recorded.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20160314074436.4980.83856.stgit@PASHA-ISP>
[Add stubs. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:40 +01:00
Paolo Bonzini
39c350ee12 exec: fix early return from ram_block_add
After reporting an error, ram_block_add was going on with the registration
of the RAMBlock.  The visible effect is that it unlocked the ramlist
mutex twice.

Fixes: 528f46af6e
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Markus Armbruster
e1fb647199 exec: Fix memory allocation when memory path isn't on hugetlbfs
gethugepagesize() works reliably only when its argument is on
hugetlbfs.  When it's not, it returns the filesystem's "optimal
transfer block size", which may or may not be the actual page size
you'll get when you mmap().

If the value is too small or not a power of two, we fail
qemu_ram_mmap()'s assertions.  These were added in commit 794e8f3
(v2.5.0).  The bug's impact before that is currently unknown.  Seems
fairly unlikely at least when the normal page size is 4KiB.

Else, if the value is too large, we align more strictly than
necessary.

gethugepagesize() goes back to commit c902760 (v0.13).  That commit
clearly intended gethugepagesize() to be used on hugetlbfs only.  Not
only was it named accordingly, it also printed a warning when used on
anything else.  However, the commit neglected to spell out the
restriction in user documentation of -mem-path.

Commit bfc2a1a (v2.5.0) dropped the warning as bogus "because QEMU
functions perfectly well with the path on a regular tmpfs filesystem".
It sure does when you're sufficiently lucky.  In my testing, I was
lucky, too.

Fix by switching to qemu_fd_getpagesize().  Rename the variable
holding its result from hpagesize to page_size.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-3-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Markus Armbruster
fd97fd4408 exec: Fix memory allocation when memory path names new file
Commit 8d31d6b extended file_ram_alloc() to accept file names in
addition to directory names.  Even though it passes O_CREAT to open(),
it actually works only for existing files.  Reproducer adapted from
the commit's qemu-doc.texi update:

    $ qemu-system-x86_64 -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1
    qemu-system-x86_64: -object memory-backend-file,size=2M,mem-path=/dev/hugepages/my-shmem-file,id=mb1: failed to get page size of file /dev/hugepages/my-shmem-file: No such file or directory

This is because we first get the page size for @path, then open the
actual file.  Unwise even before the flawed commit, because the
directory could change in between, invalidating the page size.
Unlikely to bite in practice.

Rearrange the code to create the file (if necessary) before getting
its page size.  Carefully avoid TOCTTOU conditions with a method
suggested by Paolo Bonzini.

While there, replace "hugepages" by "guest RAM" in error messages,
because host memory backends can be used for purposes other than huge
pages, e.g. /dev/shm/ shared memory.  Help text of -mem-path agrees.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457378754-21649-2-git-send-email-armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Alexey Kardashevskiy
2ae823d4f7 update-linux-headers: Add userfaultfd.h
userfailtfd.h is used by post-copy migration so include it to
the update-linux-headers.sh as we want it updated altogether with
other kernel headers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <1455512381-15271-1-git-send-email-aik@ozlabs.ru>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Rita Sinha
b094f2e015 kvm: x86: q35: Add support for -machine kernel_irqchip=split for q35
The split IRQ chip mode via KVM_CAP_SPLIT_IRQCHIP was introduced with commit
15eafc2e60 but was broken for q35. This patch makes kernel_irqchip=split
functional for q35.

Signed-off-by: Rita Sinha <rita.sinha89@gmail.com>
Message-Id: <1457378525-16455-1-git-send-email-rita.sinha89@gmail.com>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-15 18:23:33 +01:00
Peter Maydell
a6cdb77f81 Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp: Adding IPv6 support to Qemu -net user mode

# gpg: Signature made Tue 15 Mar 2016 16:06:03 GMT using RSA key ID FB6B2F1D
# gpg: Good signature from "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: F632 74CD C630 0873 CB3D  29D9 E3E5 1CE8 FB6B 2F1D

* remotes/thibault/tags/samuel-thibault:
  slirp: Add IPv6 support to the TFTP code
  qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses
  slirp: Adding IPv6 address for DNS relay
  slirp: Handle IPv6 in TCP functions
  slirp: Reindent after refactoring
  slirp: Generalizing and neutralizing various TCP functions before adding IPv6 stuff
  slirp: Factorizing tcpiphdr structure with an union
  slirp: Adding IPv6 UDP support
  slirp: Adding ICMPv6 error sending
  slirp: Fix ICMP error sending
  slirp: Adding IPv6, ICMPv6 Echo and NDP autoconfiguration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 17:09:52 +00:00
Peter Maydell
a58a4cb187 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, virtio, pci, pc, acpi

nvdimm work
sparse cpu id rework
ipmi enhancements
fixes all over the place
pxb option to tweak chassis number

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 15 Mar 2016 14:33:10 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (51 commits)
  hw/acpi: fix GSI links UID
  ipmi: add some local variables in ipmi_sdr_init
  ipmi: remove the need of an ending record in the SDR table
  ipmi: use a function to initialize the SDR table
  ipmi: add a realize function to the device class
  ipmi: add rsp_buffer_set_error() helper
  ipmi: remove IPMI_CHECK_RESERVATION() macro
  ipmi: replace IPMI_ADD_RSP_DATA() macro with inline helpers
  ipmi: remove IPMI_CHECK_CMD_LEN() macro
  MAINTAINERS: machine core
  MAINTAINERS: Add an entry for virtio header files
  pc: acpi: clarify why possible LAPIC entries must be present in MADT
  pc: acpi: drop cpu->found_cpus bitmap
  pc: acpi: create Processor and Notify objects only for valid lapics
  pc: acpi: create MADT.lapic entries only for valid lapics
  pc: acpi: SRAT: create only valid processor lapic entries
  pc: acpi: cleanup qdev_get_machine() calls
  machine: introduce MachineClass.possible_cpu_arch_ids() hook
  pc: init pcms->apic_id_limit once and use it throughout pc.c
  pc: acpi: remove NOP assignment
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 16:43:48 +00:00
Thomas Huth
fad7fb9ccd slirp: Add IPv6 support to the TFTP code
Add the handler code for incoming TFTP packets to udp6_input(),
and make sure that the TFTP code can send packets with both,
udp_output() and udp6_output() by introducing a wrapper function
called tftp_udp_output().

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-03-15 17:05:34 +01:00
Peter Maydell
f84d587111 Merge remote-tracking branch 'remotes/berrange/tags/pull-io-next-2016-03-15-1' into staging
Merge I/O fixes

# gpg: Signature made Tue 15 Mar 2016 14:42:43 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-io-next-2016-03-15-1:
  io: stronger check for support for IPv4/6

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 15:51:06 +00:00
Marcel Apfelbaum
6a991e07bb hw/acpi: fix GSI links UID
According to the ACPI spec, each UID must be unique.
Use the irq number as UID for GSI links.

Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-15 16:16:57 +02:00
Daniel P. Berrange
cfd47a71df io: stronger check for support for IPv4/6
Instead of just checking for bind(), also check whether
getaddrinfo can resolve IPv6 addresses. This catches
failure when travis runs QEMU builds inside minimal
docker containers

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-15 13:55:52 +00:00
Peter Maydell
d41e0bed7b Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
X86 fixes

# gpg: Signature made Mon 14 Mar 2016 20:26:25 GMT using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/x86-pull-request:
  kvm: Remove x2apic feature from CPU model when kernel_irqchip is off
  hyperv: cpu hotplug fix with HyperV enabled

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 11:05:37 +00:00
Peter Maydell
9828f9b6c8 Merge remote-tracking branch 'remotes/rth/tags/pull-i386-20160314' into staging
target-i386 fixes

# gpg: Signature made Mon 14 Mar 2016 17:54:06 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-i386-20160314:
  target-i386: Dump unknown opcodes with -d unimp
  target-i386: Fix inhibit irq mask handling
  target-i386: Use gen_nop_modrm for prefetch instructions
  target-i386: Fix addr16 prefix
  target-i386: Fix SMSW for 64-bit mode
  target-i386: Fix SMSW and LMSW from/to register
  target-i386: Avoid repeated calls to the bnd_jmp helper

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 10:08:12 +00:00
Yann Bordenave
7aac531ef2 qapi-schema, qemu-options & slirp: Adding Qemu options for IPv6 addresses
This patch adds parameters to manage some new options in the qemu -net
command.
Slirp IPv6 address, network prefix, and DNS IPv6 address can be given in
argument to the qemu command.
Defaults parameters are respectively fec0::2, fec0::, /64 and fec0::3.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:25 +01:00
Guillaume Subiron
05061d8548 slirp: Adding IPv6 address for DNS relay
This patch adds an IPv6 address to the DNS relay. in6_equal_dns() is
developed using this Slirp attribute.
sotranslate_in/out/accept() are also updated to manage the IPv6 case so the
guest can be able to join the host using one of the Slirp addresses.

For now this only points to localhost. Further development will be needed to
automatically fetch the IPv6 address from resolv.conf, and announce this via
RDNSS.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:22 +01:00
Guillaume Subiron
3feea4447f slirp: Handle IPv6 in TCP functions
This patch adds IPv6 case in TCP functions refactored by the last
patches.
This also adds IPv6 pseudo-header in tcpiphdr structure.
Finally, tcp_input() is called by ip6_input().

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:19 +01:00
Guillaume Subiron
1252cf40a8 slirp: Reindent after refactoring
No code change.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:17 +01:00
Guillaume Subiron
9dfbf250d2 slirp: Generalizing and neutralizing various TCP functions before adding IPv6 stuff
Basically, this patch adds some switch in various TCP functions to
prepare them for the IPv6 case.

To have something to "switch" in tcp_input() and tcp_respond(), a new
argument is used to give them the sa_family of the addresses they are
working on.

This patch does not include the entailed reindentation, to make proofread
easier. Reindentation is adressed in the following no-op patch.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:14 +01:00
Guillaume Subiron
98c63057d2 slirp: Factorizing tcpiphdr structure with an union
This patch factorizes the tcpiphdr structure to put the IPv4 fields in
an union, for addition of version 6 in further patch.
Using some macros, retrocompatibility of the existing code is assured.

This patch also fixes the SLIRP_MSIZE and margin computation in various
functions, and makes them compatible with the new tcpiphdr structure,
whose size will be bigger than sizeof(struct tcphdr) + sizeof(struct ip)

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:11 +01:00
Guillaume Subiron
15d62af4b6 slirp: Adding IPv6 UDP support
This adds the sin6 case in the fhost and lhost unions and related macros.
It adds udp6_input() and udp6_output().
It adds the IPv6 case in sorecvfrom().
Finally, udp_input() is called by ip6_input().

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:08 +01:00
Yann Bordenave
fc6c9257c6 slirp: Adding ICMPv6 error sending
Adding icmp6_send_error to send ICMPv6 Error messages. This function is
simpler than the v4 version.
Adding some calls in various functions to send ICMP errors, when a
received packet is too big, or when its hop limit is 0.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:04 +01:00
Yann Bordenave
de40abfecf slirp: Fix ICMP error sending
Disambiguation : icmp_error is renamed into icmp_send_error, since it
doesn't manage errors, but only sends ICMP Error messages.

Signed-off-by: Yann Bordenave <meow@meowstars.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:02 +01:00
Guillaume Subiron
0d6ff71ae3 slirp: Adding IPv6, ICMPv6 Echo and NDP autoconfiguration
This patch adds the functions needed to handle IPv6 packets. ICMPv6 and
NDP headers are implemented.

Slirp is now able to send NDP Router or Neighbor Advertisement when it
receives Router or Neighbor Solicitation. Using a 64bit-sized IPv6
prefix, the guest is now able to perform stateless autoconfiguration
(SLAAC) and to compute its IPv6 address.

This patch adds an ndp_table, mainly inspired by arp_table, to keep an
NDP cache and manage network address resolution.
Slirp regularly sends NDP Neighbor Advertisement, as recommended by the
RFC, to make the guest refresh its route.

This also adds ip6_cksum() to compute ICMPv6 checksums using IPv6
pseudo-header.

Some #define ETH_* are moved upper in slirp.h to make them accessible to
other slirp/*.h

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-03-15 10:35:00 +01:00
Peter Maydell
1a8b408168 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Mon 14 Mar 2016 16:36:52 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (40 commits)
  iotests: Add test for QMP event rates
  monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
  monitor: Separate QUORUM_REPORT_BAD events according to the node name
  quorum: Fix crash in quorum_aio_cb()
  iotests: Correct 081's reference output
  block: Remove unused typedef of BlockDriverDirtyHandler
  block: Move block dirty bitmap code to separate files
  typedefs: Add BdrvDirtyBitmap
  block: Include hbitmap.h in block.h
  backup: Use Bitmap to replace "s->bitmap"
  vpc: Use BB functions in .bdrv_create()
  vmdk: Use BB functions in .bdrv_create()
  vhdx: Use BB functions in .bdrv_create()
  vdi: Use BB functions in .bdrv_create()
  sheepdog: Use BB functions in .bdrv_create()
  qed: Use BB functions in .bdrv_create()
  qcow2: Use BB functions in .bdrv_create()
  qcow: Use BB functions in .bdrv_create()
  parallels: Use BB functions in .bdrv_create()
  block: Introduce blk_set_allow_write_beyond_eof()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-15 09:13:06 +00:00
Lan Tianyu
492a4c94be kvm: Remove x2apic feature from CPU model when kernel_irqchip is off
x2apic feature is in the kvm_default_props and automatically added to all
CPU models when KVM is enabled. But userspace devices don't support x2apic
which can't be enabled without the in-kernel irqchip. It will trigger
warning of "host doesn't support requested feature: CPUID.01H:ECX.x2apic
[bit 21]" when kernel_irqchip is off. This patch is to fix it via removing
x2apic feature when kernel_irqchip is off.

Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-03-14 17:26:06 -03:00
Denis V. Lunev
4467c6c118 hyperv: cpu hotplug fix with HyperV enabled
With Hyper-V enabled CPU hotplug stops working. The CPU appears
in device manager on Windows but does not appear in peformance
monitor and control panel.

The root of the problem is the following. Windows checks
HV_X64_CPU_DYNAMIC_PARTITIONING_AVAILABLE bit in CPUID. The
presence of this bit is enough to cure the situation.

The bit should be set when CPU hotplug is allowed for HyperV VM.
The check that hot_add_cpu callback is defined is enough from the
protocol point of view. Though this callback is defined almost
always thus there is no need to export that knowledge in the
other way.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Richard Henderson <rth@twiddle.net>
CC: Eduardo Habkost <ehabkost@redhat.com>
CC: "Andreas Färber" <afaerber@suse.de>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-03-14 17:26:06 -03:00
Richard Henderson
b9f9c5b41a target-i386: Dump unknown opcodes with -d unimp
We discriminate here between opcodes that are illegal in the current
cpu mode or with illegal arguments (such as modrm.mod == 3) and
encodings that are unknown (such as an unimplemented isa extension).

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:53:07 -07:00
Richard Henderson
f083d92c03 target-i386: Fix inhibit irq mask handling
The patch in 7f0b714 was too simplistic, in that we wound up setting
the flag and then resetting it immediately in gen_eob.

Fixes the reported boot problem with Windows XP.

Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:53:02 -07:00
Richard Henderson
26317698ef target-i386: Use gen_nop_modrm for prefetch instructions
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:52:56 -07:00
Paolo Bonzini
e2e02a8207 target-i386: Fix addr16 prefix
While ADDSEG will only be false in 16-bit mode for LEA, it can be
false even in other cases when 16-bit addresses are obtained via
the 67h prefix in 32-bit mode.  In this case, gen_lea_v_seg forgets
to add a nonzero FS or GS base if CS/DS/ES/SS are all zero.  This
case is pretty rare but happens when booting Windows 95/98, and
this patch fixes it.

The bug is visible since commit d6a291498, but it was introduced
together with gen_lea_v_seg and it probably could be reproduced
with a "addr16 gs movsb" instruction as early as in commit
ca2f29f555.

Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456931078-21635-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:52:48 -07:00
Richard Henderson
a657f79e32 target-i386: Fix SMSW for 64-bit mode
In non-64-bit modes, the instruction always stores 16 bits.
But in 64-bit mode, when the destination is a register, the
instruction can write 32 or 64 bits.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:52:42 -07:00
Paolo Bonzini
880f848650 target-i386: Fix SMSW and LMSW from/to register
SMSW and LMSW accept register operands, but commit 1906b2a ("target-i386:
Rearrange processing of 0F 01", 2016-02-13) did not account for that.

Fixes: 1906b2af7c
Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456845134-18812-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-03-14 10:52:29 -07:00
Paolo Bonzini
8b33e82b86 target-i386: Avoid repeated calls to the bnd_jmp helper
Two flags were tested the wrong way.

Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1456845145-18891-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Fixed enable test as well.]
2016-03-14 10:45:41 -07:00
Kevin Wolf
0d611402a1 Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-03-14-v2' into queue-block
Block patches for pi day, v2.

# gpg: Signature made Mon Mar 14 17:35:29 2016 CET using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-03-14-v2:
  iotests: Add test for QMP event rates
  monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
  monitor: Separate QUORUM_REPORT_BAD events according to the node name
  quorum: Fix crash in quorum_aio_cb()
  iotests: Correct 081's reference output
  block: Remove unused typedef of BlockDriverDirtyHandler
  block: Move block dirty bitmap code to separate files
  typedefs: Add BdrvDirtyBitmap
  block: Include hbitmap.h in block.h
  backup: Use Bitmap to replace "s->bitmap"

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 17:36:31 +01:00
Alberto Garcia
7223c48cff iotests: Add test for QMP event rates
This test verifies that the rate-limited QMP events are emitted at a
maximum rate of 1 per second as defined in monitor_qapi_event_conf in
monitor.c

It also checks that QUORUM_REPORT_BAD events generated from different
nodes are kept in separate queues so they don't mask each other.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 0dbd3ee88a59a6363042ad81cfb345037bfbf612.1457610443.git.berto@igalia.com
[mreitz@redhat.com: Renamed test from 146 to 148]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:06 +01:00
Alberto Garcia
dc59997871 monitor: Use QEMU_CLOCK_VIRTUAL for the event queue in qtest mode
This allows us to perform tests on the monitor queues to verify that
the rate limits are enforced.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: dde511809e954a5c32d5b648bb184c03c89ed5d5.1457610443.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:06 +01:00
Alberto Garcia
6d425eb94d monitor: Separate QUORUM_REPORT_BAD events according to the node name
The QUORUM_REPORT_BAD event is emitted whenever there's an I/O error
in a child of a Quorum device. This event is emitted at a maximum rate
of 1 per second. This means that an error in one of the children will
mask errors in the other children if they happen within the same 1
second interval.

This patch modifies qapi_event_throttle_equal() so QUORUM_REPORT_BAD
events are kept separately if they come from different children.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: b989c0cb3755bc4b6696e796fa8ed2ef6c56606a.1457610443.git.berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:06 +01:00
Alberto Garcia
b9c600d207 quorum: Fix crash in quorum_aio_cb()
quorum_aio_cb() emits the QUORUM_REPORT_BAD event if there's
an I/O error in a Quorum child. However sacb->aiocb must be
correctly initialized for this to happen. read_quorum_children() and
read_fifo_child() are not doing this, which results in a QEMU crash.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 8138570d071ba7e25db3736979234a1fd71dbd05.1457610443.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:06 +01:00
Max Reitz
e3f66e0368 iotests: Correct 081's reference output
The newly added type parameter for the QUORUM_REPORT_BAD event changed
the output of iotest 081, so the reference should be amended
accordingly.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1457705687-27122-1-git-send-email-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-03-14 17:35:06 +01:00
Fam Zheng
fcce736719 block: Remove unused typedef of BlockDriverDirtyHandler
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-6-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:05 +01:00
Fam Zheng
ebab225910 block: Move block dirty bitmap code to separate files
The only code change is making bdrv_dirty_bitmap_truncate public. It is
used in block.c.

Also two long lines (bdrv_get_dirty) are wrapped.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-5-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:05 +01:00
Fam Zheng
9a3f5cf1bf typedefs: Add BdrvDirtyBitmap
Following patches to refactor and move block dirty bitmap code could use
this.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:05 +01:00
Fam Zheng
78f9dc859d block: Include hbitmap.h in block.h
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:05 +01:00
Fam Zheng
b2f56462d5 backup: Use Bitmap to replace "s->bitmap"
"s->bitmap" tracks done sectors, we only check bit states without using any
iterator which HBitmap is good for. Switch to "Bitmap" which is simpler and
more memory efficient.

Meanwhile, rename it to done_bitmap, to reflect the intention.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1457412306-18940-2-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-03-14 17:35:05 +01:00
Peter Maydell
618a5a8bc5 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 14 Mar 2016 11:27:01 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: separate MMIO tracepoints from TB-access tracepoints
  trace: include CPU index in trace_memory_region_*()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-14 16:22:17 +00:00
Kevin Wolf
b8f45cdf78 vpc: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:44 +01:00
Kevin Wolf
c4bea1690e vmdk: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
10bf03af12 vhdx: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
a08f0c3b5f vdi: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
fba98d455a sheepdog: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
8a56fdadaf qed: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
23588797b6 qcow2: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
6af4016020 qcow: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
8942764f54 parallels: Use BB functions in .bdrv_create()
All users of the block layers are supposed to go through a BlockBackend.
The .bdrv_create() implementation is one such user, so this patch
converts it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
c10c9d9615 block: Introduce blk_set_allow_write_beyond_eof()
We check that the guest can't write beyond the end of its disk, but for
other internal users it can make sense to allow growing a file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
6340472c54 block: Use writeback in .bdrv_create() implementations
There's no reason to use a writethrough cache mode while creating an
image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
2073d410ce hmp: Extend drive_del to delete nodes without BB
Now that we can use drive_add to create new nodes without a BB, we also
want to be able to delete such nodes again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
abb21ac3e6 hmp: 'drive_add -n' for creating a node without BB
This patch adds an option to the drive_add HMP command to create only a
BlockDriverState without a BlockBackend on top.

The motivation for this is that libvirt needs to specify options to a
migration target (specifically, detect-zeroes). drive-mirror doesn't
allow specifying options, and the proper way to do this is to create the
target BDS separately with blockdev-add (where you can specify options)
and then use blockdev-mirror to that BDS.

However, libvirt can't use blockdev-add as long as it is still
experimental, and we're expecting that it will still take some time, so
we need to resort to drive_add.

The problem with drive_add is that so far it always created a BB, and
BDSes with a BB can't be used as a mirroring target as long as we don't
support multiple BBs per BDS - and while we're working towards that
goal, it's another thing that will still take some time.

So to achieve the goal, the simplest solution to provide the
functionality now without adding one-off options to the mirror QMP
commands is to extend drive_add to create nodes without BBs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Fam Zheng
71968dbfd8 vmdk: Switch to heap arrays for vmdk_parent_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Fam Zheng
5997c210b9 vmdk: Switch to heap arrays for vmdk_read_cid
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Fam Zheng
965415eb20 vmdk: Switch to heap arrays for vmdk_write_cid
It is only called once for each opened image, so we can do it the easy
way.

Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
a81d616437 block: Fix cache mode defaults in bds_tree_init()
Without setting explicit defaults in the options, blockdev-add without
an ID ended up defaulting to writethrough. It should be writeback as
documented.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
73176bee99 block: Fix snapshot=on cache modes
Since commit 91a097e, we end up with a somewhat weird cache mode
configuration with snapshot=on: The commit broke the cache mode
inheritance for the snapshot overlay so that it is opened as
writethrough instead of unsafe now. The following bdrv_append() call to
put it on top of the tree swaps the WCE flag with the snapshot's backing
file (i.e. the originally given file), so what we eventually get is
cache=writeback on the temporary overlay and
cache=writethrough,cache.no-flush=on on the real image file.

This patch changes things so that the temporary overlay gets
cache=unsafe again like it used to, and the real images get whatever the
user specified. This means that cache.direct is now respected even with
snapshot=on, and in the case of committing changes, the final flush is
no longer ignored except explicitly requested by the user.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:43 +01:00
Kevin Wolf
f86b8b584b blockdev: Snapshotting must not open second instance of old top
Calling bdrv_img_create() with a size of -1 means that it determines the
size automatically by opening the backing file. However, in the case of
live snapshots, the backing file is already opened and we must avoid
opening the same image twice at the same time. Apart from that, just
getting the size from the already existing BDS is a lot less overhead
than opening a new instance.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2016-03-14 16:46:43 +01:00
Changlong Xie
924e8a2bbc quorum: modify vote rules for flush operation
Keep flush interface the same logic as quorum read/write, Otherwise in
following scenario, we'll encounter unexpected errors.

Quorum has two children(A, B). A do flush sucessfully, but B flush failed.
This cause the filesystem of guest become read-only with following errors:

end_request: I/O error, dev vda, sector 11159960
Aborting journal on device vda3-8
EXT4-fs error (device vda3): ext4_journal_start_sb:327: Detected abort journal
EXT4-fs (vda3): Remounting filesystem read-only

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Changlong Xie
0ae053b7e1 qmp event: Refactor QUORUM_REPORT_BAD
Introduce QuorumOpType, and make QUORUM_REPORT_BAD compatible
with it.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Changlong Xie
58346b82ed docs: fix invalid node name in qmp event
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:43 +01:00
Jeff Cody
1001dd9f84 block/vpc: add tests for image creation force_size parameter
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:42 +01:00
Jeff Cody
fb9245c261 block/vpc: give option to force the current_size field in .bdrv_create
When QEMU creates a VHD image, it goes by the original spec,
calculating the current_size based on the nearest CHS geometry (with an
exception for disks > 127GB).

Apparently, Azure will only allow images that are sized to the nearest
MB, and the current_size as calculated from CHS cannot guarantee that.

Allow QEMU to create images similar to how Hyper-V creates images, by
setting current_size to the specified virtual disk size.  This
introduces an option, force_size, to be passed to the vpc format during
image creation, e.g.:

    qemu-img convert -f raw -o force_size -O vpc test.img test.vhd

When using the "force_size" option, the creator app field used by
QEMU will be "qem2" instead of "qemu", to indicate the difference.
In light of this, we also add parsing of the "qem2" field during
vpc_open.

Bug reference: https://bugs.launchpad.net/qemu/+bug/1490611

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:42 +01:00
Jeff Cody
798609bbe2 block/vpc: tests for auto-detecting VPC and Hyper-V VHD images
This tests auto-detection, and overrides, of VHD image sizes created
by Virtual PC, Hyper-V, and Disk2vhd.

This adds three sample images:

hyperv2012r2-dynamic.vhd.bz2 - dynamic VHD image created with Hyper-V
virtualpc-dynamic.vhd.bz2    - dynamic VHD image created with Virtual PC
d2v-zerofilled.vhd.bz2       - dynamic VHD image created with Disk2vhd

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:42 +01:00
Jeff Cody
c540d53ac8 block/vpc: choose size calculation method based on creator_app field
The VHD file format is used by both Virtual PC, and Hyper-V.  However,
how the virtual disk size is calculated varies between the two.

Virtual PC uses the CHS drive parameters to determine the drive size.
Hyper-V, on the other hand, uses the current_size field in the footer
when determining image size.

This is problematic for a few reasons:

* VHD images from Hyper-V, using CHS calculations, will likely be
  trunctated.

* If we just rely always on current_size, then QEMU may have data
  compatibility issues with Virtual PC (we may write too much data
  into a VHD file to be used by Virtual PC, for instance).

* Existing VHD images created by QEMU have used the CHS calculations,
  except for images exceeding the 127GB limit.  We want to remain
  compatible with our own generated images.

Luckily, the VHD specification defines a 'Creator App' field, that is
used to indicate what software created the VHD file.

This patch does two things:

    1. Uses the 'Creator App' field to help determine how to calculate
       size, and

    2. Adds a VPC format option 'force_size_calc', so that the user can
       override the 'Creator App' auto-detection, in case there exist
       VHD images with unknown or contradictory 'Creator App' entries.

N.B.: We currently use the maximum CHS value as an indication to use the
current_size field.  This patch does not change that, even with the
'force_size_calc' option.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:42 +01:00
Kevin Wolf
c21cc6ca98 block/qapi: Include empty drives in query-blockstats
Since commit 5ec18f8c, query-blockstats didn't return the statistics of
drives without media any more because such drives have only a BB now,
but not a BDS any more.

This patch fixes the regression so that query-blockstats iterates over
BBs by default and empty drives are displayed again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:42 +01:00
Kevin Wolf
b07363a1a3 block/qapi: Factor out bdrv_query_bds_stats()
The new functions handles the data that is taken from the
BlockDriverState.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:42 +01:00
Kevin Wolf
2b77e60ab8 block/qapi: Factor out bdrv_query_blk_stats()
The new functions handles the data that is taken from the BlockBackend.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-03-14 16:46:42 +01:00
Paolo Bonzini
396374caea qemu-img: eliminate memory leak
Not particularly important since qemu-img exits immediately after
calling img_rebase, but easily fixed.  Coverity says thanks.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-03-14 16:46:42 +01:00
Peter Maydell
6dcea61425 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160311.0' into staging
VFIO updates 2016-03-11

 - Allow devices to be specified via sysfs path (Alex Williamson)
 - vfio region helpers and generalization for future device specific regions
   (Alex Williamson)
 - Automatic ROM device ID and checksum fixup (Alex Williamson)
 - Split VGA setup to allow enabling VGA from quirks (Alex Williamson)
 - Remove fixed string limit for ROM MemoryRegion name (Neo Jia)
 - MAINTAINERS update (Thomas Huth)

# gpg: Signature made Fri 11 Mar 2016 15:55:31 GMT using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20160311.0:
  MAINTAINERS: Add entry for the include/hw/vfio/ folder
  vfio/pci: replace fixed string limit by g_strdup_printf
  vfio/pci: Split out VGA setup
  vfio/pci: Fixup PCI option ROMs
  vfio/pci: Convert all MemoryRegion to dynamic alloc and consistent functions
  vfio: Generalize region support
  vfio: Wrap VFIO_DEVICE_GET_REGION_INFO
  vfio: Add sysfsdev property for pci & platform

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-14 15:11:39 +00:00
Peter Maydell
0dcee62261 Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.6-7' into staging
migration:
 - postcopy is no longer experimental
 - fix a use-after-free in postcopy
 - fix a compile warning

# gpg: Signature made Fri 11 Mar 2016 12:29:33 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-for-2.6-7:
  postcopy: Remove the x-
  postcopy: listen thread is never joined
  migration: fix use-after-free in loadvm_postcopy_handle_run_bh
  migration: fix warning for source_return_path_thread

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-14 13:51:21 +00:00
Peter Maydell
8326ec2c83 Merge remote-tracking branch 'remotes/berrange/tags/pull-io-win32-2016-03-11-1' into staging
Merge I/O fixes for win32

# gpg: Signature made Fri 11 Mar 2016 10:03:20 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-io-win32-2016-03-11-1:
  osdep: remove use of socket_error() from all code
  osdep: add wrappers for socket functions
  char: remove qemu_chr_open_socket_fd method
  char: remove socket_try_connect method
  char: remove qemu_chr_finish_socket_connection method
  io: implement socket watch for win32 using WSAEventSelect+select
  io: remove checking of EWOULDBLOCK
  io: use qemu_accept to ensure SOCK_CLOEXEC is set
  io: introduce qio_channel_create_socket_watch
  io: pass HANDLE to g_source_add_poll on Win32
  io: fix copy+paste mistake in socket error message
  io: assert errors before asserting content in I/O test
  io: set correct error object in background reader test thread
  io: wait for incoming client in socket test
  io: bind to socket before creating QIOChannelSocket
  io: initialize sockets in test program
  io: use bind() to check for IPv4/6 availability
  osdep: fix socket_error() to work with Mingw64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-14 11:49:33 +00:00
Peter Maydell
d1ab9681ac Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160311' into staging
CPU hotplug via cpu-add for s390x, cleanup of the s390x machine
compat code and a bugfix in the s390-ccw bios.

# gpg: Signature made Fri 11 Mar 2016 09:48:02 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20160311:
  s390x/cpu: use g_new0
  s390x: Introduce S390MachineClass
  s390x: Introduce machine definition macros
  pc-bios/s390-ccw: fix old bug in ptr increment
  s390x/cpu: Allow hotplug of CPUs
  s390x/cpu: Add error handling to cpu creation
  s390x/cpu: Add CPU property links
  s390x/cpu: Tolerate max_cpus
  s390x/cpu: Get rid of side effects when creating a vcpu
  s390x/cpu: Set initial CPU state in common routine
  s390x/cpu: Cleanup init in preparation for hotplug

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-14 11:13:11 +00:00
Hollis Blanchard
f2d089425d trace: separate MMIO tracepoints from TB-access tracepoints
Memory accesses to code which has previously been translated into a TB show up
in the MMIO path, so that they may invalidate the TB. It's extremely confusing
to mix those in with device MMIOs, so split them into their own tracepoint.

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1456949575-1633-2-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-14 09:34:30 +00:00
Hollis Blanchard
5a68be94ac trace: include CPU index in trace_memory_region_*()
Knowing which CPU performed an action is essential for understanding SMP guest
behavior.

However, cpu_physical_memory_rw() may be executed by a machine init function,
before any VCPUs are running, when there is no CPU running ('current_cpu' is
NULL). In this case, store -1 in the trace record as the CPU index. Trace
analysis tools may need to be aware of this special case.

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Message-id: 1456949575-1633-1-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-14 09:34:30 +00:00
Cédric Le Goater
5167560b03 ipmi: add some local variables in ipmi_sdr_init
This patch adds a couple of variables to manipulate the raw sdr
entries. The const attribute is also removed on init_sdrs. This will
ease the introduction of a sdr loader using a file.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
52fc01d973 ipmi: remove the need of an ending record in the SDR table
Currently, the code initializing the sdr table relies on an ending
record with a recid of 0xffff. This patch changes the loop to use the
sdr size as a breaking condition.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
4fa9f08e96 ipmi: use a function to initialize the SDR table
This patch moves the code section initializing the sdrs in its own
routine to prepare ground for changes in the subsequent patches.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
0bc6001f0d ipmi: add a realize function to the device class
This will be useful to define and use properties when the object is
instantiated.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
6acb971a94 ipmi: add rsp_buffer_set_error() helper
The third byte in the response buffer of an IPMI command holds the
error code. In many IPMI command handlers, this byte is updated
directly. This patch adds a helper routine to clarify why this byte is
being used.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
7f996411ad ipmi: remove IPMI_CHECK_RESERVATION() macro
Some IPMI command handlers in the BMC simulator use a macro
IPMI_CHECK_RESERVATION() to check a SDR reservation but the macro
implicitly uses local variables. This patch simply removes it.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
a580d82085 ipmi: replace IPMI_ADD_RSP_DATA() macro with inline helpers
The IPMI command handlers in the BMC simulator use a macro
IPMI_ADD_RSP_DATA() to push bytes in a response buffer. The macro
hides the fact that it implicitly uses variables local to the handler,
which is misleading.

This patch introduces a simple 'struct RspBuffer' and inlined helper
routines to store byte(s) in a response buffer. rsp_buffer_push()
replaces the macro IPMI_ADD_RSP_DATA() and rsp_buffer_pushmore() is
new helper to push multiple bytes. The latest is used in the command
handlers get_msg() and get_sdr() which are manipulating the buffer
directly.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Cédric Le Goater
4f298a4b29 ipmi: remove IPMI_CHECK_CMD_LEN() macro
Most IPMI command handlers in the BMC simulator start with a call to
the macro IPMI_CHECK_CMD_LEN() which verifies that a minimal number of
arguments expected by the command are indeed available. To achieve
this task, the macro implicitly uses local variables which is
misleading in the code.

This patch adds a 'cmd_len_min' attribute to the struct IPMICmdHandler
defining the minimal number of arguments expected by the command and
moves this check in the global command handler ipmi_sim_handle_command().

To clarify the checks being done on the received command, the patch
introduces a helper ipmi_get_handler().

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Michael S. Tsirkin
5da4fb0018 MAINTAINERS: machine core
Marcel and Eduardo agreed to co-maintain these.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Thomas Huth
494f7b572e MAINTAINERS: Add an entry for virtio header files
Files in the include/hw/virtio/ folder should be included in the
"virtio" sections of the MAINTAINERS file.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:13 +02:00
Igor Mammedov
ed2ef10c0c pc: acpi: clarify why possible LAPIC entries must be present in MADT
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
adcb89d55d pc: acpi: drop cpu->found_cpus bitmap
cpu->found_cpus bitmap is used for setting present
flag in CPON AML package. But it takes a bunch of code
to fill bitmap and could be simplified by getting
presense info from possible CPUs list directly.

So drop cpu->found_cpus bitmap and unroll possible
CPUs list into APIC index array at the place where
CPUON AML package is created.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
2adba0a18a pc: acpi: create Processor and Notify objects only for valid lapics
do not assume that all lapics in range 0..apic_id_limit
are valid and do not create Processor and Notify objects
for not possible lapics.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
907e7c94d1 pc: acpi: create MADT.lapic entries only for valid lapics
do not assume that all lapics in range 0..apic_id_limit
are valid and do not create lapic entries for not
possible lapics in MADT.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
5803fce389 pc: acpi: SRAT: create only valid processor lapic entries
When APIC IDs are sparse*, in addition to valid LAPIC
entries the SRAT is also filled invalid ones for non
possible APIC IDs.
Fix it by asking machine for all possible APIC IDs
instead of wrongly assuming that all APIC IDs in
range 0..apic_id_limit are possible.

* sparse lapic topology CLI:
     -smp x,sockets=2,cores=3,maxcpus=6
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
3d3ebcad6a pc: acpi: cleanup qdev_get_machine() calls
cache qdev_get_machine() result in acpi_setup/acpi_build_update
time and pass it as an argument to child functions that need it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
3811ef14f5 machine: introduce MachineClass.possible_cpu_arch_ids() hook
on x86 currently range 0..max_cpus is used to generate
architecture-dependent CPU ID (APIC Id) for each present
and possible CPUs. However architecture-dependent CPU IDs
list could be sparse and code that needs to enumerate
all IDs (ACPI) ended up doing guess work enumerating all
possible and impossible IDs up to
  apic_id_limit = x86_cpu_apic_id_from_index(max_cpus).

That leads to creation of MADT entries and Processor
objects in ACPI tables for not possible CPUs.
Fix it by allowing board specify a concrete list of
CPU IDs accourding its own rules (which for x86 depends
on topology). So that code that needs this list could
request it from board instead of trying to guess
what IDs are correct on its own.

This interface will also allow to help making AML
part of CPU hotplug target independent so it could
be reused for ARM target.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
ebde2465a9 pc: init pcms->apic_id_limit once and use it throughout pc.c
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Igor Mammedov
ae29883508 pc: acpi: remove NOP assignment
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Cao jin
f9735fd53f pxb: cleanup
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-03-11 16:59:12 +02:00
Marc-André Lureau
342f7a9d05 qemu-char: make tcp_chr_disconnect() reentrant-safe
During CHR_EVENT_CLOSED, the function could be reentered, make this
case safe.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Marc-André Lureau
6167ebbd91 qemu-char: remove all msgfds on disconnect
Disconnect should reset context.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Marc-André Lureau
869a58af86 qemu-char: avoid potential double-free
If tcp_set_msgfds() is called several time with NULL fds, this
could lead to double-free.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Marc-André Lureau
b7fcb3603c vhost-user: remove useless is_server field
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Marc-André Lureau
c1bf3531ae vhost-user: fix use after free
"name" is freed after visiting options, instead use the first NetClientState
name. Adds a few assert() for clarifying and checking some impossible states.

READ of size 1 at 0x602000000990 thread T0
    #0 0x7f6b251c570c  (/lib64/libasan.so.2+0x4770c)
    #1 0x5566dc380600 in qemu_find_net_clients_except net/net.c:824
    #2 0x5566dc39bac7 in net_vhost_user_event net/vhost-user.c:193
    #3 0x5566dbee862a in qemu_chr_be_event /home/elmarco/src/qemu/qemu-char.c:201
    #4 0x5566dbef2890 in tcp_chr_disconnect /home/elmarco/src/qemu/qemu-char.c:2790
    #5 0x5566dbef2d0b in tcp_chr_sync_read /home/elmarco/src/qemu/qemu-char.c:2835
    #6 0x5566dbee8a99 in qemu_chr_fe_read_all /home/elmarco/src/qemu/qemu-char.c:295
    #7 0x5566dc39b964 in net_vhost_user_watch net/vhost-user.c:180
    #8 0x5566dc5a06c7 in qio_channel_fd_source_dispatch io/channel-watch.c:70
    #9 0x7f6b1aa2ab87 in g_main_dispatch /home/elmarco/src/gnome/glib/glib/gmain.c:3154
    #10 0x7f6b1aa2b9cb in g_main_context_dispatch /home/elmarco/src/gnome/glib/glib/gmain.c:3769
    #11 0x5566dc475ed4 in glib_pollfds_poll /home/elmarco/src/qemu/main-loop.c:212
    #12 0x5566dc476029 in os_host_main_loop_wait /home/elmarco/src/qemu/main-loop.c:257
    #13 0x5566dc476165 in main_loop_wait /home/elmarco/src/qemu/main-loop.c:505
    #14 0x5566dbf08d31 in main_loop /home/elmarco/src/qemu/vl.c:1932
    #15 0x5566dbf16783 in main /home/elmarco/src/qemu/vl.c:4646
    #16 0x7f6b180bb57f in __libc_start_main (/lib64/libc.so.6+0x2057f)
    #17 0x5566dbbf5348 in _start (/home/elmarco/src/qemu/x86_64-softmmu/qemu-system-x86_64+0x3f9348)

0x602000000990 is located 0 bytes inside of 5-byte region [0x602000000990,0x602000000995)
freed by thread T0 here:
    #0 0x7f6b2521666a in __interceptor_free (/lib64/libasan.so.2+0x9866a)
    #1 0x7f6b1aa332a4 in g_free /home/elmarco/src/gnome/glib/glib/gmem.c:189
    #2 0x5566dc5f416f in qapi_dealloc_type_str qapi/qapi-dealloc-visitor.c:134
    #3 0x5566dc5f3268 in visit_type_str qapi/qapi-visit-core.c:196
    #4 0x5566dc5ced58 in visit_type_Netdev_fields /home/elmarco/src/qemu/qapi-visit.c:5936
    #5 0x5566dc5cef71 in visit_type_Netdev /home/elmarco/src/qemu/qapi-visit.c:5960
    #6 0x5566dc381a8d in net_visit net/net.c:1049
    #7 0x5566dc381c37 in net_client_init net/net.c:1076
    #8 0x5566dc3839e2 in net_init_netdev net/net.c:1473
    #9 0x5566dc63cc0a in qemu_opts_foreach util/qemu-option.c:1112
    #10 0x5566dc383b36 in net_init_clients net/net.c:1499
    #11 0x5566dbf15d86 in main /home/elmarco/src/qemu/vl.c:4397
    #12 0x7f6b180bb57f in __libc_start_main (/lib64/libc.so.6+0x2057f)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:12 +02:00
Xiao Guangrong
f7df22de56 nvdimm acpi: emulate dsm method
Emulate dsm method after IO VM-exit

Currently, we only introduce the framework and no function is actually
supported

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Xiao Guangrong
18c440e1e1 nvdimm acpi: let qemu handle _DSM method
If dsm memory is successfully patched, we let qemu fully emulate
the dsm method

This patch saves _DSM input parameters into dsm memory, tell dsm
memory address to QEMU, then fetch the result from the dsm memory

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Xiao Guangrong
b99514135b nvdimm acpi: introduce patched dsm memory
The dsm memory is used to save the input parameters and store
the dsm result which is filled by QEMU.

The address of dsm memory is decided by bios and patched into
int32 object named "MEMA"

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Xiao Guangrong
5fe79386ba nvdimm acpi: initialize the resource used by NVDIMM ACPI
32 bits IO port starting from 0x0a18 in guest is reserved for NVDIMM
ACPI emulation. The table, NVDIMM_DSM_MEM_FILE, will be patched into
NVDIMM ACPI binary code

OSPM uses this port to tell QEMU the final address of the DSM memory
and notify QEMU to emulate the DSM method

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Gerd Hoffmann
b63283d7c3 pci-ids: add virtio 1.0 ids to spec
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Michael S. Tsirkin
2c02a48e6d acpi-test-data: add _DIS methods
commit c82f503dd5
("hw/acpi: fix Q35 support for legacy Windows OS")
added _DIS for all link devices.

Update expected test files accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:59:11 +02:00
Marcel Apfelbaum
c82f503dd5 hw/acpi: fix Q35 support for legacy Windows OS
Legacy Windows operating systems like Windows XP and Windows 2003
require _DIS method to be present for all interrupt links.

PC machines already have a no-op implemented for GSI links, add
it also in Q35.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-03-11 16:45:21 +02:00
Cao jin
7335a95abd ich9lpc: fix typo
change some "rbca" to "rcrb"(root complex register block) while
the other to "rcba"(root complex base address).
Bonus: add more comments and fix some indentation.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:45:21 +02:00
Michael S. Tsirkin
226419d615 msi_supported -> msi_nonbroken
Rename controller flag to make it clearer what it means.
Add some documentation as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:45:21 +02:00
Gerd Hoffmann
75fd6f13af virtio-pci: call pci reset variant when guest requests reset.
Actually fixes linux not finding virtio 1.0 device virtqueues after
reboot.  Which is new I think, any chance linux kernel virtio code
became more strict in 4.3?

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Fam Zheng <famz@redhat.com>
2016-03-11 16:45:21 +02:00
Michael S. Tsirkin
79248c22ad i386: update expected DSDT
DSDT was changed by:
commit 27b9fc54d2 ("i386: populate floppy
drive information in DSDT").

Update expected files accordingly.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 16:44:58 +02:00
Roman Kagan
27b9fc54d2 i386: populate floppy drive information in DSDT
On x86-based systems Linux determines the presence and the type of
floppy drives via a query of a CMOS field.  So does SeaBIOS when
populating the return data for int 0x13 function 0x08.

However Windows doesn't do it. Instead, it requests this information
from BIOS via int 0x13/0x08 or through ACPI objects _FDE (Floppy Drive
Enumerate) and _FDI (Floppy Drive Information) of the floppy controller
object.  On UEFI systems only ACPI-based detection is supported.

QEMU doesn't provide those objects in its ACPI tables and as a result
floppy drives are invisible to Windows on UEFI/OVMF.

This patch adds those objects to the floppy controller in DSDT,
populating them with the information from respective QEMU objects.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:55:15 +02:00
Roman Kagan
e08fde0c5e fdc: add function to determine drive chs limits
When populating ACPI objects for floppy drives one needs to provide the
maximum values for cylinder, sector, and head number the drive supports.

This patch adds a function that iterates through the array of predefined
floppy drive formats and returns the maximum values of c, h, s, out of
those matching the given floppy drive type.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
2016-03-11 14:55:15 +02:00
Roman Kagan
bda055096b i386: expose floppy drive CMOS type
Make it possible to query the CMOS type of a floppy drive outside of the
source file where it's defined.

It will allow to properly populate the corresponding ACPI objects and
thus enable Windows on BIOS-less systems to access the floppy drives.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:55:15 +02:00
Roman Kagan
9b613f4e40 i386/acpi: make floppy controller object dynamic
Instead of statically declaring the floppy controller in DSDT, with its
_STA method depending on some obscure bit in the parent ISA bridge, add
the object dynamically to DSDT via AML API only when the controller is
present.

The _STA method is no longer necessary and is therefore dropped.  So are
the declarations of the fields indicating whether the contoller is
enabled.

Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:55:15 +02:00
Igor Mammedov
c9f4b77ad5 pc-dimm: fix error handling in pc_dimm_check_memdev_is_busy()
If host_memory_backend_get_memory() were to return error and
NULL MemoryRegion, pc_dimm_check_memdev_is_busy() would crash
dereferencing NULL pointer in memory_region_is_mapped().
But if error is set and non NULL MemoryRegion is returned
then error_setg() will fail with "error already set" assertion
in error_setv()

To avoid above issues use typical error handling pattern
for property setters:

Error *local_error = NULL;
...
error_propagate(errp, local_err);

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:55:15 +02:00
Ilya Maximets
fff4e48ed5 vhost-user: verify that number of queues is less than MAX_QUEUE_NUM
Fix QEMU crash when -netdev vhost-user,queues=n is passed with number
of queues greater than MAX_QUEUE_NUM.

Signed-off-by: Ilya Maximets <i.maximets@samsung.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2016-03-11 14:55:15 +02:00
Denis V. Lunev
a0d06486b4 virtio-balloon: add 'available' counter
The patch for the kernel part is in linux-next already:
commit ac88e7c908b920866e529862f2b2f0129b254ab2
    Author: Igor Redko <redkoi@virtuozzo.com>
    Date:   Thu Feb 18 09:23:01 2016 +1100

    virtio_balloon: export 'available' memory to balloon statistics

    Add a new field, VIRTIO_BALLOON_S_AVAIL, to virtio_balloon memory
    statistics protocol, corresponding to 'Available' in /proc/meminfo.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Igor Redko <redkoi@virtuozzo.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:55:15 +02:00
Marcel Apfelbaum
fc1769b758 hw/virtio: group virtio flags into an enum
Minimizes the possibility to assign
the same bit to different features.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2016-03-11 14:54:28 +02:00
Marcel Apfelbaum
631a438755 hw/virtio: fix double use of a virtio flag
Commits 1811e64c and a6df8adf use the same virtio feature bit 4
for different features.

Fix it by using different bits.

Reported-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
2016-03-11 14:54:28 +02:00
Ladi Prosek
4eae2a657d balloon: fix segfault and harden the stats queue
The segfault here is triggered by the driver notifying the stats queue
twice after adding a buffer to it. This effectively resets stats_vq_elem
back to NULL and QEMU crashes on the next stats timer tick in
balloon_stats_poll_cb.

This is a regression introduced in 51b19ebe43, although admittedly
the device assumed too much about the stats queue protocol even before
that commit. This commit adds a few more checks and ensures that the one
stats buffer gets deallocated on device reset.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:54:28 +02:00
Michael S. Tsirkin
f203549108 acpi: add build_append_named_dword, returning an offset in buffer
This is a very limited form of support for runtime patching -
similar in functionality to what we can do with ACPI_EXTRACT
macros in python, but implemented in C.

This is to allow ACPI code direct access to data tables -
which is exactly what DataTableRegion is there for, except
no known windows release so far implements DataTableRegion.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:54:28 +02:00
Xiao Guangrong
3f3009c098 acpi: allow using object as offset for OperationRegion
Extend aml_operation_region() to use object as offset

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:54:28 +02:00
Xiao Guangrong
9815cba502 acpi: add aml_concatenate()
It will be used by nvdimm acpi

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:54:28 +02:00
Xiao Guangrong
39b6dbd8d7 acpi: add aml_create_field()
It will be used by nvdimm acpi

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-11 14:54:27 +02:00
Dr. David Alan Gilbert
32c3db5b26 postcopy: Remove the x-
Postcopy seems to have survived a cycle with only a few fixes,
and Jiri has the current libvirt wired up and working
( https://www.redhat.com/archives/libvir-list/2016-March/msg00080.html )
so remove the experimental tag.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457690016-9070-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-11 17:53:59 +05:30
Dr. David Alan Gilbert
a587a3fe6c postcopy: listen thread is never joined
We don't join the listen thread, it does its own cleanup.
Mark as detached not joinable.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457690016-9070-2-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-11 17:53:59 +05:30
Denis V. Lunev
8646992279 migration: fix use-after-free in loadvm_postcopy_handle_run_bh
MigrationState is destroyed before we can come into bottom half.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1457537708-8622-1-git-send-email-den@openvz.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-11 12:58:45 +05:30
Peter Xu
568b01caf3 migration: fix warning for source_return_path_thread
max_len is not necessary, while it brings a warning during compilation
when specify "-Wstack-usage=1000000". Replacing using sizeof().

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1457503932-31763-1-git-send-email-peterx@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-11 12:58:37 +05:30
Thomas Huth
99b88c6d1f MAINTAINERS: Add entry for the include/hw/vfio/ folder
The headers in include/hw/vfio/ should be listed in the VFIO
section of the MAINTAINERS file.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:50:44 -07:00
Neo Jia
062ed5d8d6 vfio/pci: replace fixed string limit by g_strdup_printf
A trivial change to remove string limit by using g_strdup_printf

Tested-by: Neo Jia <cjia@nvidia.com>
Signed-off-by: Neo Jia <cjia@nvidia.com>
Signed-off-by: Kirti Wankhede <kwankhede@nvidia.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:50:43 -07:00
Alex Williamson
e593c0211b vfio/pci: Split out VGA setup
This could be setup later by device specific code, such as IGD
initialization.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:50:41 -07:00
Alex Williamson
e2e5ee9c56 vfio/pci: Fixup PCI option ROMs
Devices like Intel graphics are known to not only have bad checksums,
but also the wrong device ID.  This is not so surprising given that
the video BIOS is typically part of the system firmware image rather
that embedded into the device and needs to support any IGD device
installed into the system.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:50:39 -07:00
Alex Williamson
2d82f8a3cd vfio/pci: Convert all MemoryRegion to dynamic alloc and consistent functions
Match common vfio code with setup, exit, and finalize functions for
BAR, quirk, and VGA management.  VGA is also changed to dynamic
allocation to match the other MemoryRegions.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:50:38 -07:00
Alex Williamson
db0da029a1 vfio: Generalize region support
Both platform and PCI vfio drivers create a "slow", I/O memory region
with one or more mmap memory regions overlayed when supported by the
device. Generalize this to a set of common helpers in the core that
pulls the region info from vfio, fills the region data, configures
slow mapping, and adds helpers for comleting the mmap, enable/disable,
and teardown.  This can be immediately used by the PCI MSI-X code,
which needs to mmap around the MSI-X vector table.

This also changes VFIORegion.mem to be dynamically allocated because
otherwise we don't know how the caller has allocated VFIORegion and
therefore don't know whether to unreference it to destroy the
MemoryRegion or not.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 20:03:16 -07:00
Daniel P. Berrange
b16a44e13e osdep: remove use of socket_error() from all code
Now that QEMU wraps the Win32 sockets methods to automatically
set errno upon failure, there is no reason for callers to use
the socket_error() method. They can rely on accessing errno
even on Win32. Remove all use of socket_error() from general
code, leaving it as a static method in oslib-win32.c only.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:34 +00:00
Daniel P. Berrange
a2d96af4bb osdep: add wrappers for socket functions
The windows socket functions look identical to the normal POSIX
sockets functions, but instead of setting errno, the caller needs
to call WSAGetLastError(). QEMU has tried to deal with this
incompatibility by defining a socket_error() method that callers
must use that abstracts the difference between WSAGetLastError()
and errno.

This approach is somewhat error prone though - many callers of
the sockets functions are just using errno directly because it
is easy to forget the need use a QEMU specific wrapper. It is
not always immediately obvious that a particular function will
in fact call into Windows sockets functions, so the dev may not
even realize they need to use socket_error().

This introduces an alternative approach to portability inspired
by the way GNULIB fixes portability problems. We use a macro to
redefine the original socket function names to refer to a QEMU
wrapper function. The wrapper function calls the original Win32
sockets method and then sets errno from the WSAGetLastError()
value.

Thus all code can simply call the normal POSIX sockets APIs are
have standard errno reporting on error, even on Windows. This
makes the socket_error() method obsolete.

We also bring closesocket & ioctlsocket into this approach. Even
though they are non-standard Win32 names, we can't wrap the normal
close/ioctl methods since there's no reliable way to distinguish
between a file descriptor and HANDLE in Win32.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Daniel P. Berrange
08b758b482 char: remove qemu_chr_open_socket_fd method
The qemu_chr_open_socket_fd method takes care of either doing a
synchronous socket connect, or creating a listener socket. Part
of the work when creating the listener socket is to register a
watch for incoming clients. The caller of qemu_chr_open_socket_fd
may not want this watch created, as it might be doing a synchronous
wait for the first client. Rather than passing yet more parameters
into qemu_chr_open_socket_fd to let it handle this, just remove
the qemu_chr_open_socket_fd method an inline its functionality
into the caller. This allows for a clearer control flow and shorter
code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Daniel P. Berrange
317856cac8 char: remove socket_try_connect method
The qemu_chr_open_socket_fd() method multiplexes three different
actions into one method. The socket_try_connect() method is one
of its callers, but it only ever want one specific action
performed. By inlining that action into socket_try_connect()
we see that there is not in fact any failure scenario, so there
is not even any reason for socket_try_connect to exist. Just
inline the asynchronous connection attempts directly at the
places that need them. This shortens & clarifies the code.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Daniel P. Berrange
f50dfe457f char: remove qemu_chr_finish_socket_connection method
The qemu_chr_finish_socket_connection method is multiplexing two
different actions into one method. Each caller of it though, only
wants one specific action. The code is shorter & clearer if we
thus remove the method and just inline the specific actions
where needed.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Paolo Bonzini
a589720567 io: implement socket watch for win32 using WSAEventSelect+select
On Win32 we cannot directly poll on socket handles. Instead we
create a Win32 event object and associate the socket handle with
the event. When the event signals readyness we then have to
use select to determine which events are ready. Creating Win32
events is moderately heavyweight, so we don't want todo it
every time we create a GSource, so this associates a single
event with a QIOChannel.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:07 +00:00
Daniel P. Berrange
30fd3e2790 io: remove checking of EWOULDBLOCK
Since we now canonicalize WSAEWOULDBLOCK into EAGAIN there is
no longer any need to explicitly check EWOULDBLOCK for Win32.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:19:05 +00:00
Daniel P. Berrange
de7971ffb9 io: use qemu_accept to ensure SOCK_CLOEXEC is set
The QIOChannelSocket code mistakenly uses the bare accept()
function which does not set SOCK_CLOEXEC.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:11:40 +00:00
Paolo Bonzini
b83b68a013 io: introduce qio_channel_create_socket_watch
Sockets are not in the same namespace as file descriptors on Windows.
As an initial step, introduce separate APIs for file descriptor and
socket watches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10 17:10:19 +00:00
Paolo Bonzini
e560d141ab io: pass HANDLE to g_source_add_poll on Win32
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-10 17:10:19 +00:00
Daniel P. Berrange
5151d23e65 io: fix copy+paste mistake in socket error message
s/write/read/ in the error message reported after
readmsg() fails

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
294bbbb425 io: assert errors before asserting content in I/O test
When checking the results of an I/O operation test, assert that
the error objects are NULL before asserting on the content. This
is found to give more useful indication of the problem when
diagnosing test failures.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
256920eb94 io: set correct error object in background reader test thread
The reader thread was accidentally setting the error pointer
intended for the writer thread. If both threads set errors
this would result in QEMU abort'ing due to the error already
being set.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
a9d5aed12d io: wait for incoming client in socket test
Exercise the GSource code for server sockets by calling
qio_channel_wait() prior to accepting the incoming client.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
abc981bf29 io: bind to socket before creating QIOChannelSocket
In the QIOChannelSocket test we create a socket file
descriptor and then try to create a QIOChannelSocket.
This works on Linux, but fails on Win32 because it is
not valid to call getsockname() on an unbound socket.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
5838d66e73 io: initialize sockets in test program
The win32 sockets layer requires that socket_init() is called
otherwise nothing will work.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
0a27af918b io: use bind() to check for IPv4/6 availability
Currently the test-io-channel-socket.c test uses getifaddrs
to see if an IPv4/6 address is present on any host NIC, as
a way to determine if IPv4/6 sockets can be used. This is
problematic because getifaddrs is not available on Win32.

Rather than testing indirectly via getifaddrs, just create
a socket and try to bind() to the loopback address instead.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:18 +00:00
Daniel P. Berrange
c619644067 osdep: fix socket_error() to work with Mingw64
Historically QEMU has had a socket_error() macro that was
defined to map to WSASocketError(). The os-win32.h header
file would define errno constants that mapped to the
WSA error constants. This worked fine with Mingw32 since
its header files never defined any errno values, nor did
it even provide an errno.h.  So callers of socket_error()
could match on traditional Exxxx constants and it would
all "just work".

With Mingw64 though, things work rather differently. First
there is an errno.h file which defines all the traditional
errno constants you'd expect from a UNIX platform. There
is then a winerror.h which defined the WSA error constants.
Crucially the WSAExxxx errno values in winerror.h do not
match the Exxxx errno values in error.h.

If QEMU had only imported winerror.h it would still work,
but the qemu/osdep.h file unconditionally imports errno.h.
So callers of socket_error() will get now WSAExxxx values
back and compare them to the Exxx constants. This will
always fail silently at runtime.

To solve this QEMU needs to stop assuming the WSAExxxx
constant values match the Exxx constant values. Thus the
socket_error() macro is turned into a small function that
re-maps WSAExxxx values into Exxx.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-03-10 17:10:17 +00:00
Alex Williamson
469002263a vfio: Wrap VFIO_DEVICE_GET_REGION_INFO
In preparation for supporting capability chains on regions, wrap
ioctl(VFIO_DEVICE_GET_REGION_INFO) so we don't duplicate the code for
each caller.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 09:39:07 -07:00
Alex Williamson
7df9381b7a vfio: Add sysfsdev property for pci & platform
vfio-pci currently requires a host= parameter, which comes in the
form of a PCI address in [domain:]<bus:slot.function> notation.  We
expect to find a matching entry in sysfs for that under
/sys/bus/pci/devices/.  vfio-platform takes a similar approach, but
defines the host= parameter to be a string, which can be matched
directly under /sys/bus/platform/devices/.  On the PCI side, we have
some interest in using vfio to expose vGPU devices.  These are not
actual discrete PCI devices, so they don't have a compatible host PCI
bus address or a device link where QEMU wants to look for it.  There's
also really no requirement that vfio can only be used to expose
physical devices, a new vfio bus and iommu driver could expose a
completely emulated device.  To fit within the vfio framework, it
would need a kernel struct device and associated IOMMU group, but
those are easy constraints to manage.

To support such devices, which would include vGPUs, that honor the
VFIO PCI programming API, but are not necessarily backed by a unique
PCI address, add support for specifying any device in sysfs.  The
vfio API already has support for probing the device type to ensure
compatibility with either vfio-pci or vfio-platform.

With this, a vfio-pci device could either be specified as:

-device vfio-pci,host=02:00.0

or

-device vfio-pci,sysfsdev=/sys/devices/pci0000:00/0000:00:1c.0/0000:02:00.0

or even

-device vfio-pci,sysfsdev=/sys/bus/pci/devices/0000:02:00.0

When vGPU support comes along, this might look something more like:

-device vfio-pci,sysfsdev=/sys/devices/virtual/intel-vgpu/vgpu0@0000:00:02.0

NB - This is only a made up example path

The same change is made for vfio-platform, specifying sysfsdev has
precedence over the old host option.

Tested-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Eric Auger <eric.auger@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-03-10 09:39:07 -07:00
Cornelia Huck
75cfb3bb41 s390x/cpu: use g_new0
Let's use g_new0 to allocate cpu_states.

Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 12:02:02 +01:00
Janosch Frank
8b8a61ad8c s390x: Introduce S390MachineClass
As we now have the new machine definitions, that let us disable/enable
machine options more easily, we need a way to save them and make them
publicly available.

The new s390-virtio-ccw.h header exports the s390 ccw machine state
and class, so they can be easily used in other C files.

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:16 +01:00
Janosch Frank
4fca654872 s390x: Introduce machine definition macros
Most of the machine definition code looks the same between different
machine versions. The new DEFINE_CCW_MACHINE macro makes defining a
new machine easier by inserting standard machine version
definitions. This also makes it possible to propagate values between
machine versions.

The patch is inspired by code from hw/ppc/spapr.c

Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:16 +01:00
Eugene (jno) Dvurechenski
3a3c752f0b pc-bios/s390-ccw: fix old bug in ptr increment
We need to increment by the size of the structure, whereas 'ns' is 'uint8_t *'.

Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Eugene (jno) Dvurechenski <jno@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:16 +01:00
Matthew Rosato
a006b67fe4 s390x/cpu: Allow hotplug of CPUs
Implement cpu hotplug routine and add the machine hook.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-8-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
96b1a8bb55 s390x/cpu: Add error handling to cpu creation
Check for and propogate errors during s390 cpu creation.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-7-git-send-email-mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
502edbf834 s390x/cpu: Add CPU property links
Link each CPUState as property machine/cpu[n] during initialization.
Add a hotplug handler to s390-virtio-ccw machine and set the
state during plug.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-6-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
25637d31f2 s390x/cpu: Tolerate max_cpus
Once hotplug is enabled, interrupts may come in for CPUs
with an address > smp_cpus.  Allocate for this and allow
search routines to look beyond smp_cpus.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-5-git-send-email-mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
c6644fc88b s390x/cpu: Get rid of side effects when creating a vcpu
In preparation for hotplug, defer some CPU initialization
until the device is actually being realized, including
cpu_exec_init.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-4-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
ef3027affc s390x/cpu: Set initial CPU state in common routine
Both initial and hotplugged CPUs need to set the same initial
state.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-3-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Matthew Rosato
d2eae20790 s390x/cpu: Cleanup init in preparation for hotplug
Ensure a valid cpu_model is set upfront by setting the
default value directly into the MachineState when none is
specified.  This is needed to ensure hotplugged CPUs share
the same cpu_model.

Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Message-Id: <1457112875-5209-2-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-10 10:37:15 +01:00
Peter Maydell
a648c13738 Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160309-1' into staging
add linux evdev support, vnc and console fixes.

# gpg: Signature made Wed 09 Mar 2016 09:02:47 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160309-1:
  ui/console: add escape sequence \e[5, 6n
  input-linux: add switch to enable auto-repeat events
  input-linux: add option to toggle grab on all devices
  input: linux evdev support
  vnc: send cursor when a new client is connecting

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-10 02:51:14 +00:00
Ren Kimura
58aa7d8e44 ui/console: add escape sequence \e[5, 6n
Add support of escape sequence "\e[5n" and "\e[6n" to console.
"\e[5n" reports status of console and it always succeed
in virtual console.
"\e[6n" reports now cursor position in console.

Signed-off-by: Ren Kimura <rkx1209dev@gmail.com>
Message-id: 1457466681-7714-2-git-send-email-rkx1209dev@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-09 09:35:56 +01:00
Peter Maydell
4ba364b472 Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
Add Samuel Thibault as slirp maintainer

# gpg: Signature made Tue 08 Mar 2016 20:43:01 GMT using RSA key ID FB6B2F1D
# gpg: Good signature from "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: F632 74CD C630 0873 CB3D  29D9 E3E5 1CE8 FB6B 2F1D

* remotes/thibault/tags/samuel-thibault:
  MAINTAINERS: Add Samuel Thibault as slirp maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-09 05:14:55 +00:00
Peter Maydell
8519c8e073 Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.6-6' into staging
migration:
* add avx2 instruction optimization, speeds up zero-page checking on
  compatible architectures and compilers (gcc 4.9+)
* add additional postcopy stats to 'info migrate' output

# gpg: Signature made Tue 08 Mar 2016 11:29:48 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-for-2.6-6:
  cutils: add avx2 instruction optimization
  configure: detect ifunc and avx2 attribute
  Postcopy: Fix sync count in info migrate

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-09 01:07:16 +00:00
Peter Maydell
3293680dc7 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fw-cfg-20160308-1' into staging
acpi: add fw_cfg device node to dsdt

# gpg: Signature made Tue 08 Mar 2016 11:15:42 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-fw-cfg-20160308-1:
  tests: update acpi test data
  fw_cfg: document ACPI device node information
  acpi: arm: add fw_cfg device node to dsdt
  acpi: pc: add fw_cfg device node to dsdt
  pc: fw_cfg: move ioport base constant to pc.h
  fw_cfg: expose control register size in fw_cfg.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-09 00:44:43 +00:00
Peter Maydell
5763795f93 Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-for-2.6-2' into staging
rng: use simpleq instead of gslist

# gpg: Signature made Tue 08 Mar 2016 10:51:23 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-virtio-rng/tags/rng-for-2.6-2:
  rng: switch request queue to QSIMPLEQ

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-09 00:21:17 +00:00
Samuel Thibault
eda509fa0a MAINTAINERS: Add Samuel Thibault as slirp maintainer
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Acked-by: Jan Kiszka <jan.kiszka@siemens.com>
2016-03-08 21:39:04 +01:00
Liang Li
28b90d9c19 cutils: add avx2 instruction optimization
buffer_find_nonzero_offset() is a hot function during live migration.
Now it use SSE2 instructions for optimization. For platform supports
AVX2 instructions, use AVX2 instructions for optimization can help
to improve the performance of buffer_find_nonzero_offset() about 30%
comparing to SSE2.

Live migration can be faster with this optimization, the test result
shows that for an 8GiB RAM idle guest just boots, this patch can help
to shorten the total live migration time about 6%.

This patch use the ifunc mechanism to select the proper function when
running, for platform supports AVX2, execute the AVX2 instructions,
else, execute the original instructions.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1457416397-26671-3-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-08 16:53:26 +05:30
Liang Li
99f2dbd343 configure: detect ifunc and avx2 attribute
Detect if the compiler can support the ifun and avx2, if so, set
CONFIG_AVX2_OPT which will be used to turn on the avx2 instruction
optimization.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1457416397-26671-2-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-08 16:53:26 +05:30
Dr. David Alan Gilbert
614e8018ed Postcopy: Fix sync count in info migrate
I'd missed the sync count off in the postcopy case.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Message-id: 1456394631-18010-1-git-send-email-dgilbert@redhat.com
Message-Id: <1456394631-18010-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-08 16:52:27 +05:30
Gerd Hoffmann
a6ccabd676 input-linux: add switch to enable auto-repeat events
Enable with "-input-linux /dev/input/${device},repeat=on".

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1457087116-4379-4-git-send-email-kraxel@redhat.com
2016-03-08 12:20:11 +01:00
Gerd Hoffmann
46d921bebe input-linux: add option to toggle grab on all devices
Maintain a list of all input devices.  Add an option to make grab
work across all devices (so toggling grab on the keybard can switch
over the mouse too).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1457087116-4379-3-git-send-email-kraxel@redhat.com
2016-03-08 12:20:11 +01:00
Gerd Hoffmann
e0d2bd5195 input: linux evdev support
This patch adds support for reading input events directly from linux
evdev devices and forward them to the guest.  Unlike virtio-input-host
which simply passes on all events to the guest without looking at them
this will interpret the events and feed them into the qemu input
subsystem.

Therefore this is limited to what the qemu input subsystem and the
emulated input devices are able to handle.  Also there is no support for
absolute coordinates (tablet/touchscreen).  So we are talking here about
basic mouse and keyboard support.

The advantage is that it'll work without virtio-input drivers in the
guest, the events are delivered to the usual ps/2 or usb input devices
(depending on what the machine happens to have).  And for keyboards
qemu is able to switch the keyboard between guest and host on hotkey.
The hotkey is hard-coded for now (both control keys), initialy the
guest owns the keyboard.

Probably most useful when assigning vga devices with vfio and using a
physical monitor instead of vnc/spice/gtk as guest display.

Usage:  Add '-input-linux /dev/input/event<nr>' to the qemu command
line.  Note that udev has rules which populate /dev/input/by-{id,path}
with static names, which might be more convinient to use.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1457087116-4379-2-git-send-email-kraxel@redhat.com
2016-03-08 12:20:11 +01:00
Gerd Hoffmann
a60c785608 tests: update acpi test data
using tests/acpi-test-data/rebuild-expected-aml.sh

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 12:15:27 +01:00
Gabriel L. Somlo
36a43ea83b fw_cfg: document ACPI device node information
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc Marí <markmb@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1455906029-25565-6-git-send-email-somlo@cmu.edu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 12:15:22 +01:00
Gabriel L. Somlo
70bee80d6b acpi: arm: add fw_cfg device node to dsdt
Add a fw_cfg device node to the ACPI DSDT. This is mostly
informational, as the authoritative fw_cfg MMIO region(s)
are listed in the Device Tree. However, since we are building
ACPI tables, we might as well be thorough while at it...

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc Marí <markmb@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1455906029-25565-5-git-send-email-somlo@cmu.edu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 12:15:15 +01:00
Gabriel L. Somlo
e2ec75685c acpi: pc: add fw_cfg device node to dsdt
Add a fw_cfg device node to the ACPI DSDT. While the guest-side
firmware can't utilize this information (since it has to access
the hard-coded fw_cfg device to extract ACPI tables to begin with),
having fw_cfg listed in ACPI will help the guest kernel keep a more
accurate inventory of in-use IO port regions.

Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc Marí <markmb@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1455906029-25565-4-git-send-email-somlo@cmu.edu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 12:15:09 +01:00
Gabriel L. Somlo
305ae88895 pc: fw_cfg: move ioport base constant to pc.h
Move BIOS_CFG_IOPORT define from pc.c to pc.h, and rename
it to FW_CFG_IO_BASE.

Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc Marí <markmb@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1455906029-25565-3-git-send-email-somlo@cmu.edu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 12:14:49 +01:00
Peter Maydell
d1cc881d54 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 08 Mar 2016 07:46:08 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net: check packet payload length
  filter-buffer: Add status_changed callback processing
  filter: Add 'status' property for filter object
  rocker: allow user to specify rocker world by property
  rocker: add name field into WorldOps ale let world specify its name
  rocker: return -ENOMEM in case of some world alloc fails
  rocker: forbid to change world type
  net: netmap: probe netmap interface for virtio-net header
  net: simplify net_init_tap_one logic
  MAINTAINERS: Add entries for include/net/ files
  net: filter: correctly remove filter from the list during finalization
  net: ne2000: check ring buffer control registers

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-08 10:25:50 +00:00
Gabriel L. Somlo
ce9a2aa372 fw_cfg: expose control register size in fw_cfg.h
Expose the size of the control register (FW_CFG_CTL_SIZE) in fw_cfg.h.
Add comment to fw_cfg_io_realize() pointing out that since the
8-bit data register is always subsumed by the 16-bit control
register in the port I/O case, we use the control register width
as the *total* width of the (classic, non-DMA) port I/O region reserved
for the device.

Cc: Marc Marí <markmb@redhat.com>
Signed-off-by: Gabriel Somlo <somlo@cmu.edu>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marc Marí <markmb@redhat.com>
Message-id: 1455906029-25565-2-git-send-email-somlo@cmu.edu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 10:46:30 +01:00
Frediano Ziglio
91ec41dc3f vnc: send cursor when a new client is connecting
If you have hardware cursor and you are reconnecting the VNC client
you need to send the cursor. Failing to do so make the cursor invisible
till is changed.

Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Message-id: 1456929142-14033-1-git-send-email-fziglio@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-08 10:45:01 +01:00
Prasad J Pandit
362786f14a net: check packet payload length
While computing IP checksum, 'net_checksum_calculate' reads
payload length from the packet. It could exceed the given 'data'
buffer size. Add a check to avoid it.

Reported-by: Liu Ling <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
zhanghailiang
f1b2bc601a filter-buffer: Add status_changed callback processing
While the status of filter-buffer changing from 'on' to 'off',
it need to release all the buffered packets, and delete the related
timer, while switch from 'off' to 'on', it need to resume the release
packets timer.

Here, we extract the process of setup timer into a new helper,
which will be used in the new status_changed callback.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
zhanghailiang
338d3f415e filter: Add 'status' property for filter object
With this property, users can control if this filter is 'on'
or 'off'. The default behavior for filter is 'on'.

For some types of filters, they may need to react to status changing,
So here, we introduced status changing callback/notifier for filter class.

We will skip the disabled ('off') filter when delivering packets in net layer.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Jiri Pirko
9fe7101f1d rocker: allow user to specify rocker world by property
Add property to specify rocker world. All ports will be assigned to this
world.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Jiri Pirko
031143c8d5 rocker: add name field into WorldOps ale let world specify its name
Also use this in world_name getter function.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Jiri Pirko
39e0c4f47d rocker: return -ENOMEM in case of some world alloc fails
Until now, 0 is returned in this error case. Fix it ro return -ENOMEM.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Jiri Pirko
0ab9cd9a4b rocker: forbid to change world type
Port to world assignment should be permitted only by qemu user. Driver
should not be able to do it, so forbid that possibility.

Signed-off-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Vincenzo Maffione
9fbad2ca36 net: netmap: probe netmap interface for virtio-net header
Previous implementation of has_ufo, has_vnet_hdr, has_vnet_hdr_len, etc.
did not really probe for virtio-net header support for the netmap
interface attached to the backend. These callbacks were correct for
VALE ports, but incorrect for hardware NICs, pipes, monitors, etc.

This patch fixes the implementation to work properly with all kinds
of netmap ports.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:18 +08:00
Paolo Bonzini
3a2d44f6dd net: simplify net_init_tap_one logic
net_init_tap_one receives in vhostfdname a fd name from vhostfd= or
vhostfds=, or NULL if there is no vhostfd=/vhostfds=.  It is simpler
to just check vhostfdname, than it is to check for vhostfd= or
vhostfds=.  This also calms down Coverity, which otherwise thinks
that monitor_fd_param could dereference a NULL vhostfdname.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:09 +08:00
Thomas Huth
d24b2b1ccc MAINTAINERS: Add entries for include/net/ files
The include/net/ files correspond to the files in the net/ directory,
thus there should be corresponding entries in the MAINTAINERS file.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:09 +08:00
Jason Wang
5dd2d45e34 net: filter: correctly remove filter from the list during finalization
Qemu may crash when we want to add two filters on the same netdev but
the initialization of second fails (e.g missing parameters):

./qemu-system-x86_64 -netdev user,id=un0 \
 -object filter-buffer,id=f0,netdev=un0,interval=10 \
 -object filter-buffer,id=f1,netdev=un0
Segmentation fault (core dumped)

This is because we don't check whether or not the filter was in the
list of netdev. This patch fixes this.

Cc: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:09 +08:00
Prasad J Pandit
415ab35a44 net: ne2000: check ring buffer control registers
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. Registers PSTART & PSTOP
define ring buffer size & location. Setting these registers
to invalid values could lead to infinite loop or OOB r/w
access issues. Add check to avoid it.

Reported-by: Yang Hongke <yanghongke@huawei.com>
Tested-by: Yang Hongke <yanghongke@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-03-08 15:34:09 +08:00
Ladi Prosek
443590c204 rng: switch request queue to QSIMPLEQ
QSIMPLEQ supports appending to tail in O(1) and is intrusive so
it doesn't require extra memory allocations for the bookkeeping
data.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1457010971-24771-1-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-08 12:54:14 +05:30
Peter Maydell
97556fe80e Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* RAMBlock vs. MemoryRegion cleanups from Fam
* mru_section optimization from Fam
* memory.txt improvements from Peter and Xiaoqiang
* i8257 fix from Hervé
* -daemonize fix
* Cleanups and small fixes from Alex, Praneith, Wei

# gpg: Signature made Mon 07 Mar 2016 17:08:59 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  scsi-bus: Remove tape command from scsi_req_xfer
  kvm/irqchip: use bitmap utility for gsi tracking
  MAINTAINERS: Add entry for include/sysemu/kvm*.h
  doc/memory.txt: correct description of MemoryRegionOps fields
  doc/memory.txt: correct a logic error
  icount: possible options for sleep are on or off
  exec: Introduce AddressSpaceDispatch.mru_section
  exec: Factor out section_covers_addr
  exec: Pass RAMBlock pointer to qemu_ram_free
  memory: Drop MemoryRegion.ram_addr
  memory: Implement memory_region_get_ram_addr with mr->ram_block
  memory: Move assignment to ram_block to memory_region_init_*
  exec: Return RAMBlock pointer from allocating functions
  i8257: fix Terminal Count status
  log: do not log if QEMU is daemonized but without -D

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-08 04:53:37 +00:00
Alex Pyrgiotis
4792b7e9d5 scsi-bus: Remove tape command from scsi_req_xfer
Remove the RECOVER_BUFFERED_DATA command from the list of commands that
are handled by scsi_req_xfer(). Given that this command is
tape-specific, it should be handled only by scsi_stream_req_xfer().

Signed-off-by: Alex Pyrgiotis <apyrgio@arrikto.com>

Message-Id: <1457365822-22435-1-git-send-email-apyrgio@arrikto.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 17:56:23 +01:00
Wei Yang
8269fb7082 kvm/irqchip: use bitmap utility for gsi tracking
By using utilities in bitops and bitmap, this patch tries to make it more
friendly to audience. No functional change.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Message-Id: <1457229445-25954-1-git-send-email-richard.weiyang@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 15:18:22 +01:00
Thomas Huth
a95e9a485b MAINTAINERS: Add entry for include/sysemu/kvm*.h
The include/sysemu/kvm*.h header files should be part of
the overall KVM section.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1456403605-26587-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:38 +01:00
Peter Maydell
ef00bdaf8c doc/memory.txt: correct description of MemoryRegionOps fields
Probably what happened was that when the API was being designed it
started off with an 'aligned' field, and then later the field name
and semantics were changed but the docs weren't updated to match.

Similarly, cpu_register_io_memory() does not exist anymore, so
clarify the documentation for .old_mmio.

Reported-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:38 +01:00
xiaoqiang zhao
8210f5f6f5 doc/memory.txt: correct a logic error
In the regions overlap example, region B has a higher priority thus
should has a larger priority number than C.

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1456476051-15121-1-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:38 +01:00
Pranith Kumar
778d9f9b25 icount: possible options for sleep are on or off
icount sleep takes on or off as options. A few places mention sleep=no
which is not accepted. This patch corrects them.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <1456499811-16819-1-git-send-email-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:38 +01:00
Fam Zheng
729633c2bc exec: Introduce AddressSpaceDispatch.mru_section
Under heavy workloads the lookup will likely end up with the same
MemoryRegionSection from last time. Using a pointer to cache the result,
like ram_list.mru_block, significantly reduces cost of
address_space_translate.

During address space topology update, as->dispatch will be reallocated
so the pointer is invalidated automatically.

Perf reports a visible drop on the cpu usage, because phys_page_find is
not called.  Before:

   2.35%  qemu-system-x86_64       [.] phys_page_find
   0.97%  qemu-system-x86_64       [.] address_space_translate_internal
   0.95%  qemu-system-x86_64       [.] address_space_translate
   0.55%  qemu-system-x86_64       [.] address_space_lookup_region

After:

   0.97%  qemu-system-x86_64       [.] address_space_translate_internal
   0.97%  qemu-system-x86_64       [.] address_space_lookup_region
   0.84%  qemu-system-x86_64       [.] address_space_translate

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-8-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
29cb533d8c exec: Factor out section_covers_addr
This will be shared by the next patch.

Also add a comment explaining the unobvious condition on "size.hi".

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-7-git-send-email-famz@redhat.com>
[Small change to the comment. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
f1060c55bf exec: Pass RAMBlock pointer to qemu_ram_free
The only caller now knows exactly which RAMBlock to free, so it's not
necessary to do the lookup.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-6-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:37 +01:00
Fam Zheng
8e41fb63c5 memory: Drop MemoryRegion.ram_addr
All references to mr->ram_addr are replaced by
memory_region_get_ram_addr(mr) (except for a few assertions that are
replaced with mr->ram_block).

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-5-git-send-email-famz@redhat.com>
Acked-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:26:29 +01:00
Fam Zheng
7ebb2745ac memory: Implement memory_region_get_ram_addr with mr->ram_block
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-4-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Fam Zheng
0a75601853 memory: Move assignment to ram_block to memory_region_init_*
We don't force "const" qualifiers with pointers in QEMU, but it's still
good to keep a clean function interface. Assigning to mr->ram_block is
in this sense ugly - one initializer mutating its owning object's state.

Move it to memory_region_init_*, where mr->ram_addr is assigned.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Fam Zheng
528f46af6e exec: Return RAMBlock pointer from allocating functions
Previously we return RAMBlock.offset; now return the pointer to the
whole structure.

ram_block_add returns void now, error is completely passed with errp.

Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1456813104-25902-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Hervé Poussineau
bb8f32c031 i8257: fix Terminal Count status
When a DMA transfer is done (ie all bytes have been transfered), the corresponding
Terminal Count bit must be set in the status register.
This bit is already cleared in i8257_read_cont and i8257_write_cont when required.

This fixes (at least) floppy transfer in IBM 40p firmware, which checks in DMA
controller if everything went fine.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1456404332-31556-1-git-send-email-hpoussin@reactos.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Paolo Bonzini
c586eac336 log: do not log if QEMU is daemonized but without -D
Commit 96c33a4 ("log: Redirect stderr to logfile if deamonized",
2016-02-22) wanted to move stderr of a daemonized QEMU to the file
specified with -D.

However, if -D was not passed, the patch had the side effect of not
redirecting stderr to /dev/null.  This happened because qemu_logfile
was set to stderr rather than the expected value of NULL.  The fix
is simply in the "if" condition of do_qemu_set_log; the "if" for
closing the file is also changed to match.

Reported-by: Jan Tomko <jtomko@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-07 13:18:28 +01:00
Peter Maydell
1464ad45cd Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-03-04' into staging
QAPI patches for 2016-03-04

# gpg: Signature made Sat 05 Mar 2016 09:47:19 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2016-03-04:
  qapi: Drop useless 'data' member of unions
  chardev: Drop useless ChardevDummy type
  qapi: Avoid use of 'data' member of QAPI unions
  ui: Shorten references into InputEvent
  util: Shorten references into SocketAddress
  chardev: Shorten references into ChardevBackend
  qapi: Update docs to match recent generator changes
  qapi-visit: Expose visit_type_FOO_members()
  qapi: Rename 'fields' to 'members' in generated C code
  qapi: Rename 'fields' to 'members' in generator
  qapi-dealloc: Reduce use outside of generated code
  qmp-shell: fix pretty printing of JSON responses

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-06 11:53:27 +00:00
Eric Blake
48eb62a74f qapi: Drop useless 'data' member of unions
We started moving away from the use of the 'void *data' member
in the C union corresponding to a QAPI union back in commit
544a373; recent commits have gotten rid of other uses.  Now
that it is completely unused, we can remove the member itself
as well as the FIXME comment.  Update the testsuite to drop the
negative test union-clash-data.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457021813-10704-11-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:42:06 +01:00
Eric Blake
b1918fbb1c chardev: Drop useless ChardevDummy type
Commit d0d7708b made ChardevDummy be an empty wrapper type around
ChardevCommon.  But there is no technical reason for this indirection,
so simplify the code by directly using the base type.

Also change the fallback assignment to assign u.null rather than
u.data, since a future patch will remove the data member of the C
struct generated for QAPI unions.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457106160-23614-1-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:42:03 +01:00
Eric Blake
10f759079e qapi: Avoid use of 'data' member of QAPI unions
QAPI code generators currently create a 'void *data' member as
part of the anonymous union embedded in the C struct corresponding
to a QAPI union.  However, directly assigning to this member of
the union feels a bit fishy, when we can assign to another member
of the struct instead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457021813-10704-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:58 +01:00
Eric Blake
b5a1b44318 ui: Shorten references into InputEvent
An upcoming patch will alter how simple unions, like InputEvent, are
laid out, which will impact all lines of the form 'evt->u.XXX'
(expanding it to the longer 'evt->u.XXX.data').  For better
legibility in that patch, and less need for line wrapping, it's better
to use a temporary variable to reduce the effect of a layout change to
just the variable initializations, rather than every reference within
an InputEvent.

There was one instance in hid.c:hid_pointer_event() where the code
was referring to evt->u.rel inside the case label where evt->u.abs
is the correct name; thankfully, both members of the union have the
same type, so it happened to work, but it is now cleaner.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-8-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:55 +01:00
Eric Blake
0399293e5b util: Shorten references into SocketAddress
An upcoming patch will alter how simple unions, like SocketAddress,
are laid out, which will impact all lines of the form 'addr->u.XXX'
(expanding it to the longer 'addr->u.XXX.data').  For better
legibility in that patch, and less need for line wrapping, it's better
to use a temporary variable to reduce the effect of a layout change to
just the variable initializations, rather than every reference within
a SocketAddress.  Also, take advantage of some C99 initialization where
it makes sense (simplifying g_new0() to g_new()).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-7-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:52 +01:00
Eric Blake
f194a1ae53 chardev: Shorten references into ChardevBackend
An upcoming patch will alter how simple unions, like ChardevBackend,
are laid out, which will impact all lines of the form 'backend->u.XXX'
(expanding it to the longer 'backend->u.XXX.data').  For better
legibility in that patch, and less need for line wrapping, it's better
to use a temporary variable to reduce the effect of a layout change to
just the variable initializations, rather than every reference within
a ChardevBackend.  It doesn't hurt that this also makes the code more
consistent: some clients touched here already had a temporary variable
but weren't using it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-By: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1457021813-10704-6-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:47 +01:00
Eric Blake
9ee86b8526 qapi: Update docs to match recent generator changes
Several commits have been changing the generator, but not updating
the docs to match:
- The implicit tag member is named "type", not "kind".  Screwed up in
commit 39a1815.
- Commit 9f08c8ec made list types lazy, and thereby dropped
UserDefOneList if nothing explicitly uses the list type.
- Commit 51e72bc1 switched the parameter order with 'name' occurring
earlier.
- Commit e65d89bf changed the layout of UserDefOneList.
- Prefer the term 'member' over 'field'.
- We now expose visit_type_FOO_members() for objects.
- etc.

Rework the examples to show slightly more output (we don't want to
show too much; that's what the testsuite is for), and regenerate the
output to match all recent changes.  Also, rearrange output to show
.h files before .c (understanding the interface first often makes
the implementation easier to follow).

Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1457021813-10704-5-git-send-email-eblake@redhat.com>
2016-03-05 10:41:16 +01:00
Eric Blake
4d91e9115c qapi-visit: Expose visit_type_FOO_members()
Dan Berrange reported a case where he needs to work with a
QCryptoBlockOptions union type using the OptsVisitor, but only
visit one of the branches of that type (the discriminator is not
visited directly, but learned externally).  When things were
boxed, it was easy: just visit the variant directly, which took
care of both allocating the variant and visiting its members, then
store that pointer in the union type.  But now that things are
unboxed, we need a way to visit the members without allocation,
done by exposing visit_type_FOO_members() to the user.

Before the patch, we had quite a bit of code associated with
object_members_seen to make sure that a declaration of the helper
was in scope before any use of the function.  But now that the
helper is public and declared in the header, the .c file no
longer needs to worry about topological sorting (the helper is
always in scope), which leads to some nice cleanups.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:13 +01:00
Eric Blake
c81200b014 qapi: Rename 'fields' to 'members' in generated C code
C types and JSON objects don't have fields, but members.  We
shouldn't gratuitously invent terminology.  This patch is a
strict renaming of static genarated functions, plus the naming
of the dummy filler member for empty structs, before the next
patch exposes some of that naming to the rest of the code base.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:41:09 +01:00
Eric Blake
14f00c6c49 qapi: Rename 'fields' to 'members' in generator
C types and JSON objects don't have fields, but members.  We
shouldn't gratuitously invent terminology.  This patch is a
strict renaming of generator code internals (including testsuite
comments), before later patches rename C interfaces.

No change to generated code with this patch.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1457021813-10704-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-05 10:40:52 +01:00
Eric Blake
96a1616c85 qapi-dealloc: Reduce use outside of generated code
No need to roll our own use of the dealloc visitors when we can
just directly use the qapi_free_FOO() functions that do what we
want in one line.

In net.c, inline net_visit() into its remaining lone caller.

After this patch, test-visitor-serialization.c is the only
non-generated file that needs to use a dealloc visitor, because
it is testing low level aspects of the visitor interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1456262075-3311-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-04 17:16:32 +01:00
Daniel P. Berrange
e55250c6cb qmp-shell: fix pretty printing of JSON responses
Pretty printing of JSON responses is important to be able to understand
large responses from query commands in particular. Unfortunately this
was broken during the addition of the verbose flag in

  commit 1ceca07e48
  Author: John Snow <jsnow@redhat.com>
  Date:   Wed Apr 29 15:14:04 2015 -0400

    scripts: qmp-shell: Add verbose flag

This is because that change turned the python data structure into a
formatted JSON string before the pretty print was given it. So we're
just pretty printing a string, which is a no-op.

The original pretty printer would output python objects.

(QEMU) query-chardev
{   u'return': [   {   u'filename': u'vc',
                       u'frontend-open': False,
                       u'label': u'parallel0'},
                   {   u'filename': u'vc',
                       u'frontend-open': True,
                       u'label': u'serial0'},
                   {   u'filename': u'unix:/tmp/qemp,server',
                       u'frontend-open': True,
                       u'label': u'compat_monitor0'}]}

This fixes the problem by switching to outputting pretty formatted JSON
text instead. This has the added benefit that the pretty printed output
is now valid JSON text. Due to the way the verbose flag was handled, the
pretty printing now applies to the command sent, as well as its response:

(QEMU) query-chardev
{
    "execute": "query-chardev",
    "arguments": {}
}
{
    "return": [
        {
            "frontend-open": false,
            "label": "parallel0",
            "filename": "vc"
        },
        {
            "frontend-open": true,
            "label": "serial0",
            "filename": "vc"
        },
        {
            "frontend-open": true,
            "label": "compat_monitor0",
            "filename": "unix:/tmp/qmp,server"
        }
    ]
}

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1456224706-1591-1-git-send-email-berrange@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[Bonus fix: multiple -p now work]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-03-04 17:16:32 +01:00
Peter Maydell
3c0f12df65 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160304' into staging
target-arm queue:
 * Correct handling of writes to CPSR from gdbstub in user mode
 * virt: lift maximum RAM limit to 255GB
 * sdhci: implement reset
 * virt: if booting in Secure mode, provide secure-only RAM, make first
   flash device secure-only, and assume the EL3 boot rom will handle PSCI
 * bcm2835: use explicit endianness accessors rather than ldl/stl_phys
 * support big-endian in system mode for ARM
 * implement SETEND instruction
 * arm_gic: implement the GICv2 GICC_DIR register
 * fix SRS bug: only trap from S-EL1 to EL3 if specified mode is Mon

# gpg: Signature made Fri 04 Mar 2016 11:38:53 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160304: (30 commits)
  target-arm: Only trap SRS from S-EL1 if specified mode is MON
  hw/intc/arm_gic.c: Implement GICv2 GICC_DIR
  arm: boot: Support big-endian elfs
  loader: Add data swap option to load-elf
  loader: load_elf(): Add doc comment
  loader: add API to load elf header
  target-arm: implement BE32 mode in system emulation
  target-arm: implement setend
  target-arm: introduce tbflag for endianness
  target-arm: a64: Add endianness support
  target-arm: introduce disas flag for endianness
  target-arm: pass DisasContext to gen_aa32_ld*/st*
  target-arm: implement SCTLR.EE
  linux-user: arm: handle CPSR.E correctly in strex emulation
  linux-user: arm: set CPSR.E/SCTLR.E0E correctly for BE mode
  arm: cpu: handle BE32 user-mode as BE
  target-arm: cpu: Move cpu_is_big_endian to header
  target-arm: implement SCTLR.B, drop bswap_code
  linux-user: arm: pass env to get_user_code_*
  linux-user: arm: fix coding style for some linux-user signal functions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:46:32 +00:00
Ralf-Philipp Weinmann
ba63cf47a9 target-arm: Only trap SRS from S-EL1 if specified mode is MON
Commit cbc0326b6f caused SRS instructions executed from Secure
EL1 to trap to EL3 even if the specified mode was not monitor mode.

According to the ARMv8 Architecture reference manual [F6.1.203], ALL
of the following conditions need to be met for SRS to trap to EL3:
* It is executed at Secure PL1.
* The specified mode is monitor mode.
* EL3 is using AArch64.

Correct the condition governing the trap to EL3 to check the
specified mode.

Signed-off-by: Ralf-Philipp Weinmann <ralf+devel@comsecuris.com>
Message-id: 20160222224251.GA11654@beta.comsecuris.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweaked comment text to read 'specified mode'; edited
 commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:22 +00:00
Peter Maydell
a55c910e0b hw/intc/arm_gic.c: Implement GICv2 GICC_DIR
The GICv2 introduces a new CPU interface register GICC_DIR, which
allows an OS to split the "priority drop" and "deactivate interrupt"
parts of interrupt completion. Implement this register.
(Note that the register is at offset 0x1000 in the CPU interface,
which means it is on a different 4K page from all the other registers.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1456854176-7813-1-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:22 +00:00
Peter Crosthwaite
9776f63645 arm: boot: Support big-endian elfs
Support ARM big-endian ELF files in system-mode emulation. When loading
an elf, determine the endianness mode expected by the elf, and set the
relevant CPU state accordingly.

With this, big-endian modes are now fully supported via system-mode LE,
so there is no need to restrict the elf loading to the TARGET
endianness so the ifdeffery on TARGET_WORDS_BIGENDIAN goes away.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: fix typo in comments]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Peter Crosthwaite
7ef295ea5b loader: Add data swap option to load-elf
Some CPUs are of an opposite data-endianness to other components in the
system. Sometimes elfs have the data sections layed out with this CPU
data-endianness accounting for when loaded via the CPU, so byte swaps
(relative to other system components) will occur.

The leading example, is ARM's BE32 mode, which is is basically LE with
address manipulation on half-word and byte accesses to access the
hw/byte reversed address. This means that word data is invariant
across LE and BE32. This also means that instructions are still LE.
The expectation is that the elf will be loaded via the CPU in this
endianness scheme, which means the data in the elf is reversed at
compile time.

As QEMU loads via the system memory directly, rather than the CPU, we
need a mechanism to reverse elf data endianness to implement this
possibility.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Peter Crosthwaite
140b7ce5ff loader: load_elf(): Add doc comment
Document the usage of load_elf() for clarity on current features.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Peter Crosthwaite
04ae712a9f loader: add API to load elf header
Add an API to load an elf header header from a file. Populates a
buffer with the header contents, as well as a boolean for whether the
elf is 64b or not. Both arguments are optional.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Fix typo in comment]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Paolo Bonzini
e334bd3190 target-arm: implement BE32 mode in system emulation
System emulation only has a little-endian target; BE32 mode
is implemented by adjusting the low bits of the address
for every byte and halfword load and store.  64-bit accesses
flip the low and high words.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[PC changes:
  * rebased against master (Jan 2016)
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Paolo Bonzini
9886ecdf31 target-arm: implement setend
Since this is not a high-performance path, just use a helper to
flip the E bit and force a lookup in the hash table since the
flags have changed.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:21 +00:00
Peter Crosthwaite
91cca2cda9 target-arm: introduce tbflag for endianness
Introduce a tbflags for endianness, set based upon the CPUs current
endianness. This in turn propagates through to the disas endianness
flag.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:20 +00:00
Peter Crosthwaite
aa6489da4e target-arm: a64: Add endianness support
Set the dc->mo_endianness flag for AA64 and use it in all ldst ops.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:20 +00:00
Paolo Bonzini
dacf0a2ff7 target-arm: introduce disas flag for endianness
Introduce a disas flag for setting the CPU data endianness. This allows
control of the endianness from the CPU state rather than hard-coding it
to TARGET_WORDS_BIGENDIAN.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[ PC changes:
  * Split off as new patch from original:
        "target-arm: introduce tbflag for CPSR.E"
  * Wrote commit message from scratch
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:20 +00:00
Paolo Bonzini
12dcc3217d target-arm: pass DisasContext to gen_aa32_ld*/st*
We'll need the DisasContext in the next patch to retrieve the
desired endianness, so pass it as a whole to gen_aa32_ld*/st*.

Unfortunately we cannot let those functions call get_mem_index,
because of user-mode load/store instructions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[ PC changes:
 * Fix long lines
]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:20 +00:00
Peter Crosthwaite
73462dddf6 target-arm: implement SCTLR.EE
Implement SCTLR.EE bit which controls data endianess for exceptions
and page table translations. SCTLR.EE is mirrored to the CPSR.E bit
on exception entry.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:20 +00:00
Paolo Bonzini
c3ae85fc8f linux-user: arm: handle CPSR.E correctly in strex emulation
Now that CPSR.E is set correctly, prepare for when setend will be able
to change it; bswap data in and out of strex manually by comparing
SCTLR.B, CPSR.E and TARGET_WORDS_BIGENDIAN (we do not have the luxury
of using TCGMemOps).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[ PC changes:
  * Moved SCTLR/CPSR logic to arm_cpu_data_is_big_endian
]
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:19 +00:00
Peter Crosthwaite
9c5a746038 linux-user: arm: set CPSR.E/SCTLR.E0E correctly for BE mode
If doing big-endian linux-user mode, set both the CPSR.E and SCTLR.E0E
bits. This sets big-endian mode for data accesses.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:19 +00:00
Peter Crosthwaite
b2e62d9a7b arm: cpu: handle BE32 user-mode as BE
endian with address manipulations on subword accesses (to give the
illusion of BE). But user-mode cannot tell the difference and is
already implemented as straight BE. So handle the difference in the
endianess query, where USER mode is BE and system is not.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:19 +00:00
Peter Crosthwaite
ed50ff7875 target-arm: cpu: Move cpu_is_big_endian to header
There is a CPU data endianness test that is used to drive the
virtio_big_endian test.

Move this up to the header so it can be more generally used for endian
tests. The KVM specific cpu_syncronize_state call is left behind in the
virtio specific function.

Rename it arm_cpu-data_is_big_endian() to more accurately capture that
this is for data accesses only.

Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:19 +00:00
Paolo Bonzini
f9fd40ebe4 target-arm: implement SCTLR.B, drop bswap_code
bswap_code is a CPU property of sorts ("is the iside endianness the
opposite way round to TARGET_WORDS_BIGENDIAN?") but it is not the
actual CPU state involved here which is SCTLR.B (set for BE32
binaries, clear for BE8).

Replace bswap_code with SCTLR.B, and pass that to arm_ld*_code.
The next patches will make data fetches honor both SCTLR.B and
CPSR.E appropriately.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[PC changes:
 * rebased on master (Jan 2016)
 * s/TARGET_USER_ONLY/CONFIG_USER_ONLY
 * Use bswap_code() for disas_set_info() instead of raw sctlr_b
]
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:19 +00:00
Paolo Bonzini
49017bd8b4 linux-user: arm: pass env to get_user_code_*
This matches the idiom used by get_user_data_* later in the series,
and will help when bswap_code will be replaced by SCTLR.B.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:18 +00:00
Paolo Bonzini
a0e1e6d705 linux-user: arm: fix coding style for some linux-user signal functions
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:18 +00:00
Andrew Baumann
eab713941a bcm2835_mbox/property: replace ldl_phys/stl_phys with endian-specific accesses
PMM pointed out that ldl_phys and stl_phys are dependent on the CPU's
endianness, whereas device model code should be independent of
it. This changes the relevant Raspberry Pi devices to explicitly call
the little-endian variants.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1456880233-22568-1-git-send-email-Andrew.Baumann@microsoft.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-04 11:30:18 +00:00
Peter Maydell
4824a61a6d hw/arm/virt: Assume EL3 boot rom will handle PSCI if one is provided
If the user passes us an EL3 boot rom, then it is going to want to
implement the PSCI interface itself. In this case, disable QEMU's
internal PSCI implementation so it does not get in the way, and
instead start all CPUs in an SMP configuration at once (the boot
rom will catch them all and pen up the secondaries until needed).
The boot rom code is also responsible for editing the device tree
to include any necessary information about its own PSCI implementation
before eventually passing it to a NonSecure guest.

(This "start all CPUs at once" approach is what both ARM Trusted
Firmware and UEFI expect, since it is what the ARM Foundation Model
does; the other approach would be to provide some emulated hardware
for "start the secondaries" but this is simplest.)

This is a compatibility break, but I don't believe that anybody
was using a secure boot ROM with an SMP configuration. Such a setup
would be somewhat broken since there was nothing preventing nonsecure
guest code from calling the QEMU PSCI function to start up a secondary
core in a way that completely bypassed the secure world.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1456853976-7592-1-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:18 +00:00
Peter Maydell
738a5d9fbb hw/arm/virt: Make first flash device Secure-only if booting secure
If the virt board is started with the 'secure' property set to
request a Secure setup, then make the first flash device be
visible only to the Secure world.

This is a breaking change, but I don't expect it to be noticed
by anybody, because running TZ-aware guests isn't common and
those guests are generally going to be booting from the flash
and implicitly expecting their Non-secure guests to not touch it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455288361-30117-5-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:18 +00:00
Peter Maydell
16f4a8dc5c hw/arm/virt: Load bios image to MemoryRegion, not physaddr
If we're loading a BIOS image into the first flash device,
load it into the flash's memory region specifically, not
into the physical address where the flash resides. This will
make a difference when the flash might be in the Secure
address space rather than the Nonsecure one.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455288361-30117-4-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:17 +00:00
Peter Maydell
76151cacfe loader: Add load_image_mr() to load ROM image to a MemoryRegion
Add a new function load_image_mr(), which behaves like
load_image_targphys() except that it loads the ROM image to
a specified MemoryRegion rather than to a specified physical
address. This is useful when a ROM blob needs to be loaded
to a particular flash or ROM device but the address of that
device in the machine's address space is not known. (For
instance, ROMs in devices, or ROMs which might exist in
a different address space to the system address space.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455288361-30117-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
2016-03-04 11:30:17 +00:00
Peter Maydell
83ec1923cd hw/arm/virt: Provide a secure-only RAM if booting in Secure mode
If we're booting in Secure mode, provide a secure-only RAM
(just 16MB) so that secure firmware has somewhere to run
from that won't be accessible to the Non-secure guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455288361-30117-2-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:17 +00:00
Peter Maydell
8b41c30525 sdhci: Implement DeviceClass reset
The sdhci device was missing a DeviceClass reset method;
implement it. Poweron reset looks the same as reset commanded
by the guest via the device registers, apart from modelling of
the rpi 'pending insert interrupt on powerup' quirk.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1456493044-10025-3-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:17 +00:00
Peter Maydell
0719e71e52 sd.c: Handle NULL block backend in sd_get_inserted()
The sd.c SD card emulation code can be in a state where the
SDState BlockBackend pointer is NULL; this is treated as
"card not present". Add a missing check to sd_get_inserted()
so that we don't segfault in this situation.

(This could be provoked by the guest writing to the SDHCI
register to do a reset on a xilinx-zynq-a9 board; it will
also happen at startup when sdhci implements its DeviceClass
reset method.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1456493044-10025-2-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:17 +00:00
Peter Maydell
71c2768433 virt: Lift the maximum RAM limit from 30GB to 255GB
The virt board restricts guests to only 30GB of RAM. This is a
hangover from the vexpress-a15 board, and there's no inherent reason
for it. 30GB is smaller than you might reasonably want to provision
a VM for on a beefy server machine. Raise the limit to 255GB.

We choose 255GB because the available space we currently have
below the 1TB boundary is up to the 512GB mark, but we don't
want to paint ourselves into a corner by assigning it all to
RAM. So we make half of it available for RAM, with the 256GB..512GB
range available for future non-RAM expansion purposes.

If we need to provide more RAM to VMs in the future then we need to:
 * allocate a second bank of RAM starting at 2TB and working up
 * fix the DT and ACPI table generation code in QEMU to correctly
   report two split lumps of RAM to the guest
 * fix KVM in the host kernel to allow guests with >40 bit address spaces

The last of these is obviously the trickiest, but it seems
reasonable to assume that anybody configuring a VM with a quarter
of a terabyte of RAM will be doing it on a host with more than a
terabyte of physical address space.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Christoffer Dall <christoffer.dall@linaro.org>
Tested-by: Wei Huang <wei@redhat.com>
Message-id: 1456402182-11651-1-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:16 +00:00
Peter Maydell
8c4f0eb94c target-arm: Correct handling of writes to CPSR mode bits from gdb in usermode
In helper.c the expression
  (env->uncached_cpsr & CPSR_M) != CPSR_USER
is always true; the right hand side was supposed to be ARM_CPU_MODE_USR
(an error in commit cb01d391).

Since the incorrect expression was always true, this just meant that
commit cb01d391 had no effect.

However simply changing the RHS here would reveal a logic error: if
the mode is USR we wish to completely ignore the attempt to set the
mode bits, which means that we must clear the CPSR_M bits from mask
to avoid the uncached_cpsr bits being updated at the end of the
function.

Move the condition into the correct place in the code, fix its RHS
constant, and add a comment about the fact that we must be doing a
gdbstub write if we're in user mode.

Fixes: https://bugs.launchpad.net/qemu/+bug/1550503
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1456764438-30015-1-git-send-email-peter.maydell@linaro.org
2016-03-04 11:30:16 +00:00
Peter Maydell
2d3b7c0164 Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-for-2.6-1' into staging
rng:
- implement a request queue for rng-random so multiple guest requests
  don't result in vq buffers getting forgotten
- remove unused request cancellation code
- a VM with multiple vq buffers, when migrated, could get in a situation
  where not all buffers are handed back to the guest.  This is now
  fixed.

# gpg: Signature made Thu 03 Mar 2016 12:18:54 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-virtio-rng/tags/rng-for-2.6-1:
  virtio-rng: ask for more data if queue is not fully drained
  rng: add request queue support to rng-random
  rng: move request queue cleanup from RngEgd to RngBackend
  rng: move request queue from RngEgd to RngBackend
  rng: remove the unused request cancellation code
  MAINTAINERS: Add an entry for the include/sysemu/rng*.h files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-03 13:13:36 +00:00
Ladi Prosek
f8693c2cd0 virtio-rng: ask for more data if queue is not fully drained
This commit effectively reverts:

  commit 4621c1768e
  Author: Amit Shah <amit.shah@redhat.com>
  Date:   Wed Nov 21 11:21:19 2012 +0530

  virtio-rng: remove extra request for entropy

but instead of calling virtio_rng_process unconditionally, it
first checks to see if the queue is empty as a little bit of
optimization.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456998514-19271-1-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:26 +05:30
Ladi Prosek
60253ed1e6 rng: add request queue support to rng-random
Requests are now created in the RngBackend parent class and the
code path is shared by both rng-egd and rng-random.

This commit fixes the rng-random implementation which processed
only one request at a time and simply discarded all but the most
recent one. In the guest this manifested as delayed completion
of reads from virtio-rng, i.e. a read was completed only after
another read was issued.

By switching rng-random to use the same request queue as rng-egd,
the unsafe stack-based allocation of the entropy buffer is
eliminated and replaced with g_malloc.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-5-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:26 +05:30
Ladi Prosek
9f14b0add1 rng: move request queue cleanup from RngEgd to RngBackend
RngBackend is now in charge of cleaning up the linked list on
instance finalization. It also exposes a function to finalize
individual RngRequest instances, called by its child classes.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-4-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:26 +05:30
Ladi Prosek
74074e8a7c rng: move request queue from RngEgd to RngBackend
The 'requests' field now lives in the RngBackend parent class.
There are no functional changes in this commit.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-3-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:26 +05:30
Ladi Prosek
3c52ddcdc5 rng: remove the unused request cancellation code
rng_backend_cancel_requests had no callers and none of the code
deleted in this commit ever ran.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-2-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:26 +05:30
Thomas Huth
750cf86932 MAINTAINERS: Add an entry for the include/sysemu/rng*.h files
These headers are used by the virtio-rng and rng backends code,
so they should be listed in the same section in MAINTAINERS, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456404260-26928-1-git-send-email-thuth@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-03-03 17:42:23 +05:30
Peter Maydell
ed6128ebbd Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 01 Mar 2016 15:48:04 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  trace: Add a proper API to manage auto-generated events from the 'tcg' property
  trace: Add 'vcpu' event property to trace guest vCPU
  typedefs: Add CPUState
  trace: Add helper function to cast event arguments
  tcg: Move definition of type TCGv
  tcg: Add type for vCPU pointers
  trace: Remove unnecessary intermediate event copies
  trace: Extend API to manage event arguments
  vl: fix tracing initialization
  trace: use addresses instead of offsets in memory tracepoints
  trace: split subpage MMIOs into their own trace events.
  trace: docs: "simple" backend does support strings
  trace: drop trailing empty strings

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 15:54:03 +00:00
Lluís Vilanova
4ade0541de trace: Add a proper API to manage auto-generated events from the 'tcg' property
Formalizes the existence of the 'event_trans' and 'event_exec' event
attributes, which until now were monkey-patched only when necessary.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145640558759.20978.6374959404425591089.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:34:38 +00:00
Lluís Vilanova
3d211d9f4d trace: Add 'vcpu' event property to trace guest vCPU
This property identifies events that trace vCPU-specific information.

It adds a "CPUState*" argument to events with the property, identifying
the vCPU raising the event. TCG translation events also have a
"TCGv_env" implicit argument that is later used as the "CPUState*"
argument at execution time.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145641861797.30295.6991314023181842105.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:10 +00:00
Lluís Vilanova
b23197f9cf typedefs: Add CPUState
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145641861239.30295.8564457138934628740.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Lluís Vilanova
bc9beb47c7 trace: Add helper function to cast event arguments
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145641860680.30295.1873612736245870753.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Lluís Vilanova
5d4e1a1081 tcg: Move definition of type TCGv
The target-dependant type TCGv must be defined in "tcg/tcg.h" before
including the tracing helper wrappers in "tcg/tcg-op.h".

It also makes more sense to define it here, where other TCG types are
defined too.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145641860129.30295.17554707227384022653.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Lluís Vilanova
1bcea73e13 tcg: Add type for vCPU pointers
Adds the 'TCGv_env' type for pointers to 'CPUArchState' objects. The
tracing infrastructure later needs to differentiate between regular
pointers and pointers to vCPUs.

Also changes all targets to use the new 'TCGv_env' type instead of the
generic 'TCGv_ptr'. As of now, the change is merely cosmetic ('TCGv_env'
translates into 'TCGv_ptr'), but that could change in the future to
enforce the difference.

Note that a 'TCGv_env' type (for 'CPUState') is not added, since all
helpers currently receive the architecture-specific
pointer ('CPUArchState').

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Acked-by: Richard Henderson <rth@twiddle.net>
Message-id: 145641859552.30295.7821536833590725201.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Lluís Vilanova
56797b1fbc trace: Remove unnecessary intermediate event copies
The current code forces the use of a chain of ".original" dereferences,
which looks odd.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 145641858988.30295.7223459456488075843.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Lluís Vilanova
3596f524d4 trace: Extend API to manage event arguments
Lets the user manage event arguments as a list, and simplifies argument
concatenation.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 145641858432.30295.3069911069472672646.stgit@localhost
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:27:09 +00:00
Denis V. Lunev
62cb4145bb vl: fix tracing initialization
we should call trace_init_backends() before trace_init_file() for
CONFIG_TRACE_SIMPLE There is no difference for other cases.

This problem was introduced by the commit
    commit 41fc57e44e
    Author: Paolo Bonzini <pbonzini@redhat.com>
    Date:   Thu Jan 7 16:55:24 2016 +0300

    trace: split trace_init_file out of trace_init_backends

'make check' was failed as a result if configured with
  --enable-trace-backends=simple

Spotted by Alex Bennée.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1455036545-14870-1-git-send-email-den@openvz.org
CC: Alex Bennée <alex.bennee@linaro.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:20:15 +00:00
Hollis Blanchard
4779dc1d19 trace: use addresses instead of offsets in memory tracepoints
When memory_region_ops tracepoints are enabled, calculate and record the
absolute address being accessed. Otherwise, we only get offsets into the
memory region instead of addresses.

[Fixed "offset" -> "addr" in trace event format strings.
--Stefan]

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Message-id: 1454976185-30095-3-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:20:15 +00:00
Hollis Blanchard
23d92d68e7 trace: split subpage MMIOs into their own trace events.
Previously, a single MMIO could trigger the memory_region_ops tracepoint twice:
once on its way into subpage ops, then later on its way into the model's ops.

Also, the fields previously called "addr" are actually offsets into the memory
region. Rename them to "offset" while we're editing the tracepoint definitions.

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Message-id: 1454976185-30095-2-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:20:15 +00:00
Hollis Blanchard
2c140f5f2c trace: docs: "simple" backend does support strings
The simple tracing backend has supported strings for more than three years
(62bab73213).

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Message-id: 1454976185-30095-1-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:20:15 +00:00
Greg Kurz
6411dd1334 trace: drop trailing empty strings
Also fix a typo in the virtio_balloon_handle_output() trace while here.

[The double-quoting was a limitation of the old tracetool.sh script.
The modern tracetool.py script does not require double-quotes at the end
of the line.  See commit cf85cf8e97
("trace: Format strings must begin/end with double quotes").
--Stefan]

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20160111173036.24764.59878.stgit@bahia.huguette.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-03-01 13:20:15 +00:00
Peter Maydell
9c279bec75 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160301' into staging
Assorted fixes, cleanups and enhancements.

# gpg: Signature made Tue 01 Mar 2016 11:45:12 GMT using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20160301:
  s390x/css: only suspend when enabled by orb
  MAINTAINERS: Remove entry for hw/s390x/s390-virtio-bus.[ch]
  MAINTAINERS: Remove the old s390-virtio machine
  s390x/pci: use PCI_MSIX_FLAGS on retrieving the MSIX entries
  s390x/css: Use static initialization for channel_subsys fields
  s390x/css: Allocate channel_subsys statically
  s390x/pci: fix reg/dereg irq functions
  s390x/css: introduce indicator refcounting interfaces
  s390x/virtio: old machine leftovers
  watchdog/diag288: avoid race condition on expired watchdog
  s390x: remove {kvm_}s390_virtio_irq()
  s390x: fix debug statement in trigger_page_fault()
  s390x/kvm: sync fprs via kvm_run
  linux-headers: update against kvm/next

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 13:09:55 +00:00
Peter Maydell
646fd16865 Merge remote-tracking branch 'remotes/kraxel/tags/pull-seabios-20160301-1' into staging
seabios: update to 1.9.1 stable release

# gpg: Signature made Tue 01 Mar 2016 08:39:53 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-seabios-20160301-1:
  seabios: update to 1.9.1 stable release

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 12:18:23 +00:00
Cornelia Huck
ce350f32e4 s390x/css: only suspend when enabled by orb
We must not allow a channel program to suspend if the suspend
control bit in the orb had not been specified.

Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Thomas Huth
d90527178c MAINTAINERS: Remove entry for hw/s390x/s390-virtio-bus.[ch]
The files have been deleted recently, no need to keep these entries
anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1456397100-22746-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Thomas Huth
6aaa681c9b MAINTAINERS: Remove the old s390-virtio machine
The old s390-virtio machine has been removed last year, so we don't
need the corresponding section in the MAINTAINERS file anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1456394274-21082-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Wei Yang
ce1307e180 s390x/pci: use PCI_MSIX_FLAGS on retrieving the MSIX entries
Even PCI_CAP_FLAGS has the same value as PCI_MSIX_FLAGS, the later one is
the more proper on retrieving MSIX entries.

This patch uses PCI_MSIX_FLAGS to retrieve the MSIX entries.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
CC: Cornelia Huck <cornelia.huck@de.ibm.com>
CC: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1455895091-7589-3-git-send-email-richard.weiyang@gmail.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Eduardo Habkost
bc994b74ea s390x/css: Use static initialization for channel_subsys fields
machine_init() will be gone, but we don't need it if we just
initialize the channel_subsys fields statically.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455656347-29033-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
[adapted on top of indicator changes]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Eduardo Habkost
562f5e0b97 s390x/css: Allocate channel_subsys statically
There's no need to use g_malloc0() to allocate the channel_subsys
struct, just use a static variable.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Alexander Graf <agraf@suse.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455656347-29033-3-git-send-email-ehabkost@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
[adapted on top of indicator changes]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Yi Min Zhao
8581c115d2 s390x/pci: fix reg/dereg irq functions
Indicator refcounting interfaces are introduced. This patch fixes
introducing unneeded indicator mappings and failure to release
AISB mappings on deregistration.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:29 +01:00
Yi Min Zhao
a28d8391e3 s390x/css: introduce indicator refcounting interfaces
Currently, virtio-ccw uses its own interfaces to keep indicators mapped
just once even if the same address has been registered multiple times.
These interfaces fit the PCI use case as well. Therefore, move them to
css and make them generic interfaces.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
Cornelia Huck
99abd0d6f7 s390x/virtio: old machine leftovers
Remove some now unused #defines.

Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
Sascha Silbe
fe345a3d5d watchdog/diag288: avoid race condition on expired watchdog
When configured to inject an NMI, watchdog_perform_action() may cause
the BQL to be temporarily relinquished (inject_nmi() → ... →
s390_nmi() → s390_cpu_restart() → run_on_cpu()). When the guest issues
diag 288 again in response to the NMI, the diag 288 operation will
race against wdt_diag288_reset(). Depending on scheduler behaviour,
wdt_diag288_reset() may be run after the guest issued a diag 288
Init. As a result, we will cancel the timer the guest just set up. The
effect observed by the guest is that a second expiry does not trigger
the watchdog action and diag 288 Change operations fail.

Fix this by resetting the timer _before_ invoking the action.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
Cornelia Huck
8777f6abdb s390x: remove {kvm_}s390_virtio_irq()
This interface was only used by the old virtio machine and therefore
is not needed anymore.

Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
David Hildenbrand
c5b2ee4c7a s390x: fix debug statement in trigger_page_fault()
When mmu_translate debugging output is enabled, code won't compile.
Let's just use the same statement as in trigger_prot_fault().

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
David Hildenbrand
5ab0e547bf s390x/kvm: sync fprs via kvm_run
We can now also sync the fprs via kvm_run, avoiding one ioctl.

Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
Cornelia Huck
66fb2d5467 linux-headers: update against kvm/next
Update against commit efef127c, but keep userfaultd.h.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-03-01 12:15:28 +01:00
Peter Maydell
0b85d73583 Merge remote-tracking branch 'remotes/kraxel/tags/pull-input-20160301-1' into staging
qapi: fix input-send-event and promote to stable

# gpg: Signature made Tue 01 Mar 2016 08:19:52 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-input-20160301-1:
  qapi: promote input-send-event to stable
  qapi: rename InputAxis values.
  qapi: rename input buttons
  qapi: switch x-input-send-event from console to device+head
  console: add & use qemu_console_lookup_by_device_name

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 11:15:00 +00:00
Peter Maydell
d9c7737e57 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160301-1' into staging
vga: minor cirrus/qxl bugfixes.

# gpg: Signature made Tue 01 Mar 2016 07:16:22 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20160301-1:
  qxl: lock current_async update in qxl_soft_reset
  cirrus_vga: fix off-by-one in blit_region_is_unsafe

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 10:34:19 +00:00
Peter Maydell
9c74a85304 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Mon 29 Feb 2016 20:08:16 GMT using RSA key ID C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"

* remotes/cody/tags/block-pull-request:
  iotests/124: Add cluster_size mismatch test
  block/backup: avoid copying less than full target clusters
  block/backup: make backup cluster size configurable
  mirror: Add mirror_wait_for_io
  mirror: Rewrite mirror_iteration
  vhdx: Simplify vhdx_set_shift_bits()
  vhdx: DIV_ROUND_UP() in vhdx_calc_bat_entries()
  iscsi: add support for getting CHAP password via QCryptoSecret API
  curl: add support for HTTP authentication parameters
  rbd: add support for getting password from QCryptoSecret object
  sheepdog: allow to delete snapshot
  block/nfs: add support for setting debug level

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-03-01 09:54:53 +00:00
Gerd Hoffmann
fee5b753ff seabios: update to 1.9.1 stable release
git shortlog rel-1.9.0..rel-1.9.1
=================================

Cole Robinson (1):
      biostables: Support SMBIOS 2.6+ UUID format

Kevin O'Connor (7):
      xhci: Check for device disconnects during USB2 reset polling
      xhci: Wait for port enable even for USB3 devices
      sdcard: Only enable error_irq_enable for bits defined in SDHCI v1 spec
      sdcard: fix typo causing 32bit write to 16bit block_size field
      nmi: Don't try to switch onto extra stack in NMI handler
      scsi: Do not call printf() from scsi_is_ready()
      coreboot: Check for unaligned cbfs header

Marcel Apfelbaum (1):
      fw/pci: do not automatically allocate IO region for PCIe bridges

Roger Pau Monne (1):
      build: fix typo in buildversion.py

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-01 09:37:07 +01:00
Gerd Hoffmann
6575ccddf4 qapi: promote input-send-event to stable
With all fixups being in place now, we can promote input-send-event
to stable abi by removing the x- prefix.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-01 08:20:27 +01:00
Gerd Hoffmann
01df51432e qapi: rename InputAxis values.
Lowercase them.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-01 08:19:45 +01:00
Gerd Hoffmann
f22d0af076 qapi: rename input buttons
All lowercase, use-dash instead of CamelCase.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-03-01 08:19:07 +01:00
Gerd Hoffmann
b98d26e333 qapi: switch x-input-send-event from console to device+head
Use display device qdev id and head number instead of console index to
specify the QemuConsole.  This makes things consistent with input
devices (for input routing) and vnc server configuration, which both use
display and head too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-03-01 07:51:34 +01:00
Gerd Hoffmann
f2c1d54c18 console: add & use qemu_console_lookup_by_device_name
We have two places needing this, and a third one will come shortly.
So factor things out into a helper function to reduce code duplication.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-03-01 07:51:34 +01:00
Gerd Hoffmann
05fa1c742f qxl: lock current_async update in qxl_soft_reset
This should fix a defect report from Coverity.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2016-03-01 07:51:32 +01:00
Paolo Bonzini
d2ba7ecb34 cirrus_vga: fix off-by-one in blit_region_is_unsafe
The "max" value is being compared with >=, but addr + width points to
the first byte that will _not_ be copied.  Laszlo suggested using a
"greater than" comparison, instead of subtracting one like it is
already done above for the height, so that max remains always positive.

The mistake is "safe"---it will reject some blits, but will never cause
out-of-bounds writes.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1455121059-18280-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-03-01 07:51:32 +01:00
John Snow
cc199b16cf iotests/124: Add cluster_size mismatch test
If a backing file isn't specified in the target image and the
cluster_size is larger than the bitmap granularity, we run the risk of
creating bitmaps with allocated clusters but empty/no data which will
prevent the proper reading of the backup in the future.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1456433911-24718-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:55:14 -05:00
John Snow
4c9bca7e39 block/backup: avoid copying less than full target clusters
During incremental backups, if the target has a cluster size that is
larger than the backup cluster size and we are backing up to a target
that cannot (for whichever reason) pull clusters up from a backing image,
we may inadvertantly create unusable incremental backup images.

For example:

If the bitmap tracks changes at a 64KB granularity and we transmit 64KB
of data at a time but the target uses a 128KB cluster size, it is
possible that only half of a target cluster will be recognized as dirty
by the backup block job. When the cluster is allocated on the target
image but only half populated with data, we lose the ability to
distinguish between zero padding and uninitialized data.

This does not happen if the target image has a backing file that points
to the last known good backup.

Even if we have a backing file, though, it's likely going to be faster
to just buffer the redundant data ourselves from the live image than
fetching it from the backing file, so let's just always round up to the
target granularity.

The same logic applies to backup modes top, none, and full. Copying
fractional clusters without the guarantee of COW is dangerous, but even
if we can rely on COW, it's likely better to just re-copy the data.

Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1456433911-24718-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:55:14 -05:00
John Snow
16096a4d47 block/backup: make backup cluster size configurable
64K might not always be appropriate, make this a runtime value.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1456433911-24718-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:55:14 -05:00
Fam Zheng
21cd917ff5 mirror: Add mirror_wait_for_io
The three lines are duplicated a number of times now, refactor a
function.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454637630-10585-3-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Fam Zheng
e5b43573e2 mirror: Rewrite mirror_iteration
The "pnum < nb_sectors" condition in deciding whether to actually copy
data is unnecessarily strict, and the qiov initialization is
unnecessarily for bdrv_aio_write_zeroes and bdrv_aio_discard.

Rewrite mirror_iteration to fix both flaws.

The output of iotests 109 is updated because we now report the offset
and len slightly differently in mirroring progress.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454637630-10585-2-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Max Reitz
04a3615860 vhdx: Simplify vhdx_set_shift_bits()
For values which are powers of two (and we do assume all of these to
be), sizeof(x) * 8 - 1 - clz(x) == ctz(x). Therefore, use ctz().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450451066-13335-3-git-send-email-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Max Reitz
939901dcd2 vhdx: DIV_ROUND_UP() in vhdx_calc_bat_entries()
We have DIV_ROUND_UP(), so we can use it to produce more easily readable
code. It may be slower than the bit shifting currently performed
(because it actually performs a division), but since
vhdx_calc_bat_entries() is never used in a hot path, this is completely
fine.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1450451066-13335-2-git-send-email-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Daniel P. Berrange
b189346eb1 iscsi: add support for getting CHAP password via QCryptoSecret API
The iSCSI driver currently accepts the CHAP password in plain text
as a block driver property. This change adds a new "password-secret"
property that accepts the ID of a QCryptoSecret instance.

  $QEMU \
     -object secret,id=sec0,filename=/home/berrange/example.pw \
     -drive driver=iscsi,url=iscsi://example.com/target-foo/lun1,\
            user=dan,password-secret=sec0

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-4-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Daniel P. Berrange
1bff960642 curl: add support for HTTP authentication parameters
If connecting to a web server which has authentication
turned on, QEMU gets a 401 as curl has not been configured
with any authentication credentials.

This adds 4 new parameters to the curl block driver
options 'username', 'password-secret', 'proxy-username'
and 'proxy-password-secret'. Passwords are provided using
the recently added 'secret' object type

 $QEMU \
     -object secret,id=sec0,filename=/home/berrange/example.pw \
     -object secret,id=sec1,filename=/home/berrange/proxy.pw \
     -drive driver=http,url=http://example.com/some.img,\
            username=dan,password-secret=sec0,\
            proxy-username=dan,proxy-password-secret=sec1

Of course it is possible to use the same secret for both the
proxy & server passwords if desired, or omit the proxy auth
details, or the server auth details as required.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-3-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:31 -05:00
Daniel P. Berrange
60390a2192 rbd: add support for getting password from QCryptoSecret object
Currently RBD passwords must be provided on the command line
via

  $QEMU -drive file=rbd:pool/image:id=myname:\
               key=QVFDVm41aE82SHpGQWhBQXEwTkN2OGp0SmNJY0UrSE9CbE1RMUE=:\
               auth_supported=cephx

This is insecure because the key is visible in the OS process
listing.

This adds support for an 'password-secret' parameter in the RBD
parameters that can be used with the QCryptoSecret object to
provide the password via a file:

  echo "QVFDVm41aE82SHpGQWhBQXEwTkN2OGp0SmNJY0UrSE9CbE1RMUE=" > poolkey.b64
  $QEMU -object secret,id=secret0,file=poolkey.b64,format=base64 \
        -drive driver=rbd,filename=rbd:pool/image:id=myname:\
               auth_supported=cephx,password-secret=secret0

Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1453385961-10718-2-git-send-email-berrange@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:30 -05:00
Vasiliy Tolstov
eab8eb8db3 sheepdog: allow to delete snapshot
This patch implements a blockdriver function bdrv_snapshot_delete() in
the sheepdog driver. With the new function, snapshots of sheepdog can
be deleted from libvirt.

Cc: Jeff Cody <jcody@redhat.com>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Signed-off-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Message-id: 1450873346-22334-1-git-send-email-mitake.hitoshi@lab.ntt.co.jp
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:30 -05:00
Peter Lieven
7725b8bf12 block/nfs: add support for setting debug level
recent libnfs versions support logging debug messages. Add
support for it in qemu through an URL parameter.

Example:
 qemu -cdrom nfs://127.0.0.1/iso/my.iso?debug=2

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1447052973-14513-1-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-02-29 14:54:30 -05:00
Peter Maydell
071608b519 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160229-1' into staging
usb: redirect bugfix, MAINTAINERS update.

# gpg: Signature made Mon 29 Feb 2016 11:09:54 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160229-1:
  usb-redirect: Avoid double free of data
  MAINTAINERS: Add some missing entries for USB related files

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-29 12:24:26 +00:00
Peter Maydell
1da90c34c9 Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160229-1' into staging
ui: spice dmabuf fix, MAINTAINERS updates.

# gpg: Signature made Mon 29 Feb 2016 10:41:15 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160229-1:
  MAINTAINERS: Add an entry for the include/ui/ folder
  MAINTAINERS: Add spice-display.h to the SPICE section
  spice/gl: Enable dmabuf only for spice >= 0.13.1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-29 11:49:50 +00:00
Peter Maydell
3ff430aa91 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fw-cfg-20160226-1' into staging
fw_cfg: unbreak migration compatibility for 2.4 and earlier machines

# gpg: Signature made Fri 26 Feb 2016 09:45:50 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-fw-cfg-20160226-1:
  fw_cfg: unbreak migration compatibility for 2.4 and earlier machines

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-29 11:24:36 +00:00
Peter Maydell
35227e6a09 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160229' into staging
ppc patch queue for 2016-02-29

Some more accumulated patches for target-ppc, pseries machine type and
related devices to fit in before the qemu-2.6 soft freeze.
    * Mostly bugfixes and small cleanups for spapr and Mac platforms

# gpg: Signature made Mon 29 Feb 2016 06:56:34 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160229:
  xics: report errors with the QEMU Error API
  migration: allow machine to enforce configuration section migration
  spapr: skip configuration section during migration of older machines
  dbdma: warn when using unassigned channel
  spapr: disable vmdesc submission for old machines
  spapr_pci: fix irq leak in RTAS ibm,change-msi
  spapr_pci: kill useless variable in rtas_ibm_change_msi()
  spapr_rng: disable hotpluggability

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-29 10:51:11 +00:00
Fam Zheng
e8ce12d9ea usb-redirect: Avoid double free of data
If dropping packets, data is freed, the caller's loop should not continue.

Reported by ccc-analyzer.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1456301288-1592-1-git-send-email-famz@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-29 11:45:26 +01:00
Thomas Huth
beded0ff7f MAINTAINERS: Add some missing entries for USB related files
USB-related docs and include files should go into the USB
section of the MAINTAINERS file.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456392967-20274-2-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-29 11:45:26 +01:00
Thomas Huth
e220656ce1 MAINTAINERS: Add an entry for the include/ui/ folder
The ui/ folder is listed in the "Graphics" section, so I think
the "include/ui/" folder should be listed there, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456392967-20274-4-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-29 10:54:32 +01:00
Thomas Huth
438528a3e7 MAINTAINERS: Add spice-display.h to the SPICE section
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456392967-20274-3-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-29 10:54:32 +01:00
Michal Privoznik
9f5c6d06ad spice/gl: Enable dmabuf only for spice >= 0.13.1
After 474114b7 the dmabuf feature is enabled whenever spice
greater than or equal to spice 0.13.0 is found. This is because
two new functions are required: spice_qxl_gl_scanout and
spice_qxl_gl_draw_async. These were, however, introduce in 0.13.1
release. Well, technically they haven't been released yet, but
for sure they are not going to be part of 0.13.0 release (for the
ABI stability sake).

Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Message-id: 1a724e97cb587624d6f6009c15395496bccfa32b.1456317738.git.mprivozn@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-29 10:54:32 +01:00
Greg Kurz
a005b3ef50 xics: report errors with the QEMU Error API
Using the return value to report errors is error prone:
- xics_alloc() returns -1 on error but spapr_vio_busdev_realize() errors
  on 0
- xics_alloc_block() returns the unclear value of ics->offset - 1 on error
  but both rtas_ibm_change_msi() and spapr_phb_realize() error on 0

This patch adds an errp argument to xics_alloc() and xics_alloc_block() to
report errors. The return value of these functions is a valid IRQ number
if errp is NULL. It is undefined otherwise.

The corresponding error traces get promotted to error messages. Note that
the "can't allocate IRQ" error message in spapr_vio_busdev_realize() also
moves to xics_alloc(). Similar error message consolidation isn't really
applicable to xics_alloc_block() because callers have extra context (device
config address, MSI or MSIX).

This fixes the issues mentioned above.

Based on previous work from Brian W. Hart.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
902c053d83 migration: allow machine to enforce configuration section migration
Migration of pseries-2.3 doesn't have configuration section. Unfortunately,
QEMU 2.4/2.4.1/2.5 are buggy and always stream and expect the configuration
section, and break migration both ways.

This patch introduces a property which allows to enforce a configuration
section for machines who don't have one.

It can be set at startup:

-machine enforce-config-section=on

or later from the QEMU monitor:

qom-set /machine enforce-config-section on

It is up to the tooling to set or unset this property according to the
version of the QEMU at the other end of the pipe.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
09b5e30da5 spapr: skip configuration section during migration of older machines
Since QEMU 2.4, we have a configuration section in the migration stream.
This must be skipped for older machines, like it is already done for x86.

This patch fixes the migration of pseries-2.3 from/to QEMU 2.3, but it
breaks migration of the same machine from/to QEMU 2.4/2.4.1/2.5. We do
that anyway because QEMU 2.3 is likely to be more widely deployed than
newer QEMU versions.

Fixes: 61964c23e5
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Hervé Poussineau
2d7d06d847 dbdma: warn when using unassigned channel
With this, it's easier to know if a guest uses an invalid and/or unimplemented
DMA channel.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
cba0e7796b spapr: disable vmdesc submission for old machines
Since QEMU 2.3, we have a vmdesc section in the migration stream.
This section is not mandatory but when migrating a pseries-2.2
machine from QEMU 2.2, you get a warning at the destination:

qemu-system-ppc64: Expected vmdescription section, but got 0

The warning goes away if we decide to skip vmdesc as well for
older pseries, like it is already done for pc's.

This can only be observed with -cpu POWER7 because POWER8
cannot migrate from QEMU 2.2 to 2.3 (insns_flags2 mismatch).

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
ce266b75fe spapr_pci: fix irq leak in RTAS ibm,change-msi
This RTAS call is used to request new interrupts or to free all interrupts.

If the driver has already allocated interrupts and asks again for a non-null
number of irqs, then the rtas_ibm_change_msi() function will silently leak
the previous interrupts.

It happens because xics_free() is only called when the driver releases all
interrupts (!req_num case). Note that the previously allocated spapr_pci_msi
is not leaked because the GHashTable is created with destroy functions and
g_hash_table_insert() hence frees the old value.

This patch makes sure any previously allocated MSIs are released when a
new allocation succeeds.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
d4a63ac8b1 spapr_pci: kill useless variable in rtas_ibm_change_msi()
The num local variable is initialized to zero and has no writer.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Greg Kurz
3d0db3e74d spapr_rng: disable hotpluggability
It is currently possible to hotplug a spapr_rng device but QEMU crashes
when we try to hot unplug:

ERROR:hw/core/qdev.c:295:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted

This happens because spapr_rng isn't plugged to any bus and sPAPR does
not provide hotplug support for it: qdev_get_hotplug_handler() hence
return NULL and we hit the assertion.

And anyway, it doesn't make much sense to unplug this device since hcalls
cannot be unregistered. Even the idea of hotplugging a RNG device instead
of declaring it on the QEMU command line looks weird.

This patch simply disables hotpluggability for the spapr-rng class.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-28 16:19:02 +11:00
Peter Maydell
6e378dd214 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160226' into staging
target-arm queue:
 * Clean up handling of bad mode switches writing to CPSR, and implement
   the ARMv8 requirement that they set PSTATE.IL
 * Implement MDCR_EL3.TPM and MDCR_EL2.TPM traps on perf monitor
   register accesses
 * Don't implement stellaris-pl061-only registers on generic-pl061
 * Fix SD card handling for raspi
 * Add missing include files to MAINTAINERS
 * Mark CNTHP_TVAL_EL2 as ARM_CP_NO_RAW
 * Make reserved ranges in ID_AA64* spaces RAZ, not UNDEF

# gpg: Signature made Fri 26 Feb 2016 15:19:07 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160226:
  target-arm: Make reserved ranges in ID_AA64* spaces RAZ, not UNDEF
  target-arm: Mark CNTHP_TVAL_EL2 as ARM_CP_NO_RAW
  sdhci: add quirk property for card insert interrupt status on Raspberry Pi
  sdhci: Revert "add optional quirk property to disable card insertion/removal interrupts"
  MAINTAINERS: Add some missing ARM related header files
  raspi: fix SD card with recent sdhci changes
  ARM: PL061: Checking register r/w accesses to reserved area
  target-arm: Implement MDCR_EL3.TPM and MDCR_EL2.TPM traps
  target-arm: Fix handling of SDCR for 32-bit code
  target-arm: Make Monitor->NS PL1 mode changes illegal if HCR.TGE is 1
  target-arm: Make mode switches from Hyp via CPS and MRS illegal
  target-arm: In v8, make illegal AArch32 mode changes set PSTATE.IL
  target-arm: Forbid mode switch to Mon from Secure EL1
  target-arm: Add Hyp mode checks to bad_mode_switch()
  target-arm: Add comment about not implementing NSACR.RFR
  target-arm: In cpsr_write() ignore mode switches from User mode
  linux-user: Use restrictive mask when calling cpsr_write()
  target-arm: Raw CPSR writes should skip checks and bank switching
  target-arm: Add write_type argument to cpsr_write()
  target-arm: Give CPSR setting on 32-bit exception return its own helper

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 16:02:00 +00:00
Peter Maydell
aa53d5bfc3 Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.6-5' into staging
migration pull
 - fix a qcow2 assert
 - fix for older distros (CentOS 5)
 - documentation for vmstate flags
 - minor code rearrangement

# gpg: Signature made Fri 26 Feb 2016 15:15:15 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-for-2.6-5:
  migration (postcopy): move bdrv_invalidate_cache_all of of coroutine context
  migration (ordinary): move bdrv_invalidate_cache_all of of coroutine context
  migration/vmstate: document VMStateFlags
  MAINTAINERS: Add docs/migration.txt to the "Migration" section
  migration/postcopy-ram: Guard use of sys/eventfd.h with CONFIG_EVENTFD
  migration: reorder code to make it symmetric

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:21:26 +00:00
Denis V. Lunev
ea6a55bcc0 migration (postcopy): move bdrv_invalidate_cache_all of of coroutine context
There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

The patch moves processing of bdrv_invalidate_cache_all out of
coroutine context for postcopy migration to avoid that. This function
is called with the following stack:
  process_incoming_migration_co
  qemu_loadvm_state
  qemu_loadvm_state_main
  loadvm_process_command
  loadvm_postcopy_handle_run

Signed-off-by: Denis V. Lunev <den@openvz.org>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456304019-10507-3-git-send-email-den@openvz.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 20:40:08 +05:30
Denis V. Lunev
0aa6aefc9c migration (ordinary): move bdrv_invalidate_cache_all of of coroutine context
There is a possibility to hit an assert in qcow2_get_specific_info that
s->qcow_version is undefined. This happens when VM in starting from
suspended state, i.e. it processes incoming migration, and in the same
time 'info block' is called.

The problem is that qcow2_invalidate_cache() closes the image and
memset()s BDRVQcowState in the middle.

The patch moves processing of bdrv_invalidate_cache_all out of
coroutine context for standard migration to avoid that.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456304019-10507-2-git-send-email-den@openvz.org>

[Amit: Fix a use-after-free bug]

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 20:39:50 +05:30
Peter Maydell
e20d84c140 target-arm: Make reserved ranges in ID_AA64* spaces RAZ, not UNDEF
The v8 ARM ARM defines that unused spaces in the ID_AA64* system
register ranges are Reserved and must RAZ, rather than being UNDEF.
Implement this.

In particular, ARM v8.2 adds a new feature register ID_AA64MMFR2,
and newer versions of the Linux kernel will attempt to read this,
which causes them not to boot up on versions of QEMU missing this fix.

Since the encoding .opc0 = 3, .opc1 = 0, .crn = 0, .crm = 2, .opc2 = 6
is actually defined in ARMv8 (as ID_MMFR4), we give it an entry in
the ARMCPU struct so CPUs can override it, though since none do
this too will just RAZ.

Cc: qemu-stable@nongnu.org
Reported-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455890863-11203-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Alex Bennée <alex.bennee@linaro.org>
2016-02-26 15:09:42 +00:00
Edgar E. Iglesias
d44ec15630 target-arm: Mark CNTHP_TVAL_EL2 as ARM_CP_NO_RAW
Mark CNTHP_TVAL_EL2 as ARM_CP_NO_RAW due to the register not
having any underlying state. This fixes an issue with booting
KVM enabled kernels when EL2 is on.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1456490739-19343-1-git-send-email-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Andrew Baumann
0a7ac9f9e7 sdhci: add quirk property for card insert interrupt status on Raspberry Pi
This quirk is a workaround for the following hardware behaviour, on
which UEFI (specifically, the bootloader for Windows on Pi2) depends:

1. at boot with an SD card present, the interrupt status/enable
   registers are initially zero
2. upon enabling it in the interrupt enable register, the card insert
   bit in the interrupt status register is immediately set
3. after a subsequent controller reset, the card insert interrupt does
   not fire, even if enabled in the interrupt enable register

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1456436130-7048-3-git-send-email-Andrew.Baumann@microsoft.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Andrew Baumann
5c1bc9a234 sdhci: Revert "add optional quirk property to disable card insertion/removal interrupts"
This reverts commit 723697551a.

This change was poorly tested on my part. It squelched card insertion
interrupts on reset, but that was not necessary because sdhci_reset()
clears all the registers (via the call to memset), so the subsequent
sdhci_insert_eject_cb() call never sees the card insert interrupt
enabled. However, not calling the insert_eject_cb results in prnsts
remaining 0, when it actually needs to be updated to indicate card
presence and R/O status.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1456436130-7048-2-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Thomas Huth
ed0db8663a MAINTAINERS: Add some missing ARM related header files
Some header files in the include/hw/arm/ directory can be assigned
to entries in the MAINTAINERS file.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1456399324-24259-1-git-send-email-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Andrew Baumann
a55b53a2f4 raspi: fix SD card with recent sdhci changes
Recent changes to sdhci broke SD on raspi. This change mirrors
the logic to create the SD card device at the board level.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1456351128-5560-1-git-send-email-Andrew.Baumann@microsoft.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Wei Huang
09aa3bf382 ARM: PL061: Checking register r/w accesses to reserved area
pl061.c emulates two GPIO devices, ARM PL061 and TI Stellaris, which
share the same read/write functions (pl061_read and pl061_write).
However PL061 and Stellaris have different GPIO register definitions
and pl061_read()/pl061_write() doesn't check it. This patch enforces
checking on offset, preventing R/W into the reserved memory area.

Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1455814580-17699-1-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 15:09:42 +00:00
Peter Maydell
1fce1ba985 target-arm: Implement MDCR_EL3.TPM and MDCR_EL2.TPM traps
Implement the performance monitor register traps controlled
by MDCR_EL3.TPM and MDCR_EL2.TPM. Most of the performance
registers already have an access function to deal with the
user-enable bit, and the TPM checks can be added there. We
also need a new access function which only implements the
TPM checks for use by the few not-EL0-accessible registers
and by PMUSERENR_EL0 (which is always EL0-readable).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455892784-11328-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
2016-02-26 15:09:42 +00:00
Peter Maydell
a8d64e7351 target-arm: Fix handling of SDCR for 32-bit code
Fix two issues with our implementation of the SDCR:
 * it is only present from ARMv8 onwards
 * it does not contain several of the trap bits present in its 64-bit
   counterpart the MDCR_EL3

Put the register description in the right place so that it does not
get enabled for ARMv7 and earlier, and give it a write function so that
we can mask out the bits which should not be allowed to have an effect
if EL3 is 32-bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455892784-11328-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
2016-02-26 15:09:42 +00:00
Peter Maydell
10eacda787 target-arm: Make Monitor->NS PL1 mode changes illegal if HCR.TGE is 1
If HCR.TGE is 1 then mode changes via CPS and MSR from Monitor to
NonSecure PL1 modes are illegal mode changes. Implement this check
in bad_mode_switch().

(We don't currently implement HCR.TGE, but this is the only missing
check from the v8 ARM ARM G1.9.3 and so it's worth adding now; the
rest of the HCR.TGE checks can be added later as necessary.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-12-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:42 +00:00
Peter Maydell
af393ffc6d target-arm: Make mode switches from Hyp via CPS and MRS illegal
Mode switches from Hyp to any other mode via the CPS and MRS
instructions are illegal mode switches (though obviously switching
via exception return is valid).  Add this check to bad_mode_switch().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-11-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
81907a5829 target-arm: In v8, make illegal AArch32 mode changes set PSTATE.IL
In v8, the illegal mode changes which are UNPREDICTABLE in v7 are
given architected behaviour:
 * the mode field is unchanged
 * PSTATE.IL is set (so any subsequent instructions will UNDEF)
 * any other CPSR fields are written to as normal

This is pretty much the same behaviour we picked for our
UNPREDICTABLE handling, with the exception that for v8 we
need to set the IL bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-10-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
58ae2d1f03 target-arm: Forbid mode switch to Mon from Secure EL1
In v8 trying to switch mode to Mon from Secure EL1 is an
illegal mode switch. (In v7 this is impossible as all secure
modes except User are at EL3.) We can handle this case by
making a switch to Mon valid only if the current EL is 3,
which then gives the correct answer whether EL3 is AArch32
or AArch64.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-9-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
e6c8fc07b4 target-arm: Add Hyp mode checks to bad_mode_switch()
We don't actually support Hyp mode yet, but add the correct
checks for it to the bad_mode_switch() function for completeness.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-8-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
52ff951b4f target-arm: Add comment about not implementing NSACR.RFR
QEMU doesn't implement the NSACR.RFR bit, which is a permitted
IMPDEF in choice in ARMv7 and the only permitted choice in ARMv8.
Add a comment to bad_mode_switch() to note that this is why
FIQ is always a valid mode regardless of the CPU's Secure state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-7-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
cb01d3912c target-arm: In cpsr_write() ignore mode switches from User mode
The only case where we can attempt a cpsr_write() mode switch from
User is from the gdbstub; all other cases are handled in the
calling code (notably translate.c). Architecturally attempts to
alter the mode bits from user mode are simply ignored (and not
treated as a bad mode switch, which in v8 sets CPSR.IL). Make
mode switches from User ignored in cpsr_write() as well, for
consistency.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-6-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
ae08792301 linux-user: Use restrictive mask when calling cpsr_write()
When linux-user code is calling cpsr_write(), use a restrictive
mask to ensure we are limiting the set of CPSR bits we update.
In particular, don't allow the mode bits to be changed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-5-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
f8c88bbcda target-arm: Raw CPSR writes should skip checks and bank switching
Raw CPSR writes should skip the architectural checks for whether
we're allowed to set the A or F bits and should also not do
the switching of register banks if the mode changes. Handle
this inside cpsr_write(), which allows us to drop the "manually
set the mode bits to avoid the bank switch" code from all the
callsites which are using CPSRWriteRaw.

This fixes a bug in 32-bit KVM handling where we had forgotten
the "manually set the mode bits" part and could thus potentially
trash the register state if the mode from the last exit to userspace
differed from the mode on this exit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-4-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
50866ba5a2 target-arm: Add write_type argument to cpsr_write()
Add an argument to cpsr_write() to indicate what kind of CPSR
write is being requested, since the exact behaviour should
differ for the different cases.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-3-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Peter Maydell
235ea1f5c8 target-arm: Give CPSR setting on 32-bit exception return its own helper
The rules for setting the CPSR on a 32-bit exception return are
subtly different from those for setting the CPSR via an instruction
like MSR or CPS. (In particular, in Hyp mode changing the mode bits
is not valid via MSR or CPS.) Split the exception-return case into
its own helper for setting CPSR, so we can eventually handle them
differently in the helper function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1455556977-3644-2-git-send-email-peter.maydell@linaro.org
2016-02-26 15:09:41 +00:00
Sascha Silbe
8da5ef579f migration/vmstate: document VMStateFlags
The VMState API is rather sparsely documented. Start by describing the
meaning of all VMStateFlags.

Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-Id: <1456474693-11662-1-git-send-email-silbe@linux.vnet.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 18:40:30 +05:30
Thomas Huth
a609ad8b69 MAINTAINERS: Add docs/migration.txt to the "Migration" section
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456393669-20678-1-git-send-email-thuth@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 18:40:30 +05:30
Peter Maydell
4d1e324b22 Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160226' into staging
MIPS patches 2016-02-26

Changes:
* support for FPU and MSA in KVM guest
* support for R6 Virtual Processors

# gpg: Signature made Fri 26 Feb 2016 11:07:37 GMT using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"

* remotes/lalrae/tags/mips-20160226:
  target-mips: implement R6 multi-threading
  mips/kvm: Support MSA in MIPS KVM guests
  mips/kvm: Support FPU in MIPS KVM guests
  mips/kvm: Support signed 64-bit KVM registers
  mips/kvm: Support unsigned KVM registers
  mips/kvm: Implement Config CP0 registers
  mips/kvm: Implement PRid CP0 register
  mips/kvm: Remove a couple of noisy DPRINTFs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 12:54:22 +00:00
Peter Maydell
a88a5cd2e8 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Fri 26 Feb 2016 10:45:04 GMT using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-26 12:24:03 +00:00
Mark Cave-Ayland
2d4846bd7b Update OpenBIOS images
Update OpenBIOS images to SVN r1391 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-02-26 10:44:40 +00:00
Matthew Fortune
d8b9d7719c migration/postcopy-ram: Guard use of sys/eventfd.h with CONFIG_EVENTFD
sys/eventfd.h was being guarded only by a check for linux but does
not exist on older distributions like CentOS 5. Move the include
into the code that uses it and add an appropriate guard.

Signed-off-by: Matthew Fortune <matthew.fortune@imgtec.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <6D39441BF12EF246A7ABCE6654B023536BB85DEB@hhmail02.hh.imgtec.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 15:05:25 +05:30
Wei Yang
bdf46d6478 migration: reorder code to make it symmetric
In qemu_savevm_state_complete_precopy(), it iterates on each device to add
a json object and transfer related status to destination, while the order
of the last two steps could be refined.

Current order:

    json_start_object()
    	save_section_header()
    	vmstate_save()
    json_end_object()
    	save_section_footer()

After the change:

    json_start_object()
    	save_section_header()
    	vmstate_save()
    	save_section_footer()
    json_end_object()

This patch reorder the code to to make it symmetric. No functional change.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1454626230-16334-1-git-send-email-richard.weiyang@gmail.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-26 15:05:24 +05:30
Laszlo Ersek
e6915b5f3a fw_cfg: unbreak migration compatibility for 2.4 and earlier machines
When I reviewed Marc's fw_cfg DMA patches, I completely missed that the
way we set dma_enabled would break migration.

Gerd explained the right way (see reference below): dma_enabled should be
set to true by default, and only true->false transitions should be
possible:

- when the user requests that with

    -global fw_cfg_mem.dma_enabled=off

  or

   -global fw_cfg_io.dma_enabled=off

  as appropriate for the platform,

- when HW_COMPAT_2_4 dictates it,

- when board code initializes fw_cfg without requesting DMA support.

Cc: Marc Marí <markmb@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Alexandre DERUMIER <aderumier@odiso.com>
Cc: qemu-stable@nongnu.org
Ref: http://thread.gmane.org/gmane.comp.emulators.qemu/390272/focus=391042
Ref: https://bugs.launchpad.net/qemu/+bug/1536487
Suggested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1455823860-22268-1-git-send-email-lersek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-26 10:06:40 +01:00
Yongbok Kim
01bc435b44 target-mips: implement R6 multi-threading
MIPS Release 6 provides multi-threading features which replace
pre-R6 MT Module. CP0.Config3.MT is always 0 in R6, instead there is new
CP0.Config5.VP (Virtual Processor) bit which indicates presence of
multi-threading support which includes CP0.GlobalNumber register and
DVP/EVP instructions.

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
bee62662a3 mips/kvm: Support MSA in MIPS KVM guests
Support the new KVM_CAP_MIPS_MSA capability, which allows MIPS SIMD
Architecture (MSA) to be exposed to the KVM guest.

The capability is enabled if the guest core has MSA according to its
Config3 register. Various config bits are now writeable so that KVM is
aware of the configuration (Config3.MSAP) and so that QEMU can
save/restore the guest modifiable bits (Config5.MSAEn). The MSACSR/MSAIR
registers and the MSA vector registers are now saved/restored. Since the
FP registers are a subset of the vector registers, they are omitted if
the guest has MSA.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
152db36ae6 mips/kvm: Support FPU in MIPS KVM guests
Support the new KVM_CAP_MIPS_FPU capability, which allows the host's FPU
to be exposed to the KVM guest.

The capability is enabled if the guest core has an FPU according to its
Config1 register. Various config bits are now writeable so that KVM is
aware of the configuration (Config1.FP) and so that QEMU can
save/restore the guest modifiable bits (Config5.FRE, Config5.UFR,
Config5.UFE). The FCSR/FIR registers and the floating point registers
are now saved/restored (depending on the FR mode bit).

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
d319f83fe9 mips/kvm: Support signed 64-bit KVM registers
Rename kvm_mips_{get,put}_one_reg64() to kvm_mips_{get,put}_one_ureg64()
since they take an int64_t pointer, and add separate signed 64-bit
accessors. These will be used for double precision floating point
registers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
0759487b56 mips/kvm: Support unsigned KVM registers
Add KVM register access functions for the uint32_t type. This is
required for FP and MSA control registers, which are represented as
unsigned 32-bit integers.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
03cbfd7b5c mips/kvm: Implement Config CP0 registers
Implement saving and restoring to KVM state of the Config CP0 registers
(namely Config, Config1, Config2, Config3, Config4, and Config5). These
control the features available to a guest, and a few of the fields will
soon be writeable by a guest so QEMU needs to know about them so as not
to clobber them on migration/savevm.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
461a1582f0 mips/kvm: Implement PRid CP0 register
Implement saving and restoring to KVM state of the Processor ID (PRid)
CP0 register. This allows QEMU to control the PRid exposed to the guest
instead of using the default set by KVM.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
James Hogan
c489e5591f mips/kvm: Remove a couple of noisy DPRINTFs
The DPRINTFs in cpu_mips_io_interrupts_pending() and kvm_arch_pre_run()
are particularly noisy during normal execution, and also not
particularly helpful. Remove them so that more important debug messages
can be more easily seen.

Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-02-26 08:59:17 +00:00
Peter Maydell
67ef811ed1 Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-02-25-tag' into staging
qemu-ga patch queue for 2.6

* fix w32 build breakage when VSS enabled
* fix up wchar handling in guest-set-user-password
* fix re-install handling for w32 MSI installer
* add w32 support for guest-get-vcpus
* add support for enums in guest-file-seek SEEK params
  instead of relying on platform-specific integer values

# gpg: Signature made Thu 25 Feb 2016 16:59:13 GMT using RSA key ID F108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"

* remotes/mdroth/tags/qga-pull-2016-02-25-tag:
  qga: fix w32 breakage due to missing osdep.h includes
  qga: check utf8-to-utf16 conversion
  qga: fix off-by-one length check
  qga: use wide-chars constants for wchar_t comparisons
  qga: use size_t for wcslen() return value
  qga: use more idiomatic qemu-style eol operators
  qga: implement the guest-get-vcpus for windows
  qemu-ga: Fixed minor version switch issue
  qga: Support enum names in guest-file-seek

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 17:33:19 +00:00
Michael Roth
e55eb806db qga: fix w32 breakage due to missing osdep.h includes
requester.h relied on qemu/compiler.h definitions to
handle GCC_FMT_ATTR() stub, but this include was removed as part
of scripted clean-ups via 30456d5:

  all: Clean up includes

under the assumption that all C files would have included it via
qemu/osdep.h at that point. requester.cpp was likely missed
due to C++ files requiring manual/special handling as well as
VSS build options needing to be enabled to trigger build failures.

Fix this by including qemu/osdep.h. That in turn pulls in a
macro from qapi/error.h that conflicts with a struct field name
in requester.h, so fix that as well by renaming the field.

While we're at it, fix up provider.cpp/install.cpp to include
osdep.h as well.

Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-25 10:54:32 -06:00
Lluís Vilanova
0c6940d086 build: [bsd-user] Rename "syscall.h" to "target_syscall.h" in target directories
This fixes double-definitions in bsd-user builds when using the UST
tracing backend (which indirectly includes the system's "syscall.h").

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 16:41:08 +00:00
Marc-André Lureau
8021de1013 qga: check utf8-to-utf16 conversion
UTF8 to UTF16 conversion can fail for genuine reasons, let's check errors.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:52 -06:00
Marc-André Lureau
25d943b957 qga: fix off-by-one length check
Laszlo Ersek said: "The length check is off by one (in the safe direction); it
should be (nchars >= 2). The processing should be active for the wide string
L"\r\n" -- resulting in the empty wide string --, I believe."

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Marc-André Lureau
6c6916dac8 qga: use wide-chars constants for wchar_t comparisons
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Marc-André Lureau
6771197dff qga: use size_t for wcslen() return value
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Marc-André Lureau
02506e2d54 qga: use more idiomatic qemu-style eol operators
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Gal Hammer
a7a173624e qga: implement the guest-get-vcpus for windows
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
* report rather than assert when VCPU count == 0
* fix up subject: s/set-vcpus/get-vcpus/
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Leonid Bloch
01fdadde80 qemu-ga: Fixed minor version switch issue
With automatically generated GUID, on minor version changes, an error
occurred, stating that there is a problem with the installer.
Now, a notification is shown, warning the user that another version of
this product is already installed, and that configuration or removal of
the existing version is possible through Add/Remove Programs on the
Control Panel (expected behavior).

Signed-off-by: Leonid Bloch <leonid@daynix.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:51 -06:00
Eric Blake
0b4b49387c qga: Support enum names in guest-file-seek
Magic constants are a pain to use, especially when we run the
risk that our choice of '1' for QGA_SEEK_CUR might differ from
the host or guest's choice of SEEK_CUR.  Better is to use an
enum value, via a qapi alternate type for back-compatibility.

With this,
 {"command":"guest-file-seek", "arguments":{"handle":1,
  "offset":0, "whence":"cur"}}
becomes a synonym for the older
 {"command":"guest-file-seek", "arguments":{"handle":1,
  "offset":0, "whence":1}}

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-02-25 09:48:50 -06:00
Peter Maydell
586fc27e6a Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Asynchronous dump-guest-memory from Peter
* improved logging with -D -daemonize from Dimitris
* more address_space_* optimization from Gonglei
* TCG xsave/xrstor thinko fix
* chardev bugfix and documentation patch

# gpg: Signature made Thu 25 Feb 2016 15:12:27 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  target-i386: fix confusion in xcr0 bit position vs. mask
  chardev: Properly initialize ChardevCommon components
  memory: Remove unreachable return statement
  memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
  exec: store RAMBlock pointer into memory region
  log: Redirect stderr to logfile if deamonized
  dump-guest-memory: add qmp event DUMP_COMPLETED
  Dump: add hmp command "info dump"
  Dump: add qmp command "query-dump"
  DumpState: adding total_size and written_size fields
  dump-guest-memory: add "detach" support
  dump-guest-memory: disable dump when in INMIGRATE state
  dump-guest-memory: introduce dump_process() helper function.
  dump-guest-memory: add dump_in_progress() helper function
  dump-guest-memory: using static DumpState, add DumpStatus
  dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
  dump-guest-memory: cleanup: removing dump_{error|cleanup}().
  scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"
  qemu-options.hx: Improve documentation of chardev multiplexing mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 15:30:57 +00:00
Paolo Bonzini
cfc3b074de target-i386: fix confusion in xcr0 bit position vs. mask
The xsave and xrstor helpers are accessing the x86_ext_save_areas array
using a bit mask instead of a bit position.  Provide two sets of XSTATE_*
definitions and use XSTATE_*_BIT when a bit position is requested.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Eric Blake
21a933ea33 chardev: Properly initialize ChardevCommon components
Commit d0d7708b forgot to parse logging for spice chardevs and
virtual consoles. This requires making qemu_chr_parse_common()
non-static. While at it, use a temporary variable to make the
code shorter, as well as reduce the churn when a later patch
alters the layout of simple unions.

Signed-off-by: Eric Blake <eblake@redhat.com>
CC: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455927587-28033-2-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
d61524486c memory: Remove unreachable return statement
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-4-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
3655cb9c73 memory: optimize qemu_get_ram_ptr and qemu_ram_ptr_length
these two functions consume too much cpu overhead to
find the RAMBlock by ram address.

After this patch, we can pass the RAMBlock pointer
to them so that they don't need to find the RAMBlock
anymore most of the time. We can get better performance
in address translation processing.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1455935721-8804-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:29 +01:00
Gonglei
58eaa2174e exec: store RAMBlock pointer into memory region
Each RAM memory region has a unique corresponding RAMBlock.
In the current realization, the memory region only stored
the ram_addr which means the offset of RAM address space,
We need to qurey the global ram.list to find the ram block
by ram_addr if we want to get the ram block, which is very
expensive.

Now, we store the RAMBlock pointer into memory region
structure. So, if we know the mr, we can easily get the
RAMBlock.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1456130097-4208-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-25 16:11:26 +01:00
Peter Maydell
774ae4254d Merge remote-tracking branch 'remotes/bkoppelmann/tags/pull-tricore-20160225' into staging
TriCore bugfixes and synchronous trap implementation

# gpg: Signature made Thu 25 Feb 2016 11:57:41 GMT using RSA key ID 6B69CA14
# gpg: Good signature from "Bastian Koppelmann <kbastian@mail.uni-paderborn.de>"

* remotes/bkoppelmann/tags/pull-tricore-20160225:
  target-tricore: add opd trap generation
  target-tricore: add illegal opcode trap generation
  target-tricore: add context managment trap generation
  target-tricore: Add trap handling & SOVF/OVF traps
  target-tricore: Fix wrong precedences on psw_write
  target-tricore: fix save_context_upper using env->PSW

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 12:57:22 +00:00
Peter Maydell
df215b59d9 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, virtio, pci, pc

Fixes all over the place.
virtio dataplane migration support.
Old q35 machine types removed.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 25 Feb 2016 11:16:46 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (21 commits)
  q35: No need to check gigabyte_align
  q35: Remove unused q35-acpi-dsdt.aml file
  ich9: Remove enable_tco arguments from init functions
  machine: Remove no_tco field
  q35: Remove old machine versions
  tests/vhost-user-bridge: fix build on 32 bit systems
  vring: remove
  virtio-scsi: do not use vring in dataplane
  virtio-blk: do not use vring in dataplane
  virtio-blk: fix "disabled data plane" mode
  virtio: export vring_notify as virtio_should_notify
  virtio: add AioContext-specific function for host notifiers
  vring: make vring_enable_notification return void
  block-migration: acquire AioContext as necessary
  pci core: function pci_bus_init() cleanup
  pci core: function pci_host_bus_register() cleanup
  balloon: Use only 'pc-dimm' type dimm for ballooning
  virtio-balloon: rewrite get_current_ram_size()
  move get_current_ram_size to virtio-balloon.c
  vhost-user: don't merge regions with different fds
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 12:13:49 +00:00
Bastian Koppelmann
828066c78a target-tricore: add opd trap generation
If an instruction uses a 64 bit register which consists of an even-odd pair
of 32 bit registers and if the register specifier in the instruction is
odd an opd trap is raised.

Reviewed-by: Richard Henderson  <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <1455889426-1923-5-git-send-email-kbastian@mail.uni-paderborn.de>
2016-02-25 12:54:50 +01:00
Bastian Koppelmann
f678f671ba target-tricore: add illegal opcode trap generation
Reviewed-by: Richard Henderson  <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <1455889426-1923-4-git-send-email-kbastian@mail.uni-paderborn.de>
2016-02-25 12:54:47 +01:00
Bastian Koppelmann
3292b4477f target-tricore: add context managment trap generation
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <1455889426-1923-3-git-send-email-kbastian@mail.uni-paderborn.de>
2016-02-25 12:54:45 +01:00
Bastian Koppelmann
518d7fd2a0 target-tricore: Add trap handling & SOVF/OVF traps
Add the infrastructure needed to generate and handle traps and
implement the generation of SOVF and OVF traps.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Message-Id: <1455889426-1923-2-git-send-email-kbastian@mail.uni-paderborn.de>
2016-02-25 12:54:42 +01:00
Bastian Koppelmann
5dc1fbae70 target-tricore: Fix wrong precedences on psw_write
Wrong braces on the restore of the cached TCGv SV and V bit could lead to
a wrong PSW. While at this it removes unnecessary braces for the restore
of the cached TCGv AV and SAV bits.

Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2016-02-25 12:51:31 +01:00
Bastian Koppelmann
723733575b target-tricore: fix save_context_upper using env->PSW
If the cached bits for C, V, SV, AV, or SAV were set, they would
not be saved during the context save since env->PSW was stored instead
of properly reading them using psw_read().

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
2016-02-25 12:51:27 +01:00
Peter Maydell
8283f6f821 Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160225' into staging
Second pull req with getrandom fix

# gpg: Signature made Thu 25 Feb 2016 10:57:42 GMT using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20160225:
  linux-user: add getrandom() syscall
  linux-user: correct timerfd_create syscall numbers
  linux-user: remove unavailable syscalls from aarch64
  linux-user: sync syscall numbers with kernel
  linux-user: Don't assert if guest tries shmdt(0)
  linux-user: set ppc64/ppc64le default CPU to POWER8
  build: [linux-user] Rename "syscall.h" to "target_syscall.h" in target directories
  linux-user: fix realloc size of target_fd_trans.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 11:46:53 +00:00
Eduardo Habkost
533e8bbb55 q35: No need to check gigabyte_align
gigabyte_align is always true on q35, so we don't need the
!gigabyte_align compat code anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-02-25 13:14:19 +02:00
Eduardo Habkost
75fb3d286e q35: Remove unused q35-acpi-dsdt.aml file
The file was used only by older machine-types, and it is not
needed anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-02-25 13:14:19 +02:00
Eduardo Habkost
18d6abae3e ich9: Remove enable_tco arguments from init functions
The enable_tco arguments are always true, so they are not needed
anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-02-25 13:14:19 +02:00
Eduardo Habkost
d6b304ba92 machine: Remove no_tco field
The field is always set to zero, so it is not necessary anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-02-25 13:14:19 +02:00
Eduardo Habkost
86165b499e q35: Remove old machine versions
Migration with q35 was not possible before commit
04329029a8, because q35
unconditionally creates an ich9-ahci device, that was marked as
unmigratable. So all q35 machine classes before pc-q35-2.4 were
not migratable, so there's no point in keeping compatibility code
for them.

Remove all old pc-q35 machine classes and keep only pc-q35-2.4
and newer.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2016-02-25 13:14:19 +02:00
Michael S. Tsirkin
5602b39ff3 tests/vhost-user-bridge: fix build on 32 bit systems
Mainly casts between void * and uint64_t, and wrong
format for size_t.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-25 13:14:19 +02:00
Paolo Bonzini
fee089e4e2 vring: remove
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:19 +02:00
Paolo Bonzini
e24a47c5b7 virtio-scsi: do not use vring in dataplane
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:19 +02:00
Paolo Bonzini
03de2f5274 virtio-blk: do not use vring in dataplane
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:18 +02:00
Paolo Bonzini
2906cddfec virtio-blk: fix "disabled data plane" mode
In disabled mode, virtio-blk dataplane seems to be enabled, but flow
actually goes through the normal virtio path.  This patch simplifies a bit
the handling of disabled mode.  In disabled mode, virtio_blk_handle_output
might be called even if s->dataplane is not NULL.

This is a bit tricky, because the current check for s->dataplane will
always trigger, causing a continuous stream of calls to
virtio_blk_data_plane_start.  Unfortunately, these calls will not
do anything.  To fix this, set the "started" flag even in disabled
mode, and skip virtio_blk_data_plane_start if the started flag is true.
The resulting changes also prepare the code for the next patch, were
virtio-blk dataplane will reuse the same virtio_blk_handle_output function
as "regular" virtio-blk.

Because struct VirtIOBlockDataPlane is opaque in virtio-blk.c, we have
to move s->dataplane->started inside struct VirtIOBlock.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:18 +02:00
Paolo Bonzini
adb3feda8d virtio: export vring_notify as virtio_should_notify
Virtio dataplane needs to trigger the irq manually through the
guest notifier.  Export virtio_should_notify so that it can be
used around event_notifier_set.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:18 +02:00
Paolo Bonzini
a1afb6062e virtio: add AioContext-specific function for host notifiers
This is used to register ioeventfd with a dataplane thread.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:18 +02:00
Paolo Bonzini
8b1fe1cedf vring: make vring_enable_notification return void
Make the API more similar to the regular virtqueue API.  This will
help when modifying the code to not use vring.c anymore.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-25 13:14:18 +02:00
Paolo Bonzini
ef0716df7f block-migration: acquire AioContext as necessary
This is needed because dataplane will run during block migration as well.

The block device migration code is quite liberal in taking the iothread
mutex.  For simplicity, keep it the same way, even though one could
actually choose between the BQL (for regular BlockDriverStates) and
the AioContext (for dataplane BlockDriverStates).  When the block layer
is made fully thread safe, aio_context_acquire shall go away altogether.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-02-25 13:14:18 +02:00
Cao jin
9ae91bc43f pci core: function pci_bus_init() cleanup
remove unused param

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-02-25 13:14:18 +02:00
Cao jin
3dbc01ae87 pci core: function pci_host_bus_register() cleanup
remove unused param, and rename the other to a meaningful one.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
2016-02-25 13:14:18 +02:00
Vladimir Sementsov-Ogievskiy
2b75f84823 balloon: Use only 'pc-dimm' type dimm for ballooning
For now there are only two dimm's: pc-dimm and nvdimm. This patch is
actually needed to disable ballooning on nvdimm. But, to avoid future
bugs, instead of disallowing nvdimm, we allow only pc-dimm. So, if
someone adds new dimm which should be balloon-able, then this ability
should be explicitly specified here.

Why ballooning for nvdimm should be disabled for now:

NVDIMM for now is planned to use as a backing store for DAX filesystem
in the guest and thus this memory is excluded from guest memory
management and LRUs.

In this case libvirt running QEMU along with configured balloon almost
immediately inflates balloon and effectively kill the guest as
qemu counts nvdimm as part of the ram.

Counting dimm devices as part of the ram for ballooning was started from
commit 463756d03:
 virtio-balloon: Fix balloon not working correctly when hotplug memory

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-25 13:14:18 +02:00
Vladimir Sementsov-Ogievskiy
e8dc06d225 virtio-balloon: rewrite get_current_ram_size()
Use pc_dimm_built_list() instead of qmp_pc_dimm_device_list()

Actually, Qapi is not related to this internal helper.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-25 13:14:18 +02:00
Peter Maydell
d159148b63 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160225' into staging
ppc patch queue for 2016-02-25

Hopefully final queue before qemu-2.6 soft freeze.  Currently
accumulated patches for target-ppc, pseries machine type and related
devices:
    * SLOF firmware update
        - Many new features, including virtio 1.0 non-legacy support
    * H_PAGE_INIT hypercall implementation
    * Small cleanups and bugfixes.

# gpg: Signature made Thu 25 Feb 2016 03:00:56 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160225:
  ppc/kvm: Tell the user what might be wrong when using bad CPU types with kvm-hv
  ppc/kvm: Use error_report() instead of cpu_abort() for user-triggerable errors
  spapr: initialize local Error pointer
  hw/ppc/spapr: Implement the h_page_init hypercall
  pseries: Update SLOF firmware image to 20160223

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-25 10:46:06 +00:00
Thomas Huth
388e47c75b ppc/kvm: Tell the user what might be wrong when using bad CPU types with kvm-hv
Using a CPU type that does not match the host is not possible when using
the kvm-hv kernel module - the PVR is checked in the kernel function
kvm_arch_vcpu_ioctl_set_sregs_hv() and rejected with -EINVAL if it
does not match the host.
However, when the user tries to specify a non-matching CPU type, QEMU
currently only reports "kvm_init_vcpu failed: Invalid argument", and
this is of course not very helpful for the user to solve the problem.
So this patch adds a more descriptive error message that tells the
user to specify "-cpu host" instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
[Removed melodramatic '!' :)]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Thomas Huth
072ed5f260 ppc/kvm: Use error_report() instead of cpu_abort() for user-triggerable errors
Setting the KVM_CAP_PPC_PAPR capability can fail if either the KVM
kernel module does not support it, or if the specified vCPU type
is not a 64-bit Book3-S CPU type. For example, the user can trigger
it easily with "-M pseries -cpu G2leLS" when using the kvm-pr kernel
module. So the error should not be reported with cpu_abort() since
this function is rather meant for reporting programming errors than
reporting user-triggerable errors (it prints out all CPU registers
and then calls abort() to kills the program - two things that the
normal user does not expect here) . So let's use error_report() with
exit(1) here instead.
A similar problem exists in the code that sets the KVM_CAP_PPC_EPR
capability, so while we're at it, fix that, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Greg Kurz
9897e46264 spapr: initialize local Error pointer
This fixes a crash in the target QEMU during migration.

Broken in commit c5f54f3.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[reworded commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Thomas Huth
3240dd9a69 hw/ppc/spapr: Implement the h_page_init hypercall
This hypercall either initializes a page with zeros, or copies
another page.
According to LoPAPR, the i-cache of the page should also be
flushed if using H_ICACHE_INVALIDATE or H_ICACHE_SYNCHRONIZE,
and the d-cache should be synchronized to the RAM if the
H_ICACHE_SYNCHRONIZE flag is used. For this, two new functions
are introduced, kvmppc_dcbst_range() and kvmppc_icbi()_range, which
use the corresponding assembler instructions to flush the caches
if running with KVM on Power. If the code runs with TCG instead,
the code only uses tb_flush(), assuming that this will be
enough for synchronization.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:44 +11:00
Alexey Kardashevskiy
4f7ab0cdbc pseries: Update SLOF firmware image to 20160223
The main change is virtio 1.0 support.

The complete changelog is:
  > dhcp: fix warning messages when calling strtoip()
  > virtio-scsi: enable virtio 1.0
  > virtio-scsi: use virtio_fill desc api
  > virtio-scsi: use idx during initialization
  > virtio-net: enable virtio 1.0
  > virtio-blk: enable virtio 1.0
  > virtio: 1.0 helper to read 16/32/64 bit value
  > virtio: add and enable 1.0 device setup
  > virtio: 1.0 guest features negotiation
  > virtio: update features set/get register accessor
  > virtio: make all virtio apis 1.0 aware
  > virtio: add 64-bit virtio helpers for 1.0
  > virtio: add virtio 1.0 related struct and defines
  > virtio: get rid of type variable in virtio_device
  > virtio-net: move setup-mac to the open routine
  > virtio-net: make net_hdr_size a variable
  > virtio-net: replace vq array with vq_{tx,rx}
  > virtio-net: use virtio_fill_desc
  > virtio-{net,blk,scsi,9p}: use status variable
  > virtio-blk: add helpers for filling descriptors
  > virtio-{blk,9p}: enable resetting the device
  > virtio: introduce helper for initializing virt queue
  > virtio: fix code style/design issues.
  > fix code style in byteorder.h
  > pci: add byte read/write helper routines
  > virtio-net: fix gcc warnings (-Wextra)
  > virtio-blk: fix gcc warnings (-Wextra)
  > readme: Add a note about coding style
  > dhcp: Remove duplicated strtoip()
  > ethernet: Fix gcc warnings
  > net-snk: Fix gcc warnings
  > net-snk: Fix coding style
  > net-snk: Fix memory leak in dhcp6_process_options()
  > net-snk: Fix memory leak in ip6_to_multicast_mac() / send_ipv6()
  > net-snk: Remove bad NEIGHBOUR_SOLICITATION code in send_ipv6()
  > Fix dma-alloc and dma-map-in functions on board-js2x
  > net-snk: Allow stateless autoconfig IPv6 addresses with IP_INIT_IPV6_MANUAL
  > net-snk: Simplify the ip6_is_multicast() function
  > net-snk: Move global variable definition out of the header file
  > net-snk: Prefer non-link-local unicast IPv6 addresses if possible
  > net-snk: Fix the check for link-local addresses when receiving RAs
  > net-snk: Remove junk at the end of IPv6 TFTP ACK and error packets
  > Fix format strings in usb-ohci.c
  > net-snk: Get rid of junk at the end of sent DHCPv6 packets
  > net-snk: Use transaction IDs in DHCPv4, too
  > net-snk: Make use of DHCPv6 transaction IDs
  > net-snk: Seed the pseudo-random number generator
  > libc: Add srand() call
  > libc: Fix the rand() function to return non-zero values
  > net-snk: Improve printed text when booting via network
  > Increase temporary buffer size of ibm,client-architecture-support call
  > Move archsupport.fs into board-qemu directory
  > boot: stop booting when we encounter HALT
  > fat-files: Fix bug with root-entries = 0 on certain FAT32 file systems
  > usb: print unhandled descriptor in debug mode
  > Improve stack usage with libnvram get_partition function
  > Improve stack usage in libnvram environment variable code
  > libc: Port vsnprintf back from skiboot
  > Move the code for rfill into a separate function
  > Rework wrapper for new_nvram_partition() and fix possible bug in there
  > Stack optimization in libusb: split up setup_new_device()
  > Check for stack overflow in paflof engine
  > Clean up pending packet variable in ipv4 code
  > Fix tracking of pending outgoing packets when handling ARP replies

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-25 13:58:26 +11:00
Laurent Vivier
f894efd199 linux-user: add getrandom() syscall
getrandom() has been introduced in kernel 3.17 and is now used during
the boot sequence of Debian unstable (stretch/sid).

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-24 15:22:15 +02:00
Riku Voipio
93a92d3bd6 linux-user: correct timerfd_create syscall numbers
x86, m68k, ppc, sh4 and sparc failed to enable timerfd, because they
didn't have timerfd_create system call defined. Instead QEMU
defined timerfd syscall. Checking with kernel sources, it appears
kernel developers reused timerfd syscall number with timerfd_create,
presumably since no userspace called the old syscall number.

Reported-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:10 +02:00
Riku Voipio
13756fb008 linux-user: remove unavailable syscalls from aarch64
QEMU lists deprecated system call numbers in for Aarch64. These
are never enabled for Linux kernel, so don't define them in Qemu
either. Remove the ifdef around host_to_target_stat64 since
all architectures need it now.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:10 +02:00
Riku Voipio
7c73d2a3fa linux-user: sync syscall numbers with kernel
Sync syscall numbers to match the linux v4.5-rc1 kernel.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:10 +02:00
Peter Maydell
b6e17875f2 linux-user: Don't assert if guest tries shmdt(0)
Our implementation of shmat() and shmdt() for linux-user was
using "zero guest address" as its marker for "entry in the
shm_regions[] array is not in use". This meant that if the
guest did a shmdt(0) we would match on an unused array entry
and call page_set_flags() with both start and end addresses zero,
which causes an assertion failure.

Use an explicit in_use flag to manage the shm_regions[] array,
so that we avoid this problem.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Pavel Shamis <pasharesearch@gmail.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:09 +02:00
Laurent Vivier
de3f1b9841 linux-user: set ppc64/ppc64le default CPU to POWER8
Set the default to the latest CPU version to have the
largest set of available features.

It is also really needed in little-endian mode because
POWER7 is not really supported in this mode and some distros
(at least debian) generate POWER8 code for their ppc64le target.

Fixes: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=813698

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:09 +02:00
Lluís Vilanova
460c579f3d build: [linux-user] Rename "syscall.h" to "target_syscall.h" in target directories
This fixes double-definitions in linux-user builds when using the UST
tracing backend (which indirectly includes the system's "syscall.h").

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:09 +02:00
Laurent Vivier
5089c7ce82 linux-user: fix realloc size of target_fd_trans.
target_fd_trans is an array of "TargetFdTrans *": compute size
accordingly. Use g_renew() as proposed by Paolo.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-02-23 21:25:09 +02:00
Peter Maydell
7bd57b5150 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20160223' into staging
Queued TCG patches

# gpg: Signature made Tue 23 Feb 2016 18:27:44 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tcg-20160223:
  tcg: Remove unnecessary osdep.h includes from tcg-target.inc.c
  scripts/clean-includes: Ignore .inc.c files
  tcg: Rename tcg-target.c to tcg-target.inc.c
  target-sparc: Use global registers for the register window
  target-sparc: Tidy global register initialization
  tcg: Allocate indirect_base temporaries in a different order
  tcg: Implement indirect memory registers
  tcg: Work around clang bug wrt enum ranges, part 2

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-23 18:49:30 +00:00
Peter Maydell
c3b7f66800 tcg: Remove unnecessary osdep.h includes from tcg-target.inc.c
Commit 757e725b58 added a number of #include "qemu/osdep.h"
files to the tcg-target.c files (as they were named at the time).
These are unnecessary because these files are not standalone C
files, and the tcg/tcg.c file which includes them will have
already included osdep.h on their behalf. Remove the unneeded
include directives.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-4-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:31:03 -08:00
Peter Maydell
f8e1f5d6a2 scripts/clean-includes: Ignore .inc.c files
Ignore files which have a .inc.c extension -- these are not headers
but they are not standalone C source files either, so we can't make
any automated decisions about what #include directives they should
have.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-3-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:30:59 -08:00
Peter Maydell
ce15110981 tcg: Rename tcg-target.c to tcg-target.inc.c
Rename the per-architecture tcg-target.c files to tcg-target.inc.c.
This makes it clearer that they are not intended to be standalone
C files, but are instead #included into another source file.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1456238983-10160-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:30:38 -08:00
Richard Henderson
d2dc4069e0 target-sparc: Use global registers for the register window
Via indirection off cpu_regwptr.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:28:21 -08:00
Peter Maydell
1b1624092d Merge remote-tracking branch 'remotes/spice/tags/pull-spice-20160223-1' into staging
spice: initial opengl/virgl support, postcopy migration fix.

# gpg: Signature made Tue 23 Feb 2016 12:30:40 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/spice/tags/pull-spice-20160223-1:
  Postcopy+spice: Pass spice migration data earlier
  spice/gl: tweak debug messages.
  spice/gl: add unblock timer
  spice: add opengl/virgl/dmabuf support
  spice: reset cursor on resize
  egl-helpers: add functions for render nodes and dma-buf passing
  configure: add dma-buf support detection.
  spice: init dcl before registering qxl interface

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-23 16:14:17 +00:00
Richard Henderson
0ea63844c2 target-sparc: Tidy global register initialization
Create tables for the various global registers that need allocation.
Remove one level of indirection from  gregnames and fregnames.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:07:14 -08:00
Richard Henderson
91478cefaa tcg: Allocate indirect_base temporaries in a different order
Since we've not got liveness analysis for indirect bases,
placing them at the end of the call-saved registers makes
it more likely that it'll stay live.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:07:14 -08:00
Richard Henderson
b3915dbbdc tcg: Implement indirect memory registers
That is, global_mem registers whose base is another global_mem
register, rather than a fixed register.

Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-23 08:07:14 -08:00
Richard Henderson
869938ae2a tcg: Work around clang bug wrt enum ranges, part 2
A previous patch patch changed the type of REG from int
to enum TCGReg, which provokes the following bug in clang:

  https://llvm.org/bugs/show_bug.cgi?id=16154

Signed-off-by: Richard Henderson  <rth@twiddle.net>
2016-02-23 08:07:14 -08:00
Peter Maydell
3174c64bb7 tracetool: Include osdep.h in generated-ust.c
When generating the trace/generated-ust.c source file, make sure
it includes osdep.h as its first include.

This fixes compilation with --enable-trace-backends=ust

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1456240661-15422-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 15:43:30 +00:00
Peter Maydell
90ce6e2644 include: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

NB: If this commit breaks compilation for your out-of-tree
patchseries or fork, then you need to make sure you add
#include "qemu/osdep.h" to any new .c files that you have.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Maydell
974dc73d77 all: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
---
This just catches a couple of stragglers since I posted
the last clean-includes patchset last week.
2016-02-23 12:43:05 +00:00
Peter Maydell
30456d5ba3 all: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Maydell
b1e34d1c3a osdep.h: Include config-target.h if NEED_CPU_H is defined
NEED_CPU_H is the define we use to distinguish per-target object
compilation from common object compilation. For the former, we must
also include config-target.h so that the .c files see the necessary
CONFIG_ constants.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Maydell
d57106a4b6 scripts/clean-includes: Add --all option
Add a --all option which will run the script on every C
source and header file in the repository (except for those
in a few directories which contain standalone guest code).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Maydell
fd3e39a40c scripts/clean-includes: Enhance to handle header files
Enhance clean-includes to handle header files as well as .c source
files. For headers we merely remove all the redundant #include
lines, including any includes of qemu/osdep.h itself.

There is a simple mollyguard on the include file processing to
skip a few key headers like osdep.h itself, to avoid producing
bad patches if the script is run on every file in include/.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:05 +00:00
Peter Maydell
e78490c44c disas/arm-a64.cc: Include osdep.h first
Rearrange include directives so that we include osdep.h first.
This has to be done manually because clean-includes doesn't
handle C++.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:04 +00:00
Peter Maydell
79f56d82f8 osdep.h: Define macros for the benefit of C++ before C++11
For C++ before C++11, <stdint.h> requires definition of the macros
__STDC_CONSTANT_MACROS, __STDC_LIMIT_MACROS and __STDC_FORMAT_MACROS
in order to enable definition of various macros by the header file.
Define these in osdep.h, so that we get the right header file
definitions whether osdep.h is being used by plain C, C++11 or
older C++.

In particular libvixl's header files depend on this and won't
compile if osdep.h is included before them otherwise.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-23 12:43:04 +00:00
Peter Maydell
1ef26b1f30 cpu: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-23 12:43:04 +00:00
Dr. David Alan Gilbert
b82fc321bf Postcopy+spice: Pass spice migration data earlier
Spice hooks the migration status changes to figure out when to
transmit information to the new spice server; but the migration
status in postcopy doesn't quite fit - the destination starts
running before the end of the source migration.

It's not a case of hanging off the migration status change to
postcopy-active either, since that happens before we stop the
guest CPU.

Fix it by sending a notify just after sending the device state,
and adding a flag that can be tested by the notify receiver.

Symptom:
   spice handover doesn't work with the error:
   red_worker.c:11540:display_channel_wait_for_migrate_data: timeout

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-id: 1456161452-25318-1-git-send-email-dgilbert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 12:05:02 +01:00
Gerd Hoffmann
22672a3798 spice/gl: tweak debug messages.
Adjust message levels, make messages more verbose.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 12:04:40 +01:00
Gerd Hoffmann
8e388e907b spice/gl: add unblock timer
Pure debug aid, print a warning in case unblocking
doesn't happen within one second.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-23 12:04:40 +01:00
Gerd Hoffmann
474114b730 spice: add opengl/virgl/dmabuf support
This adds support for dma-buf passing to spice.  This makes virtio-gpu
with 3d acceleration work with spice.

Workflow:
 * virglrenderer renders the guest command stream into a texture.
 * qemu exports the texture as dma-buf and passes on that dma-buf
   to spice-server.
 * spice-server passes the dma-buf to spice-client, using unix
   socket file descriptor passing.
 * spice-client asks the window systems composer to render the
   dma-buf to the screen.

Requires cutting edge spice (server) and spice-gtk (client) builds,
from git master branch.

Also requires libvirt managing your qemu instance, and using
"virt-viewer --attach $guest".  libvirt will connect spice-server and
spice-client using unix sockets instead of tcp sockets then, which
is required for file descriptor passing.

Works for the local case (spice server and client on the same machine)
only.  Supporting remote too is planned (by feeding the dma-bufs into
gpu-assisted video encoder), but not there yet.

gl mode is turned off by default, use "-spice gl=on,$otherargs" to
enable it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 12:04:39 +01:00
Marc-André Lureau
58c7b618f3 spice: reset cursor on resize
Spice server will clear the cursor on resize. QXL driver reset it after
resize, however, virtio and other devices do not. Teach qemu to set it
back.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 12:04:39 +01:00
Gerd Hoffmann
1e3165980c egl-helpers: add functions for render nodes and dma-buf passing
Adds helpers to open a drm render node and create a opengl
context for it.  Also add a helper to export a texture as
dma-buf.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-23 12:04:39 +01:00
Gerd Hoffmann
014cb152b8 configure: add dma-buf support detection.
Set CONFIG_OPENGL_DMABUF in case both mesa and libepoxy are
new enough to have support for dma-buf import/export.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-23 12:04:39 +01:00
Gerd Hoffmann
b5e751b51f spice: init dcl before registering qxl interface
Without this spice might callback into qemu before ssd->dcl.con is
initialized, resulting in a segfault due to NULL pointer dereference.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-23 12:04:39 +01:00
Peter Maydell
ea6e4981bf Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160223-1' into staging
usb: misc bugfixes.

# gpg: Signature made Tue 23 Feb 2016 10:53:01 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160223-1:
  ohci: allocate timer only once.
  usb: add pid check at the first of uhci_handle_td()
  usb: check RNDIS buffer offsets & length
  usb: check RNDIS message length
  tusb6010: move from hw/timer to hw/usb
  usb: check USB configuration descriptor object

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-23 10:57:31 +00:00
Vladimir Sementsov-Ogievskiy
39de99843e move get_current_ram_size to virtio-balloon.c
get_current_ram_size() is used only in virtio-balloon.c
This patch moves it into virtio-balloon and make it static, to allow
some balloon-specific tuning.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23 12:55:16 +02:00
Michael S. Tsirkin
ffe42cc14c vhost-user: don't merge regions with different fds
vhost currently merges regions with contiguious virtual and physical
addresses.  This breaks for vhost-user since that also needs fds to
match.

Add a vhost_ops entry to compare the fds for vhost-user only.

Cc: qemu-stable@nongnu.org
Cc: Victor Kaplansky <victork@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23 12:55:16 +02:00
Michael S. Tsirkin
b54ca0c3df bios-linker-loader: document+validate input
While guest/host ABI is documented in hw/acpi/bios-linker-loader.c,
the API was left undocumented.

This adds documentation for all API functions.

Additionally, input is validated to make sure all
pointers fall within range of provided files.

To allow this validation for checksum commands,
bios_linker_loader_add_checksum is changed to accept GArray * in place
of void *.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-23 12:55:16 +02:00
Gerd Hoffmann
fa1298c2d6 ohci: allocate timer only once.
Allocate timer once, at init time, instead of allocating/freeing
it all the time when starting/stopping the bus.  Simplifies the
code, also fixes bugs (memory leak) due to missing checks whenever
the time is already allocated or not.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 11:13:18 +01:00
Gonglei
5f77e06baa usb: add pid check at the first of uhci_handle_td()
pid can be gotten from uhci device memory in uhci_handle_td(),
so the guest can trigger assert qemu if we get an invalid pid.
And the uhci spec 2.1.2 tells us The Host Controller sets Host
Controller Process Error bit to 1 when it detects a fatal error
and indicates that the Host Controller suffered a consistency
check failure while processing a Transfer Descriptor. An example
of a consistency check failure would be finding an illegal PID
field while processing the packet header portion of the TD.
When this error occurs, the Host Controller clears the Run/Stop
bit in the Command register to prevent further schedule execution.

We'd better to set UHCI_STS_HCPERR and kick an interrupt, check
the pid value at the first of uhci_handle_td function.

https://bugzilla.redhat.com/show_bug.cgi?id=1070027

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-id: 1455867238-4720-1-git-send-email-arei.gonglei@huawei.com

[ applied minor codestyle fix ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 10:38:01 +01:00
Prasad J Pandit
fe3c546c5f usb: check RNDIS buffer offsets & length
When processing remote NDIS control message packets,
the USB Net device emulator uses a fixed length(4096) data buffer.
The incoming informationBufferOffset & Length combination could
overflow and cross that range. Check control message buffer
offsets and length to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-3-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 10:38:01 +01:00
Prasad J Pandit
64c9bc181f usb: check RNDIS message length
When processing remote NDIS control message packets, the USB Net
device emulator uses a fixed length(4096) data buffer. The incoming
packet length could exceed this limit. Add a check to avoid it.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-2-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 10:38:00 +01:00
Peter Maydell
14ec7b2c5b tusb6010: move from hw/timer to hw/usb
The TUSB6010 is a USB controller (as the name suggests). Move it from
hw/timer (where it was accidentally filed in 2013 when we moved
everything out of hw/) to hw/usb.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455883404-10976-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 10:38:00 +01:00
Prasad J Pandit
80eecda8e5 usb: check USB configuration descriptor object
When processing remote NDIS control message packets, the USB Net
device emulator checks to see if the USB configuration descriptor
object is of RNDIS type(2). But it does not check if it is null,
which leads to a null dereference error. Add check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455188480-14688-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-23 10:38:00 +01:00
Dimitris Aragiorgis
96c33a4523 log: Redirect stderr to logfile if deamonized
In case of daemonize, use the logfile passed with the -D option in
order to redirect stderr to it instead of /dev/null.

Also remove some unused code in log.h.

Signed-off-by: Dimitris Aragiorgis <dimara@arrikto.com>
Message-Id: <1455795518-19205-1-git-send-email-dimara@arrikto.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:29 +01:00
Peter Xu
d42a0d1484 dump-guest-memory: add qmp event DUMP_COMPLETED
One new QMP event DUMP_COMPLETED is added. When a dump finishes, one
DUMP_COMPLETED event will occur to notify the user.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by:   Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-12-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:29 +01:00
Peter Xu
4a6b52d67e Dump: add hmp command "info dump"
It will calculate percentage of finished work from completed and
total.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1455772616-8668-11-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
39ba2ea61f Dump: add qmp command "query-dump"
When dump-guest-memory is requested with detach flag, after its
return, user could query its status using "query-dump" command (with
no argument). The result contains:

- status: current dump status
- completed: bytes written in the latest dump
- total: bytes to write in the latest dump

From completed and total, we could know how much work
finished by calculating:

  100.0 * completed / total (%)

Reviewed-by:   Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1455772616-8668-10-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
2264c2c96e DumpState: adding total_size and written_size fields
Here, total_size is the size in bytes to be dumped (raw data, which
means before compression), while written_size are bytes handled (raw
size too).

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1455772616-8668-9-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
1fbeff72c2 dump-guest-memory: add "detach" support
If "detach" is provided, one thread is created to do the dump work,
while main thread will return immediately. For each GuestPhysBlock,
adding one more field "mr" to points to MemoryRegion that it
belongs, also ref the mr before use.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-8-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
63e27f28f2 dump-guest-memory: disable dump when in INMIGRATE state
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-7-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
ca1fc8c97e dump-guest-memory: introduce dump_process() helper function.
No functional change. Cleanup only.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-6-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
65d64f3623 dump-guest-memory: add dump_in_progress() helper function
For now, it has no effect. It will be used in dump detach support.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-5-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
baf28f57e2 dump-guest-memory: using static DumpState, add DumpStatus
Instead of malloc/free each time for DumpState, make it
static. Added DumpStatus to show status for dump.

This is to be used for detached dump.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-4-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
228de9cf1d dump-guest-memory: add "detach" flag for QMP/HMP interfaces.
This patch only adds the interfaces, but does not implement them.
"detach" parameter is made optional, to make sure that all the old
dump-guest-memory requests will still be able to work.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Peter Xu
e3517a5299 dump-guest-memory: cleanup: removing dump_{error|cleanup}().
It might be a little bit confusing and error prone to do
dump_cleanup() in these two functions. A better way is to do
dump_cleanup() before dump finish, no matter whether dump has
succeeded or not.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455772616-8668-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:28 +01:00
Fam Zheng
cf7ea1e60c scripts/kvm/kvm_stat: Fix missing right parantheses and ".format(...)"
They seem to have snuck in when applying Janosch Frank
<frankja@linux.vnet.ibm.com>'s previous patch.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1455848416-13177-1-git-send-email-famz@redhat.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Tested-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-22 18:40:22 +01:00
Peter Maydell
8eb779e422 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Mon 22 Feb 2016 15:59:25 GMT using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (34 commits)
  qemu-iotests: 140: make description slightly more verbose
  qemu-iotests: 140: don't use IDE device
  qemu-iotests: 067: ignore QMP events
  blockdev: unset inappropriate flags when changing medium
  MAINTAINERS: Add myself as maintainer of the throttling code
  docs: Document the throttling infrastructure
  qapi: Correct the name of the iops_rd parameter
  qemu-iotests: Extend iotest 093 to test bursts
  throttle: Test throttle_compute_wait() during bursts
  throttle: Check that burst_level leaks correctly
  qapi: Add burst length fields to BlockDeviceInfo
  qapi: Add burst length parameters to block_set_io_throttle
  throttle: Add command-line settings to define the burst periods
  throttle: Add support for burst periods
  throttle: Use throttle_config_init() to initialize ThrottleConfig
  throttle: Merge all functions that check the configuration into one
  throttle: Set always an average value when setting a maximum value
  throttle: Make throttle_is_valid() set errp
  throttle: Make throttle_max_is_missing_limit() set errp
  throttle: Make throttle_conflicting() set errp
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-22 16:55:41 +00:00
Kevin Wolf
fe243e4881 Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-02-22' into queue-block
Block patches of the last three weeks.

# gpg: Signature made Mon Feb 22 16:55:33 2016 CET using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-02-22:
  qemu-iotests: 140: make description slightly more verbose
  qemu-iotests: 140: don't use IDE device
  qemu-iotests: 067: ignore QMP events
  blockdev: unset inappropriate flags when changing medium

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 16:57:50 +01:00
Sascha Silbe
43e15ed4fd qemu-iotests: 140: make description slightly more verbose
Describe in a little more detail what the test is supposed to achieve.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1455827853-33477-3-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-22 16:54:14 +01:00
Sascha Silbe
4b84fc70ce qemu-iotests: 140: don't use IDE device
IDE is only implemented by very few architectures (mostly PC). The
test doesn't actually need a block device attached to the
BlockBackend, so just drop it and adjust the reference output
accordingly.

Fixes: 16dee418 ("iotests: Add test for eject under NBD server")
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1455827853-33477-2-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-22 16:54:14 +01:00
Sascha Silbe
f436c94102 qemu-iotests: 067: ignore QMP events
The relative ordering of "device_del" return value and the
"DEVICE_DELETED" QMP event depends on the architecture being
tested. On x86 unplugging virtio disks is asynchronous
(=qdev_unplug()= → =hotplug_handler_unplug_request()=) while on s390x
it is synchronous (=qdev_unplug()= → =hotplug_handler_unplug()=). This
leads to the actual output on s390x consistently differing from the
reference output (that was probably produced on x86).

The easiest way to address this is to filter out QMP events in
067. The DEVICE_DELETED event is already getting explicitly tested by
the Python-based test case 139, so the test coverage should be
unaffected. Make use of the recently introduced _filter_qmp_events()
to remove QMP events from the test case output and adjust the
reference output accordingly.

The tr / sed / tr trick used for filtering was suggested by Max Reitz
<mreitz@redhat.com>.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Message-id: 1455886869-139916-2-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-22 16:54:14 +01:00
Alyssa Milburn
156abc2f90 blockdev: unset inappropriate flags when changing medium
Most importantly, this removes BDRV_O_TEMPORARY, to avoid unlink()ing an
image which replaces a snapshotted one.

Signed-off-by: Alyssa Milburn <fuzzie@fuzzie.org>
Message-id: 20160206133618.GA16635@li141-249.members.linode.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-22 16:54:14 +01:00
Alberto Garcia
d310d85bf4 MAINTAINERS: Add myself as maintainer of the throttling code
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:07 +01:00
Alberto Garcia
1ffad77cde docs: Document the throttling infrastructure
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:07 +01:00
Alberto Garcia
f5a845fdb4 qapi: Correct the name of the iops_rd parameter
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
a90cade023 qemu-iotests: Extend iotest 093 to test bursts
This patch adds a new test that checks that the burst settings
('iops_max', 'iops_max_length', etc.) of the throttling code work as
expected.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
f9d058852c throttle: Test throttle_compute_wait() during bursts
This test simulates an I/O burst for more than two seconds and checks
that it works as expected.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
eb8a1a1cbd throttle: Check that burst_level leaks correctly
This patch expands test_leak_bucket() to check that burst_level leaks
correctly.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
398befdf50 qapi: Add burst length fields to BlockDeviceInfo
This patch adds the new bps_*_max_length and iops_*_max_length
parameters to the BlockDeviceInfo struct.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
dce13204a0 qapi: Add burst length parameters to block_set_io_throttle
This patch adds the new bps_*_max_length and iops_*_max_length
parameters to the block_set_io_throttle command.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:06 +01:00
Alberto Garcia
8a0fc18d88 throttle: Add command-line settings to define the burst periods
This patch adds all the throttling.*-max-length command-line
parameters to define the length of the burst periods.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
100f8f2608 throttle: Add support for burst periods
This patch adds support for burst periods to the throttling code.
With this feature the user can keep performing bursts as defined by
the LeakyBucket.max rate for a configurable period of time.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
1588ab5d0b throttle: Use throttle_config_init() to initialize ThrottleConfig
We can currently initialize ThrottleConfig by zeroing all its fields,
but this will change with the new fields to define the length of the
burst periods.

This patch introduces a new throttle_config_init() function and uses it
to replace all memset() calls that initialize ThrottleConfig directly.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
d5851089a8 throttle: Merge all functions that check the configuration into one
There's no need to keep throttle_conflicting(), throttle_is_valid()
and throttle_max_is_missing_limit() as separate functions, so this
patch merges all three into one.

As a consequence, check_throttle_config() becomes redundant and can be
replaced with throttle_is_valid().

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
6f9b6d57ae throttle: Set always an average value when setting a maximum value
When testing the ranges of valid values, set_cfg_value() creates
sometimes invalid throttling configurations by setting bucket.max
while leaving bucket.avg uninitialized.

While this doesn't break the current tests, it will as soon as
we unify all functions that check the validity of the throttling
configuration.

This patch ensures that the value of bucket.avg is valid when setting
bucket.max.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:05 +01:00
Alberto Garcia
03ba36c83d throttle: Make throttle_is_valid() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Alberto Garcia
45b2d418e0 throttle: Make throttle_max_is_missing_limit() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Alberto Garcia
6921b18095 throttle: Make throttle_conflicting() set errp
The caller does not need to set it, and this will allow us to refactor
this function later.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Alberto Garcia
3c9242f5ae throttle: Make throttle_compute_timer() static
This function is only used internally in throttle.c

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 14:08:04 +01:00
Peter Maydell
a02dabe10a Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160222-1' into staging
gtk: fix uninitialized temporary VirtualConsole

# gpg: Signature made Mon 22 Feb 2016 08:30:39 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160222-1:
  gtk: fix uninitialized temporary VirtualConsole

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-22 11:10:47 +00:00
Kevin Wolf
9bd9c7f5b5 block migration: Activate image on destination before writing to it
When using 'migrate -b', we must make sure to take ownership of the
image before writing to it. Otherwise metadata would be thrown away on
migration completion; this was caught by the assertions introduced in
commit 09e0c771.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 10:21:15 +01:00
Daniel P. Berrange
a513416ecf qemu-io: use no_argument/required_argument constants
When declaring the 'struct option' array, use the standard
constants no_argument/required_argument, instead of magic
values 0 and 1.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:05 +01:00
Daniel P. Berrange
aa6e546c5a qemu-nbd: use no_argument/required_argument constants
When declaring the 'struct option' array, use the standard
constants no_argument/required_argument, instead of magic
values 0 and 1.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:05 +01:00
Daniel P. Berrange
fa8b7ce2c6 qemu-nbd: don't overlap long option values with short options
When defining values for long options, the normal practice is
to start numbering from 256, to avoid overlap with the range
of valid values for short options.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:05 +01:00
Daniel P. Berrange
eb769f7420 qemu-img: allow specifying image as a set of options args
Currently qemu-img allows an image filename to be passed on the
command line, but unless using the JSON format, it does not have
a way to set any options except the format eg

   qemu-img info https://127.0.0.1/images/centos7.iso

This adds a --image-opts arg that indicates that the positional
filename should be interpreted as a full option string, not
just a filename.

   qemu-img info --image-opts driver=https,url=https://127.0.0.1/images,sslverify=off

This flag is mutually exclusive with the '-f' / '-F' flags.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:04 +01:00
Daniel P. Berrange
77c9aaefd7 qemu-nbd: allow specifying image as a set of options args
Currently qemu-nbd allows an image filename to be passed on the
command line, but unless using the JSON format, it does not have
a way to set any options except the format eg

   qemu-nbd https://127.0.0.1/images/centos7.iso
   qemu-nbd /home/berrange/demo.qcow2

This adds a --image-opts arg that indicates that the positional
filename should be interpreted as a full option string, not
just a filename.

   qemu-nbd --image-opts driver=https,url=https://127.0.0.1/images,sslverify=off
   qemu-nbd --image-opts driver=file,filename=/home/berrange/demo.qcow2

This flag is mutually exclusive with the '-f' flag.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:04 +01:00
Daniel P. Berrange
499afa2512 qemu-io: allow specifying image as a set of options args
Currently qemu-io allows an image filename to be passed on the
command line, but unless using the JSON format, it does not have
a way to set any options except the format eg

 qemu-io https://127.0.0.1/images/centos7.iso
 qemu-io /home/berrange/demo.qcow2

By contrast when using the interactive shell, it is possible to
use --option with the 'open' command, or to omit the filename.

This adds a --image-opts arg that indicates that the positional
filename should be interpreted as a full option string, not
just a filename.

 qemu-io --image-opts driver=https,url=https://127.0.0.1/images,sslverify=off
 qemu-io --image-opts driver=qcow2,file.filename=/home/berrange/demo.qcow2

This flag is mutually exclusive with the '-f' flag and with
the '-o' flag to the 'open' command

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:04 +01:00
Daniel P. Berrange
3babeb153c qemu-img: add support for --object command line arg
Allow creation of user creatable object types with qemu-img
via a new --object command line arg. This will be used to supply
passwords and/or encryption keys to the various block driver
backends via the recently added 'secret' object type.

 # printf letmein > mypasswd.txt
 # qemu-img info --object secret,id=sec0,file=mypasswd.txt \
      ...other info args...

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:04 +01:00
Daniel P. Berrange
9ba371b634 qemu-io: add support for --object command line arg
Allow creation of user creatable object types with qemu-io
via a new --object command line arg. This will be used to supply
passwords and/or encryption keys to the various block driver
backends via the recently added 'secret' object type.

 # printf letmein > mypasswd.txt
 # qemu-io --object secret,id=sec0,file=mypasswd.txt \
      ...other args...

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:50:04 +01:00
Kevin Wolf
12d5ee3a7e block: Fix -incoming with snapshot=on
The BDRV_O_INACTIVE flag should only be set for images explicitly opened
by the user. snapshot=on needs to create a new qcow2 image and write
some metadata to it. This is not a problem because it can't come from
the source, so there's no reason to mark it as BDRV_O_INACTIVE, even
though it is opened while waiting for the migration to complete.

This fixes an assertion failure when -incoming and snapshot=on are
combined.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:49:46 +01:00
Vladimir Sementsov-Ogievskiy
bca5a8f462 spec: add qcow2 bitmaps extension specification
The new feature for qcow2: storing bitmaps.

This patch adds new header extension to qcow2 - Bitmaps Extension. It
provides an ability to store virtual disk related bitmaps in a qcow2
image. For now there is only one type of such bitmaps: Dirty Tracking
Bitmap, which just tracks virtual disk changes from some moment.

Note: Only bitmaps, relative to the virtual disk, stored in qcow2 file,
should be stored in this qcow2 file. The size of each bitmap
(considering its granularity) is equal to virtual disk size.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:49:46 +01:00
Changlong Xie
f38738e212 quorum: fix segfault when read fails in fifo mode
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:49:46 +01:00
John Snow
2875645b65 qemu-img: initialize MapEntry object
Commit 16b0d555 introduced an issue where we are not initializing
has_filename for the 'next' MapEntry object, which leads to interesting
errors in both Valgrind and Clang -fsanitize=undefined.

Zero the stack object at allocation AND make sure the utility to
populate the fields properly marks has_filename as false if applicable.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-22 09:49:46 +01:00
Paolo Bonzini
327d83ba71 gtk: fix uninitialized temporary VirtualConsole
Only the echo field is used in the temporary VirtualConsole, so the
damage was limited.  But still, if echo was incorrectly set to true,
the result would be some puzzling output in VTE monitor and serial
consoles.

Fixes: fba958c692
Cc: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1455015557-15106-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-22 08:38:42 +01:00
Edgar E. Iglesias
c3bce9d5f9 etraxfs_dma: Dont forward zero-length payload to clients
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-20 00:17:48 +01:00
Peter Maydell
586d1a99ff Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160219.1' into staging
VFIO updates 2016-02-19

 - AER pre-enable and misc fixes (Cao jin and Chen Fan)
 - PCI_CAP_LIST_NEXT & PCI_MSIX_FLAGS cleanup (Wei Yang)
 - AMD XGBE KVM platform passthrough (Eric Auger)

# gpg: Signature made Fri 19 Feb 2016 17:28:36 GMT using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20160219.1:
  vfio/pci: use PCI_MSIX_FLAGS on retrieving the MSIX entries
  hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check
  hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation
  hw/arm/sysbus-fdt: helpers for clock node generation
  device_tree: qemu_fdt_getprop_cell converted to use the error API
  device_tree: qemu_fdt_getprop converted to use the error API
  device_tree: introduce qemu_fdt_node_path
  device_tree: introduce load_device_tree_from_sysfs
  hw/vfio/platform: amd-xgbe device
  vfio/pci: replace 1 with PCI_CAP_LIST_NEXT to make code self-explain
  pcie_aer: expose pcie_aer_msg() interface
  aer: impove pcie_aer_init to support vfio device
  vfio: make the 4 bytes aligned for capability size
  pcie: modify the capability size assert

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-19 17:44:24 +00:00
Peter Maydell
a40db1b36b qemu-options.hx: Improve documentation of chardev multiplexing mode
The current documentation of chardev mux=on is rather brief and opaque;
expand it to hopefully be a bit more helpful.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1455643738-6068-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-19 18:27:56 +01:00
Peter Maydell
3ba32c100a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-softfloat-20160219' into staging
softfloat queue:
 * update MAINTAINERS with a section for softfloat
 * drop all the uses of int_fast*_t types

# gpg: Signature made Fri 19 Feb 2016 16:34:35 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-softfloat-20160219:
  MAINTAINERS: Add section for FPU emulation
  osdep.h: Remove int_fast*_t Solaris compatibility code
  fpu: Use plain 'int' rather than 'int_fast16_t' for exponents
  fpu: Use plain 'int' rather than 'int_fast16_t' for shift counts
  fpu: Remove use of int_fast16_t in conversions to int16
  target-mips: Stop using uint_fast*_t types in r4k_tlb_t struct

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-19 16:49:49 +00:00
Wei Yang
b58b17f744 vfio/pci: use PCI_MSIX_FLAGS on retrieving the MSIX entries
Even PCI_CAP_FLAGS has the same value as PCI_MSIX_FLAGS, the later one is
the more proper on retrieving MSIX entries.

This patch uses PCI_MSIX_FLAGS to retrieve the MSIX entries.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:32 -07:00
Eric Auger
c89e91a76b hw/arm/sysbus-fdt: remove qemu_fdt_setprop returned value check
qemu_fdt_setprop asserts in case of error hence no need to check
the returned value.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:31 -07:00
Eric Auger
cf5a13e370 hw/arm/sysbus-fdt: enable amd-xgbe dynamic instantiation
This patch allows the instantiation of the vfio-amd-xgbe device
from the QEMU command line (-device vfio-amd-xgbe,host="<device>").

The guest is exposed with a device tree node that combines the description
of both XGBE and PHY (representation supported from 4.2 onwards kernel):
Documentation/devicetree/bindings/net/amd-xgbe.txt.

There are 5 register regions, 6 interrupts including 4 optional
edge-sensitive per-channel interrupts.

Some property values are inherited from host device tree. Host device tree
must feature a combined XGBE/PHY representation (>= 4.2 host kernel).

2 clock nodes (dma and ptp) also are created. It is checked those clocks
are fixed on host side.

AMD XGBE node creation function has a dependency on vfio Linux header and
more generally node creation function for VFIO platform devices only make
sense with CONFIG_LINUX so let's protect this code with #ifdef CONFIG_LINUX.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:31 -07:00
Eric Auger
9481cf2e5f hw/arm/sysbus-fdt: helpers for clock node generation
Some passthrough'ed devices depend on clock nodes. Those need to be
generated in the guest device tree. This patch introduces some helpers
to build a clock node from information retrieved in the host device tree.

- copy_properties_from_host copies properties from a host device tree
  node to a guest device tree node
- fdt_build_clock_node builds a guest clock node and checks the host
  fellow clock is a fixed one.

fdt_build_clock_node will become static as soon as it gets used. A
dummy pre-declaration is needed for compilation of this patch.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:31 -07:00
Eric Auger
58e71097ce device_tree: qemu_fdt_getprop_cell converted to use the error API
This patch aligns the prototype with qemu_fdt_getprop. The caller
can choose whether the function self-asserts on error (passing
&error_fatal as Error ** argument, corresponding to the legacy behavior),
or behaves differently such as simply output a message.

In this later case the caller can use the new lenp parameter to interpret
the error if any.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:30 -07:00
Eric Auger
78e24f235e device_tree: qemu_fdt_getprop converted to use the error API
Current qemu_fdt_getprop exits if the property is not found. It is
sometimes needed to read an optional property, in which case we do
not wish to exit but simply returns a null value.

This patch converts qemu_fdt_getprop to accept an Error **, and existing
users are converted to pass &error_fatal. This preserves the existing
behaviour. Then to use the API with your optional semantic a null
parameter can be conveyed.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:30 -07:00
Eric Auger
6d79566ae6 device_tree: introduce qemu_fdt_node_path
This new helper routine returns a NULL terminated array of
node paths matching a node name and a compat string.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:30 -07:00
Eric Auger
60e43e987c device_tree: introduce load_device_tree_from_sysfs
This function returns the host device tree blob from sysfs
(/proc/device-tree). It uses a recursive function inspired
from dtc read_fstree.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:29 -07:00
Eric Auger
62d9551247 hw/vfio/platform: amd-xgbe device
This patch introduces the amd-xgbe VFIO platform device. It
allows the guest to do passthrough on a device exposing an
"amd,xgbe-seattle-v1a" compat string.

Signed-off-by: Eric Auger <eric.auger@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:29 -07:00
Wei Yang
3fc1c182c1 vfio/pci: replace 1 with PCI_CAP_LIST_NEXT to make code self-explain
Use the macro PCI_CAP_LIST_NEXT instead of 1, so that the code would be
more self-explain.

This patch makes this change and also fixs one typo in comment.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:29 -07:00
Chen Fan
40f8f0c31b pcie_aer: expose pcie_aer_msg() interface
For vfio device, we need to propagate the aer error to
Guest OS. we use the pcie_aer_msg() to send aer error
to guest.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:28 -07:00
Chen Fan
8d86ada2a7 aer: impove pcie_aer_init to support vfio device
pcie_aer_init was used to emulate an aer capability for pcie device,
but for vfio device, the aer config space size is mutable and is not
always equal to PCI_ERR_SIZEOF(0x48). it depends on where the TLP Prefix
register required, so here we add a size argument.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:28 -07:00
Chen Fan
88caf177ac vfio: make the 4 bytes aligned for capability size
this function search the capability from the end, the last
size should 0x100 - pos, not 0xff - pos.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:28 -07:00
Chen Fan
79095ef717 pcie: modify the capability size assert
Device's Offset and size can reach PCIE_CONFIG_SPACE_SIZE,
fix the corresponding assert.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-02-19 09:42:27 -07:00
Peter Maydell
1badb58698 MAINTAINERS: Add section for FPU emulation
Add an entry to the MAINTAINERS file for our softfloat FPU
emulation code. This code is only 'odd fixes' but it's useful to
record who to cc on patches to it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453814875-440-1-git-send-email-peter.maydell@linaro.org
2016-02-19 16:27:22 +00:00
Peter Maydell
50fe4df8ee osdep.h: Remove int_fast*_t Solaris compatibility code
We now do not use the int_fast*_t types anywhere in QEMU, so we can
remove the compatibility definitions we were providing for the
benefit of ancient Solaris versions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-5-git-send-email-peter.maydell@linaro.org
2016-02-19 16:27:22 +00:00
Peter Maydell
0c48262d47 fpu: Use plain 'int' rather than 'int_fast16_t' for exponents
Use the plain 'int' type rather than 'int_fast16_t' for handling
exponents. Exponents don't need to be exactly 16 bits, so using int16_t
for them would confuse more than it clarified.

This should be a safe change because int_fast16_t semantics
permit use of 'int' (and on 32-bit glibc that is what you get).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-4-git-send-email-peter.maydell@linaro.org
2016-02-19 16:27:22 +00:00
Peter Maydell
07d792d2b0 fpu: Use plain 'int' rather than 'int_fast16_t' for shift counts
Use the plain 'int' type rather than 'int_fast16_t' for shift counts
in the various shift related functions, since we don't actually care
about the size of the integer at all here, and using int16_t would
be confusing.

This should be a safe change because int_fast16_t semantics
permit use of 'int' (and on 32-bit glibc that is what you get).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-3-git-send-email-peter.maydell@linaro.org
2016-02-19 16:27:22 +00:00
Peter Maydell
0bb721d721 fpu: Remove use of int_fast16_t in conversions to int16
Make the functions which convert floating point to 16 bit integer
return int16_t rather than int_fast16_t, and correspondingly use
int_fast16_t in their internal implementations where appropriate.

(These functions are used only by the ARM target.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1453807806-32698-2-git-send-email-peter.maydell@linaro.org
2016-02-19 16:27:21 +00:00
Peter Maydell
d783f78933 target-mips: Stop using uint_fast*_t types in r4k_tlb_t struct
The r4k_tlb_t structure uses the uint_fast*_t types. Most of these
uses are in bitfields and are thus pointless, because the bitfield
itself specifies the width of the type; just use 'unsigned int'
instead. (On glibc uint_fast16_t is defined as either 32 or 64 bits,
so we know the code is not reliant on it being exactly 16 bits.)
There is also one use of uint_fast8_t, which we replace with uint8_t,
because both are exactly 8 bits on glibc and this is the only
place outside the softfloat code which uses an int_fast*_t type.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
2016-02-19 16:27:06 +00:00
Peter Maydell
1b3337bb1d Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-02-19' into staging
Error reporting patches for 2016-02-19

# gpg: Signature made Fri 19 Feb 2016 12:47:50 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-error-2016-02-19:
  vl: Clean up machine selection in main().
  vl: Set error location when parsing memory options
  replay: Set error location properly when parsing options
  vl: Reset location after handling command-line arguments
  vl.c: Fix regression in machine error message

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-19 15:19:13 +00:00
Peter Maydell
5cfffc30de Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-02-19' into staging
QAPI patches for 2016-02-19

# gpg: Signature made Fri 19 Feb 2016 10:10:18 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2016-02-19:
  qapi: Change visit_start_implicit_struct to visit_start_alternate
  qapi: Don't box branches of flat unions
  qapi: Don't box struct branch of alternate
  qapi-visit: Use common idiom in gen_visit_fields_decl()
  qapi: Emit structs used as variants in topological order
  qapi: Adjust layout of FooList types
  qapi-visit: Less indirection in visit_type_Foo_fields()
  qapi-visit: Unify struct and union visit
  qapi: Visit variants in visit_type_FOO_fields()
  qapi-visit: Simplify how we visit common union members
  qapi: Add tests of complex objects within alternate
  qapi: Forbid 'any' inside an alternate
  qapi: Forbid empty unions and useless alternates
  qapi: Simplify excess input reporting in input visitors
  qapi-visit: Honor prefix of discriminator enum

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-19 14:18:21 +00:00
Markus Armbruster
7580f231cf vl: Clean up machine selection in main().
We set machine_class to the default first, and update it to the real
one later.  Any use of machine_class in between is almost certainly
wrong (there are no such uses right now).  Set it once and for all
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-19 13:46:44 +01:00
Eduardo Habkost
bbe2d25c8f vl: Set error location when parsing memory options
Set error location so the error_report() calls will show
appropriate command-line argument or config file info.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455303747-19776-5-git-send-email-ehabkost@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 13:46:44 +01:00
Eduardo Habkost
890ad5508e replay: Set error location properly when parsing options
Set error location so the error_report() calls will show
appropriate command-line argument or config file info.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455303747-19776-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 13:46:44 +01:00
Eduardo Habkost
43fa1e0bd9 vl: Reset location after handling command-line arguments
After looping through all command-line arguments, error location
info becomes obsolete, and any function calling error_report()
will print misleading information. This breaks error reporting
for some option handling, like:

  $ qemu-system-x86_64 -icount rr=x -vnc :0
  qemu-system-x86_64: -vnc :0: Invalid icount rr option: x

  $ qemu-system-x86_64 -m size= -vnc :0
  qemu-system-x86_64: -vnc :0: missing 'size' option value

Fix this by resetting location info as soon as we exit the
command-line handling loop.

With this, replay_configure() and set_memory_options() won't
print any location info yet, but at least they won't print
incorrect information.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455303747-19776-3-git-send-email-ehabkost@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
["Do not insert code here" comment added to prevent regressions]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 13:46:44 +01:00
Marcel Apfelbaum
34f405ae6d vl.c: Fix regression in machine error message
Commit e1ce0c3cb (vl.c: fix regression when reading machine type
from config file) fixed the error message when the machine type
was supplied inside the config file. However now the option name
is not displayed correctly if the error happens when the machine
is specified at command line.

Running
    ./x86_64-softmmu/qemu-system-x86_64 -M q35-1.5 -redir tcp:8022::22
will result in the error message:
    qemu-system-x86_64: -redir tcp:8022::22: unsupported machine type
    Use -machine help to list supported machines

Fixed it by restoring the error location and also extracted the code
dealing with machine options into a separate function.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1455303747-19776-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 13:46:44 +01:00
Peter Maydell
09125c5e76 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, virtio, pci, pxe

Fixes all over the place.
New tests for pxe.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 18 Feb 2016 15:46:39 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream:
  tests/vhost-user-bridge: add scattering of incoming packets
  vhost-user interrupt management fixes
  rules: filter out irrelevant files
  change type of pci_bridge_initfn() to void
  dec: convert to realize()
  tests: add pxe e1000 and virtio-pci tests
  msix: fix msix_vector_masked
  virtio: optimize virtio_access_is_big_endian() for little-endian targets
  vhost: simplify vhost_needs_vring_endian()
  vhost: move virtio 1.0 check to cross-endian helper
  virtio: move cross-endian helper to vhost
  vhost-net: revert support of cross-endian vnet headers
  virtio-net: use the backend cross-endian capabilities

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-19 10:50:37 +00:00
Eric Blake
dbf1192262 qapi: Change visit_start_implicit_struct to visit_start_alternate
After recent changes, the only remaining use of
visit_start_implicit_struct() is for allocating the space needed
when visiting an alternate.  Since the term 'implicit struct' is
hard to explain, rename the function to its current usage.  While
at it, we can merge the functionality of visit_get_next_type()
into the same function, making it more like visit_start_struct().

Generated code is now slightly smaller:

| {
|     Error *err = NULL;
|
|-    visit_start_implicit_struct(v, (void**) obj, sizeof(BlockdevRef), &err);
|+    visit_start_alternate(v, name, (GenericAlternate **)obj, sizeof(**obj),
|+                          true, &err);
|     if (err) {
|         goto out;
|     }
|-    visit_get_next_type(v, name, &(*obj)->type, true, &err);
|-    if (err) {
|-        goto out_obj;
|-    }
|     switch ((*obj)->type) {
|     case QTYPE_QDICT:
|         visit_start_struct(v, name, NULL, 0, &err);
...
|     }
|-out_obj:
|-    visit_end_implicit_struct(v);
|+    visit_end_alternate(v);
| out:
|     error_propagate(errp, err);
| }

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-16-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
544a373159 qapi: Don't box branches of flat unions
There's no reason to do two malloc's for a flat union; let's just
inline the branch struct directly into the C union branch of the
flat union.

Surprisingly, fewer clients were actually using explicit references
to the branch types in comparison to the number of flat unions
thus modified.

This lets us reduce the hack in qapi-types:gen_variants() added in
the previous patch; we no longer need to distinguish between
alternates and flat unions.

The change to unboxed structs means that u.data (added in commit
cee2dedb) is now coincident with random fields of each branch of
the flat union, whereas beforehand it was only coincident with
pointers (since all branches of a flat union have to be objects).
Note that this was already the case for simple unions - but there
we got lucky.  Remember, visit_start_union() blindly returns true
for all visitors except for the dealloc visitor, where it returns
the value !!obj->u.data, and that this result then controls
whether to proceed with the visit to the variant.  Pre-patch,
this meant that flat unions were testing whether the boxed pointer
was still NULL, and thereby skipping visit_end_implicit_struct()
and avoiding a NULL dereference if the pointer had not been
allocated.  The same was true for simple unions where the current
branch had pointer type, except there we bypassed visit_type_FOO().
But for simple unions where the current branch had scalar type, the
contents of that scalar meant that the decision to call
visit_type_FOO() was data-dependent - the reason we got lucky there
is that visit_type_FOO() for all scalar types in the dealloc visitor
is a no-op (only the pointer variants had anything to free), so it
did not matter whether the dealloc visit was skipped.  But with this
patch, we would risk leaking memory if we could skip a call to
visit_type_FOO_fields() based solely on a data-dependent decision.

But notice: in the dealloc visitor, visit_type_FOO() already handles
a NULL obj - it was only the visit_type_implicit_FOO() that was
failing to check for NULL. And now that we have refactored things to
have the branch be part of the parent struct, we no longer have a
separate pointer that can be NULL in the first place.  So we can just
delete the call to visit_start_union() altogether, and blindly visit
the branch type; there is no change in behavior except to the dealloc
visitor, where we now unconditionally visit the branch, but where that
visit is now always safe (for a flat union, we can no longer
dereference NULL, and for a simple union, visit_type_FOO() was already
safely handling NULL on pointer types).

Unfortunately, simple unions are not as easy to switch to unboxed
layout; because we are special-casing the hidden implicit type with
a single 'data' member, we really DO need to keep calling another
layer of visit_start_struct(), with a second malloc; although there
are some cleanups planned for simple unions in later patches.

visit_start_union() and gen_visit_implicit_struct() are now unused.
Drop them.

Note that after this patch, the only remaining use of
visit_start_implicit_struct() is for alternate types; the next patch
will do further cleanup based on that fact.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-14-git-send-email-eblake@redhat.com>
[Dead code deletion squashed in, commit message updated accordingly]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
becceedc4d qapi: Don't box struct branch of alternate
There's no reason to do two malloc's for an alternate type visiting
a QAPI struct; let's just inline the struct directly as the C union
branch of the struct.

Surprisingly, no clients were actually using the struct member prior
to this patch outside of the testsuite; an earlier patch in the series
added some testsuite coverage to make the effect of this patch more
obvious.

In qapi.py, c_type() gains a new is_unboxed flag to control when we
are emitting a C struct unboxed within the context of an outer
struct (different from our other two modes of usage with no flags
for normal local variable declarations, and with is_param for adding
'const' in a parameter list).  I don't know if there is any more
pythonic way of collapsing the two flags into a single parameter,
as we never have a caller setting both flags at once.

Ultimately, we want to also unbox branches for QAPI unions, but as
that touches a lot more client code, it is better as separate
patches.  But since unions and alternates share gen_variants(), I
had to hack in a way to test if we are visiting an alternate type
for setting the is_unboxed flag: look for a non-object branch.
This works because alternates have at least two branches, with at
most one object branch, while unions have only object branches.
The hack will go away in a later patch.

The generated code difference to qapi-types.h is relatively small:

| struct BlockdevRef {
|     QType type;
|     union { /* union tag is @type */
|         void *data;
|-        BlockdevOptions *definition;
|+        BlockdevOptions definition;
|         char *reference;
|     } u;
| };

The corresponding spot in qapi-visit.c calls visit_type_FOO(), which
first calls visit_start_struct() to allocate or deallocate the member
and handle a layer of {} from the JSON stream, then visits the
members.  To peel off the indirection and the memory management that
comes with it, we inline this call, then suppress allocation /
deallocation by passing NULL to visit_start_struct(), and adjust the
member visit:

|     switch ((*obj)->type) {
|     case QTYPE_QDICT:
|-        visit_type_BlockdevOptions(v, name, &(*obj)->u.definition, &err);
|+        visit_start_struct(v, name, NULL, 0, &err);
|+        if (err) {
|+            break;
|+        }
|+        visit_type_BlockdevOptions_fields(v, &(*obj)->u.definition, &err);
|+        error_propagate(errp, err);
|+        err = NULL;
|+        visit_end_struct(v, &err);
|         break;
|     case QTYPE_QSTRING:
|         visit_type_str(v, name, &(*obj)->u.reference, &err);

The visit of non-object fields is unchanged.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-13-git-send-email-eblake@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
2208d64998 qapi-visit: Use common idiom in gen_visit_fields_decl()
We have several instances of methods that do an early exit if
output is not needed, then log that output is being generated,
and finally produce the output; see qapi-types.py:gen_object()
and qapi-visit.py:gen_visit_implicit_struct().  The odd man
out was gen_visit_fields_decl(); rearrange it to be more like
the others.  No semantic change or difference to generated code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-12-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
1de5d4ca07 qapi: Emit structs used as variants in topological order
Right now, we emit the branches of union types as a boxed pointer,
and it suffices to have a forward declaration of the type.  However,
a future patch will swap things to directly use the branch type,
instead of hiding it behind a pointer.  For this to work, the
compiler needs the full definition of the type, not just a forward
declaration, prior to the union that is including the branch type.
This patch just adds topological sorting to hoist all types
mentioned in a branch of a union to be fully declared before the
union itself.  The sort is always possible, because we do not
allow circular union types that include themselves as a direct
branch (it is, however, still possible to include a branch type
that itself has a pointer to the union, for a type that can
indirectly recursively nest itself - that remains safe, because
that the member of the branch type will remain a pointer, and the
QMP representation of such a type adds another {} for each recurring
layer of the union type).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-11-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
e65d89bf1a qapi: Adjust layout of FooList types
By sticking the next pointer first, we don't need a union with
64-bit padding for smaller types.  On 32-bit platforms, this
can reduce the size of uint8List from 16 bytes (or 12, depending
on whether 64-bit ints can tolerate 4-byte alignment) down to 8.
It has no effect on 64-bit platforms (where alignment still
dictates a 16-byte struct); but fewer anonymous unions is still
a win in my book.

It requires visit_next_list() to gain a size parameter, to know
what size element to allocate; comparable to the size parameter
of visit_start_struct().

I debated about going one step further, to allow for fewer casts,
by doing:
    typedef GenericList GenericList;
    struct GenericList {
        GenericList *next;
    };
    struct FooList {
        GenericList base;
        Foo *value;
    };
so that you convert to 'GenericList *' by '&foolist->base', and
back by 'container_of(generic, GenericList, base)' (as opposed to
the existing '(GenericList *)foolist' and '(FooList *)generic').
But doing that would require hoisting the declaration of
GenericList prior to inclusion of qapi-types.h, rather than its
current spot in visitor.h; it also makes iteration a bit more
verbose through 'foolist->base.next' instead of 'foolist->next'.

Note that for lists of objects, the 'value' payload is still
hidden behind a boxed pointer.  Someday, it would be nice to do:

struct FooList {
    FooList *next;
    Foo value;
};

for one less level of malloc for each list element.  This patch
is a step in that direction (now that 'next' is no longer at a
fixed non-zero offset within the struct, we can store more than
just a pointer's-worth of data as the value payload), but the
actual conversion would be a task for another series, as it will
touch a lot of code.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-10-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
655519030b qapi-visit: Less indirection in visit_type_Foo_fields()
We were passing 'Foo **obj' to the internal helper function, but
all uses within the helper were via reads of '*obj'.  Refactor
things to pass one less level of indirection, by having the
callers dereference before calling.

For an example of the generated code change:

|-static void visit_type_BalloonInfo_fields(Visitor *v, BalloonInfo **obj, Error **errp)
|+static void visit_type_BalloonInfo_fields(Visitor *v, BalloonInfo *obj, Error **errp)
| {
|     Error *err = NULL;
|
|-    visit_type_int(v, "actual", &(*obj)->actual, &err);
|+    visit_type_int(v, "actual", &obj->actual, &err);
|     error_propagate(errp, err);
| }
|
|@@ -261,7 +261,7 @@ void visit_type_BalloonInfo(Visitor *v,
|     if (!*obj) {
|         goto out_obj;
|     }
|-    visit_type_BalloonInfo_fields(v, obj, &err);
|+    visit_type_BalloonInfo_fields(v, *obj, &err);
| out_obj:

The refactoring will also make it easier to reuse the helpers in
a future patch when implicit structs are stored directly in the
parent struct rather than boxed through a pointer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Markus Armbruster
59d9e84cc9 qapi-visit: Unify struct and union visit
gen_visit_union() is now just like gen_visit_struct().  Rename
it to gen_visit_object(), use it for structs, and drop
gen_visit_struct().  Output is unchanged.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1453902888-20457-4-git-send-email-armbru@redhat.com>
[split out variant handling, rebase to earlier changes]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-8-git-send-email-eblake@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
9a5cd424d5 qapi: Visit variants in visit_type_FOO_fields()
We initially created the static visit_type_FOO_fields() helper
function for reuse of code - we have cases where the initial
setup for a visit has different allocation (depending on whether
the fields represent a stand-alone type or are embedded as part
of a larger type), but where the actual field visits are
identical once a pointer is available.

Up until the previous patch, visit_type_FOO_fields() was only
used for structs (no variants), so it was covering every field
for each type where it was emitted.

Meanwhile, the code for visiting unions looks like:

static visit_type_U_fields() {
    visit base;
    visit local_members;
}
visit_type_U() {
    visit_start_struct();
    visit_type_U_fields();
    visit variants;
    visit_end_struct();
}

which splits the fields of the union visit across two functions.
Move the code to visit variants to live inside visit_type_U_fields(),
while making it conditional on having variants so that all other
instances of the helper function remain unchanged.  This is also
a step closer towards unifying struct and union visits, and towards
allowing one union type to be the branch of another flat union.

The resulting diff to the generated code is a bit hard to read,
but it can be verified that it touches only union types, and that
the end result is the following general structure:

static visit_type_U_fields() {
    visit base;
    visit local_members;
    visit variants;
}
visit_type_U() {
    visit_start_struct();
    visit_type_U_fields();
    visit_end_struct();
}

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-7-git-send-email-eblake@redhat.com>
[gen_visit_struct_fields() parameter variants made mandatory]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Markus Armbruster
d7445b57f4 qapi-visit: Simplify how we visit common union members
For a simple union SU, gen_visit_union() generates a visit of its
single tag member, like this:

    visit_type_SUKind(v, "type", &(*obj)->type, &err);

For a flat union FU with base B, it generates a visit of its base
fields:

    visit_type_B_fields(v, (B **)obj, &err);

Instead, we can simply visit the common members using the same fields
visit function we use for structs, generated with
gen_visit_struct_fields().  This function visits the base if any, then
the local members.

For a simple union SU, visit_type_SU_fields() contains exactly the old
tag member visit, because there is no base, and the tag member is the
only member.  For instance, the code generated for qapi-schema.json's
KeyValue changes like this:

    +static void visit_type_KeyValue_fields(Visitor *v, KeyValue **obj, Error **errp)
    +{
    +    Error *err = NULL;
    +
    +    visit_type_KeyValueKind(v, "type", &(*obj)->type, &err);
    +    if (err) {
    +        goto out;
    +    }
    +
    +out:
    +    error_propagate(errp, err);
    +}
    +
     void visit_type_KeyValue(Visitor *v, const char *name, KeyValue **obj, Error **errp)
     {
         Error *err = NULL;
    @@ -4863,7 +4911,7 @@ void visit_type_KeyValue(Visitor *v, con
         if (!*obj) {
             goto out_obj;
         }
    -    visit_type_KeyValueKind(v, "type", &(*obj)->type, &err);
    +    visit_type_KeyValue_fields(v, obj, &err);
         if (err) {
             goto out_obj;
         }

For a flat union FU, visit_type_FU_fields() contains exactly the old
base fields visit, because there is a base, but no members.  For
instance, the code generated for qapi-schema.json's CpuInfo changes
like this:

     static void visit_type_CpuInfoBase_fields(Visitor *v, CpuInfoBase **obj, Error **errp);

    +static void visit_type_CpuInfo_fields(Visitor *v, CpuInfo **obj, Error **errp)
    +{
    +    Error *err = NULL;
    +
    +    visit_type_CpuInfoBase_fields(v, (CpuInfoBase **)obj, &err);
    +    if (err) {
    +        goto out;
    +    }
    +
    +out:
    +    error_propagate(errp, err);
    +}
    +
     static void visit_type_CpuInfoX86_fields(Visitor *v, CpuInfoX86 **obj, Error **errp)
...
    @@ -3485,7 +3509,7 @@ void visit_type_CpuInfo(Visitor *v, cons
         if (!*obj) {
             goto out_obj;
         }
    -    visit_type_CpuInfoBase_fields(v, (CpuInfoBase **)obj, &err);
    +    visit_type_CpuInfo_fields(v, obj, &err);
         if (err) {
             goto out_obj;
         }

As you see, the generated code grows a bit, but in practice, it's lost
in the noise: qapi-schema.json's qapi-visit.c gains roughly 1%.

This simplification became possible with commit 441cbac "qapi-visit:
Convert to QAPISchemaVisitor, fixing bugs".  It's a step towards
unifying gen_struct() and gen_union().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1453902888-20457-2-git-send-email-armbru@redhat.com>
[improve commit message examples]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-6-git-send-email-eblake@redhat.com>
[Commit message tweaked]
2016-02-19 11:08:57 +01:00
Eric Blake
68d078395d qapi: Add tests of complex objects within alternate
Upcoming patches will adjust how we visit an object branch of an
alternate; but we were completely lacking testsuite coverage.
Rectify this, so that the future patches will be able to highlight
the changes and still prove that we avoided regressions.

In particular, the use of a flat union UserDefFlatUnion rather
than a simple struct UserDefA as the branch will give us coverage
of an object with variants.  And visiting an alternate as both
the top level and as a nested member gives confidence in correct
memory allocation handling, especially if the test is run under
valgrind.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-5-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:57 +01:00
Eric Blake
46534309e6 qapi: Forbid 'any' inside an alternate
The whole point of an alternate is to allow some type-safety while
still accepting more than one JSON type.  Meanwhile, the 'any'
type exists to bypass type-safety altogether.  The two are
incompatible: you can't accept every type, and still tell which
branch of the alternate to use for the parse; fix this to give a
sane error instead of a Python stack trace.

Note that other types that can't be alternate members are caught
earlier, by check_type().

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-4-git-send-email-eblake@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:56 +01:00
Eric Blake
02a57ae32b qapi: Forbid empty unions and useless alternates
Empty unions serve no purpose, and while we compile with gcc
which permits them, strict C99 forbids them.  We happen to inject
a dummy 'void *data' member into the C unions that represent QAPI
unions and alternates, but we want to get rid of that member (it
pollutes the namespace for no good reason), which would leave us
with an empty union if the user didn't provide any branches.  While
empty structs make sense in QAPI, empty unions don't add any
expressiveness to the QMP language.  So prohibit them at parse
time.  Update the documentation and testsuite to match.

Note that the documentation already mentioned that alternates
should have "two or more JSON data types"; so this also fixes
the code to enforce that.  However, we have existing uses of a
union type with only one branch, so the 2-or-more strictness
is intentionally limited to alternates.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455778109-6278-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:56 +01:00
Eric Blake
f96493b1ab qapi: Simplify excess input reporting in input visitors
When reporting that an unvisited member remains at the end of an
input visit for a struct, we were using g_hash_table_find()
coupled with a callback function that always returns true, to
locate an arbitrary member of the hash table.  But if all we
need is an arbitrary entry, we can get that from a single-use
iterator, without needing a tautological callback function.

Technically, our cast of &(GQueue *) to (void **) is not strict
C (while void * must be able to hold all other pointers, nothing
says a void ** has to be the same width or representation as a
GQueue **).  The kosher way to write it would be the verbose:

    void *tmp;
    GQueue *any;
    if (g_hash_table_iter_next(&iter, NULL, &tmp)) {
        any = tmp;

But our code base (not to mention glib itself) already has other
cases of assuming that ALL pointers have the same width and
representation, where a compiler would have to go out of its way
to mis-compile our borderline behavior.

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1455778109-6278-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:56 +01:00
Eric Blake
9d3524b39e qapi-visit: Honor prefix of discriminator enum
When we added support for a user-specified prefix for an enum
type (commit 351d36e), we forgot to teach the qapi-visit code
to honor that prefix in the case of using a prefixed enum as
the discriminator for a flat union.  While there is still some
on-list debate on whether we want to keep prefixes, we should
at least make it work as long as it is still part of the code
base.

Reported-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455665965-27638-1-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-19 11:08:56 +01:00
Victor Kaplansky
a28c393cc2 tests/vhost-user-bridge: add scattering of incoming packets
This patch adds to the vubr test the scattering of incoming
packets to the chain of RX buffer.  Also, this patch corrects the
size of the header preceding the packet in RX buffers.

Note that this patch doesn't add the support for mergeable
buffers.

Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-18 17:42:05 +02:00
Peter Maydell
dd5e38b19d Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160218-1' into staging
target-arm queue:
 * implement or fix various EL3 trap behaviour for system registers
 * clean up the trap/undef handling of the SRS instruction
 * add some missing AArch64 performance monitor system registers
 * implement reset for the PL061 GPIO device
 * QOMify sd.c and the pxa2xx_mmci device
 * SD card emulation fixes for booting Tianocore UEFI on RPi2
 * QOMify various ARM timer devices

# gpg: Signature made Thu 18 Feb 2016 15:19:31 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160218-1: (36 commits)
  hw/timer: QOM'ify pxa2xx_timer
  hw/timer: QOM'ify pl031
  hw/timer: QOM'ify exynos4210_rtc
  hw/timer: QOM'ify exynos4210_pwm
  hw/timer: QOM'ify exynos4210_mct
  hw/timer: QOM'ify arm_timer (pass 2)
  hw/timer: QOM'ify arm_timer (pass 1)
  hw/sd: use guest error logging rather than fprintf to stderr
  hw/sd: model a power-up delay, as a workaround for an EDK2 bug
  hw/sd: implement CMD23 (SET_BLOCK_COUNT) for MMC compatibility
  hw/sd/pxa2xx_mmci: Add reset function
  hw/sd/pxa2xx_mmci: Convert to VMStateDescription
  hw/sd/pxa2xx_mmci: Update to use new SDBus APIs
  hw/sd/pxa2xx_mmci: convert to SysBusDevice object
  sdhci_sysbus: Create SD card device in users, not the device itself
  hw/sd/sdhci.c: Update to use SDBus APIs
  hw/sd: Add QOM bus which SD cards plug in to
  hw/sd/sd.c: Convert sd_reset() function into Device reset method
  hw/sd/sd.c: QOMify
  hw/sd/sdhci.c: Remove x-drive property
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 15:20:35 +00:00
xiaoqiang.zhao
5d83e348e7 hw/timer: QOM'ify pxa2xx_timer
* split the old SysBus init function into an instance_init
  and a Device realize function
* use DeviceClass::realize instead of SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:51 +00:00
xiaoqiang.zhao
81dcc49463 hw/timer: QOM'ify pl031
assign pl031_init to pl031_info.instance_init and drop the
SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:51 +00:00
xiaoqiang.zhao
c9d64639dd hw/timer: QOM'ify exynos4210_rtc
assign exynos4210_rtc_init to exynos4210_rtc_info.instance_init
and drop the SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
xiaoqiang.zhao
ff6ee49511 hw/timer: QOM'ify exynos4210_pwm
assign exynos4210_pwm_init to exynos4210_pwm_info.instance_init
and drop the SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
xiaoqiang.zhao
7a53a140f0 hw/timer: QOM'ify exynos4210_mct
assign exynos4210_mct_init to exynos4210_mct_info.instance_init
and drop the SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
xiaoqiang.zhao
d712a5a2a4 hw/timer: QOM'ify arm_timer (pass 2)
assign DeviceClass::vmsd instead of using vmstate_register function

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
xiaoqiang.zhao
0d175e745f hw/timer: QOM'ify arm_timer (pass 1)
* assign icp_pit_init to icp_pit_info.instance_init
* split the old SysBus init function into an instance_init
  and a Device realize function
* use DeviceClass::realize instead of SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
Andrew Baumann
9800ad88c8 hw/sd: use guest error logging rather than fprintf to stderr
Some of these errors may be harmless (e.g. probing unimplemented
commands, or issuing CMD12 in the wrong state), and may also be quite
frequent. Spamming the standard error output isn't desirable in such
cases.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1454902521-21164-4-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
Andrew Baumann
dd26eb4333 hw/sd: model a power-up delay, as a workaround for an EDK2 bug
The SD spec for ACMD41 says that a zero argument is an "inquiry"
ACMD41, which does not start initialisation and is used only for
retrieving the OCR. However, Tianocore EDK2 (UEFI) has a bug [1]: it
first sends an inquiry (zero) ACMD41. If that first request returns an
OCR value with the power up bit (0x80000000) set, it assumes the card
is ready and continues, leaving the card in the wrong state. (My
assumption is that this works on hardware, because no real card is
immediately powered up upon reset.)

This change models a delay of 0.5ms from the first ACMD41 to the power
being up. However, it also immediately sets the power on upon seeing a
non-zero (non-enquiry) ACMD41. This speeds up UEFI boot, it should
also account for guests that simply delay after card reset and then
issue an ACMD41 that they expect will succeed.

[1] https://github.com/tianocore/edk2/blob/master/EmbeddedPkg/Universal/MmcDxe/MmcIdentification.c#L279
(This is the loop starting with "We need to wait for the MMC or SD
card is ready")

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1454902521-21164-3-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
Andrew Baumann
4481bbc79d hw/sd: implement CMD23 (SET_BLOCK_COUNT) for MMC compatibility
CMD23 is optional for SD but required for MMC, and the UEFI bootloader
used for Windows on Raspberry Pi 2 issues it.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1454902521-21164-2-git-send-email-Andrew.Baumann@microsoft.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:50:50 +00:00
Peter Maydell
6002915e0c hw/sd/pxa2xx_mmci: Add reset function
Add a reset function to the pxa2xx_mmci device; previously it had
no handling for system reset at all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1455646193-13238-11-git-send-email-peter.maydell@linaro.org
2016-02-18 14:50:50 +00:00
Peter Maydell
19d25e0a6d hw/sd/pxa2xx_mmci: Convert to VMStateDescription
Convert the pxa2xx_mmci device from manual save/load
functions to a VMStateDescription structure.

This is a migration compatibility break.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1455646193-13238-10-git-send-email-peter.maydell@linaro.org
2016-02-18 14:49:55 +00:00
Peter Maydell
a9563e75e4 hw/sd/pxa2xx_mmci: Update to use new SDBus APIs
Now the PXA2xx MMCI device is QOMified itself, we can
update it to use the SDBus APIs to talk to the SD card.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455646193-13238-9-git-send-email-peter.maydell@linaro.org
2016-02-18 14:49:21 +00:00
Peter Maydell
7a9468c925 hw/sd/pxa2xx_mmci: convert to SysBusDevice object
Convert the pxa2xx_mmci device to be a sysbus device.

In this commit we only change the device itself, and leave
the interface to the SD card using the old non-SDBus APIs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1455646193-13238-8-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
eb4f566bbb sdhci_sysbus: Create SD card device in users, not the device itself
Move the creation of the SD card device from the sdhci_sysbus
device itself into the boards that create these devices.
This allows us to remove the cannot_instantiate_with_device_add
notation because we no longer call drive_get_next in the device
model.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1455646193-13238-7-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
40bbc19437 hw/sd/sdhci.c: Update to use SDBus APIs
Update the SDHCI code to use the new SDBus APIs.

This commit introduces the new command line options required
to connect a disk to sdhci-pci:

 -device sdhci-pci -drive id=mydrive,[...] -device sd,drive=mydrive

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1455646193-13238-6-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
c759a790b6 hw/sd: Add QOM bus which SD cards plug in to
Add a QOM bus for SD cards to plug in to.

Note that since sd_enable() is used only by one board and there
only as part of a broken implementation, we do not provide it in
the SDBus API (but instead add a warning comment about the old
function). Whoever converts OMAP and the nseries boards to QOM
will need to either implement the card switch properly or move
the enable hack into the OMAP MMC controller model.

In the SDBus API, the old-style use of sd_set_cb to register some
qemu_irqs for notification of card insertion and write-protect
toggling is replaced with methods in the SDBusClass which the
card calls on status changes and methods in the SDClass which
the controller can call to find out the current status. The
query methods will allow us to remove the abuse of the 'register
irqs' API by controllers in their reset methods to trigger
the card to tell them about the current status again.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1455646193-13238-5-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
ba3ed0fa94 hw/sd/sd.c: Convert sd_reset() function into Device reset method
Convert the sd_reset() function into a proper Device reset method.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1455646193-13238-4-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
260bc9d8aa hw/sd/sd.c: QOMify
Turn the SD card into a QOM device.
This conversion only changes the device itself; the various
functions which are effectively methods on the device are not
touched at this point.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1455646193-13238-3-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Peter Maydell
ac6de31acd hw/sd/sdhci.c: Remove x-drive property
The following commits will remove support for the old sdhci-pci
command line syntax using the x-drive property:
 -device sdhci-pci,x-drive=mydrive -drive id=mydrive,[...]
and replace it with an explicit sd device:
 -device sdhci-pci -drive id=mydrive,[...] -device sd,drive=mydrive

(This is OK because x-drive is experimental.)

This commit removes the x-drive property so that old style
command lines will fail with a reasonable error message:
  -device sdhci-pci,x-drive=mydrive: Property '.x-drive' not found

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 1455646193-13238-2-git-send-email-peter.maydell@linaro.org
2016-02-18 14:26:33 +00:00
Wei Huang
c3a86b35f2 ARM: PL061: Cleaning field of PL061 device state
This patch removes the float_high field of PL061State, which doesn't
seem to be used anywhere. Because this changes the device state, the
version ID is also bumped up for the reason of compatiblity.

Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455729552-28026-3-git-send-email-wei@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:26:33 +00:00
Wei Huang
b527db44ad ARM: PL061: Clear PL061 device state after reset
Current QEMU doesn't clear PL061 state after reset. This causes a
weird issue with guest reboot via GPIO. Here is the device state
with two reboot requests:

  (PL061State fields)           data   old_in_data   istate
VM boot                         0      0             0
After 1st ACPI reboot request   8      8             8
After VM PL061 driver ACK       8      8             0
After VM reboot                 8      8             0
------------------------------------------------------------
2nd ACPI reboot request         8

In the second reboot request above, because the old_in_data field is 8,
QEMU decides that there is a pending edge IRQ already (see
pl061_update()) in input; so it doesn't raise up IRQ again. As a result
the second reboot request is lost. The correct way is to clear PL061
device state after reset.

The default reset state is found from the documents listed below. Per
Peter's suggestion that QEMU automatically calls reset function after
device initialization, this patch removes calling pl061_reset() from
pl061_initfn().

Reference:
[1] PL061 Technical Reference Manual
[2] Stellaris LM3S8962 Microcontroller Data Sheet
[3] Stellaris LM3S5P31 Microcontroller Data Sheet

Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1455729552-28026-2-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:26:33 +00:00
Alistair Francis
8a83ffc2da target-arm: Add PMUSERENR_EL0 register
The Linux kernel accesses this register early in its setup.

Signed-off-by: Christopher Covington <christopher.covington@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: b30d536cb16ec57b4412172bb6dbc3f00d293e7d.1455060548.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:26:20 +00:00
Alistair Francis
978364f12a target-arm: Add the pmovsclr_el0 and pmintenclr_el1 registers
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-by: Nathan Rossi <nathan@nathanrossi.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 50deeafb24958a5b6d7f594b5dda399a022c0e5b.1455060548.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:16:17 +00:00
Alistair Francis
4054bfa9e7 target-arm: Add the pmceid0 and pmceid1 registers
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-by: Nathan Rossi <nathan@nathanrossi.com>
Message-id: da0563119a9f56fd5fbdc26e7ed19a8a8457c5b9.1455060548.git.alistair.francis@xilinx.com
[PMM: Use 0 for PMCEID0 values for A15 and A57 since our PMU
 does not currently implement any events.]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 14:16:17 +00:00
Peter Maydell
f01377f591 target-arm: UNDEF in the UNPREDICTABLE SRS-from-System case
Make get_r13_banked() raise an exception at runtime for the
corner case of SRS from System mode, so that we can UNDEF it;
this brings us in to line with the ARM ARM's set of permitted
CONSTRAINED UNPREDICTABLE choices.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-18 14:16:17 +00:00
Peter Maydell
d86d57d4fe target-arm: Combine user-only and softmmu get/set_r13_banked()
The user-mode versions of get/set_r13_banked() exist just to assert
if they're ever called -- the translate time code should never
emit calls to them because SRS from user mode always UNDEF.
There's no code in the softmmu versions that can't compile in
CONFIG_USER_ONLY, and the assertion is not particularly useful,
so combine the two functions rather than having completely split
versions under ifdefs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:16 +00:00
Peter Maydell
c766568d36 target-arm: Move bank_number() into internals.h
Move bank_number()'s implementation into internals.h, so
it's available in the user-mode-only compile as well.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:16 +00:00
Peter Maydell
72309cee48 target-arm: Move get/set_r13_banked() to op_helper.c
Move get/set_r13_banked() from helper.c to op_helper.c. This will
let us add exception-raising code to them, and also puts them
in the same file as get/set_user_reg(), which makes some conceptual
sense.

(The original reason for the helper.c/op_helper.c split was that
only op_helper.c had access to the CPU env pointer; this distinction
has not been true for a long time, though, and so the split is
now rather arbitrary.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-18 14:16:16 +00:00
Peter Maydell
cbc0326b6f target-arm: Clean up trap/undef handling of SRS
The SRS instruction is:
 * UNDEFINED in Hyp mode
 * UNPREDICTABLE in User or System mode
 * UNPREDICTABLE if the specified mode isn't accessible
 * trapped to EL3 if EL3 is AArch64 and we are at Secure EL1

Clean up the code to handle all these cases cleanly, including
picking UNDEF as our choice of UNPREDICTABLE behaviour rather
blindly trusting the mode field passed in the instruction.
As part of this, move the check for IS_USER into gen_srs()
itself rather than having it done by the caller.

The exception is that we don't UNDEF for calls from System
mode, which need a runtime check. This will be dealt with in
the following commits.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-18 14:16:16 +00:00
Peter Maydell
f2cae60927 target-arm: Report correct syndrome for FPEXC32_EL2 traps
If access to FPEXC32_EL2 is trapped by CPTR_EL2.TFP or CPTR_EL3.TFP,
this should be reported with a syndrome register indicating an
FP access trap, not one indicating a system register access trap.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:16 +00:00
Peter Maydell
d6c8cf8151 target-arm: Implement MDCR_EL3.TDA and MDCR_EL2.TDA traps
Implement the debug register traps controlled by MDCR_EL2.TDA
and MDCR_EL3.TDA.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:15 +00:00
Peter Maydell
91b0a23865 target-arm: Implement MDCR_EL2.TDRA traps
Implement trapping of the "debug ROM" registers, which are controlled
by MDCR_EL2.TDRA for EL2 but by the more general MDCR_EL3.TDA for EL3.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:15 +00:00
Peter Maydell
187f678d5c target-arm: Implement MDCR_EL3.TDOSA and MDCR_EL2.TDOSA traps
Implement the traps to EL2 and EL3 controlled by the bits
MDCR_EL2.TDOSA MDCR_EL3.TDOSA. These can configurably trap
accesses to the "powerdown debug" registers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
2016-02-18 14:16:15 +00:00
Peter Maydell
f096e92b63 target-arm: Fix handling of SCR.SMD
We weren't quite implementing the handling of SCR.SMD correctly.
The condition governing whether the SMD bit should apply only
for NS state is "is EL3 is AArch32", not "is the current EL AArch32".
Fix the condition, and clarify the comment both to reflect this and
to expand slightly on what's going on for the v7-no-Virtualization case.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-18 14:16:15 +00:00
Peter Maydell
755026728a target-arm: correct CNTFRQ access rights
Correct some corner cases we were getting wrong for
CNTFRQ access rights:
 * should UNDEF from 32-bit Secure EL1
 * only writable from the highest implemented exception level,
   which might not be EL1 now

To clarify the code, provide a new utility function
arm_highest_el() which returns the highest implemented
exception level.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
2016-02-18 14:16:15 +00:00
Victor Kaplansky
5669655aaf vhost-user interrupt management fixes
Since guest_mask_notifier can not be used in vhost-user mode due
to buffering implied by unix control socket, force
use_mask_notifier on virtio devices of vhost-user interfaces, and
send correct callfd to the guest at vhost start.

Using guest_notifier_mask function in vhost-user case may
break interrupt mask paradigm, because mask/unmask is not
really done when returning from guest_notifier_mask call, instead
message is posted in a unix socket, and processed later.

Add an option boolean flag 'use_mask_notifier' to disable the use
of guest_notifier_mask in virtio pci.

Signed-off-by: Didier Pallard <didier.pallard@6wind.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-18 16:13:56 +02:00
Peter Maydell
339b665c88 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.6-20160218' into staging
ppc patch queue for 2016-02-18

Currently accumulated patches for target-ppc, pseries machine type and
related devices.
  * Some cleanups to management of SDR1 and the hashed page table
  * Implementations of a number of simple PAPR hypercalls
  * Significant improvements to the Macintosh CUDA device
  * Several bugfixes

# gpg: Signature made Thu 18 Feb 2016 04:16:51 GMT using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.6-20160218: (26 commits)
  hw/ppc/spapr: Halt CPU when powering off via RTAS call
  pseries: Include missing pseries-2.5 compat properties in pseries-2.4
  cuda: remove CUDA_GET_SET_IIC/CUDA_COMBINED_FORMAT_IIC commands
  cuda: remove GET_6805_ADDR command
  cuda: port SET_TIME command to new framework
  cuda: port GET_TIME command to new framework
  cuda: port SET_POWER_MESSAGES command to new framework
  cuda: port FILE_SERVER_FLAG command to new framework
  cuda: port RESET_SYSTEM command to new framework
  cuda: port POWERDOWN command to new framework
  cuda: port SET_DEVICE_LIST command to new framework
  cuda: port SET_AUTO_RATE command to new framework
  cuda: port AUTOPOLL command to new framework
  cuda: move unknown commands reject out of switch
  cuda: add a framework to handle commands
  hw/ppc/spapr: Implement the h_set_xdabr hypercall
  hw/ppc/spapr: Implement h_set_dabr
  hw/ppc/spapr: Add h_set_sprg0 hypercall
  migration: ensure htab_save_first completes after timeout
  target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-18 10:29:47 +00:00
Thomas Huth
8a9c1b77e9 hw/ppc/spapr: Halt CPU when powering off via RTAS call
The LoPAPR specification defines the following for the RTAS
power-off call: "On successful operation, does not return".
However, the implementation in QEMU currently returns and runs
the guest CPU again for some more cycles. This caused some
trouble with the new ppc implementation of the kvm-unit-tests
recently. So let's make sure that the QEMU implementation
follows the spec, thus stop the CPU to make sure that the
RTAS call does not return to the guest anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-18 11:08:43 +11:00
Michael S. Tsirkin
cefa2bbd6a rules: filter out irrelevant files
It's often handy to make executables depend on each other, e.g. make a
test depend on a helper. This doesn't work now, as linker
will attempt to use the helper as an object.
To fix, filter only relevant file types before linking an executable.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-17 16:59:36 +02:00
David Gibson
1c81003acc pseries: Include missing pseries-2.5 compat properties in pseries-2.4
Commit 4b23699 "pseries: Add pseries-2.6 machine type" added a new
SPAPR_COMPAT_2_5 macro in the usual way.  However, it didn't add this
macro to the existing SPAPR_COMPAT_2_4 macro so that pseries-2.4
inherits newer compatibility properties which are needed for 2.5 and
earlier.

This corrects the oversight.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-17 10:25:37 +11:00
Hervé Poussineau
e4d162d72f cuda: remove CUDA_GET_SET_IIC/CUDA_COMBINED_FORMAT_IIC commands
We currently don't emulate the I2C bus provided by CUDA.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:31 +11:00
Hervé Poussineau
e230d43e80 cuda: remove GET_6805_ADDR command
It doesn't seem to be used, and operating systems should accept a 'unknown command' answer.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:31 +11:00
Hervé Poussineau
e647317892 cuda: port SET_TIME command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:31 +11:00
Hervé Poussineau
547a4d1969 cuda: port GET_TIME command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:31 +11:00
Hervé Poussineau
15b7b09b1d cuda: port SET_POWER_MESSAGES command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:31 +11:00
Hervé Poussineau
f5b941120e cuda: port FILE_SERVER_FLAG command to new framework
This command tells if computer should automatically wake-up after a power loss.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
54e894442e cuda: port RESET_SYSTEM command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
017da0b568 cuda: port POWERDOWN command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
216c906e62 cuda: port SET_DEVICE_LIST command to new framework
Also implement the command, by taking device list mask into account
when polling ADB devices.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
374312e7c5 cuda: port SET_AUTO_RATE command to new framework
Also implement the command, by removing the hardcoded period of 20 ms/50 Hz
and replacing it by the one requested by user.
Update VMState version to store this new parameter.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
1cdab10446 cuda: port AUTOPOLL command to new framework
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
0e8176e809 cuda: move unknown commands reject out of switch
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Hervé Poussineau
d20efaeb13 cuda: add a framework to handle commands
Next commits will port existing CUDA commands to this framework.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth
e49ff266f8 hw/ppc/spapr: Implement the h_set_xdabr hypercall
The H_SET_XDABR hypercall is similar to H_SET_DABR, but also sets
the extended DABR (DABRX) register.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth
af08a58f0c hw/ppc/spapr: Implement h_set_dabr
According to LoPAPR, h_set_dabr should simply set DABRX to 3
(if the register is available), and load the parameter into DABR.
If DABRX is not available, the hypervisor has to check the
"Breakpoint Translation" bit of the DABR register first.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
Thomas Huth
423576f771 hw/ppc/spapr: Add h_set_sprg0 hypercall
This is a very simple hypercall that only sets up the SPRG0
register for the guest (since writing to SPRG0 was only permitted
to the hypervisor in older versions of the PowerISA).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
David Gibson
378bc21756 migration: ensure htab_save_first completes after timeout
htab_save_first_pass could return without finishing its work due to
timeout. The patch checks if another invocation of it is necessary and
will call it in htab_save_complete if necessary.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[removed overlong line]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:30 +11:00
David Gibson
fa48b4328c target-ppc: Remove hack for ppc_hash64_load_hpte*() with HV KVM
With HV KVM, the guest's hash page table (HPT) is managed by the kernel and
not directly accessible to QEMU.  This means that spapr->htab is NULL
and normally env->external_htab would also be NULL for each cpu.

However, that would cause ppc_hash64_load_hpte*() to do the wrong thing in
the few cases where QEMU does need to load entries from the in-kernel HPT.
Specifically, seeing external_htab is NULL, they would look for an HPT
within the guest's address space instead.

To stop that we have an ugly hack in the pseries machine type code to
set external htab to (void *)1 instead.

This patch removes that hack by having ppc_hash64_load_hpte*() explicitly
check kvmppc_kern_htab instead, which makes more sense.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson
c5f54f3e31 pseries: Move hash page table allocation to reset time
At the moment the size of the hash page table (HPT) is fixed based on the
maximum memory allowed to the guest.  As such, we allocate the table during
machine construction, and just clear it at reset.

However, we're planning to implement a PAPR extension allowing the hash
page table to be resized at runtime.  This will mean that on reset we want
to revert it to the default size.  It also means that when migrating, we
need to make sure the destination allocates an HPT of size matching the
host, since the guest could have changed it before the migration.

This patch replaces the spapr_alloc_htab() and spapr_reset_htab() functions
with a new spapr_reallocate_hpt() function.  This is called at reset and
inbound migration only, not during machine init any more.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson
8dfe8e7f4f pseries: Add helper to calculate recommended hash page table size
At present we calculate the recommended hash page table (HPT) size for a
pseries guest just once in ppc_spapr_init() before allocating the HPT.
In future patches we're going to want this calculation in other places, so
this splits it out into a helper function.  While we're at it, change the
calculation to use ctz() instead of an explicit loop.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson
715c54071a pseries: Simplify handling of the hash page table fd
When migrating the 'pseries' machine type with KVM, we use a special fd
to access the hash page table stored within KVM.  Usually, this fd is
opened at the beginning of migration, and kept open until the migration
is complete.

However, if there is a guest reset during the migration, the fd can become
stale and we need to re-open it.  At the moment we use an 'htab_fd_stale'
flag in sPAPRMachineState to signal this, which is checked in the migration
iterators.

But that's rather ugly.  It's simpler to just close and invalidate the
fd on reset, and lazily re-open it in migration if necessary.  This patch
implements that change.

This requires a small addition to the machine state's instance_init,
so that htab_fd is initialized to -1 (telling the migration code it
needs to open it) instead of 0, which could be a valid fd.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson
808bc3b069 target-ppc: Include missing MMU models for SDR1 in info registers
The HMP command "info registers" produces somewhat different information on
different ppc cpu variants.  For those with a hash MMU it's supposed to
include the SDR1, DAR and DSISR registers related to the MMU.  However,
the switch is missing a couple of MMU model variants, meaning we will
miss out this information on certain CPUs which should have it.

This patch corrects the oversight.  (Really these MMU model IDs need a big
cleanup, but we might as well fix the bug in the interim).

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:30 +11:00
David Gibson
b7f0bbd259 target-ppc: Remove unused kvmppc_update_sdr1() stub
This KVM stub implementation isn't used anywhere.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
2016-02-17 09:59:29 +11:00
Alyssa Milburn
2f448e415f hw: fix some debug message format strings
Signed-off-by: Alyssa Milburn <fuzzie@fuzzie.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-17 09:59:29 +11:00
Peter Maydell
3fc63c3f33 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* Coverity fixes for IPMI and mptsas
* qemu-char fixes from Daniel and Marc-André
* Bug fixes that break qemu-iotests
* Changes to fix reset from panicked state
* checkpatch false positives for designated initializers
* TLS support in the NBD servers and clients

# gpg: Signature made Tue 16 Feb 2016 16:27:17 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (28 commits)
  nbd: enable use of TLS with nbd-server-start command
  nbd: enable use of TLS with qemu-nbd server
  nbd: enable use of TLS with NBD block driver
  nbd: implement TLS support in the protocol negotiation
  nbd: use "" as a default export name if none provided
  nbd: always query export list in fixed new style protocol
  nbd: allow setting of an export name for qemu-nbd server
  nbd: make client request fixed new style if advertised
  nbd: make server compliant with fixed newstyle spec
  nbd: invert client logic for negotiating protocol version
  nbd: convert to using I/O channels for actual socket I/O
  nbd: convert blockdev NBD server to use I/O channels for connection setup
  nbd: convert qemu-nbd server to use I/O channels for connection setup
  nbd: convert block client to use I/O channels for connection setup
  qemu-nbd: add support for --object command line arg
  qom: add helpers for UserCreatable object types
  ipmi: sensor number should not exceed MAX_SENSORS
  mptsas: fix wrong formula
  mptsas: fix memory leak
  mptsas: add missing va_end
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-16 17:31:56 +00:00
Daniel P. Berrange
ddffee3904 nbd: enable use of TLS with nbd-server-start command
This modifies the nbd-server-start QMP command so that it
is possible to request use of TLS. This is done by adding
a new optional parameter "tls-creds" which provides the ID
of a previously created QCryptoTLSCreds object instance.

TLS is only supported when using an IPv4/IPv6 socket listener.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-17-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:17:49 +01:00
Daniel P. Berrange
145614a112 nbd: enable use of TLS with qemu-nbd server
This modifies the qemu-nbd program so that it is possible to
request the use of TLS with the server. It simply adds a new
command line option --tls-creds which is used to provide the
ID of a QCryptoTLSCreds object previously created via the
--object command line option.

For example

  qemu-nbd --object tls-creds-x509,id=tls0,endpoint=server,\
                    dir=/home/berrange/security/qemutls \
           --tls-creds tls0 \
           --exportname default

TLS requires the new style NBD protocol, so if no export name
is set (via --export-name), then we use the default NBD protocol
export name ""

TLS is only supported when using an IPv4/IPv6 socket listener.
It is not possible to use with UNIX sockets, which includes
when connecting the NBD server to a host device.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-16-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:17:42 +01:00
Daniel P. Berrange
75822a12c0 nbd: enable use of TLS with NBD block driver
This modifies the NBD driver so that it is possible to request
use of TLS. This is done by providing the 'tls-creds' parameter
with the ID of a previously created QCryptoTLSCreds object.

For example

  $QEMU -object tls-creds-x509,id=tls0,endpoint=client,\
                dir=/home/berrange/security/qemutls \
        -drive driver=nbd,host=localhost,port=9000,tls-creds=tls0

The client will drop the connection if the NBD server does not
provide TLS.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-15-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:33 +01:00
Daniel P. Berrange
f95910fe6b nbd: implement TLS support in the protocol negotiation
This extends the NBD protocol handling code so that it is capable
of negotiating TLS support during the connection setup. This involves
requesting the STARTTLS protocol option before any other NBD options.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-14-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:28 +01:00
Daniel P. Berrange
69b49502d8 nbd: use "" as a default export name if none provided
If the user does not provide an export name and the server
is running the new style protocol, where export names are
mandatory, use "" as the default export name if the user
has not specified any. "" is defined in the NBD protocol
as the default name to use in such scenarios.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-13-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:20 +01:00
Daniel P. Berrange
9344e5f554 nbd: always query export list in fixed new style protocol
With the new style protocol, the NBD client will currenetly
send NBD_OPT_EXPORT_NAME as the first (and indeed only)
option it wants. The problem is that the NBD protocol spec
does not allow for returning an error message with the
NBD_OPT_EXPORT_NAME option. So if the server mandates use
of TLS, the client will simply see an immediate connection
close after issuing NBD_OPT_EXPORT_NAME which is not user
friendly.

To improve this situation, if we have the fixed new style
protocol, we can sent NBD_OPT_LIST as the first option
to query the list of server exports. We can check for our
named export in this list and raise an error if it is not
found, instead of going ahead and sending NBD_OPT_EXPORT_NAME
with a name that we know will be rejected.

This improves the error reporting both in the case that the
server required TLS, and in the case that the client requested
export name does not exist on the server.

If the server does not support NBD_OPT_LIST, we just ignore
that and carry on with NBD_OPT_EXPORT_NAME as before.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-12-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:11 +01:00
Daniel P. Berrange
3d4b2f9c94 nbd: allow setting of an export name for qemu-nbd server
The qemu-nbd server currently always uses the old style protocol
since it never sets any export name. This is a problem because
future TLS support will require use of the new style protocol
negotiation.

This adds "--exportname NAME" / "-x NAME" arguments to qemu-nbd
which allow the user to set an explicit export name. When an
export name is set the server will always use the new style
NBD protocol.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-11-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:16:00 +01:00
Daniel P. Berrange
e2a9d9a39d nbd: make client request fixed new style if advertised
If the server advertises support for the fixed new style
negotiation, the client should in turn enable new style.
This will allow the client to negotiate further NBD
options besides the export name.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-10-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:14:33 +01:00
Daniel P. Berrange
26afa868db nbd: make server compliant with fixed newstyle spec
If the client does not request the fixed new style protocol,
then we should only accept NBD_OPT_EXPORT_NAME. All other
options are only valid when fixed new style has been activated.

The qemu-nbd client doesn't currently request fixed new style
protocol, but this change won't break qemu-nbd, because it
fortunately only ever uses NBD_OPT_EXPORT_NAME, so was never
triggering the non-compliant server behaviour.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-9-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:14:24 +01:00
Daniel P. Berrange
f72d705f0d nbd: invert client logic for negotiating protocol version
The nbd_receive_negotiate() method takes different code
paths based on whether 'name == NULL', and then checks
the expected protocol version in each branch.

This patch inverts the logic, so that it takes different
code paths based on what protocol version it receives and
then checks if name is NULL or not as needed.

This facilitates later code which allows the client to
be capable of using the new style protocol regardless
of whether an export name is listed or not.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-8-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:14:08 +01:00
Daniel P. Berrange
1c778ef729 nbd: convert to using I/O channels for actual socket I/O
Now that all callers are converted to use I/O channels for
initial connection setup, it is possible to switch the core
NBD protocol handling core over to use QIOChannel APIs for
actual sockets I/O.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-7-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:57 +01:00
Daniel P. Berrange
ae39827802 nbd: convert blockdev NBD server to use I/O channels for connection setup
This converts the blockdev NBD server to use the QIOChannelSocket
class for initial listener socket setup and accepting of client
connections. Actual I/O is still being performed against the
socket file descriptor using the POSIX socket APIs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-6-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:49 +01:00
Daniel P. Berrange
d0d6ff584d nbd: convert qemu-nbd server to use I/O channels for connection setup
This converts the qemu-nbd server to use the QIOChannelSocket
class for initial listener socket setup and accepting of client
connections. Actual I/O is still being performed against the
socket file descriptor using the POSIX socket APIs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-5-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:40 +01:00
Daniel P. Berrange
064097d919 nbd: convert block client to use I/O channels for connection setup
This converts the NBD block driver client to use the QIOChannelSocket
class for initial connection setup. The NbdClientSession struct has
two pointers, one to the master QIOChannelSocket providing the raw
data channel, and one to a QIOChannel which is the current channel
used for I/O. Initially the two point to the same object, but when
TLS support is added, they will point to different objects.

The qemu-img & qemu-io tools now need to use MODULE_INIT_QOM to
ensure the QIOChannel object classes are registered. The qemu-nbd
tool already did this.

In this initial conversion though, all I/O is still actually done
using the raw POSIX sockets APIs.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-4-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:22 +01:00
Daniel P. Berrange
0ab3b3375b qemu-nbd: add support for --object command line arg
Allow creation of user creatable object types with qemu-nbd
via a new --object command line arg. This will be used to supply
passwords and/or encryption keys to the various block driver
backends via the recently added 'secret' object type.

 # printf letmein > mypasswd.txt
 # qemu-nbd --object secret,id=sec0,file=mypasswd.txt \
      ...other nbd args...

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-3-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:13:06 +01:00
Daniel P. Berrange
90998d5896 qom: add helpers for UserCreatable object types
The QMP monitor code has two helper methods object_add
and qmp_object_del that are called from several places
in the code (QMP, HMP and main emulator startup).

The HMP and main emulator startup code also share
further logic that extracts the qom-type & id
values from a qdict.

We soon need to use this logic from qemu-img, qemu-io
and qemu-nbd too, but don't want those to depend on
the monitor, nor do we want to duplicate the code.

To avoid this, move some code out of qmp.c and hmp.c
adding new methods to qom/object_interfaces.c

 - user_creatable_add - takes a QDict holding a full
   object definition & instantiates it
 - user_creatable_add_type - takes an ID, type name,
   and QDict holding object properties & instantiates
   it
 - user_creatable_add_opts - takes a QemuOpts holding
   a full object definition & instantiates it
 - user_creatable_add_opts_foreach - variant on
   user_creatable_add_opts which can be directly used
   in conjunction with qemu_opts_foreach.
 - user_creatable_del - takes an ID and deletes the
   corresponding object

The existing code is updated to use these new methods.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455129674-17255-2-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 17:12:57 +01:00
Peter Maydell
250f53ddaa Merge remote-tracking branch 'remotes/berrange/tags/pull-io-next-2016-02-16-1' into staging
Merge I/O fixes 2016/02/16 v1

# gpg: Signature made Tue 16 Feb 2016 15:42:29 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-io-next-2016-02-16-1:
  io: convert QIOChannelBuffer to use uint8_t instead of char
  io: introduce helper for creating channels from file descriptors
  io: improve docs for QIOChannelSocket async functions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-16 15:47:35 +00:00
Cédric Le Goater
73d60fa5fa ipmi: sensor number should not exceed MAX_SENSORS
Fix a number of off-by-ones, one of them spotted by Coverity.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 16:41:25 +01:00
Paolo Bonzini
9155b7606a mptsas: fix wrong formula
MPI_DOORBELL_WHO_INIT_SHIFT is being repeated twice.  Reported
by Coverity.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 16:41:22 +01:00
Paolo Bonzini
18557e646b mptsas: fix memory leak
Reported by Coverity.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 16:41:20 +01:00
Paolo Bonzini
b44bbeb44b mptsas: add missing va_end
Reported by Coverity.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 16:41:17 +01:00
Paolo Bonzini
4987783400 migration: fix incorrect memory_global_dirty_log_start outside BQL
This can cause various segmentation faults or aborts in qemu-iotests
test 091.

Fixes: 5b82b703b6
Cc: Dave Gilbert <dgilbert@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 15:34:43 +01:00
Peter Maydell
d5db2ec177 oslib-posix.c: Move workaround for OSX daemon() deprecation to osdep.h
The right place for "work around issues with system headers" code
is osdep.h. Move the workaround for OSX's stdlib.h emitting a
deprecation warning for daemon() to that header.

This also fixes a problem where running clean-includes on
oslib-posix.c would erroneously remove the #include <stdlib.h>
from it, breaking the workaround.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:28 +00:00
Peter Maydell
c964b66022 all: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:28 +00:00
Peter Maydell
2aef8c9134 scripts/tracetool: Include qemu/osdep.h in generated .c files
Include qemu/osdep.h as the first include in generated .c files,
so they don't implicitly rely on some other included header
to pull it in.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Peter Maydell
253785e3b9 scripts/feature_to_c.sh: Include qemu/osdep.h rather than config.h
In the .c files generated by this script, include qemu/osdep.h
as the first included header, not config.h.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Eric Blake
9167ebd98f qapi: Clean up includes in generated files
As a followup to commit cbf2115, clean up the includes in files
generated by QAPI so that osdep.h is included first in .c files,
and headers which it implies are not included manually.  This
patch is done manually, since Coccinelle (and therefore
scripts/clean-includes) doesn't see into the generator scripts.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Peter Maydell
681c28a33e tests: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Peter Maydell
07b096b418 tests/i440fx-test: Don't define ARRAY_SIZE locally
Don't define ARRAY_SIZE locally; instead include osdep.h for it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Peter Maydell
7a4e543de6 libdecnumber: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:27 +00:00
Peter Maydell
66d79920b9 cris: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:26 +00:00
Peter Maydell
70e6b879d7 target-cris: Remove unnecessary ifdef from mmu.c
mmu.c is only built for CONFIG_SOFTMMU targets, so there is
no need to redundantly surround the whole file contents with
an #ifndef CONFIG_USER_ONLY. The ifdef also confuses the
Coccinelle tool.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:26 +00:00
Peter Maydell
74c0e47441 hw/block/nand.c: Include osdep.h first
Include osdep.h as the first header in nand.c; this has to be
done manually because coccinelle gets confused by the way that
this C file includes itself.

We fix some odd spacing in #includes while we are in the area.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-16 14:29:26 +00:00
Eric Blake
888ea96aae build: Don't redefine 'inline'
Actively redefining 'inline' is wrong for C++, where gcc has an
extension 'inline namespace' which fails to compile if the
keyword 'inline' is replaced by a macro expansion.  This will
matter once we start to include "qemu/osdep.h" first from C++
files, depending also on whether the system headers are new
enough to be using the gcc extension.

But rather than just guard things by __cplusplus, let's look at
the overall picture.  Commit df2542c737 in 2007 defined 'inline'
to the gcc attribute __always_inline__, with the rationale "To
avoid discarded inlining bug".  But compilers have improved since
then, and we are probably better off trusting the compiler rather
than trying to force its hand.

So just nuke our craziness.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1455043788-28112-1-git-send-email-eblake@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-16 12:07:03 +00:00
Cao jin
9cfaa0079f change type of pci_bridge_initfn() to void
Since it can`t fail. Also modify the callers.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-16 12:05:18 +02:00
Cao jin
33c28f3bde dec: convert to realize()
Also because pci_bridge_initfn() can`t fail.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-16 12:05:18 +02:00
Victor Kaplansky
4e082566a9 tests: add pxe e1000 and virtio-pci tests
The test is based on bios-tables-test.c.  It creates a file with
the boot sector image and loads it into a guest using PXE and TFTP
functionality.

Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Victor Kaplansky <victork@redhat.com>
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-16 12:05:18 +02:00
Michael S. Tsirkin
e1e4bf2252 msix: fix msix_vector_masked
commit 428c3ece97 ("fix MSI injection on Xen")
inadvertently enabled the xen-specific logic unconditionally.
Limit it to only when xen is enabled.
Additionally, msix data should be read with pci_get_log
since the format is pci little-endian.

Reported-by: "Daniel P. Berrange" <berrange@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-16 12:05:18 +02:00
Greg Kurz
e5157e313c virtio: optimize virtio_access_is_big_endian() for little-endian targets
When adding cross-endian support, we introduced the TARGET_IS_BIENDIAN macro
and the virtio_access_is_big_endian() helper to have a branchless fast path
in the virtio memory accessors for targets that don't switch endian.

This was considered as a strong requirement at the time.

Now we have added a runtime check for virtio 1.0, which ruins the benefit
of the virtio_access_is_big_endian() helper for always little-endian targets.

With this patch, always little-endian targets stop checking for virtio 1.0,
since the result is little-endian in all cases.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:18 +02:00
Greg Kurz
46f70ff148 vhost: simplify vhost_needs_vring_endian()
After the call to virtio_vdev_has_feature(), we only care for legacy
devices, so we don't need the extra check in virtio_is_big_endian().

Also the device_endian field is always set (VIRTIO_DEVICE_ENDIAN_UNKNOWN
may only happen on a virtio_load() path that cannot lead here), so we
don't need the assert() either.

This open codes the device_endian checking in vhost_needs_vring_endian().
It also adds a comment to explain the logic, as recent reviews showed the
cross-endian tweaks aren't that obvious.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-16 12:05:18 +02:00
Greg Kurz
e58481234e vhost: move virtio 1.0 check to cross-endian helper
Indeed vhost doesn't need to ask for vring endian fixing if the device is
virtio 1.0, since it is already handled by the in-kernel vhost driver. This
patch simply consolidates the logic into the existing helper.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Greg Kurz
a122ab2472 virtio: move cross-endian helper to vhost
If target is bi-endian (ppc64, arm), the virtio_legacy_is_cross_endian()
indeed returns the runtime state of the virtio device. However, it returns
false unconditionally in the general case. This sounds a bit strange
given the name of the function.

This helper is only useful for vhost actually, where indeed non bi-endian
targets don't have to deal with cross-endian issues.

This patch moves the helper to vhost.c and gives it a more appropriate name.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Greg Kurz
3154d1e426 vhost-net: revert support of cross-endian vnet headers
Cross-endian is now handled by the core virtio-net code.

This patch reverts:

commit 5be7d9f1b1
	vhost-net: tell tap backend about the vnet endianness

and

commit cf0a628f6e81bfc9b7a944fa0b80c3594836df56
	net: set endianness on all backend devices

Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Greg Kurz
1bfa316ce7 virtio-net: use the backend cross-endian capabilities
When running a fully emulated device in cross-endian conditions, including
a virtio 1.0 device offered to a big endian guest, we need to fix the vnet
headers. This is currently handled by the virtio_net_hdr_swap() function
in the core virtio-net code but it should actually be handled by the net
backend.

With this patch, virtio-net now tries to configure the backend to do the
endian fixing when the device starts (i.e. drivers sets the CONFIG_OK bit).
If the backend cannot support the requested endiannes, we have to fallback
onto virtio_net_hdr_swap(): this is recorded in the needs_vnet_hdr_swap flag,
to be used in the TX and RX paths.

Note that we reset the backend to the default behaviour (guest native
endianness) when the device stops (i.e. device status had CONFIG_OK bit and
driver unsets it). This is needed, with the linux tap backend at least,
otherwise the guest may lose network connectivity if rebooted into a
different endianness.

The current vhost-net code also tries to configure net backends. This will
be no more needed and will be reverted in a subsequent patch.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
2016-02-16 12:05:17 +02:00
Paolo Bonzini
98799b0d4b vl: fix migration from prelaunch state
Reproducer is simply to migrate a virtual machine that was started with -S,
or that was already migrated.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 09:27:59 +01:00
Denis V. Lunev
7ec13c798a vl: change QEMU state machine for system reset
This patch implements proposal from Paolo to handle system reset when
the guest is not running.

"After a reset, main_loop_should_exit should actually transition
to VM_STATE_PRELAUNCH (*not* RUN_STATE_PAUSED) for *all* states except
RUN_STATE_INMIGRATE, RUN_STATE_SAVE_VM (which I think cannot happen
there) and (of course) RUN_STATE_RUNNING."

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1455369986-20353-1-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 09:27:59 +01:00
Eric Blake
b11d029b0a build: Don't redefine 'inline'
Actively redefining 'inline' is wrong for C++, where gcc has an
extension 'inline namespace' which fails to compile if the
keyword 'inline' is replaced by a macro expansion.  This will
matter once we start to include "qemu/osdep.h" first from C++
files, depending also on whether the system headers are new
enough to be using the gcc extension.

But rather than just guard things by __cplusplus, let's look at
the overall picture.  Commit df2542c737 in 2007 defined 'inline'
to the gcc attribute __always_inline__, with the rationale "To
avoid discarded inlining bug".  But compilers have improved since
then, and we are probably better off trusting the compiler rather
than trying to force its hand.

So just nuke our craziness.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1455043788-28112-1-git-send-email-eblake@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 09:27:59 +01:00
Daniel P. Berrange
e046fb4499 char: fix handling of QIO_CHANNEL_ERR_BLOCK
If io_channel_send_full gets QIO_CHANNEL_ERR_BLOCK it
and has already sent some of the data, it should return
that amount of data, not EAGAIN, as that would cause
the caller to re-try already sent data.

Unfortunately due to a previous rebase conflict resolution
error, the code for dealing with this was in the wrong
part of the conditional, and so mistakenly ran on other
I/O errors.

This be seen running

   qemu-system-x86_64 -monitor stdio

and entering 'info mtree', when running on a slow console
(eg a slow remote ssh session). The monitor would get into
an indefinite loop writing the same data until it managed
to send it all without getting EAGAIN.

Reported-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1455288410-27046-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 09:27:59 +01:00
Paolo Bonzini
837a183f00 Revert "qemu-char: Keep pty slave file descriptor open until the master is closed"
This reverts commit 34689e206a.

Marc-André Lureau provided the following commentary: "It looks like if
a the slave is opened, then Linux will buffer the master writes, up to
a few kb and then throttle, so it's not entirely blocked but eventually
the guest VM dies.  However, not having any slave open it will simply let
the write go and discard the data.  At least, virt-install configures
a pty for the serial but viewers like virt-manager do not necessarily
open it.  And, if there are no viewers, it will just hang.  If qemu
starts reading all the data from the slave, I don't think interactions
with other slaves will work. I don't see much options but to close the
slave, thus reverting this patch."

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-16 09:27:59 +01:00
Leonid Bloch
8800cf0a33 checkpatch: Eliminate false positive in case of space before square bracket in a definition
Now, macro definition such as "#define abc(x) [x] = y" should pass
without an error.

Signed-off-by: Leonid Bloch <leonid@daynix.com>
Message-Id: <1446112118-12376-3-git-send-email-leonid@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-15 20:02:09 +01:00
Leonid Bloch
409db6eb71 checkpatch: Eliminate false positive in case of comma-space-square bracket
Previously, an error was printed in cases such as:
{ [1] = 5, [2] = 6 }
The space passed OK after a curly brace, but not after a comma.
Now, a space before a square bracket is allowed, if a comma comes before
it.

Signed-off-by: Leonid Bloch <leonid@daynix.com>
Message-Id: <1446112118-12376-2-git-send-email-leonid@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-15 20:02:09 +01:00
Daniel P. Berrange
e8f117f3b3 io: convert QIOChannelBuffer to use uint8_t instead of char
The QIOChannelBuffer struct uses a 'char *' for its data
buffer. It will give simpler type compatibility with the
migration APIs if it uses 'uint8_t *' instead, avoiding
several casts.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-15 14:49:18 +00:00
Daniel P. Berrange
c767ae62b9 io: introduce helper for creating channels from file descriptors
Depending on what object a file descriptor refers to a different
type of IO channel will be needed - either a QIOChannelFile or
a QIOChannelSocket. Introduce a qio_channel_new_fd() method
which will return the appropriate channel implementation.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-15 14:49:00 +00:00
Daniel P. Berrange
fe81e932ec io: improve docs for QIOChannelSocket async functions
In the docs for qio_channel_socket_connect_async,
qio_channel_socket_listen_async and
qio_channel_socket_dgram_async, mention that the
SocketAddress parameters are copied, so can be freed
immediately.

Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-15 14:48:25 +00:00
Peter Maydell
80b5d6bfc1 Merge remote-tracking branch 'remotes/rth/tags/pull-i386-20160215' into staging
Add XSAVE, MPX, FSGSBASE.

# gpg: Signature made Mon 15 Feb 2016 11:21:50 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-i386-20160215:
  target-i386: Implement FSGSBASE
  target-i386: Enable CR4/XCR0 features for user-mode
  target-i386: Clear bndregs during legacy near jumps
  target-i386: Implement BNDLDX, BNDSTX
  target-i386: Update BNDSTATUS for exceptions raised by BOUND
  target-i386: Implement BNDCL, BNDCU, BNDCN
  target-i386: Implement BNDMOV
  target-i386: Implement BNDMK
  target-i386: Split up gen_lea_modrm
  target-i386: Perform set/reset_inhibit_irq inline
  target-i386: Enable control registers for MPX
  target-i386: Implement XSAVEOPT
  target-i386: Add XSAVE extension
  target-i386: Rearrange processing of 0F AE
  target-i386: Rearrange processing of 0F 01
  target-i386: Split fxsave/fxrstor implementation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-15 11:45:11 +00:00
Richard Henderson
07929f2ab2 target-i386: Implement FSGSBASE
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
a114d25d5b target-i386: Enable CR4/XCR0 features for user-mode
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
7d117ce81e target-i386: Clear bndregs during legacy near jumps
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
bdd87b3b59 target-i386: Implement BNDLDX, BNDSTX
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
75d14edcf5 target-i386: Update BNDSTATUS for exceptions raised by BOUND
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
523e28d761 target-i386: Implement BNDCL, BNDCU, BNDCN
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
62b58ba58b target-i386: Implement BNDMOV
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:50:00 +11:00
Richard Henderson
149b427b32 target-i386: Implement BNDMK
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-15 14:49:52 +11:00
Richard Henderson
a074ce42a3 target-i386: Split up gen_lea_modrm
This is immediately usable by lea and multi-byte nop,
and will be required to implement parts of the mpx spec.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
7f0b7141b4 target-i386: Perform set/reset_inhibit_irq inline
With helpers that can be reused for other things.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
f4f1110e4b target-i386: Enable control registers for MPX
Enable and disable at CPL changes, MSR changes, and XRSTOR changes.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
c9cfe8f9fb target-i386: Implement XSAVEOPT
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
19dc85dba2 target-i386: Add XSAVE extension
This includes XSAVE, XRSTOR, XGETBV, XSETBV, which are all related,
as well as the associate cpuid bits.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
121f315788 target-i386: Rearrange processing of 0F AE
Rather than nesting tests of OP, MOD, and RM, decode them all at once
with a switch.  Also, add some missing #UD checks for e.g. incorrect
LOCK prefix.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
1906b2af7c target-i386: Rearrange processing of 0F 01
Rather than nesting tests of OP, MOD, and RM, decode them
all at once with a switch.  Fixes incorrect decoding of
AMD Pacifica extensions (aka vmrun et al) via op==2 path.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Richard Henderson
64dbaff09b target-i386: Split fxsave/fxrstor implementation
We will be able to reuse these pieces for XSAVE/XRSTOR.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-13 07:59:59 +11:00
Peter Maydell
a5af12871f Merge remote-tracking branch 'remotes/sstabellini/tags/xen-2016-02-12' into staging
Xen 2016-02-12

# gpg: Signature made Fri 12 Feb 2016 17:28:09 GMT using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"

* remotes/sstabellini/tags/xen-2016-02-12:
  xen: Drop __XEN_LATEST_INTERFACE_VERSION__ checks from prior to Xen 4.2
  xen: move xenforeignmemory compat layer into common place
  xen: drop XenXC and associated interface wrappers
  xen: drop xen_xc_hvm_inject_msi wrapper
  xen: drop support for Xen 4.1 and older.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-12 17:36:12 +00:00
Peter Maydell
fc1ec1acff Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2016-02-11' into staging
trivial patches for 2016-02-11

# gpg: Signature made Thu 11 Feb 2016 12:16:04 GMT using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2016-02-11:
  w32: include winsock2.h before windows.h
  Adds keycode 86 to the hid_usage_keys translation table.
  s390x: remove s390-zipl.rom
  Passthru CCID card: QOMify
  Emulated CCID card: QOMify
  ES1370: QOMify
  char: fix parameter name / type in BSD codepath
  qmp-spec: fix index in doc
  rdma: remove check on time_spent when calculating mbs
  qemu-sockets: simplify error handling
  cpu: cpu_save/cpu_load is no more
  qom: Correct object_property_get_int() description
  man: virtfs-proxy-helper: Rework awkward sentence
  remove libtool support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 15:09:33 +00:00
Peter Maydell
f163684599 Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Wed 10 Feb 2016 19:23:29 GMT using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"

* remotes/jnsnow/tags/ide-pull-request:
  ahci: prohibit "restarting" the FIS or CLB engines
  ahci: explicitly reject bad engine states on post_load
  ahci: handle LIST_ON and FIS_ON in map helpers
  ahci: Do not unmap NULL addresses
  fdc: always compile-check debug prints
  ide: fix device_reset to not ignore pending AIO
  ide: Add silent DRQ cancellation
  ide: replace blk_drain_all by blk_drain
  ide: move buffered DMA cancel to core
  ide: code motion
  ide: Prohibit RESET on IDE drives

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 13:02:28 +00:00
Paolo Bonzini
1834ed3afc w32: include winsock2.h before windows.h
Recent Fedora complains while compiling ui/sdl.c:

    /usr/x86_64-w64-mingw32/sys-root/mingw/include/winsock2.h:15:2: warning: #warning Please include winsock2.h before windows.h [-Wcpp]

And with this patch we dutifully obey.

Stefan Weil:

Without that patch, windows.h will include winsock.h
(which conflicts with winsock2.h) when compiling sdl.c.

Normally we define WIN32_LEAN_AND_MEAN, and
windows.h won't include winsock.h.

include/ui/sdl2.h and ui/sdl.c undefine that macro,
so the order of the include files is important.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:47 +03:00
Daniel Serpell
91dbeeda2d Adds keycode 86 to the hid_usage_keys translation table.
This key is present in international keyboards, between left shift and
the 'Z' key, ant is described in the HID usage tables as "Keyboard
Non-US \ and |": http://www.usb.org/developers/hidpage/Hut1_12v2.pdf

This patch fixes the usb-kbd devices.

Signed-off-by: Daniel Serpell <daniel.serpell@gmail.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:47 +03:00
Michael Tokarev
6e9965d429 s390x: remove s390-zipl.rom
This is an s390 boot rom which was used in s390-virtio machine.
but since commit 3538fb6f89
"s390x: remove s390-virtio machine", this file isn't used.
The only place it is referenced in the code is an unused
define ZIPL_FILENAME.  There's also comment in hw/s390/ipl.c
which I'm modifying too, to refer to s390-ccw.img instead.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-11 15:15:47 +03:00
Cao jin
059db20419 Passthru CCID card: QOMify
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:47 +03:00
Cao jin
35997599aa Emulated CCID card: QOMify
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Cao jin
0d769044d6 ES1370: QOMify
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Daniel P. Berrange
0850d49cb6 char: fix parameter name / type in BSD codepath
The BSD impl of qemu_chr_open_pp_fd had mis-declared
its parameter type as ChardevBackend instead of
ChardevCommon. It had also mistakenly used the variable
name 'common' instead of 'backend'.

Tested-by: Sean Bruno <sbruno@freebsd.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Wei Yang
190f34f81e qmp-spec: fix index in doc
The index is duplicated. Just change it.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Wei Yang
5b648de0ee rdma: remove check on time_spent when calculating mbs
Within the if statement, time_spent is assured to be non-zero.

This patch just removes the check on time_spent when calculating mbs.

Signed-off-by: Wei Yang <richard.weiyang@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Paolo Bonzini
58c652c08a qemu-sockets: simplify error handling
Just go always through the err label.  (Noticed because Coverity
complains that peer is always non-NULL in the error cleanup code,
but removing the "if" is arguably more prone to introducing the
opposite bug in the future).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Paolo Bonzini
945123a554 cpu: cpu_save/cpu_load is no more
Everything has been converted to vmstate.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Alistair Francis
b29b47e9b3 qom: Correct object_property_get_int() description
The description of object_property_get_int() stated that on an error
it returns NULL. This is not the case and the function will return -1
if an error occurs. Update the commented documentation accordingly.

Reported-By: Christian Liebhardt <christian.liebhardt@keysight.com>
Signed-off-by: Christian Liebhardt <christian.liebhardt@keysight.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Christophe Fergeau
b8d8e8fde3 man: virtfs-proxy-helper: Rework awkward sentence
There was a 'capbilities' typo in this man page. This commit
reformulates the sentence the typo was in to make it easier to grasp.
This is based on a suggestion from Eric Blake.

Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-02-11 15:15:46 +03:00
Michael Tokarev
e999ee4434 remove libtool support
Libtool support was needed to build shared library for libcacard.
Now there's no need to use libtool, and since the build system is
already complicated enough, we have a way to slightly de-complicate
it.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-02-11 15:15:46 +03:00
Peter Maydell
36a9abd9be Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160211' into staging
target-arm queue:
 * fix some missing traps for EL3 support
 * enable EL3 on Cortex-A53 and Cortex-A57
 * fix syndrome IL bit for Thumb coprocessor, VFP and Neon traps
 * fix mishandling of architectural watchpoints
 * avoid buffer overflow in sd.c
 * fix max-cpus check in virt board
 * implement 'get board revision' query for BCM2835

# gpg: Signature made Thu 11 Feb 2016 11:23:47 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160211:
  bcm2835_property: implement "get board revision" query
  hw/arm/virt: fix max-cpus check
  sd: limit 'req.cmd' while using as an array index
  target-arm: Implement checking of fired watchpoint
  cpu: Add callback to check architectural watchpoint match
  target-arm: Fix IL bit reported for Thumb VFP and Neon traps
  target-arm: Fix IL bit reported for Thumb coprocessor traps
  target-arm: Correct misleading 'is_thumb' syn_* parameter names
  target-arm: Enable EL3 for Cortex-A53 and Cortex-A57
  target-arm: Implement NSACR trapping behaviour
  target-arm: Add isread parameter to CPAccessFns
  target-arm: Update arm_generate_debug_exceptions() to handle EL2/EL3
  target-arm: Use access_trap_aa32s_el1() for SCR and MVBAR
  target-arm: Implement MDCR_EL3 and SDCR
  target-arm: Fix typo in comment in arm_is_secure_below_el3()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:24:16 +00:00
Stephen Warren
f0afa73164 bcm2835_property: implement "get board revision" query
Return a valid value from the BCM2835 property mailbox query "get board
revision". This query is used by U-Boot. Implementing it fixes the first
obvious difference between qemu and real HW.

The value returned is currently hard-coded to match the RPi2 I own. Other
values are legal, e.g. different board manufacturer field values are
likely to exist in the wild.

Cc: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Stephen Warren <swarren@wwwdotorg.org>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1454993910-24077-1-git-send-email-swarren@wwwdotorg.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Andrew Jones
7ea686f5dd hw/arm/virt: fix max-cpus check
mach-virt doesn't yet support hotplug, but command lines specifying
-smp <num>,maxcpus=<bigger-num> don't fail. Of course specifying
bigger-num as something bigger than the machine supports, e.g. > 8
on a gicv2 machine, should fail though. This fix also makes mach-
virt's max-cpus check truly consistent with the one in vl.c:main,
as the one there was already correctly checking max-cpus instead
of smp-cpus.

Reported-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1454511578-24863-1-git-send-email-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Prasad J Pandit
97f4ed3b71 sd: limit 'req.cmd' while using as an array index
While processing standard SD commands, the 'req.cmd' value could
lead to OOB read when used as an index into 'sd_cmd_type' or
'sd_cmd_class' arrays. Limit 'req.cmd' value to avoid such an
access.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453315857-1352-1-git-send-email-ppandit@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Sergey Fedorov
3826121d92 target-arm: Implement checking of fired watchpoint
ARM stops before access to a location covered by watchpoint. Also, QEMU
watchpoint fire is not necessarily an architectural watchpoint match.
Unfortunately, that is hardly possible to ignore a fired watchpoint in
debug exception handler. So move watchpoint check from debug exception
handler to the dedicated watchpoint checking callback.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454256948-10485-3-git-send-email-serge.fdrv@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Sergey Fedorov
568496c0c0 cpu: Add callback to check architectural watchpoint match
When QEMU watchpoint matches, that is not definitely an architectural
watchpoint match yet. If it is a stop-before-access watchpoint then that
is hardly possible to ignore it after throwing a TCG exception.

A special callback is introduced to check for architectural watchpoint
match before raising a TCG exception.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454256948-10485-2-git-send-email-serge.fdrv@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-11 11:17:32 +00:00
Peter Maydell
7d197d2db5 target-arm: Fix IL bit reported for Thumb VFP and Neon traps
All Thumb Neon and VFP instructions are 32 bits, so the IL
bit in the syndrome register should be set. Pass false to the
syn_* function's is_16bit argument rather than s->thumb
so we report the correct IL bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454683067-16001-4-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:32 +00:00
Peter Maydell
4df3225930 target-arm: Fix IL bit reported for Thumb coprocessor traps
All Thumb coprocessor instructions are 32 bits, so the IL
bit in the syndrome register should be set. Pass false to the
syn_* function's is_16bit argument rather than s->thumb
so we report the correct IL bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454683067-16001-3-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:31 +00:00
Peter Maydell
fc05f4a62c target-arm: Correct misleading 'is_thumb' syn_* parameter names
In syndrome register values, the IL bit indicates the instruction
length, and is 1 for 4-byte instructions and 0 for 2-byte
instructions. All A64 and A32 instructions are 4-byte, but
Thumb instructions may be either 2 or 4 bytes long. Unfortunately
we named the parameter to the syn_* functions for constructing
syndromes "is_thumb", which falsely implies that it should be
set for all Thumb instructions, rather than only the 16-bit ones.
Fix the functions to name the parameter 'is_16bit' instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454683067-16001-2-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:31 +00:00
Peter Maydell
3ad901bc2b target-arm: Enable EL3 for Cortex-A53 and Cortex-A57
Enable EL3 support for our Cortex-A53 and Cortex-A57 CPU models.
We have enough implemented now to be able to run real world code
at least to some extent (I can boot ARM Trusted Firmware to the
point where it pulls in OP-TEE and then falls over because it
doesn't have a UEFI image it can chain to).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454506721-11843-8-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:31 +00:00
Peter Maydell
2f027fc52d target-arm: Implement NSACR trapping behaviour
Implement some corner cases of the behaviour of the NSACR
register on ARMv8:
 * if EL3 is AArch64 then accessing the NSACR from Secure EL1
   with AArch32 should trap to EL3
 * if EL3 is not present or is AArch64 then reads from NS EL1 and
   NS EL2 return constant 0xc00

It would in theory be possible to implement all these with
a single reginfo definition, but for clarity we use three
separate definitions for the three cases and install the
right one based on the CPU feature flags.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1454506721-11843-7-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:31 +00:00
Peter Maydell
3f208fd76b target-arm: Add isread parameter to CPAccessFns
System registers might have access requirements which need to
be described via a CPAccessFn and which differ for reads and
writes. For this to be possible we need to pass the access
function a parameter to tell it whether the access being checked
is a read or a write.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454506721-11843-6-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:31 +00:00
Peter Maydell
533e93f1cf target-arm: Update arm_generate_debug_exceptions() to handle EL2/EL3
The arm_generate_debug_exceptions() function as originally implemented
assumes no EL2 or EL3. Since we now have much more of an implementation
of those now, fix this assumption.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454506721-11843-5-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:30 +00:00
Peter Maydell
efe4a27408 target-arm: Use access_trap_aa32s_el1() for SCR and MVBAR
The registers MVBAR and SCR should have the behaviour of trapping to
EL3 if accessed from Secure EL1, but we were incorrectly implementing
them to UNDEF (which would trap to EL1).  Fix this by using the new
access_trap_aa32s_el1() access function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1454506721-11843-4-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:30 +00:00
Peter Maydell
5513c3abed target-arm: Implement MDCR_EL3 and SDCR
Implement the MDCR_EL3 register (which is SDCR for AArch32).
For the moment we implement it as reads-as-written.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1454506721-11843-3-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:30 +00:00
Peter Maydell
6b7f0b61f0 target-arm: Fix typo in comment in arm_is_secure_below_el3()
Fix a typo where "EL2" was written but "EL3" intended.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454506721-11843-2-git-send-email-peter.maydell@linaro.org
2016-02-11 11:17:30 +00:00
Paolo Bonzini
88c73d16ad memory: fix usage of find_next_bit and find_next_zero_bit
The last two arguments to these functions are the last and first bit to
check relative to the base.  The code was using incorrectly the first
bit and the number of bits.  Fix this in cpu_physical_memory_get_dirty
and cpu_physical_memory_all_dirty.  This requires a few changes in the
iteration; change the code in cpu_physical_memory_set_dirty_range to
match.

Fixes: 5b82b70
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Message-id: 1455113505-11237-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-10 22:38:24 +00:00
John Snow
d590474922 ahci: prohibit "restarting" the FIS or CLB engines
If the FIS or DMA engines are already started, do not allow them to be
"restarted." As a side-effect of this change, the migration post-load
routine must be modified to cope. If the engines are listed as "on"
in the migrated registers, they must be cleared to allow the startup
routine to see the transition from "off" to "on".

As a second side-effect, the extra argument to ahci_cond_engine_start
is removed in favor of consistent behavior.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1454103689-13042-5-git-send-email-jsnow@redhat.com
2016-02-10 13:29:40 -05:00
John Snow
f8a6c5f318 ahci: explicitly reject bad engine states on post_load
Currently, we let ahci_cond_start_engines reject weird configurations
where either the DMA (CLB) or FIS engines are said to be started, but
their matching on/off control bit is toggled off.

There should be no way to achieve this, since any time you toggle the
control bit off, the status bit should always follow synchronously.

Preparing for a refactor in cond_start_engines, move the rejection logic
straight up into post_load.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1454103689-13042-4-git-send-email-jsnow@redhat.com
2016-02-10 13:29:40 -05:00
John Snow
f32a2f33c2 ahci: handle LIST_ON and FIS_ON in map helpers
Instead of relying on ahci_cond_start_engines to maintain the
engine status indicators itself, have the lower-layer CLB and FIS mapper
helpers do it themselves.

This makes the cond_start routine slightly nicer to read, and makes sure
that the status indicators will always be correct.

Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1454103689-13042-3-git-send-email-jsnow@redhat.com
2016-02-10 13:29:40 -05:00
John Snow
99b4cb7106 ahci: Do not unmap NULL addresses
Definitely don't try to unmap a garbage address.

Reported-by: Zuozhi fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1454103689-13042-2-git-send-email-jsnow@redhat.com
2016-02-10 13:29:40 -05:00
John Snow
c691320faa fdc: always compile-check debug prints
Coverity noticed that some variables are only used by debug prints, and
called them unused. Always compile the print statements. While we're
here, print to stderr as well.

Bonus: Fix a debug printf I broke in f31937aa8

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Touched up commit message. --js]
Message-id: 1454971529-14830-1-git-send-email-jsnow@redhat.com
2016-02-10 13:29:40 -05:00
John Snow
f34ae00d6d ide: fix device_reset to not ignore pending AIO
Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1453225191-11871-7-git-send-email-jsnow@redhat.com
2016-02-10 13:29:39 -05:00
John Snow
e3044e2383 ide: Add silent DRQ cancellation
Split apart the ide_transfer_stop function into two versions: one that
interrupts and one that doesn't. The one that doesn't can be used to
halt any PIO transfers that are in the DRQ phase. It will not halt
any PIO transfers that are currently in the process of buffering data
for the guest to read.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[Renamed 'etf' to 'end_transfer_func' --js]
Message-id: 1453225191-11871-6-git-send-email-jsnow@redhat.com
2016-02-10 13:29:39 -05:00
John Snow
51f7b5b883 ide: replace blk_drain_all by blk_drain
Target the drain for just one device.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1453225191-11871-5-git-send-email-jsnow@redhat.com
2016-02-10 13:29:39 -05:00
John Snow
86698a12f7 ide: move buffered DMA cancel to core
Buffered DMA cancellation was added to ATAPI devices and implemented
for the BMDMA HBA. Move the code over to common IDE code and allow
it to be used for any HBA.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1453225191-11871-4-git-send-email-jsnow@redhat.com
2016-02-10 13:29:39 -05:00
John Snow
4590355bb7 ide: code motion
Shuffle the reset function upwards.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1453225191-11871-3-git-send-email-jsnow@redhat.com
2016-02-10 13:29:39 -05:00
John Snow
266e77812c ide: Prohibit RESET on IDE drives
This command is meant for ATAPI devices only, prohibit acknowledging it with
a command aborted response when an IDE device is busy.

Signed-off-by: John Snow <jsnow@redhat.com>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1453225191-11871-2-git-send-email-jsnow@redhat.com
2016-02-10 13:29:38 -05:00
Ian Campbell
47d3df2387 xen: Drop __XEN_LATEST_INTERFACE_VERSION__ checks from prior to Xen 4.2
We assume (and check for in configure) 4.2 or later now. In reality
all of the removed checks are for far older versions.

FMT_ioreq_size is no longer needed.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-02-10 12:01:32 +00:00
Ian Campbell
6aa0205e49 xen: move xenforeignmemory compat layer into common place
Now that we no longer support Xen 4.2 and earlier only the <470 case
needs this so it can live with all the others.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-02-10 12:01:29 +00:00
Ian Campbell
81daba5880 xen: drop XenXC and associated interface wrappers
Now that 4.2 and earlier are no longer supported "xc_interface *" is
always the right type for the xc interface handle.

With this we can also simplify the handling of the xenforeignmemory
compatibility wrapper by making xenforeignmemory_handle ==
xc_interface, instead of an xc_interface* and remove various uses of &
and *h.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-02-10 12:01:24 +00:00
Ian Campbell
2ac9f6d4b1 xen: drop xen_xc_hvm_inject_msi wrapper
The xc version is now always present.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-02-10 12:01:22 +00:00
Ian Campbell
edfb07ed22 xen: drop support for Xen 4.1 and older.
Xen 4.2 become unsupported upstream in 09/2015 (see
http://wiki.xen.org/wiki/Xen_Release_Features). However as far as the
interfaces provided by the toolstack libraries go 4.2 and 4.3 are
indistinguishable.

Therefore drop support for Xen 4.1 and earlier which removes a whole
pile of compatibility code which makes future work (to use stable
library interfaces provided by upstream) more difficult. In particular
all supported versions now use a pointer as a libxc handle (4.1 and
earlier used an integer, resulting in various shim layers).

Also Xen 4.2 was the first version of Xen to formally support upstream
QEMU (as a preview) so that makes sense as a cut-off now.

This change drops all the configure-y and resulting ifdefs in a mostly
mechanical way. A follow up will refactor wrappers which are now
unused.

Signed-off-by: Ian Campbell <ian.campbell@citrix.com>
Reviewed-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
2016-02-10 12:01:16 +00:00
Peter Maydell
c9f19dff10 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* switch to C11 atomics (Alex)
* Coverity fixes for IPMI (Corey), i386 (Paolo), qemu-char (Paolo)
* at long last, fail on wrong .pc files if -m32 is in use (Daniel)
* qemu-char regression fix (Daniel)
* SAS1068 device (Paolo)
* memory region docs improvements (Peter)
* target-i386 cleanups (Richard)
* qemu-nbd docs improvements (Sitsofe)
* thread-safe memory hotplug (Stefan)

# gpg: Signature made Tue 09 Feb 2016 16:09:30 GMT using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (33 commits)
  qemu-char, io: fix ordering of arguments for UDP socket creation
  MAINTAINERS: add all-match entry for qemu-devel@
  get_maintainer.pl: fall back to git if only lists are found
  target-i386: fix PSE36 mode
  docs/memory.txt: Improve list of different memory regions
  ipmi_bmc_sim: Add break to correct watchdog NMI check
  ipmi_bmc_sim: Fix off by one in check.
  ipmi: do not take/drop iothread lock
  target-i386: Deconstruct the cpu_T array
  target-i386: Tidy gen_add_A0_im
  target-i386: Rewrite leave
  target-i386: Rewrite gen_enter inline
  target-i386: Use gen_lea_v_seg in pusha/popa
  target-i386: Access segs via TCG registers
  target-i386: Use gen_lea_v_seg in stack subroutines
  target-i386: Use gen_lea_v_seg in gen_lea_modrm
  target-i386: Introduce mo_stacksize
  target-i386: Create gen_lea_v_seg
  char: fix repeated registration of tcp chardev I/O handlers
  kvm-all: trace: strerror fixup
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 19:34:46 +00:00
Peter Maydell
f075c89f0a Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 09 Feb 2016 15:11:25 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  block: add missing call to bdrv_drain_recurse
  blockjob: Fix hang in block_job_finish_sync
  iov: avoid memcpy for "simple" iov_from_buf/iov_to_buf

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 17:56:46 +00:00
Paolo Bonzini
150dcd1aed qemu-char, io: fix ordering of arguments for UDP socket creation
Two wrongs make a right, but they should be fixed anyway.

Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1455015557-15106-1-git-send-email-pbonzini@redhat.com>
2016-02-09 17:09:15 +01:00
Peter Maydell
84c0781103 Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-02-09' into staging
Error reporting patches for 2016-02-09

# gpg: Signature made Tue 09 Feb 2016 12:38:33 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-error-2016-02-09:
  HACKING: Add a section on error handling and reporting
  error: Improve documentation some more
  Use error_fatal to simplify obvious fatal errors (again)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 16:09:15 +00:00
Stephen Warren
c9a19d5b95 MAINTAINERS: add all-match entry for qemu-devel@
Add an entry to MAINTAINERS that matches every patch, and requests the
user send patches to qemu-devel@nongnu.org.

It's not 100% obvious to project newcomers that all patches should be sent
there; checkpatch doesn't say so, and since it mentions other lists to CC,
the wording "the list" from the SubmitAPatch wiki page can be taken
to mean only those lists, not the main list too.

The F: entries were taken from a similar entry in the Linux kernel.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Stephen Warren <swarren@wwwdotorg.org>
Message-Id: <1454987065-12961-1-git-send-email-swarren@wwwdotorg.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 17:08:56 +01:00
Paolo Bonzini
4db84796e7 get_maintainer.pl: fall back to git if only lists are found
It's not 100% obvious to project newcomers that all patches should be sent
there; checkpatch doesn't say so, and since it mentions other lists to CC,
the wording "the list" from the SubmitAPatch wiki page can be taken
to mean only those lists, not the main list too.  We would like therefore
to add a catch-all entry for qemu-devel@nongnu.org.

On its own, this would break fallback to git, because now every file
has a maintainer of sorts.  Modify get_maintainer.pl so that mailing
lists (L: lines) no longer prevent the fallback, only humans (M:
entries).

Several pre-existing entries have a list but no human.  These now
fall back to git.  That's a feature.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Stephen Warren <swarren@wwwdotorg.org>
Message-Id: <1454987065-12961-1-git-send-email-swarren@wwwdotorg.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 17:07:55 +01:00
Paolo Bonzini
388ee48a88 target-i386: fix PSE36 mode
(pde & 0x1fe000) is a 32-bit integer; when shifting it
into bits 39-32 the result is zero.  Fix it by making the
mask (and thus the result of the AND) a 64-bit integer.

Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:55 +01:00
Peter Maydell
5056c0c3de docs/memory.txt: Improve list of different memory regions
Improve the part of the memory region documentation which describes
the various different kinds of memory region:
 * add the missing types ROM, IOMMU and reservation
 * mention the functions used to initialize each type, as a hint
   for finding the API docs and examples of use

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1454007297-3971-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:55 +01:00
Corey Minyard
37eebb8693 ipmi_bmc_sim: Add break to correct watchdog NMI check
It was falling through when it should have been a break.  Found by
Coverity.  The logic could be simplified a bit with a fallthrough,
probably the original thought, but that would be less clear, I think.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Message-Id: <1452519152-6500-3-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Corey Minyard
93a5364620 ipmi_bmc_sim: Fix off by one in check.
Found by Paolo.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Message-Id: <1452519152-6500-2-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Paolo Bonzini
ac5e8acdae ipmi: do not take/drop iothread lock
This is not necessary and actually causes a hang; it was probably copied
and pasted from KVM code, that is one of the very few places that run
outside iothread lock.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
1d1cc4d0f4 target-i386: Deconstruct the cpu_T array
All references to cpu_T are done with a constant index.  It aids
readability to decompose the array into two scalar variables.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1436426122-12276-11-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
4e85057b92 target-i386: Tidy gen_add_A0_im
Merge gen_op_addl_A0_im and gen_op_addq_A0_im into gen_add_A0_im
and clean up the ifdef.

Replace the one remaining user of gen_op_addl_A0_im with gen_add_A0_im.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-10-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
2045f04c3a target-i386: Rewrite leave
Unify the code across stack pointer widths.  Fix the note about
not updating ESP before the potential exception.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-9-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
743e398e2f target-i386: Rewrite gen_enter inline
Use gen_lea_v_seg for centralized segment base knowledge.  Unify
code across 32- and 64-bit.  Fix note about "must save state"
before using the out-of-line helpers.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-8-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
d37ea0c047 target-i386: Use gen_lea_v_seg in pusha/popa
More centralization of handling of segment bases.
Also fixes the note about 16-bit wrap around not fully handled.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-7-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:54 +01:00
Richard Henderson
3558f8055f target-i386: Access segs via TCG registers
Having segs[].base as a register significantly improves code
generation for real and protected modes, particularly for TBs
that have multiple memory references where the segment base
can be held in a hard register through the TB.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-6-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:46:52 +01:00
Richard Henderson
77ebcad04f target-i386: Use gen_lea_v_seg in stack subroutines
I.e. gen_push_v, gen_pop_T0, gen_stack_A0.
More centralization of handling of segment bases.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-5-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:27 +01:00
Richard Henderson
d6a2914984 target-i386: Use gen_lea_v_seg in gen_lea_modrm
Centralize handling of segment bases.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-4-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:27 +01:00
Richard Henderson
64ae256c24 target-i386: Introduce mo_stacksize
Centralize computation of a MO_SIZE for the stack pointer.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-3-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:27 +01:00
Richard Henderson
ca2f29f555 target-i386: Create gen_lea_v_seg
Add forgotten zero-extension in the TARGET_X86_64, !CODE64, ss32 case;
use this new function to implement gen_string_movl_A0_EDI,
gen_string_movl_A0_ESI, gen_add_A0_ds_seg.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1450379966-28198-2-git-send-email-rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Daniel P. Berrange
1e94f23d82 char: fix repeated registration of tcp chardev I/O handlers
In previous commit:

  commit f2001a7e05
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Tue Jan 19 11:14:30 2016 +0000

    char: don't assume telnet initialization will not block

The code which writes the telnet initialization sequence moved
to an event loop callback. If the TCP chardev is opened as a
server in blocking mode (ie -serial telnet:0.0.0.0:3000,server,wait)
this results in a state where the TCP chardev is connected, but not
yet ready to send/recv data when virtual hardware is created.

When the virtual hardware initialization registers its chardev
callbacks, it triggers tcp_chr_update_read_handler, which will
add I/O watches to the connection.

When the telnet initialization finally runs, it will then call
tcp_chr_connect to finish the connection setup. This will in
turn add I/O watches to the connection too.

There are now two sets of I/O watches registered on the same
connection. This ultimately causes data loss on the connection,
for example, when typing into the telnet console only every
second byte is echoed back to the client.

The same flaw can affect channels running with TLS encryption
too, since they also have delayed connection setup completion.

The fix is to update tcp_chr_update_read_handler so that it
avoids registering watches if the connection is not fully
setup yet.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1454939707-10869-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Andrew Jones
844a3d34d6 kvm-all: trace: strerror fixup
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1454355464-14999-1-git-send-email-drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
John Snow
667ad26ff8 nbd: avoid unaligned uint64_t store
cpu_to_be64w can't be used to make unaligned stores, but stq_be_p can.
Also, the st?_be_p takes a void* so it is more clearly suited to the
case where you're writing into a byte buffer.

Use the st?_be_p family of functions everywhere in nbd/server.c.

Signed-off-by: John Snow <jsnow@redhat.com>
[Changed to use st?_be_p everywhere. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Janosch Frank
e3dd68df52 scripts/kvm/kvm_stat: Fix tracefs access checking
On kernels build without CONFIG_TRACING kvm_stat will bail out even
when traces are not used. This is not very helpful, especially if the
user can't install a new kernel. Instead, we should warn the user and
fall back to debugfs statistics.

These changes check if trace statistics were selected without kernel
support, warn with a small timeout, set the debugfs statistics option
to True and the tracefs one to False.

Fixes: 7aa4ee5 ('scripts/kvm/kvm_stat: Improve debugfs access checking')
Signed-off-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Message-Id: <1454485291-43849-2-git-send-email-frankja@linux.vnet.ibm.com>
[Exit if -t is passed explicitly. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Sitsofe Wheeler
5090121845 qemu-nbd: Fix texi sentence capitalisation
Capitalise the first letter of sentences (and reword for grammar) the
options section of qemu-nbd.texi.

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Message-Id: <1451979212-25479-4-git-send-email-sitsofe@yahoo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Sitsofe Wheeler
7e8911bb40 qemu-nbd: Minor texi updates
- Change some spacing.
- Add disconnect usage to synopsis.
- Highlight the command and its options in the synopsis.
- Fix up the grammar in the description.
- Move filename variable description out of the option table.
- Add a description of the dev variable.
- Remove duplicate entry for --format.
- Reword --discard documentation.
- Add --detect-zeroes documentation.
- Add reference to qemu man page to see also section.

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Message-Id: <1451979212-25479-3-git-send-email-sitsofe@yahoo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Sitsofe Wheeler
b9dbb61757 qemu-nbd: Fix unintended texi verbatim formatting
Indented lines in the texi meant the perlpod produced interpreted the
paragraph as being verbatim (thus formatting codes were not
interpreted). Fix this by un-indenting problem lines.

Signed-off-by: Sitsofe Wheeler <sitsofe@yahoo.com>
Message-Id: <1451979212-25479-2-git-send-email-sitsofe@yahoo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
e351b82611 hw: Add support for LSI SAS1068 (mptsas) device
This adds the SAS1068 device, a SAS disk controller used in VMware that
is oldish but widely supported and has decent performance.  Unlike
megasas, it presents itself as a SAS controller and not as a RAID
controller.  The device corresponds to the mptsas kernel driver in
Linux.

A few small things in the device setup are based on Don Slutz's old
patch, but the device emulation was written from scratch based on Don's
SeaBIOS patch and on the FreeBSD and Linux drivers.  It is 2400 lines
shorter than Don's patch (and roughly the same size as MegaSAS---also
because it doesn't support the similar SPI controller), implements SCSI
task management functions (with asynchronous cancellation), supports
big-endian hosts, has complete support for migration and follows the
QEMU coding standards much more closely.

To write the driver, I first split Don's patch in two parts, with
the configuration bits in one file and the rest in a separate file.
I first left mptconfig.c in place and rewrote the rest, then deleted
mptconfig.c as well.  The configuration pages are still based mostly on
VirtualBox's, though not exactly the same.  However, the implementation
is completely different.  The contents of the pages themselves should
not be copyrightable.

Signed-off-by: Don Slutz <Don@CloudSwitch.com>
Message-Id: <1347382813-5662-1-git-send-email-Don@CloudSwitch.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
9fd7e85938 scsi-generic: grab device and port SAS addresses from backend
This lets a SAS adapter expose them through its own configuration
mechanism.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
2ecab4084f scsi: push WWN fields up to SCSIDevice
SAS adapters need to access them in order to publish the SAS addresses
of the end devices connected to them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Alex Bennée
a0aa44b488 include/qemu/atomic.h: default to __atomic functions
The __atomic primitives have been available since GCC 4.7 and provide
a richer interface for describing memory ordering requirements. As a
bonus by using the primitives instead of hand-rolled functions we can
use tools such as the ThreadSanitizer which need the use of well
defined APIs for its analysis.

If we have __ATOMIC defines we exclusively use the __atomic primitives
for all our atomic access. Otherwise we fall back to the mixture of
__sync and hand-rolled barrier cases.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1453976119-24372-4-git-send-email-alex.bennee@linaro.org>
[Use __ATOMIC_SEQ_CST for atomic_mb_read/atomic_mb_set on !POWER. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Daniel P. Berrange
977a82ab56 configure: sanity check the glib library that pkg-config finds
Developers on 64-bit machines will often try to perform a
32-bit build of QEMU by running

  ./configure --extra-cflags="-m32"

Unfortunately if PKG_CONFIG_LIBDIR is not set to point to
the location of the 32-bit pkg-config files, then configure
will silently pick up the 64-bit pkg-config files and still
succeed.

This causes a problem for glib because it means QEMU will
be pulling in /usr/lib64/glib-2.0/include/glibconfig.h
instead of /usr/lib/glib-2.0/include/glibconfig.h

This causes problems because the 'gsize' type (defined as
'unsigned long') will no longer be fully compatible with
the 'size_t' type (defined as 'unsigned int'). Although
both are the same size, the compiler refuses to allow
casts from 'unsigned long *' to 'unsigned int *' as they
are different pointer types. This results in non-obvious
compiler errors when building QEMU eg

qga/commands-posix.c: In function ‘qmp_guest_set_user_password’:
qga/commands-posix.c:1912:55: error: passing argument 2 of ‘g_base64_decode’ from incompatible pointer type [-Werror=incompatible-pointer-types]
     rawpasswddata = (char *)g_base64_decode(password, &rawpasswdlen);
                                                            ^
In file included from /usr/include/glib-2.0/glib.h:35:0,
                 from qga/commands-posix.c:14:
/usr/include/glib-2.0/glib/gbase64.h:52:9: note: expected ‘gsize * {aka long unsigned int *}’ but argument is of type ‘size_t * {aka unsigned int *}’
 guchar *g_base64_decode         (const gchar  *text,
         ^
cc1: all warnings being treated as errors

To detect this problem, add a check to configure that
verifies that GLIB_SIZEOF_SIZE_T matches sizeof(size_t).
If this fails print a warning suggesting that the dev
probably needs to set PKG_CONFIG_LIBDIR.

On Fedora x86_64 it passes with any of:

 # ./configure
 # PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig ./configure --extra-cflags="-m32"
 # PKG_CONFIG_LIBDIR=/usr/lib64/pkgconfig ./configure --extra-cflags="-m64"

And fails with a mis-match

 # PKG_CONFIG_LIBDIR=/usr/lib64/pkgconfig ./configure --extra-cflags="-m32"
 # PKG_CONFIG_LIBDIR=/usr/lib/pkgconfig ./configure --extra-cflags="-m64"

ERROR: sizeof(size_t) doesn't match GLIB_SIZEOF_SIZE_T.
       You probably need to set PKG_CONFIG_LIBDIR
       to point to the right pkg-config files for your
       build target

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1453885245-15562-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
34689e206a qemu-char: Keep pty slave file descriptor open until the master is closed
If a process opens the slave pts device, writes data to it, then
immediately closes it, the data doesn't reliably get delivered to the
emulated serial port. This seems to be because a read of the master
pty device returns EIO on Linux if no process has the pts device open,
even when data is waiting "in the pipe".

A fix seems to be for QEMU to keep the pts file descriptor open until
the pty is closed, as per the below patch.

Signed-off-by: Ashley Jonathan <jonathan.ashley@altran.com>
Message-Id: <AC19797808C8D548ABDE0CA4A97AA30A30DEB409@XMB-DCFR-37.europe.corp.altran.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Stefan Hajnoczi
5b82b703b6 memory: RCU ram_list.dirty_memory[] for safe RAM hotplug
Although accesses to ram_list.dirty_memory[] use atomics so multiple
threads can safely dirty the bitmap, the data structure is not fully
thread-safe yet.

This patch handles the RAM hotplug case where ram_list.dirty_memory[] is
grown.  ram_list.dirty_memory[] is change from a regular bitmap to an
RCU array of pointers to fixed-size bitmap blocks.  Threads can continue
accessing bitmap blocks while the array is being extended.  See the
comments in the code for an in-depth explanation of struct
DirtyMemoryBlocks.

I have tested that live migration with virtio-blk dataplane works.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1453728801-5398-2-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Paolo Bonzini
8bafcb2164 memory: add early bail out from cpu_physical_memory_set_dirty_range
This condition is true in the common case, so we can cut out the body of
the function.  In addition, this makes it easier for the compiler to do
at least partial inlining, even if it decides that fully inlining the
function is unreasonable.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-02-09 15:45:26 +01:00
Peter Maydell
2f71f79ccd Merge remote-tracking branch 'remotes/stsquad/tags/pull-build-test-20160209' into staging
This is the third attempt for this pull request.

Since the v4 was posted:
 - fixed merge conflict with ed7f5f1d8d
 - added cleaner separation line to MAINTAINERS at Fam's request
 - skip "make check" for --enable-trace-backends=simple (see 41fc57e44e)

# gpg: Signature made Tue 09 Feb 2016 12:33:45 GMT using RSA key ID 5A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"

* remotes/stsquad/tags/pull-build-test-20160209:
  MAINTAINERS: Add .travis.yml
  .travis.yml: reduce the test matrix a little
  .travis.yml: enable ccache for the builds
  .travis.yml: enable each of the co-routine backends
  .travis.yml: run make check for all matrix targets
  .travis.yml: migrate to container builds

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 14:21:20 +00:00
Paolo Bonzini
9dcf8ecd9e block: add missing call to bdrv_drain_recurse
This is also needed in bdrv_drain_all, not just in bdrv_drain.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1450867706-19860-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-09 13:52:26 +00:00
Fam Zheng
794f01414f blockjob: Fix hang in block_job_finish_sync
With a mirror job running on a virtio-blk dataplane disk, sending "q" to
HMP will cause a dead loop in block_job_finish_sync.

This is because the aio_poll() only processes the AIO context of bs
which has no more work to do, while the main loop BH that is scheduled
for setting the job->completed flag is never processed.

Fix this by adding a flag in BlockJob structure, to track which context
to poll for the block job to make progress. Its value is set to true
when block_job_coroutine_complete() is called, and is checked in
block_job_finish_sync to determine which context to poll.

Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1454379144-29807-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-09 13:52:26 +00:00
Paolo Bonzini
ad523bca56 iov: avoid memcpy for "simple" iov_from_buf/iov_to_buf
memcpy can take a large amount of time for small reads and writes.
For virtio it is a common case that the first iovec can satisfy the
whole read or write.  In that case, and if bytes is a constant to
avoid excessive growth of code, inline the first iteration
into the caller.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1450782213-14227-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-09 13:52:26 +00:00
Markus Armbruster
d76a3bf5c4 HACKING: Add a section on error handling and reporting
Inspired by an RFC PATCH from Lluís Vilanova.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1454522628-28294-3-git-send-email-armbru@redhat.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
2016-02-09 13:19:49 +01:00
Markus Armbruster
10303f04b9 error: Improve documentation some more
Don't claim error_report_err() always reports to stderr.  It actually
reports to the current monitor when we have one.

Clarify intended use of error_abort and error_fatal.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1454522628-28294-2-git-send-email-armbru@redhat.com>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
2016-02-09 13:19:41 +01:00
Peter Maydell
ac1be2ae6b Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-02-09' into staging
QAPI patches for 2016-02-09

# gpg: Signature made Tue 09 Feb 2016 10:55:51 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2016-02-09: (31 commits)
  qapi: Add missing JSON files in build dependencies
  qapi: Fix compilation failure on MIPS and SPARC
  qmp: Don't abuse stack to track qmp-output root
  qmp: Fix reference-counting of qnull on empty output visit
  qapi: Drop unused error argument for list and implicit struct
  qapi: Tighten qmp_input_end_list()
  qapi: Drop unused 'kind' for struct/enum visit
  qapi: Swap 'name' in visit_* callbacks to match public API
  qom: Swap 'name' next to visitor in ObjectPropertyAccessor
  qapi: Swap visit_* arguments for consistent 'name' placement
  qom: Use typedef for Visitor
  qapi: Don't cast Enum* to int*
  qapi: Consolidate visitor small integer callbacks
  qapi: Make all visitors supply uint64 callbacks
  qapi: Prefer type_int64 over type_int in visitors
  qapi-visit: Kill unused visit_end_union()
  qapi: Track all failures between visit_start/stop
  qapi: Improve generated event use of qapi visitor
  balloon: Improve use of qapi visitor
  vl: Ensure qapi visitor properly ends struct visit
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 11:42:43 +00:00
Peter Maydell
74f30f153f Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20160209' into staging
Queued TCG patches

# gpg: Signature made Mon 08 Feb 2016 23:57:30 GMT using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tcg-20160209:
  tcg: Introduce temp_load
  tcg: Change temp_save argument to TCGTemp
  tcg: Change temp_sync argument to TCGTemp
  tcg: Change temp_dead argument to TCGTemp
  tcg: Change reg_to_temp to TCGTemp pointer
  tcg: Remove tcg_get_arg_str_i32/64
  tcg: More use of TCGReg where appropriate
  tcg: Work around clang bug wrt enum ranges
  tcg: Tidy temporary allocation
  tcg: Change ts->mem_reg to ts->mem_base
  tcg: Change tcg_global_mem_new_* to take a TCGv_ptr
  tcg: Remove lingering references to gen_opc_buf
  tcg: Respect highwater in tcg_out_tb_finalize

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-09 09:22:24 +00:00
Richard Henderson
40ae5c62eb tcg: Introduce temp_load
Unify all of the places that realize a temporary into a register.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
b13eb728d3 tcg: Change temp_save argument to TCGTemp
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
12b9b11a27 tcg: Change temp_sync argument to TCGTemp
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
f8bf00f102 tcg: Change temp_dead argument to TCGTemp
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
f8b2f20234 tcg: Change reg_to_temp to TCGTemp pointer
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
e4ce0d4eb7 tcg: Remove tcg_get_arg_str_i32/64
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
b663866231 tcg: More use of TCGReg where appropriate
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
c807402320 tcg: Work around clang bug wrt enum ranges
A subsequent patch patch will change the type of REG from int
to enum TCGReg, which provokes the following bug in clang:

  https://llvm.org/bugs/show_bug.cgi?id=16154

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:45:34 +11:00
Richard Henderson
7ca4b752fe tcg: Tidy temporary allocation
In particular, make sure the memory is memset before use.
Continues the increased use of TCGTemp pointers instead of
integer indices where appropriate.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:19:32 +11:00
Richard Henderson
b3a6293956 tcg: Change ts->mem_reg to ts->mem_base
Chain the temporaries together via pointers intstead of indices.
The mem_reg value is now mem_base->reg.  This will be important later.

This does require that the frame pointer have a global temporary
allocated for it.  This is simple bar the existing reserved_regs check.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:19:32 +11:00
Richard Henderson
e1ccc05444 tcg: Change tcg_global_mem_new_* to take a TCGv_ptr
Thus, use cpu_env as the parameter, not TCG_AREG0 directly.
Update all uses in the translators.

Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:19:32 +11:00
Richard Henderson
2015770593 tcg: Remove lingering references to gen_opc_buf
Three in comments and one in code in the stub tcg_liveness_analysis.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:19:32 +11:00
Richard Henderson
23dceda62a tcg: Respect highwater in tcg_out_tb_finalize
Undo the workaround at b17a6d3390.

If there are lots of memory operations in a TB, the slow path code
can exceed the highwater reservation.  Add a check within the loop.

Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-02-09 10:19:32 +11:00
Alex Bennée
b9e02c061b MAINTAINERS: Add .travis.yml
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-08 18:50:31 +00:00
Alex Bennée
119721907d .travis.yml: reduce the test matrix a little
As we are now running "make check" on more of the matrix it is worth
making more of an effort to reduce the overall load on Travis. I've done
a few things:

  - Combining a number of the targets
  - Building one target for each ancillary build

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-08 18:50:25 +00:00
Alex Bennée
4c33d42d0c .travis.yml: enable ccache for the builds
Travis support ccache on a cache-per-branch basis. Given not much of the
build changes between pushes as well as the duplication in each build it
seems worthwhile enabling this.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-08 18:49:58 +00:00
Alex Bennée
15552dbbee .travis.yml: enable each of the co-routine backends
We disable "make check" for the gthread backend as it is broken.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-08 18:49:58 +00:00
Alex Bennée
01337fbd7f .travis.yml: run make check for all matrix targets
We only ran make check once before it used to be an unreliable target.
It was only a stop gap measure and we should be able to revert it now.
This also stops us needing a large all-MMU build.

We disable "make check" for a couple of the extra config targets which
are currently broken.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-08 18:49:52 +00:00
Lluís Vilanova
423aeaf219 qapi: Add missing JSON files in build dependencies
Forgotten in commit 1dde0f4 (trace.json) and commit fafa4d5
(rocker.json).

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <145461055662.15201.2702170180078718114.stgit@localhost>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
86ae191163 qapi: Fix compilation failure on MIPS and SPARC
Commit 86f4b687 broke compilation on MIPS and SPARC, which have a
preprocessor pollution of '#define mips 1' and '#define sparc 1',
respectively.  Treat it the same way as we do for the pollution with
'unix', so that QMP remains backwards compatible and only the C code
needs to use the alternative 'q_mips', 'q_sparc' spelling.

CC: James Hogan <james.hogan@imgtec.com>
CC: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
455ba08afd qmp: Don't abuse stack to track qmp-output root
The previous commit documented an inconsistency in how we are
using the stack of qmp-output-visitor.  Normally, pushing a
single top-level object puts the object on the stack twice:
once as the root, and once as the current container being
appended to; but popping that struct only pops once.  However,
qmp_ouput_add() was trying to either set up the added object
as the new root (works if you parse two top-level scalars in a
row: the second replaces the first as the root) or as a member
of the current container (works as long as you have an open
container on the stack; but if you have popped the first
top-level container, it then resolves to the root and still
tries to add into that existing container).

Fix the stupidity by not tracking two separate things in the
stack.  Drop the now-useless qmp_output_first() and
qmp_output_last() while at it.

Saved for a later patch: we still are rather sloppy in that
qmp_output_get_object() can be called in the middle of a parse,
rather than requiring that a visit is complete.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-26-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
a861564015 qmp: Fix reference-counting of qnull on empty output visit
Commit 6c2f9a15 ensured that we would not return NULL when the
caller used an output visitor but had nothing to visit. But
in doing so, it added a FIXME about a reference count leak
that could abort qemu in the (unlikely) case of SIZE_MAX such
visits (more plausible on 32-bit).  (Although that commit
suggested we might fix it in time for 2.5, we ran out of time;
fortunately, it is unlikely enough to bite that it was not
worth worrying about during the 2.5 release.)

This fixes things by documenting the internal contracts, and
explaining why the internal function can return NULL and only
the public facing interface needs to worry about qnull(),
thus avoiding over-referencing the qnull_ global object.

It does not, however, fix the stupidity of the stack mixing
up two separate pieces of information; add a FIXME to explain
that issue, which will be fixed shortly in a future patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Message-Id: <1454075341-13658-25-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
08f9541dec qapi: Drop unused error argument for list and implicit struct
No backend was setting an error when ending the visit of a list or
implicit struct, or when moving to the next list node.  Make the
callers a bit easier to follow by making this a part of the contract,
and removing the errp argument - callers can then unconditionally end
an object as part of cleanup without having to think about whether a
second error is dominated by a first, because there is no second
error.

A later patch will then tackle the larger task of splitting
visit_end_struct(), which can indeed set an error.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
bdd8e6b5d8 qapi: Tighten qmp_input_end_list()
The only way that qmp_input_pop() will set errp is if a dictionary
was the most recent thing pushed.  Since we don't have any
push(struct)/pop(list) or push(list)/pop(struct) mismatches (such
a mismatch is a programming bug), we therefore cannot set errp
inside qmp_input_end_list().  Make this obvious by
using &error_abort.  A later patch will then remove the errp
parameter of qmp_input_pop(), but that will first require the
larger task of splitting visit_end_struct().

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-23-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
337283dffb qapi: Drop unused 'kind' for struct/enum visit
visit_start_struct() and visit_type_enum() had a 'kind' argument
that was usually set to either the stringized version of the
corresponding qapi type name, or to NULL (although some clients
didn't even get that right).  But nothing ever used the argument.
It's even hard to argue that it would be useful in a debugger,
as a stack backtrace also tells which type is being visited.

Therefore, drop the 'kind' argument as dead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-22-git-send-email-eblake@redhat.com>
[Harmless rebase mistake cleaned up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:57 +01:00
Eric Blake
0b2a0d6bb2 qapi: Swap 'name' in visit_* callbacks to match public API
As explained in the previous patches, matching argument order of
'name, &value' to JSON's "name":value makes sense.  However,
while the last two patches were easy with Coccinelle, I ended up
doing this one all by hand.  Now all the visitor callbacks match
the main interface.

The compiler is able to enforce that all clients match the changed
interface in visitor-impl.h, even where two pointers are being
swapped, because only one of the two pointers is const (if that
were not the case, then C's looseness on treating 'char *' like
'void *' would have made review a bit harder).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-21-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake
d7bce9999d qom: Swap 'name' next to visitor in ObjectPropertyAccessor
Similar to the previous patch, it's nice to have all functions
in the tree that involve a visitor and a name for conversion to
or from QAPI to consistently stick the 'name' parameter next
to the Visitor parameter.

Done by manually changing include/qom/object.h and qom/object.c,
then running this Coccinelle script and touching up the fallout
(Coccinelle insisted on adding some trailing whitespace).

    @ rule1 @
    identifier fn;
    typedef Object, Visitor, Error;
    identifier obj, v, opaque, name, errp;
    @@
     void fn
    - (Object *obj, Visitor *v, void *opaque, const char *name,
    + (Object *obj, Visitor *v, const char *name, void *opaque,
       Error **errp) { ... }

    @@
    identifier rule1.fn;
    expression obj, v, opaque, name, errp;
    @@
     fn(obj, v,
    -   opaque, name,
    +   name, opaque,
        errp)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-20-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake
51e72bc1dd qapi: Swap visit_* arguments for consistent 'name' placement
JSON uses "name":value, but many of our visitor interfaces were
called with visit_type_FOO(v, &value, name, errp).  This can be
a bit confusing to have to mentally swap the parameter order to
match JSON order.  It's particularly bad for visit_start_struct(),
where the 'name' parameter is smack in the middle of the
otherwise-related group of 'obj, kind, size' parameters! It's
time to do a global swap of the parameter ordering, so that the
'name' parameter is always immediately after the Visitor argument.

Additional reason in favor of the swap: the existing include/qjson.h
prefers listing 'name' first in json_prop_*(), and I have plans to
unify that file with the qapi visitors; listing 'name' first in
qapi will minimize churn to the (admittedly few) qjson.h clients.

Later patches will then fix docs, object.h, visitor-impl.h, and
those clients to match.

Done by first patching scripts/qapi*.py by hand to make generated
files do what I want, then by running the following Coccinelle
script to affect the rest of the code base:
 $ spatch --sp-file script `git grep -l '\bvisit_' -- '**/*.[ch]'`
I then had to apply some touchups (Coccinelle insisted on TAB
indentation in visitor.h, and botched the signature of
visit_type_enum() by rewriting 'const char *const strings[]' to
the syntactically invalid 'const char*const[] strings').  The
movement of parameters is sufficient to provoke compiler errors
if any callers were missed.

    // Part 1: Swap declaration order
    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_start_struct
    -(TV v, TObj OBJ, T1 ARG1, const char *name, T2 ARG2, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type bool, TV, T1;
    identifier ARG1;
    @@
     bool visit_optional
    -(TV v, T1 ARG1, const char *name)
    +(TV v, const char *name, T1 ARG1)
     { ... }

    @@
    type TV, TErr, TObj, T1;
    identifier OBJ, ARG1;
    @@
     void visit_get_next_type
    -(TV v, TObj OBJ, T1 ARG1, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj, T1, T2;
    identifier OBJ, ARG1, ARG2;
    @@
     void visit_type_enum
    -(TV v, TObj OBJ, T1 ARG1, T2 ARG2, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, T1 ARG1, T2 ARG2, TErr errp)
     { ... }

    @@
    type TV, TErr, TObj;
    identifier OBJ;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
     void VISIT_TYPE
    -(TV v, TObj OBJ, const char *name, TErr errp)
    +(TV v, const char *name, TObj OBJ, TErr errp)
     { ... }

    // Part 2: swap caller order
    @@
    expression V, NAME, OBJ, ARG1, ARG2, ERR;
    identifier VISIT_TYPE =~ "^visit_type_";
    @@
    (
    -visit_start_struct(V, OBJ, ARG1, NAME, ARG2, ERR)
    +visit_start_struct(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -visit_optional(V, ARG1, NAME)
    +visit_optional(V, NAME, ARG1)
    |
    -visit_get_next_type(V, OBJ, ARG1, NAME, ERR)
    +visit_get_next_type(V, NAME, OBJ, ARG1, ERR)
    |
    -visit_type_enum(V, OBJ, ARG1, ARG2, NAME, ERR)
    +visit_type_enum(V, NAME, OBJ, ARG1, ARG2, ERR)
    |
    -VISIT_TYPE(V, OBJ, NAME, ERR)
    +VISIT_TYPE(V, NAME, OBJ, ERR)
    )

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake
4fa45492c3 qom: Use typedef for Visitor
No need to repeat 'struct Visitor' when we already have it in
typedefs.h.  Omitting the redundant 'struct' also makes a later
patch easier to search for all object property callbacks that
are associated with a Visitor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-18-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:56 +01:00
Eric Blake
395a233f7c qapi: Don't cast Enum* to int*
C compilers are allowed to represent enums as a smaller type
than int, if all enum values fit in the smaller type.  There
are even compiler flags that force the use of this smaller
representation, although using them changes the ABI of a
binary. Therefore, our generated code for visit_type_ENUM()
(for all qapi enums) was wrong for casting Enum* to int* when
calling visit_type_enum().

It appears that no one has been using compiler ABI switches
for qemu, because if they had, we are potentially dereferencing
beyond bounds or even risking a SIGBUS on platforms where
unaligned pointer dereferencing is fatal.  But it is still
better to avoid the practice entirely, and just use the correct
types.

This matches the fix for alternate qapi types, done earlier in
commit 0426d53 "qapi: Simplify visiting of alternate types",
with generated code changing as:

| void visit_type_QType(Visitor *v, QType *obj, const char *name, Error **errp)
| {
|-    visit_type_enum(v, (int *)obj, QType_lookup, "QType", name, errp);
|+    int value = *obj;
|+    visit_type_enum(v, &value, QType_lookup, "QType", name, errp);
|+    *obj = value;
| }

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-17-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
04e070d217 qapi: Consolidate visitor small integer callbacks
Commit 4e27e819 introduced optional visitor callbacks for all
sorts of int types, but no visitor has supplied any of the
callbacks for sizes less than 64 bits.  In other words, the
generic implementation based on using type_[u]int64() followed
by bounds-checking works just fine. In the interest of
simplicity, it's easier to make the visitor callback interface
not have to worry about the other sizes.

Adding some helper functions minimizes the boilerplate required
to correct FIXMEs added earlier with regards to questionable
reuse of errp, particularly now that we can guarantee from a
single file audit that value is unchanged if an error is set.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-16-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
f755dea79d qapi: Make all visitors supply uint64 callbacks
Our qapi visitor contract supports multiple integer visitors,
but left the type_uint64 visitor as optional (falling back on
type_int64); which in turn can lead to awkward behavior with
numbers larger than INT64_MAX (the user has to be aware of
twos complement, and deal with negatives).

This patch does not address the disparity in handling large
values as negatives.  It merely moves the fallback from uint64
to int64 from the visitor core to the visitors, where the issue
can actually be fixed, by implementing the missing type_uint64()
callbacks on top of the respective type_int64() callbacks, and
with a FIXME comment explaining why that's wrong.

With that done, we now have a type_uint64() callback in every
driver, so we can make it mandatory from the core.  And although
the type_int64() callback can cover the entire valid range of
type_uint{8,16,32} on valid user input, using type_uint64() to
avoid mixed signedness makes more sense.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-15-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
4c40314a35 qapi: Prefer type_int64 over type_int in visitors
The qapi builtin type 'int' is basically shorthand for the type
'int64'.  In fact, since no visitor was providing the optional
type_int64() callback, visit_type_int64() was just always falling
back to type_int(), cementing the equivalence between the types.

However, some visitors are providing a type_uint64() callback.
For purposes of code consistency, it is nicer if all visitors
use the paired type_int64/type_uint64 names rather than the
mismatched type_int/type_uint64.  So this patch just renames
the signed int callbacks in place, dropping the type_int()
callback as redundant, and a later patch will focus on the
unsigned int callbacks.

Add some FIXMEs to questionable reuse of errp in code touched
by the rename, while at it (the reuse works as long as the
callbacks don't modify value when setting an error, but it's not
a good example to set) - a later patch will then fix those.

No change in functionality here, although further cleanups are
in the pipeline.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-14-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
7c91aabd89 qapi-visit: Kill unused visit_end_union()
The generated code can call visit_end_union() without having called
visit_start_union().  Example:

        if (!*obj) {
            goto out_obj;
        }
        visit_type_CpuInfoBase_fields(v, (CpuInfoBase **)obj, &err);
        if (err) {
            goto out_obj; // if we go from here...
        }
        if (!visit_start_union(v, !!(*obj)->u.data, &err) || err) {
            goto out_obj;
        }
        switch ((*obj)->arch) {
    [...]
        }
    out_obj:
        // ... then *obj is true, and ...
        error_propagate(errp, err);
        err = NULL;
        if (*obj) {
            // we end up here
            visit_end_union(v, !!(*obj)->u.data, &err);
        }
        error_propagate(errp, err);

Harmless only because no visitor implements end_union().  Clean it up
anyway, by deleting the function as useless.

Messed up since we have visit_end_union (commit cee2ded).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1453902888-20457-3-git-send-email-armbru@redhat.com>
[expand scope of patch to delete rather than repair]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-13-git-send-email-eblake@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
92b09babc1 qapi: Track all failures between visit_start/stop
Inside the generated code between visit_start_struct() and
visit_end_struct(), we were blindly setting the error into
the caller's errp parameter.  But a future patch to split
visit_end_struct() will require that we take action based
on whether an error has occurred, which requires us to track
all actions through a local err.  Rewrite the visits to be
more in line with the other generated calls.

Generated code changes look like:

|     visit_start_struct(v, (void **)obj, "Abort", name, sizeof(Abort), &err);
|-    if (!err) {
|-        if (*obj) {
|-            visit_type_Abort_fields(v, obj, errp);
|-        }
|-        visit_end_struct(v, &err);
|+    if (err) {
|+        goto out;
|     }
|+    if (!*obj) {
|+        goto out_obj;
|+    }
|+    visit_type_Abort_fields(v, obj, &err);
|+    error_propagate(errp, err);
|+    err = NULL;
|+out_obj:
|+    visit_end_struct(v, &err);
|+out:
|     error_propagate(errp, err);
| }

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-12-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
a16e3e5c58 qapi: Improve generated event use of qapi visitor
All other successful clients of visit_start_struct() were paired
with an unconditional visit_end_struct(); but the generated
code for events was relying on qmp_output_visitor_cleanup() to
work on an incomplete visit.  Alter the code to guarantee that
the struct is completed, which will make a future patch to
split visit_end_struct() easier to reason about.  While at it,
drop some assertions and comments that are not present in other
uses of the qmp output visitor, and pass NULL rather than "" as
the 'kind' parameter (matching most other uses where obj is NULL).

The changes to the generated code look like:

|     qmp = qmp_event_build_dict("DEVICE_TRAY_MOVED");
|
|     qov = qmp_output_visitor_new();
|-    g_assert(qov);
|-
|     v = qmp_output_get_visitor(qov);
|-    g_assert(v);
|
|-    /* Fake visit, as if all members are under a structure */
|-    visit_start_struct(v, NULL, "", "DEVICE_TRAY_MOVED", 0, &err);
|+    visit_start_struct(v, NULL, NULL, "DEVICE_TRAY_MOVED", 0, &err);
|     if (err) {
|         goto out;
|     }
|     visit_type_str(v, (char **)&device, "device", &err);
|     if (err) {
|-        goto out;
|+        goto out_obj;
|     }
|     visit_type_bool(v, &tray_open, "tray-open", &err);
|     if (err) {
|-        goto out;
|+        goto out_obj;
|     }
|-    visit_end_struct(v, &err);
|+out_obj:
|+    visit_end_struct(v, err ? NULL : &err);
|     if (err) {
|         goto out;
|     }
|
|     obj = qmp_output_get_qobject(qov);
|-    g_assert(obj != NULL);
|+    g_assert(obj);
|
|     qdict_put_obj(qmp, "data", obj);
|     emit(QAPI_EVENT_DEVICE_TRAY_MOVED, qmp, &err);

Note that the 'goto out_obj' with no intervening code before the
label, as well as the construct of 'err ? NULL : &err', are both
a bit unusual but also temporary; they get fixed in a later patch
that splits visit_end_struct() to drop its errp parameter by moving
some checking before the label.  But until that time, this was the
simplest way to avoid the appearance of passing a possibly-set
error to visit_end_struct(), even though actual code inspection
shows that visit_end_struct() for a QMP output visitor will never
set an error.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-11-git-send-email-eblake@redhat.com>
[Commit message's code diff tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
9dbb8fa7ef balloon: Improve use of qapi visitor
Rework the control flow of balloon_stats_get_all() to make it
easier for a later patch to split visit_end_struct().  Also
switch to the uint64 visitor to match the data type.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-10-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
014791b0df vl: Ensure qapi visitor properly ends struct visit
Guarantee that visit_end_struct() is called if
visit_start_struct() succeeded.  This matches the behavior of
most other uses of visitors, and is a step towards the possibility
of a future patch that adds and enforces some tighter semantics to
the visitor interface (namely, cleanup of the visitor would no
longer have to mop up as many leftovers from an aborted partial
visit).

The change to code here matches the flow of hmp.c:hmp_object_add();
a later patch will then further simplify the cleanup logic of both
places by refactoring visit_end_struct() to not require a second
local error object.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
9b65859d5e hmp: Cache use of qapi visitor
Cache the visitor in a local variable instead of repeatedly
calling the accessor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-8-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
7019738d4c hmp: Drop pointless allocation during qapi visit
The qapi visitor contract allows us to visit a virtual structure,
where we don't have any corresponding qapi struct.  Most such uses
pass NULL for @obj; but these two callers were passing a dummy
pointer, which then gets allocated to heap memory but then
immediately freed without use.  Clean this up to suppress unwanted
allocation, like we do elsewhere.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-7-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
e408311546 qapi: Drop dead parameter in gen_params()
Commit 5cdc8831 reworked gen_params() to be simpler, but forgot
to clean up a now-unused errp named argument.

No change to generated code.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-6-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:55 +01:00
Eric Blake
4894b00b27 qapi: Dealloc visitor does not need a type_size()
The intent of having the visitor type_size() callback differ
from type_uint64() is to allow special handling for sizes; the
visitor core gracefully falls back to type_uint64() if there is
no need for the distinction.  Since the dealloc visitor does
nothing for any of the int visits, drop the pointless size
handler.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-5-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Eric Blake
77577cb8d6 qapi: Drop dead dealloc visitor variable
Commit 0b9d8542 added StackEntry.is_list_head, but forgot to
delete the now-unused QapiDeallocVisitor.is_list_head.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Eric Blake
d7bea75d35 qapi: Avoid use of misnamed DO_UPCAST()
The macro DO_UPCAST() is incorrectly named: it converts from a
parent class to a derived class (which is a downcast).  Better,
and more consistent with some of the other qapi visitors, is
to use the container_of() macro through a to_FOO() helper.  Names
like 'to_ov()' may be a bit short, but for a static helper it
doesn't hurt too much, and matches existing practice in files
like qmp-input-visitor.c.

Our current definition of container_of() is weaker than
DO_UPCAST(), in that it does not require the derived class to
have Visitor as its first member, but this does not hurt our
usage patterns in qapi visitors.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1454075341-13658-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Eric Blake
6e8e5cb9aa qobject: Document more shortcomings in our number handling
We've already documented that our JSON parsing is locale dependent;
but we should also document that our JSON output has the same
problem.  Additionally, JSON requires finite values (you have to
upgrade to JSON5 to get support for Inf or NaN), and our output
truncates floating point numbers to the point of losing significant
precision that could cause the receiver to read a different value.

Sadly, this series is not going to be the one that addresses these
problems.

Fix some trailing whitespace I noticed in the vicinity.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1454075341-13658-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Markus Armbruster
03e188102c tests: Use Python 2.6 "except E as ..." syntax
PEP 8 calls for it, because it's forward compatible with Python 3.
Supported since Python 2.6, which we require (commit fec2103).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1450425164-24969-5-git-send-email-armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Markus Armbruster
86b227d984 Revert "tracetool: use Python 2.4-compatible exception handling syntax"
This reverts commit 662da3854e.

We require Python 2.6 now (commit fec2103).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1450425164-24969-4-git-send-email-armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Markus Armbruster
cf6c63456b scripts/qmp: Use Python 2.6 "except E as ..." syntax
PEP 8 calls for it, because it's forward compatible with Python 3.
Supported since Python 2.6, which we require (commit fec2103).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1450425164-24969-3-git-send-email-armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Markus Armbruster
291928a80f qapi: Use Python 2.6 "except E as ..." syntax
PEP 8 calls for it, because it's forward compatible with Python 3.
Supported since Python 2.6, which we require (commit fec2103).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1450425164-24969-2-git-send-email-armbru@redhat.com>
2016-02-08 17:29:54 +01:00
Markus Armbruster
07d04a0219 Use error_fatal to simplify obvious fatal errors (again)
Done with the Coccinelle semantic patch from commit 007b065, plus
manual clean up of dead variables.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1452783732-6581-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-08 17:22:00 +01:00
Peter Maydell
e4a096b1cd ui/cocoa.m: Include qemu/osdep.h
Include "qemu/osdep.h". (This is a manual commit equivalent
to what the clean-includes script would do, because that
script can't handle ObjectiveC source files.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454084614-5365-1-git-send-email-peter.maydell@linaro.org
2016-02-08 13:14:40 +00:00
Peter Maydell
bdad0f3977 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc and misc cleanups and fixes, virtio optimizations

Included here:
Refactoring and bugfix patches in PC/ACPI.
New commands for ipmi.
Virtio optimizations.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Sat 06 Feb 2016 18:44:26 GMT using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (45 commits)
  net: set endianness on all backend devices
  fix MSI injection on Xen
  intel_iommu: large page support
  dimm: Correct type of MemoryHotplugState->base
  pc: set the OEM fields in the RSDT and the FADT from the SLIC
  acpi: add function to extract oem_id and oem_table_id from the user's SLIC
  acpi: expose oem_id and oem_table_id in build_rsdt()
  acpi: take oem_id in build_header(), optionally
  pc: Eliminate PcGuestInfo struct
  pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
  pc: Move PcGuestInfo.fw_cfg to PCMachineState
  pc: Remove PcGuestInfo.isapc_ram_fw field
  pc: Remove RAM size fields from PcGuestInfo
  pc: Remove compat fields from PcGuestInfo
  acpi: Don't save PcGuestInfo on AcpiBuildState
  acpi: Remove guest_info parameters from functions
  pc: Simplify xen_load_linux() signature
  pc: Simplify pc_memory_init() signature
  pc: Eliminate struct PcGuestInfoState
  pc: Move PcGuestInfo declaration to top of file
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-08 11:25:31 +00:00
Laurent Vivier
a407644079 net: set endianness on all backend devices
commit 5be7d9f1b1
       vhost-net: tell tap backend about the vnet endianness

makes vhost net to set the endianness of the device, but only for
the first device.

In case of multiqueue, we have multiple devices... This patch sets the
endianness for all the devices of the interface.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
2016-02-06 20:44:10 +02:00
Stefano Stabellini
428c3ece97 fix MSI injection on Xen
On Xen MSIs can be remapped into pirqs, which are a type of event
channels. It's mostly for the benefit of PCI passthrough devices, to
avoid the overhead of interacting with the emulated lapic.

However remapping interrupts and MSIs is also supported for emulated
devices, such as the e1000 and virtio-net.

When an interrupt or an MSI is remapped into a pirq, masking and
unmasking is done by masking and unmasking the event channel. The
masking bit on the PCI config space or MSI-X table should be ignored,
but it isn't at the moment.

As a consequence emulated devices which use MSI or MSI-X, such as
virtio-net, don't work properly (the guest doesn't receive any
notifications). The mechanism was working properly when xen_apic was
introduced, but I haven't narrowed down which commit in particular is
causing the regression.

Fix the issue by ignoring the masking bit for MSI and MSI-X which have
been remapped into pirqs.

Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:10 +02:00
Jason Wang
d66b969b0d intel_iommu: large page support
Current intel_iommu only supports 4K page which may not be sufficient
to cover guest working set. This patch tries to enable 2M and 1G mapping
for intel_iommu. This is also useful for future device IOTLB
implementation to have a better hit rate.

Major work is adding a page mask field on IOTLB entry to make it
support large page. And also use the slpte level as key to do IOTLB
lookup. MAMV was increased to 18 to support direct invalidation for 1G
mapping.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:10 +02:00
David Gibson
adcb4ee660 dimm: Correct type of MemoryHotplugState->base
The 'base' field of MemoryHotplugState is ram_addr_t, which indicates that
it exists in the abstract address space of RAM regions.

However, the actual usage of this field indicates that it is a concrete
physical address (it's passed as an offset to memory_region_add_subgregion
for example).

So, correct its type to 'hwaddr'.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2016-02-06 20:44:10 +02:00
Laszlo Ersek
ae12374951 pc: set the OEM fields in the RSDT and the FADT from the SLIC
The Microsoft spec about the SLIC and MSDM ACPI tables at
<http://go.microsoft.com/fwlink/p/?LinkId=234834> requires the OEM ID and
OEM Table ID fields to be consistent between the SLIC and the RSDT/XSDT.
That further affects the FADT, because a similar match between the FADT
and the RSDT/XSDT is required by the ACPI spec in general.

This patch wires up the previous three patches.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Steven Newbury <steve@snewbury.org.uk>
2016-02-06 20:44:10 +02:00
Laszlo Ersek
88594e4fd1 acpi: add function to extract oem_id and oem_table_id from the user's SLIC
The acpi_get_slic_oem() function stores pointers to these fields in the
(first) SLIC table that the user passes in with the -acpitable switch.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Steven Newbury <steve@snewbury.org.uk>
2016-02-06 20:44:10 +02:00
Laszlo Ersek
5151355898 acpi: expose oem_id and oem_table_id in build_rsdt()
Since build_rsdt() is implemented as common utility code (in
"hw/acpi/aml-build.c"), it should expose -- and forward -- the oem_id and
oem_table_id parameters between board code and the generic build_header()
function.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Shannon Zhao <zhaoshenglong@huawei.com> (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
2016-02-06 20:44:10 +02:00
Laszlo Ersek
37ad223c51 acpi: take oem_id in build_header(), optionally
This patch is the continuation of commit 8870ca0e94 ("acpi: support
specified oem table id for build_header"). It will allow us to control the
OEM ID field too in the SDT header.

Cc: "Michael S. Tsirkin" <mst@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Igor Mammedov <imammedo@redhat.com> (supporter:ACPI/SMBIOS)
Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com> (maintainer:NVDIMM)
Cc: Shannon Zhao <zhaoshenglong@huawei.com> (maintainer:ARM ACPI Subsystem)
Cc: Paolo Bonzini <pbonzini@redhat.com> (maintainer:X86)
Cc: Richard W.M. Jones <rjones@redhat.com>
Cc: Aleksei Kovura <alex3kov@zoho.com>
Cc: Michael Tokarev <mjt@tls.msk.ru>
Cc: Steven Newbury <steve@snewbury.org.uk>
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1248758
LP: https://bugs.launchpad.net/qemu/+bug/1533848
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
2016-02-06 20:44:10 +02:00
Eduardo Habkost
e4e8ba04c2 pc: Eliminate PcGuestInfo struct
The struct is not used for anything, now.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:10 +02:00
Eduardo Habkost
dd4c2f01ab pc: Move APIC and NUMA data from PcGuestInfo to PCMachineState
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:10 +02:00
Eduardo Habkost
f264d360e0 pc: Move PcGuestInfo.fw_cfg to PCMachineState
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
5db3f0deaf pc: Remove PcGuestInfo.isapc_ram_fw field
The code can use the PCMachineClass.pci_enabled field directly.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
5299f1c70a pc: Remove RAM size fields from PcGuestInfo
The ACPI code can use the PCMachineState fields directly.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
bb292f5a9b pc: Remove compat fields from PcGuestInfo
Remove the fields: legacy_acpi_table_size, has_acpi_build,
has_reserved_memory, and rsdp_in_ram from PcGuestInfo, and let
the existing code use the PCMachineClass fields directly.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
f944d4798c acpi: Don't save PcGuestInfo on AcpiBuildState
We don't need to save the pointer on AcpiBuildState, as it is not
used anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
fb306ffeba acpi: Remove guest_info parameters from functions
We can use PC_MACHINE(qdev_get_machine())->acpi_guest_info to get
guest_info.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
7bc35e0f20 pc: Simplify xen_load_linux() signature
We can get the PcGuestInfo struct directly from PCMachineState,
and the return value is not needed at all.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
5934e2169a pc: Simplify pc_memory_init() signature
We can get the PcGuestInfo struct directly from PCMachineState,
and the return value is not needed at all.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
9ebeed0c1e pc: Eliminate struct PcGuestInfoState
Instead of allocating a new struct just for PcGuestInfo and the
mchine_done Notifier, place them inside PCMachineState.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Eduardo Habkost
281b104702 pc: Move PcGuestInfo declaration to top of file
The struct will be used inside PCMachineState.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
52ba4d509d ipmi: add ACPI power and GUID commands
>From the specs (20.8 Get Device GUID Command), the command needs to
return a GUID (Globally Unique ID), or UUID, that should never change
over the lifetime of the device. qemu_uuid looked like a good
candidate to start with but we could use a specific BMC property also
if needed.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
b708839223 ipmi: add GET_SYS_RESTART_CAUSE chassis command
This is a simulator. Just return an unknown cause (0).

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
728710e1b0 ipmi: add get and set SENSOR_TYPE commands
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
a2295f0a58 ipmi: introduce a struct ipmi_sdr_compact
Currently, sdr attributes are identified using byte offsets and this
can be a bit confusing.

This patch adds a struct ipmi_sdr_compact conforming to the IPMI specs
and replaces byte offsets with names. It also introduces and uses a
struct ipmi_sdr_header in sections of the code where no assumption is
made on the type of SDR. This leave rooms to potential usage of other
types in the future.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
792afddb4a ipmi: fix SDR length value
The IPMI BMC simulator populates the SDR table with a set of initial
SDRs. The length of each SDR is taken from the record itself (byte 4)
which does not include the size of the header. But, the full length
(header + data) is required by the sdr_add_entry() routine.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
7cfa06a2f1 ipmi: cleanup error_report messages
Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:09 +02:00
Cédric Le Goater
62a4931d1e ipmi: replace *_MAXCMD defines
ARRAY_SIZE() is simple to use and removes the need to pre-define
the size of the command arrays.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Cédric Le Goater
d13ada5d8f ipmi: replace goto by a return statement
Each routine using the IPMI_ADD_RSP_DATA, IPMI_CHECK_CMD_LEN or
IPMI_CHECK_RESERVATION macros needs to define a goto label 'out' to
handle hidden errors. Using directly a return statement has the same
effect and it removes the fact that 'out' needs to be defined.

The code exits in ipmi_sim_handle_command() are a little different
from the rest and a "possible" error in the macro IPMI_ADD_RSP_DATA is
handled before making use of it. This might be a bit excessive as a
minimum response len is currently 300 bytes and the patch checks that
at least 3 are available.

Signed-off-by: Cédric Le Goater <clg@fr.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Corey Minyard <cminyard@mvista.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Marcel Apfelbaum
0144f6f1ce hw/pci: ensure that only PCI/PCIe bridges can be attached to pxb/pxb-pcie devices
PCI devices can't be plugged directly into PCI extra root bridges
because their resources can't be computed by firmware before the ACPI
tables are loaded.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
b5c6eaf173 vhost-user-test: use correct ROM to speed up and avoid spurious failures
The mechanism to get the option ROM for virtio-net does not block the
PCI ROM from being loaded. Therefore, in vhost-user-test there are
two entries in the boot menu for the virtio-net card: one as an
embedded option ROM, one from the ROM BAR.

The embedded option ROM in vhost-user-test is the non-EFI-enabled,
while the ROM BAR has an EFI-enabled ROM. The two are compiled with
slightly different parameters, where only the old BIOS-only one doesn't
have a timeout for the "Press Ctrl-B" banner. When using a new
machine type, therefore, the vhost-user-test has to wait for the
EFI-enabled ROM's banner to go away. There are several ways to fix
this:

1) fix the ROMs to have the same configuration

2) add ",romfile=" to the -device line

3) remove --option-rom and add the ROM file name to the -device line

4) use an old machine type

This patch chooses 3. In addition, the file name was wrong because
qtest runs QEMU relative to the top build directory, not to the
x86_64-softmmu/ subdirectory, which is fixed too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Marcel Apfelbaum
13d11b0ba8 hw/pxb: add pxb devices to the bridge category
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Vincenzo Maffione
1cdd2ee54a virtio: combine write of an entry into used ring
Fill in an element of the used ring with a single combined access to the
guest physical memory, rather than using two separated accesses.
This reduces the overhead due to expensive address translation.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <e4a89a767a4a92cbb6bcc551e151487eb36e1722.1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Vincenzo Maffione
be1fea9bc2 virtio: read avail_idx from VQ only when necessary
The virtqueue_pop() implementation needs to check if the avail ring
contains some pending buffers. To perform this check, it is not
always necessary to fetch the avail_idx in the VQ memory, which is
expensive. This patch introduces a shadow variable tracking avail_idx
and modifies virtio_queue_empty() to access avail_idx in physical
memory only when necessary.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <b617d6459902773d9f4ab843bfaca764f5af8eda.1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Vincenzo Maffione
b796fcd1bf virtio: cache used_idx in a VirtQueue field
Accessing used_idx in the VQ requires an expensive access to
guest physical memory. Before this patch, 3 accesses are normally
done for each pop/push/notify call. However, since the used_idx is
only written by us, we can track it in our internal data structure.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <3d062ec54e9a7bf9fb325c1fd693564951f2b319.1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
aa570d6fb6 virtio: combine the read of a descriptor
Compared to vring, virtio has a performance penalty of 10%.  Fix it
by combining all the reads for a descriptor in a single address_space_read
call.  This also simplifies the code nicely.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
5dba97ebdc vring: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list.  The cost of the copy is minimal compared to that
of a large malloc.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
3b3b062821 virtio: slim down allocation of VirtQueueElements
Build the addresses and s/g lists on the stack, and then copy them
to a VirtQueueElement that is just as big as required to contain this
particular s/g list.  The cost of the copy is minimal compared to that
of a large malloc.

When virtqueue_map is used on the destination side of migration or on
loadvm, the iovecs have already been split at memory region boundary,
so we can just reuse the out_num/in_num we find in the file.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
3724650db0 virtio: introduce virtqueue_alloc_element
Allocate the arrays for in_addr/out_addr/in_sg/out_sg outside the
VirtQueueElement.  For now, virtqueue_pop and vring_pop keep
allocating a very large VirtQueueElement.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
ab281c1781 virtio: introduce qemu_get/put_virtqueue_element
Move allocation to virtio functions also when loading/saving a
VirtQueueElement.  This will also let the load/save functions
keep backwards compatibility when the VirtQueueElement layout
is changed.

Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-06 20:44:08 +02:00
Paolo Bonzini
51b19ebe43 virtio: move allocation to virtqueue_pop/vring_pop
The return code of virtqueue_pop/vring_pop is unused except to check for
errors or 0.  We can thus easily move allocation inside the functions
and just return a pointer to the VirtQueueElement.

The advantage is that we will be able to allocate only the space that
is needed for the actual size of the s/g list instead of the full
VIRTQUEUE_MAX_SIZE items.  Currently VirtQueueElement takes about 48K
of memory, and this kind of allocation puts a lot of stress on malloc.
By cutting the size by two or three orders of magnitude, malloc can
use much more efficient algorithms.

The patch is pretty large, but changes to each device are testable
more or less independently.  Splitting it would mostly add churn.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-06 20:39:07 +02:00
Alex Bennée
692d162cb2 .travis.yml: migrate to container builds
This moves the Travis tests from the legacy VM infrastructure (which
only seems to run 5-6 jobs at once) to the new container based approach.

The principle difference is there is no sudo in the containers so all
packages are installed using the apt add-on. This means one of the build
combinations can be dropped as it was only for checking the build with
additional packages.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
2016-02-05 16:45:56 +00:00
Peter Maydell
ee8e8f92a7 Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.6-2' into staging
Migration pull req.

Small fixes, nothing major.

# gpg: Signature made Fri 05 Feb 2016 13:51:30 GMT using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-for-2.6-2:
  migration: fix bad string passed to error_report()
  static checker: e1000-82540em got aliased to e1000
  migration: remove useless code.
  qmp-commands.hx: Document the missing options for migration capability commands
  qmp-commands.hx: Fix the missing options for migration parameters commands
  migration/ram: Fix some helper functions' parameter to use PageSearchStatus
  savevm: Split load vm state function qemu_loadvm_state
  migration: rename 'file' in MigrationState to 'to_dst_file'
  ram: Split host_from_stream_offset() into two helper functions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-05 14:20:46 +00:00
Greg Kurz
15d61692da migration: fix bad string passed to error_report()
state->name does not contain a terminating '\0' and you may get:

Machine type received is 'pseries-2.3y�?' and local is 'pseries-2.4'
load of migration failed: Invalid argument

Let's add a precision modifier to fix this.

Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <20160205083201.2201.76109.stgit@bahia.huguette.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:51 +05:30
Amit Shah
1483e0d74d static checker: e1000-82540em got aliased to e1000
Commit 8304402033 changed the name of the
e1000-82540em device to e1000.  This was flagged:

   Section "e1000-82540em" does not exist in dest

Add the mapping to the changed section names dictionary so the checker
can proceed.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-Id: <7ccfe834c897142dceaa4da87c13b7059fa12aa8.1450416947.git.amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
Liang Li
b33dc45c3f migration: remove useless code.
Since 's->state' will be set in migrate_init(), there is no
need to set it before calling migrate_init(). The code and
the related comments can be removed.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1453875065-24326-1-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
164f59e86e qmp-commands.hx: Document the missing options for migration capability commands
Add the missing descriptions for the options of migration capability commands,
and fix the example for query-migrate-capabilities command.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-7-git-send-email-zhang.zhanghailiang@huawei.com>
[Amit: Strip whitespace]
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
9c994a976f qmp-commands.hx: Fix the missing options for migration parameters commands
We didn't document x-cpu-throttle-initial/x-cpu-throttle-increment for
commands migrate-set-parameters and query-migrate-parameters.

Here we add the descriptions for these two options and fix the wrong example
for query-migrate-parameters qmp commands.
Besides, this will also fix the bug that we can't set x-cpu-throttle-initial
and x-cpu-throttle-increment through migrate-set-parameters qmp command.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-6-git-send-email-zhang.zhanghailiang@huawei.com>
[Amit: fix typo in 'auto-converge']
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
a08f689034 migration/ram: Fix some helper functions' parameter to use PageSearchStatus
Some helper functions use parameters 'RAMBlock *block' and 'ram_addr_t *offset',
We can use 'PageSearchStatus *pss' directly instead, with this change, we
can reduce the number of parameters for these helper function, also
it is easily to add new parameters for these helper functions.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-5-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
fb3520a84e savevm: Split load vm state function qemu_loadvm_state
qemu_loadvm_state is too long, and we can simplify it by splitting up
with three helper functions.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-4-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
89a02a9f7b migration: rename 'file' in MigrationState to 'to_dst_file'
Rename the 'file' member of MigrationState to 'to_dst_file' to
be consistent with to_src_file, from_src_file and from_dst_file.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-3-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
zhanghailiang
4c4bad4861 ram: Split host_from_stream_offset() into two helper functions
Split host_from_stream_offset() into two parts:
One is to get ram block, which the block idstr may be get from migration
stream, the other is to get hva (host) address from block and the offset.
Besides, we will do the check working in a new helper offset_in_ramblock().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1452829066-9764-2-git-send-email-zhang.zhanghailiang@huawei.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-02-05 19:09:50 +05:30
Paolo Bonzini
6aa46d8ff1 virtio: move VirtQueueElement at the beginning of the structs
The next patch will make virtqueue_pop/vring_pop allocate memory for
the VirtQueueElement. In some cases (blk, scsi, gpu) the device wants
to extend VirtQueueElement with device-specific fields and, until now,
the place of the VirtQueueElement within the containing struct didn't
matter. When allocating the entire block in virtqueue_pop/vring_pop,
however, the containing struct must basically be a "subclass" of
VirtQueueElement, with the VirtQueueElement as the first field. Make
that the case for blk and scsi; gpu is already doing it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-04 19:53:02 +02:00
Igor Mammedov
0734fb083c tests: pc: acpi: add expected DSDT.bridge blobs and update DSDT blobs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-04 19:53:02 +02:00
Igor Mammedov
caf50c7166 tests: pc: acpi: drop not needed 'expected SSDT' blobs
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-04 19:53:02 +02:00
Igor Mammedov
41fa5c0410 pc: acpi: merge SSDT into DSDT
Since both tables are built dynamically now,
there is no point in keeping ASL in them in separate
tables.
So do the same as we do for ARM where we have only
DSDT table, i.e. move SSDT ASL into DSDT and
drop SSDT altogether.
This patch doesn't change moved SSDT ASL in any way,
but it opens a way to relatively independently simplify
generated ASL on per device/subsystem basis in
followup series.
It also simplifies bios-tables-test where expected
SSDT blobs could be dropped and only DSDT ones
have to be maintained.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-02-04 19:53:02 +02:00
Dr. David Alan Gilbert
3e996cc583 Fix virtio migration
I misunderstood the vmstate macro definition when I reworked the
virtio .get/.put.
The VMSTATE_STRUCT_VARRAY_KNOWN, was described as being for "a
variable length array (i.e. _type *_field) but we know the
length".  However it actually specified operation for arrays embedded in
the struct (i.e. _type _field[]) since it lacked the VMS_POINTER
flag. This caused offset calculation to be completely off, examining and
potentially sending random data instead of the VirtQueue content.

Replace the otherwise unused VMSTATE_STRUCT_VARRAY_KNOWN with a
VMSTATE_STRUCT_VARRAY_POINTER_KNOWN that includes the VMS_POINTER flag
(so now actually doing what it advertises) and use it in the virtio
migration code.

Fixes and description as per Sascha's suggestions/debug.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-By: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-By: Sascha Silbe <silbe@linux.vnet.ibm.com>

Fixes: 50e5ae4dc3
Fixes: 2cf0148674
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-02-04 19:53:02 +02:00
Peter Maydell
d38ea87ac5 all: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-16-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
ccd241b5a2 contrib: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-15-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
cae9fc567d io: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-14-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
9bbc853bd4 qom: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-13-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
f2ad72b30e qobject: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1454089805-5470-12-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
2744d9207f net: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-11-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
7df7482bf6 slirp: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-10-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
4459bf3866 qga: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-9-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
cbf2115190 qapi: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1454089805-5470-8-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
48d4ab25e7 disas: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-7-git-send-email-peter.maydell@linaro.org
2016-02-04 17:41:30 +00:00
Peter Maydell
aafd758410 util: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-6-git-send-email-peter.maydell@linaro.org
2016-02-04 17:01:04 +00:00
Peter Maydell
9c058332f3 backends: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-5-git-send-email-peter.maydell@linaro.org
2016-02-04 17:01:04 +00:00
Peter Maydell
2231197c87 bsd-user: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-4-git-send-email-peter.maydell@linaro.org
2016-02-04 17:01:04 +00:00
Peter Maydell
87c9b5e047 stubs: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-3-git-send-email-peter.maydell@linaro.org
2016-02-04 17:01:04 +00:00
Peter Maydell
e16f4c8770 ui: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1454089805-5470-2-git-send-email-peter.maydell@linaro.org
2016-02-04 17:01:04 +00:00
Peter Maydell
5a3be00c9a Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Thu 04 Feb 2016 11:18:01 GMT using RSA key ID AE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-04 16:16:00 +00:00
Peter Maydell
bac8e20367 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Thu 04 Feb 2016 08:26:24 GMT using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/filter: Fix the output information for command 'info network'
  net: always walk through filters in reverse if traffic is egress
  net: netmap: use nm_open() to open netmap ports
  e1000: eliminate infinite loops on out-of-bounds transfer start
  slirp: Adding family argument to tcp_fconnect()
  slirp: Make udp_attach IPv6 compatible
  slirp: Add sockaddr_equal, make solookup family-agnostic
  slirp: Factorizing and cleaning solookup()
  slirp: Factorizing address translation
  slirp: Make Socket structure IPv6 compatible
  slirp: Adding address family switch for produced frames
  slirp: Generalizing and neutralizing ARP code
  slirp: goto bad in udp_input if sosendto fails
  cadence_gem: fix buffer overflow
  net: cadence_gem: check packet size in gem_recieve
  qemu-doc: Do not promote deprecated -smb and -redir options
  net/slirp: Tell the users when they are using deprecated options

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-04 14:17:11 +00:00
Peter Maydell
ae533a46a1 Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Wed 03 Feb 2016 20:29:54 GMT using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"

* remotes/jnsnow/tags/ide-pull-request:
  dma: remove now useless DMA_* functions
  sb16: use IsaDma interface instead of global DMA_* functions
  gus: use IsaDma interface instead of global DMA_* functions
  cs4231a: use IsaDma interface instead of global DMA_* functions
  fdc: use IsaDma interface instead of global DMA_* functions
  sparc64: disable floppy DMA
  sparc: disable floppy DMA
  magnum: disable floppy DMA for now
  i8257: implement the IsaDma interface
  isa: add an ISA DMA interface, and store it within the ISA bus
  i8257: move state definition to new independent header
  i8257: QOM'ify
  i8257: add missing const
  i8257: make the DMA running method per controller
  i8257: rename functions to start with i8257_ prefix
  i8257: rename struct dma_regs to I8257Regs
  i8257: rename struct dma_cont to I8257State
  i8257: pass ISA bus to DMA_init() function
  i82374: device only existed as ISA device, so simplify device
  fdc: fix detection under Linux

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-04 12:50:43 +00:00
Mark Cave-Ayland
44c44eceea Update OpenBIOS images
Update OpenBIOS images to SVN r1378 built from submodule.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-02-04 11:17:44 +00:00
Peter Maydell
071aacc9c9 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160203' into staging
target-arm queue:
 * virt-acpi-build: add always-on property for timer
 * various fixes for EL2 and EL3 behaviour
 * arm: virt-acpi: each MADT.GICC entry as enabled unconditionally
 * target-arm: Don't report presence of EL2 if it doesn't exist
 * raspi: add raspberry pi 2 machine

# gpg: Signature made Wed 03 Feb 2016 18:58:02 GMT using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160203:
  raspi: add raspberry pi 2 machine
  arm/boot: move highbank secure board setup code to common routine
  bcm2836: add bcm2836 SoC device
  bcm2836_control: add bcm2836 ARM control logic
  bcm2835_peripherals: add rollup device for bcm2835 peripherals
  bcm2835_ic: add bcm2835 interrupt controller
  bcm2835_property: add bcm2835 property channel
  bcm2835_mbox: add BCM2835 mailboxes
  target-arm: Don't report presence of EL2 if it doesn't exist
  libvixl: Avoid std::abs() of 64-bit type
  arm: virt-acpi: each MADT.GICC entry as enabled unconditionally
  target-arm: Implement the S2 MMU inputsize > pamax check
  target-arm: Rename check_s2_startlevel to check_s2_mmu_setup
  target-arm: Apply S2 MMU startlevel table size check to AArch64
  hw/arm: Setup EL1 and EL2 in AArch64 mode for 64bit Linux boots
  target-arm: Make various system registers visible to EL3
  virt-acpi-build: add always-on property for timer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-04 11:06:35 +00:00
zhanghailiang
aa9156f4b1 net/filter: Fix the output information for command 'info network'
The properties of netfilter object could be changed by 'qom-set'
command, but the output of 'info network' command is not updated,
because it got the old information through nf->info_str, it will
not be updated while we change the value of netfilter's property.

Here we split a helper function that could collect the output
information for filter, and also remove the useless member
'info_str' from struct NetFilterState.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Yang Hongyang <hongyang.yang@easystack.cn>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Li Zhijian
25aaadf063 net: always walk through filters in reverse if traffic is egress
Previously, if we attach more than one filters for a single netdev,
both ingress and egress traffic will go through net filters in same
order like:

ingress: netdev ->filter1 ->filter2 ->...filter[n] ->emulated device
egress: emulated device ->filter1 ->filter2 ->...filter[n] ->netdev.

This is against the natural feeling and will complicate filters
configuration since in some scenes, we hope filters handle the egress
traffic in a reverse order. For example, in colo-proxy (will be
implemented later), we have a redirector filter and a colo-rewriter
filter, we need the filter behave like:

ingress(->)/egress(<-): chardev<->redirector<->colo-rewriter<->emulated device

Since both buffer filter and dump do not require strict order of
filters, this patch switches to always let egress traffic walk through
net filters in reverse to simplify the possible filters configuration
in the future.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Yang Hongyang <hongyang.yang@easystack.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Vincenzo Maffione
ab685220f6 net: netmap: use nm_open() to open netmap ports
This patch simplifies the netmap backend code by means of the nm_open()
helper function provided by netmap_user.h, which hides the details of
open(), iotcl() and mmap() carried out on the netmap device.

Moreover, the semantic of nm_open() makes it possible to open special
netmap ports (e.g. pipes, monitors) and use special modes (e.g. host rings
only, single queue mode, exclusive access).

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Laszlo Ersek
dd793a7488 e1000: eliminate infinite loops on out-of-bounds transfer start
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:

- the TDLEN and RDLEN registers store the total size of the descriptor
  area,

- while the TDH and RDH registers store the offset (in whole tx / rx
  descriptors) into the area where the transfer is supposed to start.

Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).

QEMU already contains logic to deal with bogus transfers submitted by the
guest:

- Normally, the transmit case wants to increase TDH from its initial value
  to TDT. (TDT is allowed to be numerically smaller than the initial TDH
  value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
  that QEMU currently has here is a check against reaching the original
  TDH value again -- a complete wraparound, which should never happen.

- In the receive case RDH is increased from its initial value until
  "total_size" bytes have been received; preferably in a single step, or
  in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
  RX descriptors are skipped without receiving data, while RDH is
  incremented just the same. QEMU tries to prevent an infinite loop
  (processing only null RX descriptors) by detecting whether RDH assumes
  its original value during the loop. (Again, wrapping from RDLEN to 0 is
  normal.)

What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.

The condition that expresses this is:

  xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)

i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.

This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.

This is CVE-2016-1981.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Guillaume Subiron
cc573a6924 slirp: Adding family argument to tcp_fconnect()
This patch simply adds a unsigned short family argument to remove the hardcoded
"AF_INET" in the call of qemu_socket().

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Guillaume Subiron
9b5a30dc41 slirp: Make udp_attach IPv6 compatible
A unsigned short is now passed in argument to udp_attach instead of using a
hardcoded "AF_INET" to call qemu_socket().

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 14:13:11 +08:00
Guillaume Subiron
8a87f121ca slirp: Add sockaddr_equal, make solookup family-agnostic
This patch makes solookup() compatible with varying address
families, by using a new sockaddr_equal() function that compares
two sockaddr_storage.

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
a5fd24aa6d slirp: Factorizing and cleaning solookup()
solookup() was only compatible with TCP. Having the socket list in
argument, it is now compatible with UDP too.

Some optimization code is factorized inside the function (the function
look at the last returned result before browsing the complete socket
list).

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
5379229a27 slirp: Factorizing address translation
This patch factorizes some duplicate code into a new function,
sotranslate_out(). This function perform the address translation when a
packet is transmitted to the host network. If the packet is destinated
to the host, the loopback address is used, and if the packet is
destinated to the virtual DNS, the real DNS address is used. This code
is just a copy of the existent, but factorized and ready to manage the
IPv6 case.

On the same model, the major part of udp_output() code is moved into a
new sotranslate_in(). This function is directly used in sorecvfrom(),
like sotranslate_out() in sosendto().
udp_output() becoming useless, it is removed and udp_output2() is
renamed into udp_output(). This adds consistency with the udp6_output()
function introduced by further patches.

Lastly, this factorizes some duplicate code into sotranslate_accept(), which
performs the address translation when a connection is established on the host
for port forwarding: if it comes from localhost, the host virtual address is
used instead.

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
eae303ff23 slirp: Make Socket structure IPv6 compatible
This patch replaces foreign and local address/port couples in Socket
structure by 2 sockaddr_storage which can be casted in sockaddr_in.
Direct access to address and port is still possible thanks to some
\#define, so retrocompatibility of the existing code is assured.

The ss_family field of sockaddr_storage is declared after each socket
creation.

The whole structure is also saved/restored when a Qemu session is
saved/restored.

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
18137fba35 slirp: Adding address family switch for produced frames
In if_encap, a switch is added to prepare for the IPv6 case. Some code
is factorized.

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
fc3779a118 slirp: Generalizing and neutralizing ARP code
Basically, this patch replaces "arp" by "resolution" every time "arp"
means "mac resolution" and not specifically ARP.

This prepares for IPv6 support.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Guillaume Subiron
86c9e1e9d7 slirp: goto bad in udp_input if sosendto fails
Before this patch, if sosendto fails, udp_input is executed as if the
packet was sent, recording the packet for icmp errors, which does not
makes sense since the packet was not actually sent, errors would be
related to a previous packet.

This patch adds a goto bad to cut the execution of this function.

Signed-off-by: Guillaume Subiron <maethor@subiron.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Michael S. Tsirkin
d7f053652f cadence_gem: fix buffer overflow
gem_transmit copies a packet from guest into an tx_packet[2048]
array on stack, with size limited by descriptor length set by guest.  If
guest is malicious and specifies a descriptor length that is too large,
and should packet size exceed array size, this results in a buffer
overflow.

Reported-by: 刘令 <liuling-it@360.cn>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Prasad J Pandit
244381ec19 net: cadence_gem: check packet size in gem_recieve
While receiving packets in 'gem_receive' routine, if Frame Check
Sequence(FCS) is enabled, it copies the packet into a local
buffer without checking its size. Add check to validate packet
length against the buffer size to avoid buffer overflow.

Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Thomas Huth
c8c6afa886 qemu-doc: Do not promote deprecated -smb and -redir options
Since -smb and -redir are deprecated options, we should not
use them as examples in the documentation anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Thomas Huth
f853ac66c7 net/slirp: Tell the users when they are using deprecated options
We don't want to support the legacy -tftp, -bootp, -smb and
-net channel options forever. So let's start telling the users
that they are deprecated and what option should be used instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-02-04 13:22:06 +08:00
Peter Maydell
382d34ff9f Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Wed 03 Feb 2016 15:47:34 GMT using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  log: add "-d trace:PATTERN"
  trace: switch default backend to "log"
  trace: convert stderr backend to log
  log: move qemu-log.c into util/ directory
  log: do not unnecessarily include qom/cpu.h
  trace: add "-trace help"
  trace: add "-trace enable=..."
  trace: no need to call trace_backend_init in different branches now
  trace: split trace_init_file out of trace_init_backends
  trace: split trace_init_events out of trace_init_backends
  trace: fix documentation
  trace: track enabled events in a separate array
  trace: count number of enabled events

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 19:00:33 +00:00
Hervé Poussineau
ba0a71022c dma: remove now useless DMA_* functions
Keep only DMA_init function as a wrapper around DMA controllers creation.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-20-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:58 -05:00
Hervé Poussineau
f203c16ea2 sb16: use IsaDma interface instead of global DMA_* functions
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-19-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:58 -05:00
Hervé Poussineau
467be5f2f0 gus: use IsaDma interface instead of global DMA_* functions
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-18-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:58 -05:00
Hervé Poussineau
2d01109133 cs4231a: use IsaDma interface instead of global DMA_* functions
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-17-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:58 -05:00
Hervé Poussineau
c8a35f1cf0 fdc: use IsaDma interface instead of global DMA_* functions
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-16-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:58 -05:00
Hervé Poussineau
c3ae40e12c sparc64: disable floppy DMA
All functions relative to DMA (DMA_*() functions) are stubs on sparc64 platform.
Disable the DMA of the floppy controller, instead of calling these stubs.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-15-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:57 -05:00
Hervé Poussineau
dd446051b7 sparc: disable floppy DMA
All functions relative to DMA (DMA_*() functions) are stubs on sparc platform.
Disable the DMA in the floppy controller, instead of calling these stubs.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-14-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:57 -05:00
Hervé Poussineau
020e298699 magnum: disable floppy DMA for now
Floppy uses the DMA controller in rc4030 chipset, and not the i8259 from the ISA bus.
It's better to disable DMA than to call the wrong DMA controller.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-13-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:57 -05:00
Hervé Poussineau
16ffe36360 i8257: implement the IsaDma interface
Rewrite the global DMA_*() functions to use the IsaDma interface.
Note that these functions will be deleted in a few commits.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-12-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:57 -05:00
Hervé Poussineau
5484f30b2c isa: add an ISA DMA interface, and store it within the ISA bus
This will permit to deprecate global DMA_*() functions.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-11-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:57 -05:00
Hervé Poussineau
f5f19ee2e4 i8257: move state definition to new independent header
We will now be able to embed the i8257 interrupt controller in another object.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-10-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:56 -05:00
Hervé Poussineau
340e19ebf2 i8257: QOM'ify
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-9-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:56 -05:00
Hervé Poussineau
8d3c4c81f3 i8257: add missing const
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-8-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:56 -05:00
Hervé Poussineau
b9ebd28c62 i8257: make the DMA running method per controller
This removes some static/global variables, and we're now running only the
required controller (master or slave)

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-7-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:56 -05:00
Hervé Poussineau
74c47de010 i8257: rename functions to start with i8257_ prefix
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-6-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:56 -05:00
Hervé Poussineau
0eee6d6262 i8257: rename struct dma_regs to I8257Regs
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-5-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:55 -05:00
Hervé Poussineau
6a128b1330 i8257: rename struct dma_cont to I8257State
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-4-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:55 -05:00
Hervé Poussineau
5714694192 i8257: pass ISA bus to DMA_init() function
i8257 DMA controller exists on one ISA bus, so let's specify it at initialization.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-3-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:55 -05:00
Hervé Poussineau
449ae7eca9 i82374: device only existed as ISA device, so simplify device
Merge ISAi82374State fields into parent structure I82374State.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453843944-26833-2-git-send-email-hpoussin@reactos.org
Signed-off-by: John Snow <jsnow@redhat.com>
2016-02-03 11:28:55 -05:00
John Snow
fd9bdbd345 fdc: fix detection under Linux
Accidentally, I removed a "feature" where empty drives had geometry
values applied to them, which allows seek on empty drives to work
"by accident," as QEMU actually tries to disallow that.

Seeks on empty drives should work, though, but the easiest thing is to
restore the misfeature where empty drives have non-zero geometries
applied.

Document the hack accordingly.

[Maintainer edit]

This fix corrects a regression introduced in d5d47efc, where
pick_geometry was modified such that it would not operate on empty
drives, and as a result if there is no diskette inserted, QEMU
no longer populates it with geometry bounds. As a result, seek fails
when QEMU denies to move the current track, but reports success anyway.
This can confuse the guest, leading to kernel panics in the guest.


Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1454106932-17236-1-git-send-email-jsnow@redhat.com
2016-02-03 11:28:55 -05:00
Andrew Baumann
1df7d1f930 raspi: add raspberry pi 2 machine
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:47 +00:00
Andrew Baumann
716536a9b6 arm/boot: move highbank secure board setup code to common routine
The new version is slightly different, to support Rasbperry Pi (in
particular, Pi1's arm11 core which doesn't support v7 instructions
such as MOVW).

Tested-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:46 +00:00
Andrew Baumann
bad5623690 bcm2836: add bcm2836 SoC device
This is the SoC for Raspberry Pi 2.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:46 +00:00
Andrew Baumann
cc28296d82 bcm2836_control: add bcm2836 ARM control logic
This module is specific to the bcm2836 (Pi2). It implements the top
level interrupt controller, and mailboxes used for inter-processor
synchronisation.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:45 +00:00
Andrew Baumann
7c62aeb82a bcm2835_peripherals: add rollup device for bcm2835 peripherals
This device maintains all the non-CPU peripherals on bcm2835 (Pi1)
which are also present on bcm2836 (Pi2). It also implements the
private address spaces used for DMA and mailboxes.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:45 +00:00
Andrew Baumann
e3ece3e34d bcm2835_ic: add bcm2835 interrupt controller
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:44 +00:00
Andrew Baumann
04f1ab15b9 bcm2835_property: add bcm2835 property channel
This sits behind the mailbox interface, and implements
request/response queries for system properties. The
framebuffer-related properties will be added in a later patch.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 15:00:44 +00:00
Andrew Baumann
99494e696e bcm2835_mbox: add BCM2835 mailboxes
This adds the system mailboxes which are used to communicate with a
number of GPU peripherals on Pi/Pi2.

Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 14:56:32 +00:00
Peter Maydell
3c2f7bb32b target-arm: Don't report presence of EL2 if it doesn't exist
We already modify the processor feature bits to not report EL3
support to the guest if EL3 isn't enabled for the CPU we're emulating.
Add similar support for not reporting EL2 unless it is enabled.
This is necessary because real world guest code running at EL3
(trusted firmware or bootloaders) will query the ID registers to
determine whether it should start a guest Linux kernel in EL2 or EL3.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1454437242-10262-1-git-send-email-peter.maydell@linaro.org
2016-02-03 13:54:41 +00:00
Peter Maydell
0602f420e4 libvixl: Avoid std::abs() of 64-bit type
The std::abs() function did not get a version that works on
'long long' until C++11. Avoid it, so that we can compile on
32-bit platforms (where int64_t is 'long long') with older
compilers (which don't support C++11).

Reported-by: Franz-Josef Haider <Franz-Josef.Haider@student.uibk.ac.at>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1453739429-31477-1-git-send-email-peter.maydell@linaro.org
2016-02-03 13:46:34 +00:00
Igor Mammedov
6d152ebaf4 arm: virt-acpi: each MADT.GICC entry as enabled unconditionally
in current impl. condition

build_madt() {
  ...
  if (test_bit(i, cpuinfo->found_cpus))

is always true since loop handles only present CPUs
in range [0..smp_cpus).
But to fill usless cpuinfo->found_cpus we do unnecessary
scan over QOM tree to find the same CPUs.
So mark GICC as present always and drop not needed
code that fills cpuinfo->found_cpus.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1454323689-248759-1-git-send-email-imammedo@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:34 +00:00
Edgar E. Iglesias
3526423e86 target-arm: Implement the S2 MMU inputsize > pamax check
Implement the inputsize > pamax check for Stage 2 translations.
This is CONSTRAINED UNPREDICTABLE and we choose to fault.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1453932970-14576-4-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:33 +00:00
Edgar E. Iglesias
a0e966c93a target-arm: Rename check_s2_startlevel to check_s2_mmu_setup
Rename check_s2_startlevel to check_s2_mmu_setup in preparation
for additional checks.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1453932970-14576-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:33 +00:00
Edgar E. Iglesias
98d68ec289 target-arm: Apply S2 MMU startlevel table size check to AArch64
The S2 starting level table size check applies to both AArch32
and AArch64. Move it to common code.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1453932970-14576-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:33 +00:00
Edgar E. Iglesias
48d21a576a hw/arm: Setup EL1 and EL2 in AArch64 mode for 64bit Linux boots
When booting Linux on AArch64 enabled cores, setup EL1 and
EL2 to use AArch64.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:33 +00:00
Peter Maydell
6a43e0b6e1 target-arm: Make various system registers visible to EL3
The AArch64 system registers DACR32_EL2, IFSR32_EL2, SPSR_IRQ,
SPSR_ABT, SPSR_UND and SPSR_FIQ are visible and fully functional from
EL3 even if the CPU has no EL2 (unlike some others which are RES0
from EL3 in that configuration).  Move them from el2_cp_reginfo[] to
v8_cp_reginfo[] so they are always present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Message-id: 1453227802-9991-1-git-send-email-peter.maydell@linaro.org
2016-02-03 13:46:33 +00:00
Andrew Jones
a43e68a08b virt-acpi-build: add always-on property for timer
This patch is the ACPI equivalent of "hw/arm/virt: Add always-on
property to the virt board timer". The timer is always on, and
thus setting this informs Linux that it may switch off the periodic
timer. Switching off the periodic timer substantially reduces the
number of interrupts the host needs to inject.

Testing note: AArch64 guests (the only ones currently booting with
ACPI) do not actually need this patch to determine it can turn the
periodic timer off. I therefore used a hacked guest kernel to ensure
this patch works as the equivalent DT patch does.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1453380893-26174-1-git-send-email-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 13:46:32 +00:00
Peter Maydell
87574621b1 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160203-1' into staging
virtio-gpu: bugfixes and spice support preparation

# gpg: Signature made Wed 03 Feb 2016 09:47:13 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20160203-1:
  virtio-gpu: block any rendering until client (ui) is done
  virtio-gpu: add support to enable/disable command processing
  virtio-gpu: maintain command queue
  virtio-gpu: fix memory leak in error path
  console: block rendering until client is done
  zap qemu_egl_has_ext in include/ui/egl-helpers.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 12:23:48 +00:00
Peter Maydell
ad9e1dab20 Merge remote-tracking branch 'remotes/armbru/tags/pull-monitor-2016-02-03' into staging
Monitor patches for 2016-02-03

# gpg: Signature made Wed 03 Feb 2016 09:13:48 GMT using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-monitor-2016-02-03:
  hmp: fix sendkey out of bounds write (CVE-2015-8619)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-03 10:50:06 +00:00
Paolo Bonzini
c84ea00dc2 log: add "-d trace:PATTERN"
This is a bit easier to use than "-trace" if you are also enabling
other kinds of logging.  It is also more discoverable for experienced
QEMU users, and accessible from user-mode emulators.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-12-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 10:37:50 +00:00
Paolo Bonzini
baf86d6b3c trace: switch default backend to "log"
This enables integration with other QEMU logging facilities.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-11-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 10:37:50 +00:00
Paolo Bonzini
ed7f5f1d8d trace: convert stderr backend to log
[Also update .travis.yml --enable-trace-backends=stderr
--Stefan]

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-10-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 10:37:10 +00:00
Gerd Hoffmann
321c9adba5 virtio-gpu: block any rendering until client (ui) is done
Wire up gl_block callback, so ui code can request to stop
virtio-gpu rendering.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-03 10:41:36 +01:00
Gerd Hoffmann
0c55a1cfd3 virtio-gpu: add support to enable/disable command processing
So we can stop rendering for a while in case we have to.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03 10:41:36 +01:00
Gerd Hoffmann
3eb769fd1c virtio-gpu: maintain command queue
We'll go take out the commands we receive out of the virt queue and put
them into a linked list, to decouple virtio queue handling from actual
command processing.

Also move cmd processing to new virtio_gpu_handle_ctrl func, so we can
easily kick it from different places.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-03 10:41:36 +01:00
Gerd Hoffmann
8d94c1ca53 virtio-gpu: fix memory leak in error path
Found by Coverity Scan, buf not freed on error.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03 10:41:36 +01:00
Gerd Hoffmann
bba19b88a6 console: block rendering until client is done
Allow gl user interfaces to block display device gl rendering.
The ui code might want to do that in case it takes a little
longer to bring things to screen, for example because we'll
hand over a dma-buf to another process (spice will do that).

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03 10:41:36 +01:00
Gerd Hoffmann
cb9ab7caae zap qemu_egl_has_ext in include/ui/egl-helpers.h
Drop leftover prototype which sneaked in by mistake

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-02-03 10:41:36 +01:00
Denis V. Lunev
d890d50d18 log: move qemu-log.c into util/ directory
log will become common facility with tracepoints support in next step.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1452174932-28657-9-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:10 +00:00
Paolo Bonzini
508127e243 log: do not unnecessarily include qom/cpu.h
Split the bits that require it to exec/log.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-8-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:10 +00:00
Paolo Bonzini
e9527dd399 trace: add "-trace help"
Print a list of trace points

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-7-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
10578a257d trace: add "-trace enable=..."
Allow enabling events without going through a file, for example:

   qemu-system-x86_64 -trace bdrv_aio_writev -trace bdrv_aio_readv

or with globbing too:

   qemu-system-x86_64 -trace 'bdrv_aio_*'

if an appropriate backend is enabled (simple, stderr, ftrace).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-6-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Denis V. Lunev
f246b86672 trace: no need to call trace_backend_init in different branches now
original idea to split calling locations was to spawn tracing thread
in the final child process according to

    commit 8a745f2a92
    Author: Michael Mueller
    Date:   Mon Sep 23 16:36:54 2013 +0200

os_daemonize is now on top of both locations. Drop unneeded ifs.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1452174932-28657-5-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
41fc57e44e trace: split trace_init_file out of trace_init_backends
This is cleaner, and improves error reporting with -daemonize.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-4-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
45bd0b41bd trace: split trace_init_events out of trace_init_backends
This is cleaner and has two advantages.  First, it improves error
reporting with -daemonize.  Second, multiple "-trace events" options
now cumulate.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-3-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
52449a314e trace: fix documentation
Mention the ftrace backend too.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1452174932-28657-2-git-send-email-den@openvz.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
585ec7273e trace: track enabled events in a separate array
This is more cache friendly on the fast path, where we already have
the event id available.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:09 +00:00
Paolo Bonzini
43b48cfc3e trace: count number of enabled events
This lets trace_event_get_state_dynamic quickly return false.  Right
now there is hardly any benefit because there are also many assertions
and indirections, but the next patch will streamline all of this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-02-03 09:19:08 +00:00
Wolfgang Bumiller
64ffbe04ea hmp: fix sendkey out of bounds write (CVE-2015-8619)
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.

Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.

Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-02-03 10:13:06 +01:00
Peter Maydell
c65db7705b Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-for-peter-2016-02-02' into staging
Block patches

# gpg: Signature made Tue 02 Feb 2016 17:23:44 GMT using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* remotes/maxreitz/tags/pull-block-for-peter-2016-02-02: (50 commits)
  block: qemu-iotests - add test for snapshot, commit, snapshot bug
  block: set device_list.tqe_prev to NULL on BDS removal
  iotests: Add "qemu-img map" test for VMDK extents
  qemu-img: Make MapEntry a QAPI struct
  qemu-img: In "map", use the returned "file" from bdrv_get_block_status
  block: Use returned *file in bdrv_co_get_block_status
  vmdk: Return extent's file in bdrv_get_block_status
  vmdk: Fix calculation of block status's offset
  vpc: Assign bs->file->bs to file in vpc_co_get_block_status
  vdi: Assign bs->file->bs to file in vdi_co_get_block_status
  sheepdog: Assign bs to file in sd_co_get_block_status
  qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
  parallels: Assign bs->file->bs to file in parallels_co_get_block_status
  iscsi: Assign bs to file in iscsi_co_get_block_status
  raw: Assign bs to file in raw_co_get_block_status
  qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
  qcow: Assign bs->file->bs to file in qcow_co_get_block_status
  block: Add "file" output parameter to block status query functions
  block: acquire in bdrv_query_image_info
  iotests: Add test for block jobs and BDS ejection
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 18:04:04 +00:00
Jeff Cody
8983b670f6 block: qemu-iotests - add test for snapshot, commit, snapshot bug
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 2dbc05efba2f683cb3aaf71aaa9b776ebf7ec57c.1454376655.git.jcody@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
[Moved test number from 143 to 144]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 18:07:27 +01:00
Jeff Cody
f8aa905a4f block: set device_list.tqe_prev to NULL on BDS removal
This fixes a regression introduced with commit 3f09bfbc7.  Multiple
bugs arise in conjunction with live snapshots and mirroring operations
(which include active layer commit).

After a live snapshot occurs, the active layer and the base layer both
have a non-NULL tqe_prev field in the device_list, although the base
node's tqe_prev field points to a NULL entry.  This non-NULL tqe_prev
field occurs after the bdrv_append() in the external snapshot calls
change_parent_backing_link().

In change_parent_backing_link(), when the previous active layer is
removed from device_list, the device_list.tqe_prev pointer is not
set to NULL.

The operating scheme in the block layer is to indicate that a BDS belongs
in the bdrv_states device_list iff the device_list.tqe_prev pointer
is non-NULL.

This patch does two things:

1.) Introduces a new block layer helper bdrv_device_remove() to remove a
    BDS from the device_list, and
2.) uses that new API, which also fixes the regression once used in
    change_parent_backing_link().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 0cd51e11c0666c04ddb7c05293fe94afeb551e89.1454376655.git.jcody@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 18:04:47 +01:00
Peter Maydell
3bb1e822ca Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160202-1' into staging
usb: two ehci fixes.

# gpg: Signature made Tue 02 Feb 2016 13:12:00 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160202-1:
  ehci: update irq on reset
  usb: check page select value while processing iTD

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 17:01:56 +00:00
Fam Zheng
c7fc50d376 iotests: Add "qemu-img map" test for VMDK extents
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-17-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:48 +01:00
Fam Zheng
16b0d55586 qemu-img: Make MapEntry a QAPI struct
The "flags" bit mask is expanded to two booleans, "data" and "zero";
"bs" is replaced with "filename" string.

Refactor the merge conditions in img_map() into entry_mergeable().

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-16-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:48 +01:00
Fam Zheng
9e43034008 qemu-img: In "map", use the returned "file" from bdrv_get_block_status
Now all drivers should return a correct "file", we can make use of it,
even with the recursion into backing chain above.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-15-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
ac987b30d0 block: Use returned *file in bdrv_co_get_block_status
Now that all drivers return the right "file" pointer, we can use it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1453780743-16806-14-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
e0f100f57c vmdk: Return extent's file in bdrv_get_block_status
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-13-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
d0a18f1025 vmdk: Fix calculation of block status's offset
"offset" is the offset of cluster and sector_num doesn't necessarily
refer to the start of it, it should add index_in_cluster.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-12-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
7429e20788 vpc: Assign bs->file->bs to file in vpc_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-11-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
8bfb137152 vdi: Assign bs->file->bs to file in vdi_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-10-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
d234c92931 sheepdog: Assign bs to file in sd_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-9-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
53f1dfd1ff qed: Assign bs->file->bs to file in bdrv_qed_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-8-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
ddf4987d76 parallels: Assign bs->file->bs to file in parallels_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-7-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
3399833f14 iscsi: Assign bs to file in iscsi_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-6-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
02650acbc6 raw: Assign bs to file in raw_co_get_block_status
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-5-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
178b4db7e5 qcow2: Assign bs->file->bs to file in qcow2_co_get_block_status
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-4-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
3064bf6fff qcow: Assign bs->file->bs to file in qcow_co_get_block_status
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-3-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Fam Zheng
67a0fd2a9b block: Add "file" output parameter to block status query functions
The added parameter can be used to return the BDS pointer which the
valid offset is referring to. Its value should be ignored unless
BDRV_BLOCK_OFFSET_VALID in ret is set.

Until block drivers fill in the right value, let's clear it explicitly
right before calling .bdrv_get_block_status.

The "bs->file" condition in bdrv_co_get_block_status is kept now to keep iotest
case 102 passing, and will be fixed once all drivers return the right file
pointer.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1453780743-16806-2-git-send-email-famz@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Paolo Bonzini
1963f8d52e block: acquire in bdrv_query_image_info
NFS calls aio_poll inside bdrv_get_allocated_size.  This requires
acquiring the AioContext.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1450867706-19860-1-git-send-email-pbonzini@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:47 +01:00
Max Reitz
c78dc18295 iotests: Add test for block jobs and BDS ejection
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
15a2b18fe5 iotests: Add test for multiple BB on BDS tree
This adds a test for having multiple BlockBackends in one BDS tree. In
this case, there is one BB for the protocol BDS and one BB for the
format BDS in a simple two-BDS tree (with the protocol BDS and BB added
first).

When bdrv_close_all() is executed, no cached data from any BDS should be
lost; the protocol BDS may not be closed until the format BDS is closed.
Otherwise, metadata updates may be lost.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
ca9bd24cf1 block: Rewrite bdrv_close_all()
This patch rewrites bdrv_close_all(): Until now, all root BDSs have been
force-closed. This is bad because it can lead to cached data not being
flushed to disk.

Instead, try to make all reference holders relinquish their reference
voluntarily:

1. All BlockBackend users are handled by making all BBs simply eject
   their BDS tree. Since a BDS can never be on top of a BB, this will
   not cause any of the issues as seen with the force-closing of BDSs.
   The references will be relinquished and any further access to the BB
   will fail gracefully.
2. All BDSs which are owned by the monitor itself (because they do not
   have a BB) are relinquished next.
3. Besides BBs and the monitor, block jobs and other BDSs are the only
   things left that can hold a reference to BDSs. After every remaining
   block job has been canceled, there should not be any BDSs left (and
   the loop added here will always terminate (as long as NDEBUG is not
   defined), because either all_bdrv_states will be empty or there will
   not be any block job left to cancel, failing the assertion).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
d8da3cef3b block: Add blk_remove_all_bs()
When bdrv_close_all() is called, instead of force-closing all root
BlockDriverStates, it is better to just drop the reference from all
BlockBackends and let them be closed automatically. This prevents BDS
from getting closed that are still referenced by other BDS, which may
result in loss of cached data.

This patch adds a function for doing that, but does not yet incorporate
it in bdrv_close_all().

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
9c4218e957 blockdev: Keep track of monitor-owned BDS
As a side effect, we can now make x-blockdev-del's check whether a BDS
is actually owned by the monitor explicit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
2c1d04e002 block: Add list of all BlockDriverStates
We need this list so that bdrv_close_all() can keep track of which BDSs
are still open after having removed the BDSs from all of the BBs and
having released all monitor BDS references.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
64dff52019 block: Make bdrv_close() static
There are no users of bdrv_close() left, except for one of bdrv_open()'s
failure paths, bdrv_close_all() and bdrv_delete(), and that is good.
Make bdrv_close() static so nobody makes the mistake of directly using
bdrv_close() again.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
938abd4325 blockdev: Use blk_remove_bs() in do_drive_del()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
13855c6b9f block: Use blk_remove_bs() in blk_delete()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
033cb5659a block: Remove BDS close notifier
It is unused now, so we can remove it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
741cc43133 nbd: Switch from close to eject notifier
The NBD code uses the BDS close notifier to determine when a medium is
ejected. However, now it should use the BB's BDS removal notifier for
that instead of the BDS's close notifier.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
5b9e0e4693 virtio-scsi: Catch BDS-BB removal/insertion
Make use of the BDS-BB removal and insertion notifiers to remove or set
up, respectively, virtio-scsi's op blockers.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
1b1e0659a4 virtio-blk: Functions for op blocker management
Put the code for setting up and removing op blockers into an own
function, respectively. Then, we can invoke those functions whenever a
BDS is removed from an virtio-blk BB or inserted into it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
3301f6c6e9 block: Add BB-BDS remove/insert notifiers
bdrv_close() no longer signifies ejection of a medium, this is now done
by removing the BDS from the BB. Therefore, we want to have a notifier
for that in the BB instead of a close notifier in the BDS. The former is
added now, the latter is removed later.

Symmetrically, another notifier list is added that is invoked whenever a
BDS is inserted. We will need that for virtio-blk and virtio-scsi, which
can then remove their op blockers on BDS ejection and set them up on
insertion.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
16dee4183a iotests: Add test for eject under NBD server
This patch adds a test for ejecting the BlockBackend an NBD server is
connected to (the NBD server is supposed to stop).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Max Reitz
c5acdc9ab4 block: Release named dirty bitmaps in bdrv_close()
bdrv_delete() is not very happy about deleting BlockDriverStates with
dirty bitmaps still attached to them. In the past, we got around that
very easily by relying on bdrv_close_all() bypassing bdrv_delete(), and
bdrv_close() simply ignoring that condition. We should fix that by
releasing all named dirty bitmaps in bdrv_close() (there should not be
any unnamed bitmaps left) and moving the assertion from bdrv_delete()
there.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:50:46 +01:00
Fam Zheng
e43f7f6f46 block: Remove unused struct definition BlockFinishData
Unused since 94db6d2d3.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:50:38 +01:00
Max Reitz
34250395fe iotests: Add test for a nonexistent NBD export
Trying to connect to a nonexistent NBD export should not crash the
server.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:43 +01:00
Max Reitz
15cfba693b iotests: Make redirecting qemu's stderr optional
Redirecting qemu's stderr to stdout makes working with the stderr output
difficult due to the other file descriptor magic performed in
_launch_qemu ("ambiguous redirect").

Add an option which specifies whether stderr should be redirected to
stdout or not (allowing for other modes to be added in the future).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:43 +01:00
Max Reitz
4a940d14b3 iotests: Make _filter_nbd support more URL types
This function should support URLs of the "nbd://" format (without
swallowing the export name), and for "nbd:///" URLs it should replace
"?socket=$TEST_DIR" by "?socket=TEST_DIR" because putting the Unix
socket files into the test directory makes sense.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
dd170c0677 iotests: Make _filter_nbd drop log lines
The NBD log lines ("/your/source/dir/nbd/xyz.c:function():line: error")
should not be converted to empty lines but removed altogether.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
60d446881d iotests: Move _filter_nbd into common.filter
_filter_nbd can be useful for other NBD tests, too, therefore it should
reside in common.filter.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
d1f9cd7084 iotests: Change coding style of _filter_nbd in 083
In order to be able to move _filter_nbd to common.filter in the next
patch, its coding style needs to be adapted to that of common.filter.
That means, we have to convert tabs to four spaces, adjust the alignment
of the last line (done with spaces already, assuming one tab equals
eight spaces), fix the line length of the comment, and add a line break
before the opening brace.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
05d0fce497 iotests: Rename filter_nbd to _filter_nbd in 083
In the patch after the next, this function is moved to common.filter.
Therefore, its name should be preceded by an underscore to signify its
global availability.

To keep the code motion patch clean, we cannot rename it in the same
patch, so we need to choose some order of renaming vs. motion. It is
better to keep a supposedly global function used by only a single test
in that test than to keep a supposedly local function in a common* file
and use it from a test, so we should rename the function before moving
it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
d3780c2dce nbd: client_close on error in nbd_co_client_start
Use client_close() if an error in nbd_co_client_start() occurs instead
of manually inlining parts of it. This fixes an assertion error on the
server side if nbd_negotiate() fails.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Max Reitz
cc8c46b7c5 iotests: Limit supported formats for 118
Image formats used in test 118 need to support image creation.

Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-02-02 17:49:42 +01:00
Fam Zheng
3db1d98a20 vmdk: Fix converting to streamOptimized
Commit d62d9dc4b8 lifted streamOptimized images's version to 3, but we
now refuse to open version 3 images read-write.  We need to make
streamOptimized an exception to allow converting to it. This fixes the
accidentally broken iotests case 059 for the same reason.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-02-02 17:49:34 +01:00
Max Reitz
327032ce74 block/qapi: Emit tray_open only if there is a tray
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1454096953-31773-5-git-send-email-mreitz@redhat.com
2016-02-02 17:47:06 +01:00
Max Reitz
abb3e55b5b Revert "hw/block/fdc: Implement tray status"
This reverts the changes that commit
2e1280e8ff applied to hw/block/fdc.c;
also, an additional case of drv->media_inserted use has crept in since,
which is replaced by a call to blk_is_inserted().

That commit changed tests/fdc-test.c, too, because after it, one less
TRAY_MOVED event would be emitted when executing 'change' on an empty
drive. However, now, no TRAY_MOVED events will be emitted at all, and
the tray_open status returned by query-block will always be false,
necessitating (different) changes to tests/fdc-test.c and iotest 118,
which is why this patch is not a pure revert of said commit.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 1454096953-31773-4-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-02 17:47:04 +01:00
Max Reitz
12c7ec87a7 blockdev: Fix 'change' for slot devices
'change' and related operations did not work when used on guest devices
featuring removable media but no actual tray, because
blk_dev_is_tray_open() always returned false for them and the
blockdev-{insert,remove}-medium commands required it to return true.

Fix this by making blockdev-{insert,remove}-medium work on tray-less
devices. Also, blockdev-{open,close}-tray are now explicitly no-ops when
invoked on such devices, and blk_dev_change_media_cb() is instead
called by blockdev-{insert,remove}-medium (for tray-less devices only).

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1454096953-31773-3-git-send-email-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-02-02 17:47:00 +01:00
Max Reitz
8f3a73bc57 block: Add blk_dev_has_tray()
Pull out the check whether a block device has a tray from
blk_dev_is_tray_open() into its own function so both attributes (whether
there is a tray vs. whether that tray is open) can be queried
independently.

Cc: qemu-stable <qemu-stable@nongnu.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1454096953-31773-2-git-send-email-mreitz@redhat.com
2016-02-02 17:46:56 +01:00
Peter Maydell
d2ea854c38 Merge remote-tracking branch 'remotes/berrange/tags/pull-qcrypto-next-2016-02-02-1' into staging
Merge qcrypto-next 2016/2/2 v1

# gpg: Signature made Tue 02 Feb 2016 13:13:05 GMT using RSA key ID 15104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"

* remotes/berrange/tags/pull-qcrypto-next-2016-02-02-1:
  crypto: ensure qcrypto_hash_digest_len is always defined
  crypto: register properties against the class instead of object
  crypto: fix description of @errp parameter initialization

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 15:55:01 +00:00
Peter Maydell
baa3f63827 Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160202-1' into staging
ui: gtk vc fix, adaptive sdl refresh.

# gpg: Signature made Tue 02 Feb 2016 13:06:07 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160202-1:
  sdl: shorten the GUI refresh interval when mouse or keyboard is active
  gtk: use qemu_chr_alloc() to allocate CharDriverState

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 15:18:39 +00:00
Peter Maydell
958e369360 Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20160202-1' into staging
audio: Clean up includes

# gpg: Signature made Tue 02 Feb 2016 12:58:06 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-audio-20160202-1:
  audio: Clean up includes

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 14:55:01 +00:00
Peter Maydell
dce0238c74 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fwcfg-20160202-1' into staging
nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file

# gpg: Signature made Tue 02 Feb 2016 12:54:04 GMT using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-fwcfg-20160202-1:
  nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 14:27:12 +00:00
Peter Maydell
074d1ccb42 Merge remote-tracking branch 'remotes/elmarco/tags/ivshmem-pull-request' into staging
# gpg: Signature made Tue 02 Feb 2016 12:43:03 GMT using RSA key ID 75969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg:                 aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276  F62D DAE8 E109 7596 9CE5

* remotes/elmarco/tags/ivshmem-pull-request:
  char: remove qemu_chr_open_eventfd
  ivshmem: use a single eventfd callback, get rid of CharDriver
  ivshmem: generalize ivshmem_setup_interrupts
  ivshmem-test: test both msi & irq cases
  libqos: remove some leaks
  ivshmem-test: leak fixes
  ivshmem: remove redundant assignment, fix crash with msi=off
  ivshmem: no need for opaque argument

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-02-02 13:31:19 +00:00
Gerd Hoffmann
5a8660741a ehci: update irq on reset
After clearing the status register we also have to update the irq line
status.  Otherwise a irq which happends to be pending at reset time
causes a interrupt storm.  And the guest can't stop as the status
register doesn't indicate any pending interrupt.

Both NetBSD and FreeBSD hang on shutdown because of that.

Cc: qemu-stable@nongnu.org
Reported-by: Andrey Korolyov <andrey@xdel.ru>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1453203884-4125-1-git-send-email-kraxel@redhat.com
2016-02-02 14:11:01 +01:00
Prasad J Pandit
49d925ce50 usb: check page select value while processing iTD
While processing isochronous transfer descriptors(iTD), the page
select(PG) field value could lead to an OOB read access. Add
check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1453233406-12165-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02 14:11:01 +01:00
Jindřich Makovička
56bdd4b69a sdl: shorten the GUI refresh interval when mouse or keyboard is active
Signed-off-by: Jindřich Makovička <makovick@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02 14:05:07 +01:00
Daniel P. Berrange
919e11f373 gtk: use qemu_chr_alloc() to allocate CharDriverState
The gd_vc_handler() callback is using g_malloc0() to
allocate the CharDriverState struct. As a result the
logfd field is getting initialized to 0, instead of
-1 when no logfile is requested.

The result is that when running

 $ qemu-system-i386 -nodefaults -chardev vc,id=mon0 -mon chardev=mon0

qemu duplicates all monitor output to stdout as well
as the GTK window.

Not using qemu_chr_alloc() was already a bug, but harmless
until this commit

  commit d0d7708ba2
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Jan 11 12:44:41 2016 +0000

    qemu-char: add logfile facility to all chardev backends

which exposed the problem as a behaviour regression

Reported-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Message-id: 1453377386-10190-1-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02 14:05:07 +01:00
Daniel P. Berrange
c0377a7cc6 crypto: ensure qcrypto_hash_digest_len is always defined
The qcrypto_hash_digest_len method was accidentally inside
a CONFIG_GNUTLS_HASH block, even though it doesn't depend
on gnutls. Re-arrange it to be unconditionally defined.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-02 13:02:56 +00:00
Marc-André Lureau
6db2625572 char: remove qemu_chr_open_eventfd
Broken since d0d7708ba2, since the backend is NULL.

And now no longer needed by ivshmem.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
9940c3236f ivshmem: use a single eventfd callback, get rid of CharDriver
Simplify the interrupt handling by having a single callback on irq&msi
cases. Remove usage of CharDriver, replace it with
qemu_set_fd_handler(). Use event_notifier_test_and_clear() to read the
eventfd.

Before this patch, ivshmem writes the first byte received to
s->intrstatus. But ivshmem_device_spec.txt says "The status register is
set to 1 when an interrupt occurs." Fortunately, the byte usually comes
from another ivshmem device, and those always write 1.

After this commit, follows the specification, set to 1 when an interrupt
occurs.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
fd47bfe5ad ivshmem: generalize ivshmem_setup_interrupts
Call ivshmem_setup_interrupts() with or without MSI, always allocate
msi_vectors that is going to be used in all case in the following patch.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
00ffc3c166 ivshmem-test: test both msi & irq cases
Recent commit 660c97ee introduced a regression in irq case, make
sure this code path is also tested.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
ea53854a54 libqos: remove some leaks
qpci_device_find() returns allocated data, don't leak it.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
1760048a5d ivshmem-test: leak fixes
Add a cleanup_vm() function to free QPCIDevice & QPCIBus when cleaning
up the IVState.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
47213eb110 ivshmem: remove redundant assignment, fix crash with msi=off
Fix crash when msi=false introduced in 660c97ee (msi_vectors is NULL in
this case)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Marc-André Lureau
2c64846972 ivshmem: no need for opaque argument
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-02-02 13:28:58 +01:00
Laszlo Ersek
a907ec52cc nvme: generate OpenFirmware device path in the "bootorder" fw_cfg file
Background on QEMU boot indices
-------------------------------

Normally, the "bootindex" property is configured for bootable devices
with:

  DEVICE_instance_init()
    device_add_bootindex_property(..., "bootindex", ...)
      object_property_add(..., device_get_bootindex,
                          device_set_bootindex, ...)

and when the bootindex is set on the QEMU command line, with

  -device DEVICE,...,bootindex=N

the setter that was configured above is invoked:

  device_set_bootindex()
    /* parse boot index */
    visit_type_int32()

    /* verify unicity */
    check_boot_index()

    /* store parsed boot index */
    ...

    /* insert device path to boot order */
    add_boot_device_path()

In the last step, add_boot_device_path() ensures that an OpenFirmware
device path will show up in the "bootorder" fw_cfg file, at a position
corresponding to the device's boot index. Thus guest firmware (SeaBIOS and
OVMF) can try to boot off the device with the right priority.

NVMe boot index
---------------

In QEMU commit 33739c7129,

  nvma: ide: add bootindex to qom property

the following generic setters / getters:
- device_set_bootindex()
- device_get_bootindex()

were open-coded for NVMe, under the names
- nvme_set_bootindex()
- nvme_get_bootindex()

Plus nvme_instance_init() was added to configure the "bootindex" property
manually, designating the open-coded getter & setter, rather than calling
device_add_bootindex_property().

Crucially, nvme_set_bootindex() avoided the final add_boot_device_path()
call. This fact is spelled out in the message of commit 33739c7129, and
it was presumably the entire reason for all of the code duplication.

Now, Vladislav filed an RFE for OVMF
<https://github.com/tianocore/edk2/issues/48>; OVMF should boot off NVMe
devices. It is simple to build edk2's existent NvmExpressDxe driver into
OVMF, but the boot order matching logic in OVMF can only handle NVMe if
the "bootorder" fw_cfg file includes such devices.

Therefore this patch converts the NVMe device model to
device_set_bootindex() all the way.

Device paths
------------

device_set_bootindex() accepts an optional parameter called "suffix". When
present, it is expected to take the form of an OpenFirmware device path
node, and it gets appended as last node to the otherwise auto-generated
OFW path.

For NVMe, the auto-generated part is

  /pci@i0cf8/pci8086,5845@6[,1]
       ^     ^            ^  ^
       |     |            PCI slot and (present when nonzero)
       |     |            function of the NVMe controller, both hex
       |     "driver name" component, built from PCI vendor & device IDs
       PCI root at system bus port, PIO

to which here we append the suffix

  /namespace@1,0
             ^ ^
             | big endian (MSB at lowest address) numeric interpretation
             | of the 64-bit IEEE Extended Unique Identifier, aka EUI-64,
             | hex
             32-bit NVMe namespace identifier, aka NSID, hex

resulting in the OFW device path

  /pci@i0cf8/pci8086,5845@6[,1]/namespace@1,0

The reason for including the NSID and the EUI-64 is that an NVMe device
can in theory produce several different namespaces (distinguished by
NSID). Additionally, each of those may (optionally) have an EUI-64 value.

For now, QEMU only provides namespace 1.

Furthermore, QEMU doesn't even represent the EUI-64 as a standalone field;
it is embedded (and left unused) inside the "NvmeIdNs.res30" array, at the
last eight bytes. (Which is fine, since EUI-64 can be left zero-filled if
unsupported by the device.)

Based on the above, we set the "unit address" part of the last
("namespace") node to fixed "1,0".

OVMF will then map the above OFW device path to the following UEFI device
path fragment, for boot order processing:

  PciRoot(0x0)/Pci(0x6,0x1)/NVMe(0x1,00-00-00-00-00-00-00-00)
          ^        ^   ^    ^    ^   ^
          |        |   |    |    |   octets of the EUI-64 in address order
          |        |   |    |    NSID
          |        |   |    NVMe namespace messaging device path node
          |        PCI slot and function
          PCI root bridge

Cc: Keith Busch <keith.busch@intel.com> (supporter:nvme)
Cc: Kevin Wolf <kwolf@redhat.com> (supporter:Block layer core)
Cc: qemu-block@nongnu.org (open list:nvme)
Cc: Gonglei <arei.gonglei@huawei.com>
Cc: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Cc: Feng Tian <feng.tian@intel.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Kevin O'Connor <kevin@koconnor.net>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Acked-by: Gonglei <arei.gonglei@huawei.com>
Acked-by: Keith Busch <keith.busch@intel.com>
Tested-by: Vladislav Vovchenko <vladislav.vovchenko@sk.com>
Message-id: 1453850483-27511-1-git-send-email-lersek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-02-02 12:45:01 +01:00
Daniel P. Berrange
9884abee8f crypto: register properties against the class instead of object
This converts the tlscredsx509, tlscredsanon and secret objects
to register their properties against the class rather than object.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-01 14:11:35 +00:00
Daniel P. Berrange
07982d2ee9 crypto: fix description of @errp parameter initialization
The "Error **errp" parameters must be NULL initialized
not uninitialized.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-02-01 14:11:35 +00:00
1389 changed files with 50641 additions and 18574 deletions

View File

@@ -1,9 +1,38 @@
sudo: false
language: c
python:
- "2.4"
compiler:
- gcc
- clang
cache: ccache
addons:
apt:
packages:
- libaio-dev
- libattr1-dev
- libbrlapi-dev
- libcap-ng-dev
- libgnutls-dev
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libncurses5-dev
- libnss3-dev
- libpixman-1-dev
- libpng12-dev
- librados-dev
- libsdl1.2-dev
- libseccomp-dev
- libspice-protocol-dev
- libspice-server-dev
- libssh2-1-dev
- liburcu-dev
- libusb-1.0-0-dev
- libvte-2.90-dev
- sparse
- uuid-dev
notifications:
irc:
channels:
@@ -12,29 +41,16 @@ notifications:
on_failure: always
env:
global:
- TEST_CMD=""
- TEST_CMD="make check"
- EXTRA_CONFIG=""
# Development packages, EXTRA_PKGS saved for additional builds
- CORE_PKGS="libusb-1.0-0-dev libiscsi-dev librados-dev libncurses5-dev"
- NET_PKGS="libseccomp-dev libgnutls-dev libssh2-1-dev libspice-server-dev libspice-protocol-dev libnss3-dev"
- GUI_PKGS="libgtk-3-dev libvte-2.90-dev libsdl1.2-dev libpng12-dev libpixman-1-dev"
- EXTRA_PKGS=""
matrix:
# Group major targets together with their linux-user counterparts
- TARGETS=alpha-softmmu,alpha-linux-user
- TARGETS=alpha-softmmu,alpha-linux-user,cris-softmmu,cris-linux-user,m68k-softmmu,m68k-linux-user,microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=arm-softmmu,arm-linux-user,armeb-linux-user,aarch64-softmmu,aarch64-linux-user
- TARGETS=cris-softmmu,cris-linux-user
- TARGETS=i386-softmmu,i386-linux-user,x86_64-softmmu,x86_64-linux-user
- TARGETS=m68k-softmmu,m68k-linux-user
- TARGETS=microblaze-softmmu,microblazeel-softmmu,microblaze-linux-user,microblazeel-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu
- TARGETS=mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user
- TARGETS=ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user
- TARGETS=sh4-softmmu,sh4eb-softmmu,sh4-linux-user sh4eb-linux-user
- TARGETS=sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user
- TARGETS=unicore32-softmmu,unicore32-linux-user
- TARGETS=mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu,mips-linux-user,mips64-linux-user,mips64el-linux-user,mipsel-linux-user,mipsn32-linux-user,mipsn32el-linux-user
- TARGETS=or32-softmmu,or32-linux-user,ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,ppc-linux-user,ppc64-linux-user,ppc64abi32-linux-user,ppc64le-linux-user
- TARGETS=s390x-softmmu,s390x-linux-user,sh4-softmmu,sh4eb-softmmu,sh4-linux-user,sh4eb-linux-user,sparc-softmmu,sparc64-softmmu,sparc-linux-user,sparc32plus-linux-user,sparc64-linux-user,unicore32-softmmu,unicore32-linux-user
# Group remaining softmmu only targets into one build
- TARGETS=lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
git:
@@ -43,8 +59,6 @@ git:
before_install:
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
- sudo apt-get update -qq
- sudo apt-get install -qq ${CORE_PKGS} ${NET_PKGS} ${GUI_PKGS} ${EXTRA_PKGS}
before_script:
- ./configure --target-list=${TARGETS} --enable-debug-tcg ${EXTRA_CONFIG}
script:
@@ -52,44 +66,59 @@ script:
matrix:
# We manually include a number of additional build for non-standard bits
include:
# Make check target (we only do this once)
- env:
- TARGETS=alpha-softmmu,arm-softmmu,aarch64-softmmu,cris-softmmu,i386-softmmu,x86_64-softmmu,m68k-softmmu,microblaze-softmmu,microblazeel-softmmu,mips-softmmu,mips64-softmmu,mips64el-softmmu,mipsel-softmmu,or32-softmmu,ppc-softmmu,ppc64-softmmu,ppcemb-softmmu,s390x-softmmu,sh4-softmmu,sh4eb-softmmu,sparc-softmmu,sparc64-softmmu,unicore32-softmmu,unicore32-linux-user,lm32-softmmu,moxie-softmmu,tricore-softmmu,xtensa-softmmu,xtensaeb-softmmu
TEST_CMD="make check"
compiler: gcc
# Debug related options
- env: TARGETS=i386-softmmu,x86_64-softmmu
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-debug"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
# We currently disable "make check"
- env: TARGETS=alpha-softmmu
EXTRA_CONFIG="--enable-debug --enable-tcg-interpreter"
TEST_CMD=""
compiler: gcc
# All the extra -dev packages
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="libaio-dev libcap-ng-dev libattr1-dev libbrlapi-dev uuid-dev libusb-1.0.0-dev"
# Disable a few of the optional features
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb"
compiler: gcc
# Currently configure doesn't force --disable-pie
- env: TARGETS=i386-softmmu,x86_64-softmmu
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-gprof --enable-gcov --disable-pie"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="sparse"
# Sparse
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-sparse"
compiler: gcc
# All the trace backends (apart from dtrace)
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=stderr"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=simple"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
EXTRA_PKGS="liblttng-ust-dev liburcu-dev"
EXTRA_CONFIG="--enable-trace-backends=ust"
compiler: gcc
- env: TARGETS=i386-softmmu,x86_64-softmmu
# Modules
- env: TARGETS=arm-softmmu,x86_64-softmmu
EXTRA_CONFIG="--enable-modules"
compiler: gcc
# All the trace backends (apart from dtrace)
- env: TARGETS=i386-softmmu
EXTRA_CONFIG="--enable-trace-backends=log"
compiler: gcc
# We currently disable "make check" (until 41fc57e44ed regression fixed)
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=simple"
TEST_CMD=""
compiler: gcc
# We currently disable "make check"
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ftrace"
TEST_CMD=""
compiler: gcc
# We currently disable "make check"
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--enable-trace-backends=ust"
TEST_CMD=""
compiler: gcc
# All the co-routine backends (apart from windows)
# We currently disable "make check"
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--with-coroutine=gthread"
TEST_CMD=""
compiler: gcc
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--with-coroutine=ucontext"
compiler: gcc
- env: TARGETS=x86_64-softmmu
EXTRA_CONFIG="--with-coroutine=sigaltstack"
compiler: gcc

55
HACKING
View File

@@ -157,3 +157,58 @@ painful. These are:
* you may assume that integers are 2s complement representation
* you may assume that right shift of a signed integer duplicates
the sign bit (ie it is an arithmetic shift, not a logical shift)
7. Error handling and reporting
7.1 Reporting errors to the human user
Do not use printf(), fprintf() or monitor_printf(). Instead, use
error_report() or error_vreport() from error-report.h. This ensures the
error is reported in the right place (current monitor or stderr), and in
a uniform format.
Use error_printf() & friends to print additional information.
error_report() prints the current location. In certain common cases
like command line parsing, the current location is tracked
automatically. To manipulate it manually, use the loc_*() from
error-report.h.
7.2 Propagating errors
An error can't always be reported to the user right where it's detected,
but often needs to be propagated up the call chain to a place that can
handle it. This can be done in various ways.
The most flexible one is Error objects. See error.h for usage
information.
Use the simplest suitable method to communicate success / failure to
callers. Stick to common methods: non-negative on success / -1 on
error, non-negative / -errno, non-null / null, or Error objects.
Example: when a function returns a non-null pointer on success, and it
can fail only in one way (as far as the caller is concerned), returning
null on failure is just fine, and certainly simpler and a lot easier on
the eyes than propagating an Error object through an Error ** parameter.
Example: when a function's callers need to report details on failure
only the function really knows, use Error **, and set suitable errors.
Do not report an error to the user when you're also returning an error
for somebody else to handle. Leave the reporting to the place that
consumes the error returned.
7.3 Handling errors
Calling exit() is fine when handling configuration errors during
startup. It's problematic during normal operation. In particular,
monitor commands should never exit().
Do not call exit() or abort() to handle an error that can be triggered
by the guest (e.g., some unimplemented corner case in guest code
translation or device emulation). Guests should not be able to
terminate QEMU.
Note that &error_fatal is just another way to exit(1), and &error_abort
is just another way to abort().

View File

@@ -52,6 +52,11 @@ General Project Administration
------------------------------
M: Peter Maydell <peter.maydell@linaro.org>
All patches CC here
L: qemu-devel@nongnu.org
F: *
F: */
Responsible Disclosure, Reporting Security Issues
------------------------------
W: http://wiki.qemu.org/SecurityProcess
@@ -79,6 +84,13 @@ F: include/exec/exec-all.h
F: include/exec/helper*.h
F: include/exec/tb-hash.h
FPU emulation
M: Aurelien Jarno <aurelien@aurel32.net>
M: Peter Maydell <peter.maydell@linaro.org>
S: Odd Fixes
F: fpu/
F: include/fpu/
Alpha
M: Richard Henderson <rth@twiddle.net>
S: Maintained
@@ -222,6 +234,7 @@ L: kvm@vger.kernel.org
S: Supported
F: kvm-*
F: */kvm.*
F: include/sysemu/kvm*.h
ARM
M: Peter Maydell <peter.maydell@linaro.org>
@@ -351,6 +364,7 @@ M: Dmitry Solodkiy <d.solodkiy@samsung.com>
L: qemu-arm@nongnu.org
S: Maintained
F: hw/*/exynos*
F: include/hw/arm/exynos4210.h
Calxeda Highbank
M: Rob Herring <robh@kernel.org>
@@ -378,6 +392,7 @@ L: qemu-arm@nongnu.org
S: Odd fixes
F: hw/*/imx*
F: hw/arm/kzm.c
F: include/hw/arm/fsl-imx31.h
Integrator CP
M: Peter Maydell <peter.maydell@linaro.org>
@@ -420,6 +435,7 @@ F: hw/arm/spitz.c
F: hw/arm/tosa.c
F: hw/arm/z2.c
F: hw/*/pxa2xx*
F: include/hw/arm/pxa.h
Stellaris
M: Peter Maydell <peter.maydell@linaro.org>
@@ -641,12 +657,6 @@ F: hw/*/grlib*
S390 Machines
-------------
S390 Virtio
M: Alexander Graf <agraf@suse.de>
S: Maintained
F: hw/s390x/s390-*.c
X: hw/s390x/*pci*.[hc]
S390 Virtio-ccw
M: Cornelia Huck <cornelia.huck@de.ibm.com>
M: Christian Borntraeger <borntraeger@de.ibm.com>
@@ -654,7 +664,6 @@ M: Alexander Graf <agraf@suse.de>
S: Supported
F: hw/char/sclp*.[hc]
F: hw/s390x/
X: hw/s390x/s390-virtio-bus.[ch]
F: include/hw/s390x/
F: pc-bios/s390-ccw/
F: hw/watchdog/wdt_diag288.c
@@ -708,6 +717,12 @@ F: hw/timer/hpet*
F: hw/timer/i8254*
F: hw/timer/mc146818rtc*
Machine core
M: Eduardo Habkost <ehabkost@redhat.com>
M: Marcel Apfelbaum <marcel@redhat.com>
S: Supported
F: hw/core/machine.c
F: include/hw/boards.h
Xtensa Machines
---------------
@@ -756,6 +771,7 @@ OMAP
M: Peter Maydell <peter.maydell@linaro.org>
S: Maintained
F: hw/*/omap*
F: include/hw/arm/omap.h
IPack
M: Alberto Garcia <berto@igalia.com>
@@ -841,6 +857,10 @@ M: Gerd Hoffmann <kraxel@redhat.com>
S: Maintained
F: hw/usb/*
F: tests/usb-*-test.c
F: docs/usb2.txt
F: docs/usb-storage.txt
F: include/hw/usb.h
F: include/hw/usb/
USB (serial adapter)
M: Gerd Hoffmann <kraxel@redhat.com>
@@ -852,6 +872,7 @@ VFIO
M: Alex Williamson <alex.williamson@redhat.com>
S: Supported
F: hw/vfio/*
F: include/hw/vfio/
vhost
M: Michael S. Tsirkin <mst@redhat.com>
@@ -863,6 +884,7 @@ M: Michael S. Tsirkin <mst@redhat.com>
S: Supported
F: hw/*/virtio*
F: net/vhost-user.c
F: include/hw/virtio/
virtio-9p
M: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
@@ -908,6 +930,7 @@ M: Amit Shah <amit.shah@redhat.com>
S: Supported
F: hw/virtio/virtio-rng.c
F: include/hw/virtio/virtio-rng.h
F: include/sysemu/rng*.h
F: backends/rng*.c
nvme
@@ -993,7 +1016,7 @@ F: blockjob.c
F: include/block/blockjob.h
F: block/backup.c
F: block/commit.c
F: block/stream.h
F: block/stream.c
F: block/mirror.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
@@ -1071,6 +1094,7 @@ SPICE
M: Gerd Hoffmann <kraxel@redhat.com>
S: Supported
F: include/ui/qemu-spice.h
F: include/ui/spice-display.h
F: ui/spice-*.c
F: audio/spiceaudio.c
F: hw/display/qxl*
@@ -1079,6 +1103,7 @@ Graphics
M: Gerd Hoffmann <kraxel@redhat.com>
S: Odd Fixes
F: ui/
F: include/ui/
Cocoa graphics
M: Andreas Färber <andreas.faerber@web.de>
@@ -1106,6 +1131,7 @@ Network device backends
M: Jason Wang <jasowang@redhat.com>
S: Maintained
F: net/
F: include/net/
T: git git://github.com/jasowang/qemu.git net
Netmap network backend
@@ -1201,10 +1227,12 @@ F: scripts/qmp/
T: git git://repo.or.cz/qemu/armbru.git qapi-next
SLIRP
M: Samuel Thibault <samuel.thibault@ens-lyon.org>
M: Jan Kiszka <jan.kiszka@siemens.com>
S: Maintained
F: slirp/
F: net/slirp.c
F: include/net/slirp.h
T: git git://git.kiszka.org/qemu.git queues/slirp
Tracing
@@ -1229,6 +1257,7 @@ F: include/migration/
F: migration/
F: scripts/vmstate-static-checker.py
F: tests/vmstate-static-checker-data/
F: docs/migration.txt
Seccomp
M: Eduardo Otubo <eduardo.otubo@profitbricks.com>
@@ -1271,6 +1300,15 @@ S: Maintained
F: include/qemu/sockets.h
F: util/qemu-sockets.c
Throttling infrastructure
M: Alberto Garcia <berto@igalia.com>
S: Supported
F: block/throttle-groups.c
F: include/block/throttle-groups.h
F: include/qemu/throttle.h
F: util/throttle.c
L: qemu-block@nongnu.org
Usermode Emulation
------------------
Overall
@@ -1566,6 +1604,12 @@ L: qemu-block@nongnu.org
S: Supported
F: tests/image-fuzzer/
Build and test automation
-------------------------
M: Alex Bennée <alex.bennee@linaro.org>
L: qemu-devel@nongnu.org
S: Supported
F: .travis.yml
Documentation
-------------

View File

@@ -234,11 +234,11 @@ util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
qemu-img.o: qemu-img-cmds.h
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
@@ -272,7 +272,8 @@ $(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json \
$(SRC_PATH)/qapi/crypto.json
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -328,7 +329,7 @@ ifneq ($(EXESUF),)
qemu-ga: qemu-ga$(EXESUF) $(QGA_VSS_PROVIDER) $(QEMU_GA_MSI)
endif
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y)
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
@@ -390,7 +391,7 @@ bepo cz
ifdef INSTALL_BLOBS
BLOBS=bios.bin bios-256k.bin sgabios.bin vgabios.bin vgabios-cirrus.bin \
vgabios-stdvga.bin vgabios-vmware.bin vgabios-qxl.bin vgabios-virtio.bin \
acpi-dsdt.aml q35-acpi-dsdt.aml \
acpi-dsdt.aml \
ppc_rom.bin openbios-sparc32 openbios-sparc64 openbios-ppc QEMU,tcx.bin QEMU,cgthree.bin \
pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
@@ -399,7 +400,6 @@ efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
qemu-icon.bmp qemu_logo_no_text.svg \
bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
multiboot.bin linuxboot.bin kvmvapic.bin \
s390-zipl.rom \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper \

View File

@@ -1,6 +1,6 @@
#######################################################################
# Common libraries for tools and emulators
stub-obj-y = stubs/
stub-obj-y = stubs/ crypto/
util-obj-y = util/ qobject/ qapi/
util-obj-y += qmp-introspect.o qapi-types.o qapi-visit.o qapi-event.o
@@ -89,7 +89,6 @@ endif
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += qemu-log.o
common-obj-y += tcg-runtime.o
common-obj-y += hw/
common-obj-y += qom/

View File

@@ -23,6 +23,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"

View File

@@ -13,11 +13,12 @@
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
#ifdef CONFIG_EPOLL
#ifdef CONFIG_EPOLL_CREATE1
#include <sys/epoll.h>
#endif
@@ -32,7 +33,7 @@ struct AioHandler
QLIST_ENTRY(AioHandler) node;
};
#ifdef CONFIG_EPOLL
#ifdef CONFIG_EPOLL_CREATE1
/* The fd number threashold to switch to epoll */
#define EPOLL_ENABLE_THRESHOLD 64
@@ -482,7 +483,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
void aio_context_setup(AioContext *ctx, Error **errp)
{
#ifdef CONFIG_EPOLL
#ifdef CONFIG_EPOLL_CREATE1
assert(!ctx->epollfd);
ctx->epollfd = epoll_create1(EPOLL_CLOEXEC);
if (ctx->epollfd == -1) {

View File

@@ -15,6 +15,7 @@
* GNU GPL, version 2 or (at your option) any later version.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/queue.h"

View File

@@ -21,7 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <stdint.h>
#include "qemu/osdep.h"
#include "sysemu/sysemu.h"
#include "sysemu/arch_init.h"
#include "hw/pci/pci.h"

View File

@@ -22,6 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/aio.h"
#include "block/thread-pool.h"

View File

@@ -24,7 +24,6 @@
#ifndef QEMU_AUDIO_H
#define QEMU_AUDIO_H
#include "config-host.h"
#include "qemu/queue.h"
typedef void (*audio_callback_fn) (void *opaque, int avail);

View File

@@ -21,6 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "qemu/timer.h"
@@ -566,7 +567,7 @@ static CharDriverState *chr_baum_init(const char *id,
ChardevReturn *ret,
Error **errp)
{
ChardevCommon *common = qapi_ChardevDummy_base(backend->u.braille);
ChardevCommon *common = backend->u.braille.data;
BaumDriverState *baum;
CharDriverState *chr;
brlapi_handle_t *handle;

View File

@@ -9,6 +9,7 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"

View File

@@ -9,6 +9,7 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/hostmem.h"
#include "qom/object_interfaces.h"

View File

@@ -9,6 +9,7 @@
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/hostmem.h"
#include "hw/boards.h"
#include "qapi/visitor.h"
@@ -26,18 +27,18 @@ QEMU_BUILD_BUG_ON(HOST_MEM_POLICY_INTERLEAVE != MPOL_INTERLEAVE);
#endif
static void
host_memory_backend_get_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
host_memory_backend_get_size(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint64_t value = backend->size;
visit_type_size(v, &value, name, errp);
visit_type_size(v, name, &value, errp);
}
static void
host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
Error *local_err = NULL;
@@ -48,7 +49,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, void *opaque,
goto out;
}
visit_type_size(v, &value, name, &local_err);
visit_type_size(v, name, &value, &local_err);
if (local_err) {
goto out;
}
@@ -63,8 +64,8 @@ out:
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *host_nodes = NULL;
@@ -91,18 +92,18 @@ host_memory_backend_get_host_nodes(Object *obj, Visitor *v, void *opaque,
node = &(*node)->next;
} while (true);
visit_type_uint16List(v, &host_nodes, name, errp);
visit_type_uint16List(v, name, &host_nodes, errp);
}
static void
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
host_memory_backend_set_host_nodes(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
#ifdef CONFIG_NUMA
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
uint16List *l = NULL;
visit_type_uint16List(v, &l, name, errp);
visit_type_uint16List(v, name, &l, errp);
while (l) {
bitmap_set(backend->host_nodes, l->value, 1);

View File

@@ -21,7 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include <stdlib.h>
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "ui/console.h"
@@ -68,7 +68,7 @@ static CharDriverState *qemu_chr_open_msmouse(const char *id,
ChardevReturn *ret,
Error **errp)
{
ChardevCommon *common = qapi_ChardevDummy_base(backend->u.msmouse);
ChardevCommon *common = backend->u.msmouse.data;
CharDriverState *chr;
chr = qemu_chr_alloc(common, errp);

View File

@@ -10,6 +10,7 @@
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/rng.h"
#include "sysemu/char.h"
#include "qapi/qmp/qerror.h"
@@ -24,33 +25,12 @@ typedef struct RngEgd
CharDriverState *chr;
char *chr_name;
GSList *requests;
} RngEgd;
typedef struct RngRequest
{
EntropyReceiveFunc *receive_entropy;
uint8_t *data;
void *opaque;
size_t offset;
size_t size;
} RngRequest;
static void rng_egd_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
static void rng_egd_request_entropy(RngBackend *b, RngRequest *req)
{
RngEgd *s = RNG_EGD(b);
RngRequest *req;
req = g_malloc(sizeof(*req));
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
size_t size = req->size;
while (size > 0) {
uint8_t header[2];
@@ -64,24 +44,15 @@ static void rng_egd_request_entropy(RngBackend *b, size_t size,
size -= len;
}
s->requests = g_slist_append(s->requests, req);
}
static void rng_egd_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static int rng_egd_chr_can_read(void *opaque)
{
RngEgd *s = RNG_EGD(opaque);
GSList *i;
RngRequest *req;
int size = 0;
for (i = s->requests; i; i = i->next) {
RngRequest *req = i->data;
QSIMPLEQ_FOREACH(req, &s->parent.requests, next) {
size += req->size - req->offset;
}
@@ -93,8 +64,8 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
RngEgd *s = RNG_EGD(opaque);
size_t buf_offset = 0;
while (size > 0 && s->requests) {
RngRequest *req = s->requests->data;
while (size > 0 && !QSIMPLEQ_EMPTY(&s->parent.requests)) {
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests);
int len = MIN(size, req->size - req->offset);
memcpy(req->data + req->offset, buf + buf_offset, len);
@@ -103,38 +74,13 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
size -= len;
if (req->offset == req->size) {
s->requests = g_slist_remove_link(s->requests, s->requests);
req->receive_entropy(req->opaque, req->data, req->size);
rng_egd_free_request(req);
rng_backend_finalize_request(&s->parent, req);
}
}
}
static void rng_egd_free_requests(RngEgd *s)
{
GSList *i;
for (i = s->requests; i; i = i->next) {
rng_egd_free_request(i->data);
}
g_slist_free(s->requests);
s->requests = NULL;
}
static void rng_egd_cancel_requests(RngBackend *b)
{
RngEgd *s = RNG_EGD(b);
/* We simply delete the list of pending requests. If there is data in the
* queue waiting to be read, this is okay, because there will always be
* more data than we requested originally
*/
rng_egd_free_requests(s);
}
static void rng_egd_opened(RngBackend *b, Error **errp)
{
RngEgd *s = RNG_EGD(b);
@@ -203,8 +149,6 @@ static void rng_egd_finalize(Object *obj)
}
g_free(s->chr_name);
rng_egd_free_requests(s);
}
static void rng_egd_class_init(ObjectClass *klass, void *data)
@@ -212,7 +156,6 @@ static void rng_egd_class_init(ObjectClass *klass, void *data)
RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
rbc->request_entropy = rng_egd_request_entropy;
rbc->cancel_requests = rng_egd_cancel_requests;
rbc->opened = rng_egd_opened;
}

View File

@@ -10,6 +10,7 @@
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/rng-random.h"
#include "sysemu/rng.h"
#include "qapi/qmp/qerror.h"
@@ -21,10 +22,6 @@ struct RndRandom
int fd;
char *filename;
EntropyReceiveFunc *receive_func;
void *opaque;
size_t size;
};
/**
@@ -37,36 +34,35 @@ struct RndRandom
static void entropy_available(void *opaque)
{
RndRandom *s = RNG_RANDOM(opaque);
uint8_t buffer[s->size];
ssize_t len;
len = read(s->fd, buffer, s->size);
if (len < 0 && errno == EAGAIN) {
return;
while (!QSIMPLEQ_EMPTY(&s->parent.requests)) {
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests);
ssize_t len;
len = read(s->fd, req->data, req->size);
if (len < 0 && errno == EAGAIN) {
return;
}
g_assert(len != -1);
req->receive_entropy(req->opaque, req->data, len);
rng_backend_finalize_request(&s->parent, req);
}
g_assert(len != -1);
s->receive_func(s->opaque, buffer, len);
s->receive_func = NULL;
/* We've drained all requests, the fd handler can be reset. */
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
}
static void rng_random_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
{
RndRandom *s = RNG_RANDOM(b);
if (s->receive_func) {
s->receive_func(s->opaque, NULL, 0);
if (QSIMPLEQ_EMPTY(&s->parent.requests)) {
/* If there are no pending requests yet, we need to
* install our fd handler. */
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
}
s->receive_func = receive_entropy;
s->opaque = opaque;
s->size = size;
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
}
static void rng_random_opened(RngBackend *b, Error **errp)

View File

@@ -10,6 +10,7 @@
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/rng.h"
#include "qapi/qmp/qerror.h"
#include "qom/object_interfaces.h"
@@ -19,18 +20,20 @@ void rng_backend_request_entropy(RngBackend *s, size_t size,
void *opaque)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
RngRequest *req;
if (k->request_entropy) {
k->request_entropy(s, size, receive_entropy, opaque);
}
}
req = g_malloc(sizeof(*req));
void rng_backend_cancel_requests(RngBackend *s)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
if (k->cancel_requests) {
k->cancel_requests(s);
k->request_entropy(s, req);
QSIMPLEQ_INSERT_TAIL(&s->requests, req, next);
}
}
@@ -72,14 +75,48 @@ static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
s->opened = true;
}
static void rng_backend_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static void rng_backend_free_requests(RngBackend *s)
{
RngRequest *req, *next;
QSIMPLEQ_FOREACH_SAFE(req, &s->requests, next, next) {
rng_backend_free_request(req);
}
QSIMPLEQ_INIT(&s->requests);
}
void rng_backend_finalize_request(RngBackend *s, RngRequest *req)
{
QSIMPLEQ_REMOVE(&s->requests, req, RngRequest, next);
rng_backend_free_request(req);
}
static void rng_backend_init(Object *obj)
{
RngBackend *s = RNG_BACKEND(obj);
QSIMPLEQ_INIT(&s->requests);
object_property_add_bool(obj, "opened",
rng_backend_prop_get_opened,
rng_backend_prop_set_opened,
NULL);
}
static void rng_backend_finalize(Object *obj)
{
RngBackend *s = RNG_BACKEND(obj);
rng_backend_free_requests(s);
}
static void rng_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
@@ -92,6 +129,7 @@ static const TypeInfo rng_backend_info = {
.parent = TYPE_OBJECT,
.instance_size = sizeof(RngBackend),
.instance_init = rng_backend_init,
.instance_finalize = rng_backend_finalize,
.class_size = sizeof(RngBackendClass),
.class_init = rng_backend_class_init,
.abstract = true,

View File

@@ -23,6 +23,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"

View File

@@ -12,6 +12,7 @@
* Based on backends/rng.c by Anthony Liguori
*/
#include "qemu/osdep.h"
#include "sysemu/tpm_backend.h"
#include "qapi/qmp/qerror.h"
#include "sysemu/tpm.h"

View File

@@ -24,6 +24,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "exec/cpu-common.h"
#include "sysemu/kvm.h"

567
block.c
View File

@@ -21,7 +21,7 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "config-host.h"
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "trace.h"
#include "block/block_int.h"
@@ -42,8 +42,6 @@
#include "block/throttle-groups.h"
#ifdef CONFIG_BSD
#include <sys/types.h>
#include <sys/stat.h>
#include <sys/ioctl.h>
#include <sys/queue.h>
#ifndef __DragonFly__
@@ -55,30 +53,14 @@
#include <windows.h>
#endif
/**
* A BdrvDirtyBitmap can be in three possible states:
* (1) successor is NULL and disabled is false: full r/w mode
* (2) successor is NULL and disabled is true: read only mode ("disabled")
* (3) successor is set: frozen mode.
* A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
* or enabled. A frozen bitmap can only abdicate() or reclaim().
*/
struct BdrvDirtyBitmap {
HBitmap *bitmap; /* Dirty sector bitmap implementation */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap (Number of sectors) */
bool disabled; /* Bitmap is read-only */
QLIST_ENTRY(BdrvDirtyBitmap) list;
};
#define NOT_DONE 0x7fffffff /* used while emulated sync operation in progress */
struct BdrvStates bdrv_states = QTAILQ_HEAD_INITIALIZER(bdrv_states);
static QTAILQ_HEAD(, BlockDriverState) graph_bdrv_states =
QTAILQ_HEAD_INITIALIZER(graph_bdrv_states);
static QTAILQ_HEAD(, BlockDriverState) all_bdrv_states =
QTAILQ_HEAD_INITIALIZER(all_bdrv_states);
static QLIST_HEAD(, BlockDriver) bdrv_drivers =
QLIST_HEAD_INITIALIZER(bdrv_drivers);
@@ -87,10 +69,11 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
BlockDriverState *parent,
const BdrvChildRole *child_role, Error **errp);
static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs);
/* If non-zero, use only whitelisted block drivers */
static int use_bdrv_whitelist;
static void bdrv_close(BlockDriverState *bs);
#ifdef _WIN32
static int is_windows_drive_prefix(const char *filename)
{
@@ -241,10 +224,7 @@ void bdrv_register(BlockDriver *bdrv)
BlockDriverState *bdrv_new_root(void)
{
BlockDriverState *bs = bdrv_new();
QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
return bs;
return bdrv_new();
}
BlockDriverState *bdrv_new(void)
@@ -257,19 +237,15 @@ BlockDriverState *bdrv_new(void)
for (i = 0; i < BLOCK_OP_TYPE_MAX; i++) {
QLIST_INIT(&bs->op_blockers[i]);
}
notifier_list_init(&bs->close_notifiers);
notifier_with_return_list_init(&bs->before_write_notifiers);
qemu_co_queue_init(&bs->throttled_reqs[0]);
qemu_co_queue_init(&bs->throttled_reqs[1]);
bs->refcnt = 1;
bs->aio_context = qemu_get_aio_context();
return bs;
}
QTAILQ_INSERT_TAIL(&all_bdrv_states, bs, bs_list);
void bdrv_add_close_notifier(BlockDriverState *bs, Notifier *notify)
{
notifier_list_add(&bs->close_notifiers, notify);
return bs;
}
BlockDriver *bdrv_find_format(const char *format_name)
@@ -686,13 +662,19 @@ int bdrv_parse_cache_flags(const char *mode, int *flags)
}
/*
* Returns the flags that a temporary snapshot should get, based on the
* originally requested flags (the originally requested image will have flags
* like a backing file)
* Returns the options and flags that a temporary snapshot should get, based on
* the originally requested flags (the originally requested image will have
* flags like a backing file)
*/
static int bdrv_temp_snapshot_flags(int flags)
static void bdrv_temp_snapshot_options(int *child_flags, QDict *child_options,
int parent_flags, QDict *parent_options)
{
return (flags & ~BDRV_O_SNAPSHOT) | BDRV_O_TEMPORARY;
*child_flags = (parent_flags & ~BDRV_O_SNAPSHOT) | BDRV_O_TEMPORARY;
/* For temporary files, unconditional cache=unsafe is fine */
qdict_set_default_str(child_options, BDRV_OPT_CACHE_WB, "on");
qdict_set_default_str(child_options, BDRV_OPT_CACHE_DIRECT, "off");
qdict_set_default_str(child_options, BDRV_OPT_CACHE_NO_FLUSH, "on");
}
/*
@@ -1190,17 +1172,12 @@ static int bdrv_fill_options(QDict **options, const char *filename,
}
}
if (runstate_check(RUN_STATE_INMIGRATE)) {
*flags |= BDRV_O_INACTIVE;
}
return 0;
}
static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
BlockDriverState *child_bs,
const char *child_name,
const BdrvChildRole *child_role)
BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
const char *child_name,
const BdrvChildRole *child_role)
{
BdrvChild *child = g_new(BdrvChild, 1);
*child = (BdrvChild) {
@@ -1209,24 +1186,43 @@ static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
.role = child_role,
};
QLIST_INSERT_HEAD(&parent_bs->children, child, next);
QLIST_INSERT_HEAD(&child_bs->parents, child, next_parent);
return child;
}
static BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
BlockDriverState *child_bs,
const char *child_name,
const BdrvChildRole *child_role)
{
BdrvChild *child = bdrv_root_attach_child(child_bs, child_name, child_role);
QLIST_INSERT_HEAD(&parent_bs->children, child, next);
return child;
}
static void bdrv_detach_child(BdrvChild *child)
{
QLIST_REMOVE(child, next);
if (child->next.le_prev) {
QLIST_REMOVE(child, next);
child->next.le_prev = NULL;
}
QLIST_REMOVE(child, next_parent);
g_free(child->name);
g_free(child);
}
void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
void bdrv_root_unref_child(BdrvChild *child)
{
BlockDriverState *child_bs;
child_bs = child->bs;
bdrv_detach_child(child);
bdrv_unref(child_bs);
}
void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
{
if (child == NULL) {
return;
}
@@ -1235,9 +1231,7 @@ void bdrv_unref_child(BlockDriverState *parent, BdrvChild *child)
child->bs->inherits_from = NULL;
}
child_bs = child->bs;
bdrv_detach_child(child);
bdrv_unref(child_bs);
bdrv_root_unref_child(child);
}
/*
@@ -1427,13 +1421,13 @@ done:
return c;
}
int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
static int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags,
QDict *snapshot_options, Error **errp)
{
/* TODO: extra byte is a hack to ensure MAX_PATH space on Windows. */
char *tmp_filename = g_malloc0(PATH_MAX + 1);
int64_t total_size;
QemuOpts *opts = NULL;
QDict *snapshot_options;
BlockDriverState *bs_snapshot;
Error *local_err = NULL;
int ret;
@@ -1467,8 +1461,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
goto out;
}
/* Prepare a new options QDict for the temporary file */
snapshot_options = qdict_new();
/* Prepare options QDict for the temporary file */
qdict_put(snapshot_options, "file.driver",
qstring_from_str("file"));
qdict_put(snapshot_options, "file.filename",
@@ -1480,6 +1473,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
ret = bdrv_open(&bs_snapshot, NULL, NULL, snapshot_options,
flags, &local_err);
snapshot_options = NULL;
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
@@ -1488,6 +1482,7 @@ int bdrv_append_temp_snapshot(BlockDriverState *bs, int flags, Error **errp)
bdrv_append(bs_snapshot, bs);
out:
QDECREF(snapshot_options);
g_free(tmp_filename);
return ret;
}
@@ -1519,6 +1514,7 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
const char *drvname;
const char *backing;
Error *local_err = NULL;
QDict *snapshot_options = NULL;
int snapshot_flags = 0;
assert(pbs);
@@ -1610,7 +1606,9 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
flags |= BDRV_O_ALLOW_RDWR;
}
if (flags & BDRV_O_SNAPSHOT) {
snapshot_flags = bdrv_temp_snapshot_flags(flags);
snapshot_options = qdict_new();
bdrv_temp_snapshot_options(&snapshot_flags, snapshot_options,
flags, options);
bdrv_backing_options(&flags, options, flags, options);
}
@@ -1684,9 +1682,9 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
error_setg(errp, "Block protocol '%s' doesn't support the option "
"'%s'", drv->format_name, entry->key);
} else {
error_setg(errp, "Block format '%s' used by device '%s' doesn't "
"support the option '%s'", drv->format_name,
bdrv_get_device_name(bs), entry->key);
error_setg(errp,
"Block format '%s' does not support the option '%s'",
drv->format_name, entry->key);
}
ret = -EINVAL;
@@ -1712,7 +1710,9 @@ static int bdrv_open_inherit(BlockDriverState **pbs, const char *filename,
/* For snapshot=on, create a temporary qcow2 overlay. bs points to the
* temporary snapshot afterwards. */
if (snapshot_flags) {
ret = bdrv_append_temp_snapshot(bs, snapshot_flags, &local_err);
ret = bdrv_append_temp_snapshot(bs, snapshot_flags, snapshot_options,
&local_err);
snapshot_options = NULL;
if (local_err) {
goto close_and_fail;
}
@@ -1724,6 +1724,7 @@ fail:
if (file != NULL) {
bdrv_unref_child(bs, file);
}
QDECREF(snapshot_options);
QDECREF(bs->explicit_options);
QDECREF(bs->options);
QDECREF(options);
@@ -1746,6 +1747,7 @@ close_and_fail:
} else {
bdrv_unref(bs);
}
QDECREF(snapshot_options);
QDECREF(options);
if (local_err) {
error_propagate(errp, local_err);
@@ -2138,13 +2140,11 @@ void bdrv_reopen_abort(BDRVReopenState *reopen_state)
}
void bdrv_close(BlockDriverState *bs)
static void bdrv_close(BlockDriverState *bs)
{
BdrvAioNotifier *ban, *ban_next;
if (bs->job) {
block_job_cancel_sync(bs->job);
}
assert(!bs->job);
/* Disable I/O limits and drain all pending throttled requests */
if (bs->throttle_state) {
@@ -2155,7 +2155,8 @@ void bdrv_close(BlockDriverState *bs)
bdrv_flush(bs);
bdrv_drain(bs); /* in case flush left pending I/O */
notifier_list_notify(&bs->close_notifiers, bs);
bdrv_release_named_dirty_bitmaps(bs);
assert(QLIST_EMPTY(&bs->dirty_bitmaps));
if (bs->blk) {
blk_dev_change_media_cb(bs->blk, false);
@@ -2210,32 +2211,40 @@ void bdrv_close(BlockDriverState *bs)
void bdrv_close_all(void)
{
BlockDriverState *bs;
AioContext *aio_context;
QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
AioContext *aio_context = bdrv_get_aio_context(bs);
/* Drop references from requests still in flight, such as canceled block
* jobs whose AIO context has not been polled yet */
bdrv_drain_all();
aio_context_acquire(aio_context);
bdrv_close(bs);
aio_context_release(aio_context);
blk_remove_all_bs();
blockdev_close_all_bdrv_states();
/* Cancel all block jobs */
while (!QTAILQ_EMPTY(&all_bdrv_states)) {
QTAILQ_FOREACH(bs, &all_bdrv_states, bs_list) {
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (bs->job) {
block_job_cancel_sync(bs->job);
aio_context_release(aio_context);
break;
}
aio_context_release(aio_context);
}
/* All the remaining BlockDriverStates are referenced directly or
* indirectly from block jobs, so there needs to be at least one BDS
* directly used by a block job */
assert(bs);
}
}
/* make a BlockDriverState anonymous by removing from bdrv_state and
* graph_bdrv_state list.
Also, NULL terminate the device_name to prevent double remove */
/* make a BlockDriverState anonymous by removing from graph_bdrv_state list.
* Also, NULL terminate the device_name to prevent double remove */
void bdrv_make_anon(BlockDriverState *bs)
{
/*
* Take care to remove bs from bdrv_states only when it's actually
* in it. Note that bs->device_list.tqe_prev is initially null,
* and gets set to non-null by QTAILQ_INSERT_TAIL(). Establish
* the useful invariant "bs in bdrv_states iff bs->tqe_prev" by
* resetting it to null on remove.
*/
if (bs->device_list.tqe_prev) {
QTAILQ_REMOVE(&bdrv_states, bs, device_list);
bs->device_list.tqe_prev = NULL;
}
if (bs->node_name[0] != '\0') {
QTAILQ_REMOVE(&graph_bdrv_states, bs, node_list);
}
@@ -2262,6 +2271,14 @@ static void change_parent_backing_link(BlockDriverState *from,
{
BdrvChild *c, *next;
if (from->blk) {
/* FIXME We bypass blk_set_bs(), so we need to make these updates
* manually. The root problem is not in this change function, but the
* existence of BlockDriverState.blk. */
to->blk = from->blk;
from->blk = NULL;
}
QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
assert(c->role != &child_backing);
c->bs = to;
@@ -2270,13 +2287,6 @@ static void change_parent_backing_link(BlockDriverState *from,
bdrv_ref(to);
bdrv_unref(from);
}
if (from->blk) {
blk_set_bs(from->blk, to);
if (!to->device_list.tqe_prev) {
QTAILQ_INSERT_BEFORE(from, to, device_list);
}
QTAILQ_REMOVE(&bdrv_states, from, device_list);
}
}
static void swap_feature_fields(BlockDriverState *bs_top,
@@ -2366,13 +2376,14 @@ static void bdrv_delete(BlockDriverState *bs)
assert(!bs->job);
assert(bdrv_op_blocker_is_empty(bs));
assert(!bs->refcnt);
assert(QLIST_EMPTY(&bs->dirty_bitmaps));
bdrv_close(bs);
/* remove from list, if necessary */
bdrv_make_anon(bs);
QTAILQ_REMOVE(&all_bdrv_states, bs, bs_list);
g_free(bs);
}
@@ -2506,26 +2517,6 @@ ro_cleanup:
return ret;
}
int bdrv_commit_all(void)
{
BlockDriverState *bs;
QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
AioContext *aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
if (bs->drv && bs->backing) {
int ret = bdrv_commit(bs);
if (ret < 0) {
aio_context_release(aio_context);
return ret;
}
}
aio_context_release(aio_context);
}
return 0;
}
/*
* Return values:
* 0 - success
@@ -2974,12 +2965,23 @@ BlockDriverState *bdrv_next_node(BlockDriverState *bs)
return QTAILQ_NEXT(bs, node_list);
}
/* Iterates over all top-level BlockDriverStates, i.e. BDSs that are owned by
* the monitor or attached to a BlockBackend */
BlockDriverState *bdrv_next(BlockDriverState *bs)
{
if (!bs) {
return QTAILQ_FIRST(&bdrv_states);
if (!bs || bs->blk) {
bs = blk_next_root_bs(bs);
if (bs) {
return bs;
}
}
return QTAILQ_NEXT(bs, device_list);
/* Ignore all BDSs that are attached to a BlockBackend here; they have been
* handled by the above block already */
do {
bs = bdrv_next_monitor_owned(bs);
} while (bs && bs->blk);
return bs;
}
const char *bdrv_get_node_name(const BlockDriverState *bs)
@@ -3287,10 +3289,10 @@ void bdrv_invalidate_cache(BlockDriverState *bs, Error **errp)
void bdrv_invalidate_cache_all(Error **errp)
{
BlockDriverState *bs;
BlockDriverState *bs = NULL;
Error *local_err = NULL;
QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
while ((bs = bdrv_next(bs)) != NULL) {
AioContext *aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
@@ -3320,10 +3322,10 @@ static int bdrv_inactivate(BlockDriverState *bs)
int bdrv_inactivate_all(void)
{
BlockDriverState *bs;
BlockDriverState *bs = NULL;
int ret;
QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
while ((bs = bdrv_next(bs)) != NULL) {
AioContext *aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
@@ -3410,327 +3412,6 @@ void bdrv_lock_medium(BlockDriverState *bs, bool locked)
}
}
BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
{
BdrvDirtyBitmap *bm;
assert(name);
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->name && !strcmp(name, bm->name)) {
return bm;
}
}
return NULL;
}
void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
g_free(bitmap->name);
bitmap->name = NULL;
}
BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
uint32_t granularity,
const char *name,
Error **errp)
{
int64_t bitmap_size;
BdrvDirtyBitmap *bitmap;
uint32_t sector_granularity;
assert((granularity & (granularity - 1)) == 0);
if (name && bdrv_find_dirty_bitmap(bs, name)) {
error_setg(errp, "Bitmap already exists: %s", name);
return NULL;
}
sector_granularity = granularity >> BDRV_SECTOR_BITS;
assert(sector_granularity);
bitmap_size = bdrv_nb_sectors(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
bitmap->size = bitmap_size;
bitmap->name = g_strdup(name);
bitmap->disabled = false;
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
return bitmap;
}
bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
{
return bitmap->successor;
}
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
return !(bitmap->disabled || bitmap->successor);
}
DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
return DIRTY_BITMAP_STATUS_FROZEN;
} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
return DIRTY_BITMAP_STATUS_DISABLED;
} else {
return DIRTY_BITMAP_STATUS_ACTIVE;
}
}
/**
* Create a successor bitmap destined to replace this bitmap after an operation.
* Requires that the bitmap is not frozen and has no successor.
*/
int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
{
uint64_t granularity;
BdrvDirtyBitmap *child;
if (bdrv_dirty_bitmap_frozen(bitmap)) {
error_setg(errp, "Cannot create a successor for a bitmap that is "
"currently frozen");
return -1;
}
assert(!bitmap->successor);
/* Create an anonymous successor */
granularity = bdrv_dirty_bitmap_granularity(bitmap);
child = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!child) {
return -1;
}
/* Successor will be on or off based on our current state. */
child->disabled = bitmap->disabled;
/* Install the successor and freeze the parent */
bitmap->successor = child;
return 0;
}
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
*/
BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp)
{
char *name;
BdrvDirtyBitmap *successor = bitmap->successor;
if (successor == NULL) {
error_setg(errp, "Cannot relinquish control if "
"there's no successor present");
return NULL;
}
name = bitmap->name;
bitmap->name = NULL;
successor->name = name;
bitmap->successor = NULL;
bdrv_release_dirty_bitmap(bs, bitmap);
return successor;
}
/**
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
{
BdrvDirtyBitmap *successor = parent->successor;
if (!successor) {
error_setg(errp, "Cannot reclaim a successor when none is present");
return NULL;
}
if (!hbitmap_merge(parent->bitmap, successor->bitmap)) {
error_setg(errp, "Merging of parent and successor bitmap failed");
return NULL;
}
bdrv_release_dirty_bitmap(bs, successor);
parent->successor = NULL;
return parent;
}
/**
* Truncates _all_ bitmaps attached to a BDS.
*/
static void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
{
BdrvDirtyBitmap *bitmap;
uint64_t size = bdrv_nb_sectors(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
assert(!bdrv_dirty_bitmap_frozen(bitmap));
hbitmap_truncate(bitmap->bitmap, size);
bitmap->size = size;
}
}
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{
BdrvDirtyBitmap *bm, *next;
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if (bm == bitmap) {
assert(!bdrv_dirty_bitmap_frozen(bm));
QLIST_REMOVE(bitmap, list);
hbitmap_free(bitmap->bitmap);
g_free(bitmap->name);
g_free(bitmap);
return;
}
}
}
void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = true;
}
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = false;
}
BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
BlockDirtyInfoList *list = NULL;
BlockDirtyInfoList **plist = &list;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
info->count = bdrv_get_dirty_count(bm);
info->granularity = bdrv_dirty_bitmap_granularity(bm);
info->has_name = !!bm->name;
info->name = g_strdup(bm->name);
info->status = bdrv_dirty_bitmap_status(bm);
entry->value = info;
*plist = entry;
plist = &entry->next;
}
return list;
}
int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap, int64_t sector)
{
if (bitmap) {
return hbitmap_get(bitmap->bitmap, sector);
} else {
return 0;
}
}
/**
* Chooses a default granularity based on the existing cluster size,
* but clamped between [4K, 64K]. Defaults to 64K in the case that there
* is no cluster size information available.
*/
uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
{
BlockDriverInfo bdi;
uint32_t granularity;
if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size > 0) {
granularity = MAX(4096, bdi.cluster_size);
granularity = MIN(65536, granularity);
} else {
granularity = 65536;
}
return granularity;
}
uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
{
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
}
void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
{
hbitmap_iter_init(hbi, bitmap->bitmap, 0);
}
void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
if (!out) {
hbitmap_reset_all(bitmap->bitmap);
} else {
HBitmap *backup = bitmap->bitmap;
bitmap->bitmap = hbitmap_alloc(bitmap->size,
hbitmap_granularity(backup));
*out = backup;
}
}
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
{
HBitmap *tmp = bitmap->bitmap;
assert(bdrv_dirty_bitmap_enabled(bitmap));
bitmap->bitmap = in;
hbitmap_free(tmp);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int nr_sectors)
{
BdrvDirtyBitmap *bitmap;
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
if (!bdrv_dirty_bitmap_enabled(bitmap)) {
continue;
}
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
}
/**
* Advance an HBitmapIter to an arbitrary offset.
*/
void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
{
assert(hbi->hb);
hbitmap_iter_init(hbi, hbi->hb, offset);
}
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->bitmap);
}
/* Get a reference to bs */
void bdrv_ref(BlockDriverState *bs)
{
@@ -4150,10 +3831,10 @@ bool bdrv_recurse_is_first_non_filter(BlockDriverState *bs,
*/
bool bdrv_is_first_non_filter(BlockDriverState *candidate)
{
BlockDriverState *bs;
BlockDriverState *bs = NULL;
/* walk down the bs forest recursively */
QTAILQ_FOREACH(bs, &bdrv_states, device_list) {
while ((bs = bdrv_next(bs)) != NULL) {
bool perm;
/* try to recurse in this top level bs */

View File

@@ -20,7 +20,7 @@ block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o
block-obj-y += accounting.o dirty-bitmap.o
block-obj-y += write-threshold.o
common-obj-y += stream.o

View File

@@ -20,11 +20,9 @@
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
#include "sysemu/block-backend.h"
#include "qemu/bitmap.h"
#define BACKUP_CLUSTER_BITS 16
#define BACKUP_CLUSTER_SIZE (1 << BACKUP_CLUSTER_BITS)
#define BACKUP_SECTORS_PER_CLUSTER (BACKUP_CLUSTER_SIZE / BDRV_SECTOR_SIZE)
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
#define SLICE_TIME 100000000ULL /* ns */
typedef struct CowRequest {
@@ -45,10 +43,17 @@ typedef struct BackupBlockJob {
BlockdevOnError on_target_error;
CoRwlock flush_rwlock;
uint64_t sectors_read;
HBitmap *bitmap;
unsigned long *done_bitmap;
int64_t cluster_size;
QLIST_HEAD(, CowRequest) inflight_reqs;
} BackupBlockJob;
/* Size of a cluster in sectors, instead of bytes. */
static inline int64_t cluster_size_sectors(BackupBlockJob *job)
{
return job->cluster_size / BDRV_SECTOR_SIZE;
}
/* See if in-flight requests overlap and wait for them to complete */
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
int64_t start,
@@ -97,13 +102,14 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
QEMUIOVector bounce_qiov;
void *bounce_buffer = NULL;
int ret = 0;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int64_t start, end;
int n;
qemu_co_rwlock_rdlock(&job->flush_rwlock);
start = sector_num / BACKUP_SECTORS_PER_CLUSTER;
end = DIV_ROUND_UP(sector_num + nb_sectors, BACKUP_SECTORS_PER_CLUSTER);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
@@ -111,19 +117,19 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
cow_request_begin(&cow_request, job, start, end);
for (; start < end; start++) {
if (hbitmap_get(job->bitmap, start)) {
if (test_bit(start, job->done_bitmap)) {
trace_backup_do_cow_skip(job, start);
continue; /* already copied */
}
trace_backup_do_cow_process(job, start);
n = MIN(BACKUP_SECTORS_PER_CLUSTER,
n = MIN(sectors_per_cluster,
job->common.len / BDRV_SECTOR_SIZE -
start * BACKUP_SECTORS_PER_CLUSTER);
start * sectors_per_cluster);
if (!bounce_buffer) {
bounce_buffer = qemu_blockalign(bs, BACKUP_CLUSTER_SIZE);
bounce_buffer = qemu_blockalign(bs, job->cluster_size);
}
iov.iov_base = bounce_buffer;
iov.iov_len = n * BDRV_SECTOR_SIZE;
@@ -131,10 +137,10 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
if (is_write_notifier) {
ret = bdrv_co_readv_no_serialising(bs,
start * BACKUP_SECTORS_PER_CLUSTER,
start * sectors_per_cluster,
n, &bounce_qiov);
} else {
ret = bdrv_co_readv(bs, start * BACKUP_SECTORS_PER_CLUSTER, n,
ret = bdrv_co_readv(bs, start * sectors_per_cluster, n,
&bounce_qiov);
}
if (ret < 0) {
@@ -147,11 +153,11 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = bdrv_co_write_zeroes(job->target,
start * BACKUP_SECTORS_PER_CLUSTER,
start * sectors_per_cluster,
n, BDRV_REQ_MAY_UNMAP);
} else {
ret = bdrv_co_writev(job->target,
start * BACKUP_SECTORS_PER_CLUSTER, n,
start * sectors_per_cluster, n,
&bounce_qiov);
}
if (ret < 0) {
@@ -162,7 +168,7 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
goto out;
}
hbitmap_set(job->bitmap, start, 1);
set_bit(start, job->done_bitmap);
/* Publish progress, guest I/O counts as progress too. Note that the
* offset field is an opaque progress value, it is not a disk offset.
@@ -322,21 +328,22 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
int64_t cluster;
int64_t end;
int64_t last_cluster = -1;
int64_t sectors_per_cluster = cluster_size_sectors(job);
BlockDriverState *bs = job->common.bs;
HBitmapIter hbi;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / BACKUP_CLUSTER_SIZE), 1);
clusters_per_iter = MAX((granularity / job->cluster_size), 1);
bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
/* Find the next dirty sector(s) */
while ((sector = hbitmap_iter_next(&hbi)) != -1) {
cluster = sector / BACKUP_SECTORS_PER_CLUSTER;
cluster = sector / sectors_per_cluster;
/* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) *
BACKUP_CLUSTER_SIZE);
job->cluster_size);
}
for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
@@ -344,8 +351,8 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
if (yield_and_check(job)) {
return ret;
}
ret = backup_do_cow(bs, cluster * BACKUP_SECTORS_PER_CLUSTER,
BACKUP_SECTORS_PER_CLUSTER, &error_is_read,
ret = backup_do_cow(bs, cluster * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) ==
@@ -357,17 +364,17 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
/* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */
if (granularity < BACKUP_CLUSTER_SIZE) {
bdrv_set_dirty_iter(&hbi, cluster * BACKUP_SECTORS_PER_CLUSTER);
if (granularity < job->cluster_size) {
bdrv_set_dirty_iter(&hbi, cluster * sectors_per_cluster);
}
last_cluster = cluster - 1;
}
/* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * BACKUP_CLUSTER_SIZE);
job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
}
return ret;
@@ -384,15 +391,16 @@ static void coroutine_fn backup_run(void *opaque)
.notify = backup_before_write_notify,
};
int64_t start, end;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int ret = 0;
QLIST_INIT(&job->inflight_reqs);
qemu_co_rwlock_init(&job->flush_rwlock);
start = 0;
end = DIV_ROUND_UP(job->common.len, BACKUP_CLUSTER_SIZE);
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
job->bitmap = hbitmap_alloc(end, 0);
job->done_bitmap = bitmap_new(end);
bdrv_set_enable_write_cache(target, true);
if (target->blk) {
@@ -427,7 +435,7 @@ static void coroutine_fn backup_run(void *opaque)
/* Check to see if these blocks are already in the
* backing file. */
for (i = 0; i < BACKUP_SECTORS_PER_CLUSTER;) {
for (i = 0; i < sectors_per_cluster;) {
/* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that
* are are all in the same state.
@@ -436,8 +444,8 @@ static void coroutine_fn backup_run(void *opaque)
* needed but at some point that is always the case. */
alloced =
bdrv_is_allocated(bs,
start * BACKUP_SECTORS_PER_CLUSTER + i,
BACKUP_SECTORS_PER_CLUSTER - i, &n);
start * sectors_per_cluster + i,
sectors_per_cluster - i, &n);
i += n;
if (alloced == 1 || n == 0) {
@@ -452,8 +460,8 @@ static void coroutine_fn backup_run(void *opaque)
}
}
/* FULL sync mode we copy the whole drive. */
ret = backup_do_cow(bs, start * BACKUP_SECTORS_PER_CLUSTER,
BACKUP_SECTORS_PER_CLUSTER, &error_is_read, false);
ret = backup_do_cow(bs, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read, false);
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
@@ -473,7 +481,7 @@ static void coroutine_fn backup_run(void *opaque)
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
hbitmap_free(job->bitmap);
g_free(job->done_bitmap);
if (target->blk) {
blk_iostatus_disable(target->blk);
@@ -494,6 +502,8 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
BlockJobTxn *txn, Error **errp)
{
int64_t len;
BlockDriverInfo bdi;
int ret;
assert(bs);
assert(target);
@@ -563,14 +573,32 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
goto error;
}
bdrv_op_block_all(target, job->common.blocker);
job->on_source_error = on_source_error;
job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
/* If there is no backing file on the target, we cannot rely on COW if our
* backup cluster size is smaller than the target cluster size. Even for
* targets with a backing file, try to avoid COW if possible. */
ret = bdrv_get_info(job->target, &bdi);
if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
"which has no backing file");
error_append_hint(errp,
"Aborting, since this may create an unusable destination image\n");
goto error;
} else if (ret < 0 && target->backing) {
/* Not fatal; just trudge on ahead. */
job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
} else {
job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
}
bdrv_op_block_all(target, job->common.blocker);
job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run);
block_job_txn_add_job(txn, &job->common);

File diff suppressed because it is too large Load Diff

View File

@@ -27,6 +27,7 @@
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include <curl/curl.h>
// #define DEBUG_CURL
@@ -78,6 +79,10 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
#define CURL_BLOCK_OPT_USERNAME "username"
#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
#define CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET "proxy-password-secret"
struct BDRVCURLState;
@@ -120,6 +125,10 @@ typedef struct BDRVCURLState {
char *cookie;
bool accept_range;
AioContext *aio_context;
char *username;
char *password;
char *proxyusername;
char *proxypassword;
} BDRVCURLState;
static void curl_clean_state(CURLState *s);
@@ -419,6 +428,21 @@ static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
curl_easy_setopt(state->curl, CURLOPT_ERRORBUFFER, state->errmsg);
curl_easy_setopt(state->curl, CURLOPT_FAILONERROR, 1);
if (s->username) {
curl_easy_setopt(state->curl, CURLOPT_USERNAME, s->username);
}
if (s->password) {
curl_easy_setopt(state->curl, CURLOPT_PASSWORD, s->password);
}
if (s->proxyusername) {
curl_easy_setopt(state->curl,
CURLOPT_PROXYUSERNAME, s->proxyusername);
}
if (s->proxypassword) {
curl_easy_setopt(state->curl,
CURLOPT_PROXYPASSWORD, s->proxypassword);
}
/* Restrict supported protocols to avoid security issues in the more
* obscure protocols. For example, do not allow POP3/SMTP/IMAP see
* CVE-2013-0249.
@@ -525,10 +549,31 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{
.name = CURL_BLOCK_OPT_USERNAME,
.type = QEMU_OPT_STRING,
.help = "Username for HTTP auth"
},
{
.name = CURL_BLOCK_OPT_PASSWORD_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as password for HTTP auth",
},
{
.name = CURL_BLOCK_OPT_PROXY_USERNAME,
.type = QEMU_OPT_STRING,
.help = "Username for HTTP proxy auth"
},
{
.name = CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as password for HTTP proxy auth",
},
{ /* end of list */ }
},
};
static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -539,6 +584,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
const char *file;
const char *cookie;
double d;
const char *secretid;
static int inited = 0;
@@ -580,6 +626,26 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
s->username = g_strdup(qemu_opt_get(opts, CURL_BLOCK_OPT_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PASSWORD_SECRET);
if (secretid) {
s->password = qcrypto_secret_lookup_as_utf8(secretid, errp);
if (!s->password) {
goto out_noclean;
}
}
s->proxyusername = g_strdup(
qemu_opt_get(opts, CURL_BLOCK_OPT_PROXY_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PROXY_PASSWORD_SECRET);
if (secretid) {
s->proxypassword = qcrypto_secret_lookup_as_utf8(secretid, errp);
if (!s->proxypassword) {
goto out_noclean;
}
}
if (!inited) {
curl_global_init(CURL_GLOBAL_ALL);
inited = 1;

387
block/dirty-bitmap.c Normal file
View File

@@ -0,0 +1,387 @@
/*
* Block Dirty Bitmap
*
* Copyright (c) 2016 Red Hat. Inc
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "config-host.h"
#include "qemu-common.h"
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob.h"
/**
* A BdrvDirtyBitmap can be in three possible states:
* (1) successor is NULL and disabled is false: full r/w mode
* (2) successor is NULL and disabled is true: read only mode ("disabled")
* (3) successor is set: frozen mode.
* A frozen bitmap cannot be renamed, deleted, anonymized, cleared, set,
* or enabled. A frozen bitmap can only abdicate() or reclaim().
*/
struct BdrvDirtyBitmap {
HBitmap *bitmap; /* Dirty sector bitmap implementation */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap (Number of sectors) */
bool disabled; /* Bitmap is read-only */
QLIST_ENTRY(BdrvDirtyBitmap) list;
};
BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
{
BdrvDirtyBitmap *bm;
assert(name);
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->name && !strcmp(name, bm->name)) {
return bm;
}
}
return NULL;
}
void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
g_free(bitmap->name);
bitmap->name = NULL;
}
BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
uint32_t granularity,
const char *name,
Error **errp)
{
int64_t bitmap_size;
BdrvDirtyBitmap *bitmap;
uint32_t sector_granularity;
assert((granularity & (granularity - 1)) == 0);
if (name && bdrv_find_dirty_bitmap(bs, name)) {
error_setg(errp, "Bitmap already exists: %s", name);
return NULL;
}
sector_granularity = granularity >> BDRV_SECTOR_BITS;
assert(sector_granularity);
bitmap_size = bdrv_nb_sectors(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
bitmap->size = bitmap_size;
bitmap->name = g_strdup(name);
bitmap->disabled = false;
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
return bitmap;
}
bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
{
return bitmap->successor;
}
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
return !(bitmap->disabled || bitmap->successor);
}
DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
return DIRTY_BITMAP_STATUS_FROZEN;
} else if (!bdrv_dirty_bitmap_enabled(bitmap)) {
return DIRTY_BITMAP_STATUS_DISABLED;
} else {
return DIRTY_BITMAP_STATUS_ACTIVE;
}
}
/**
* Create a successor bitmap destined to replace this bitmap after an operation.
* Requires that the bitmap is not frozen and has no successor.
*/
int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
{
uint64_t granularity;
BdrvDirtyBitmap *child;
if (bdrv_dirty_bitmap_frozen(bitmap)) {
error_setg(errp, "Cannot create a successor for a bitmap that is "
"currently frozen");
return -1;
}
assert(!bitmap->successor);
/* Create an anonymous successor */
granularity = bdrv_dirty_bitmap_granularity(bitmap);
child = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!child) {
return -1;
}
/* Successor will be on or off based on our current state. */
child->disabled = bitmap->disabled;
/* Install the successor and freeze the parent */
bitmap->successor = child;
return 0;
}
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
*/
BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
Error **errp)
{
char *name;
BdrvDirtyBitmap *successor = bitmap->successor;
if (successor == NULL) {
error_setg(errp, "Cannot relinquish control if "
"there's no successor present");
return NULL;
}
name = bitmap->name;
bitmap->name = NULL;
successor->name = name;
bitmap->successor = NULL;
bdrv_release_dirty_bitmap(bs, bitmap);
return successor;
}
/**
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
Error **errp)
{
BdrvDirtyBitmap *successor = parent->successor;
if (!successor) {
error_setg(errp, "Cannot reclaim a successor when none is present");
return NULL;
}
if (!hbitmap_merge(parent->bitmap, successor->bitmap)) {
error_setg(errp, "Merging of parent and successor bitmap failed");
return NULL;
}
bdrv_release_dirty_bitmap(bs, successor);
parent->successor = NULL;
return parent;
}
/**
* Truncates _all_ bitmaps attached to a BDS.
*/
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
{
BdrvDirtyBitmap *bitmap;
uint64_t size = bdrv_nb_sectors(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
assert(!bdrv_dirty_bitmap_frozen(bitmap));
hbitmap_truncate(bitmap->bitmap, size);
bitmap->size = size;
}
}
static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
bool only_named)
{
BdrvDirtyBitmap *bm, *next;
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
assert(!bdrv_dirty_bitmap_frozen(bm));
QLIST_REMOVE(bm, list);
hbitmap_free(bm->bitmap);
g_free(bm->name);
g_free(bm);
if (bitmap) {
return;
}
}
}
}
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
}
/**
* Release all named dirty bitmaps attached to a BDS (for use in bdrv_close()).
* There must not be any frozen bitmaps attached.
*/
void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
}
void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = true;
}
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = false;
}
BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
BlockDirtyInfoList *list = NULL;
BlockDirtyInfoList **plist = &list;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
info->count = bdrv_get_dirty_count(bm);
info->granularity = bdrv_dirty_bitmap_granularity(bm);
info->has_name = !!bm->name;
info->name = g_strdup(bm->name);
info->status = bdrv_dirty_bitmap_status(bm);
entry->value = info;
*plist = entry;
plist = &entry->next;
}
return list;
}
int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector)
{
if (bitmap) {
return hbitmap_get(bitmap->bitmap, sector);
} else {
return 0;
}
}
/**
* Chooses a default granularity based on the existing cluster size,
* but clamped between [4K, 64K]. Defaults to 64K in the case that there
* is no cluster size information available.
*/
uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
{
BlockDriverInfo bdi;
uint32_t granularity;
if (bdrv_get_info(bs, &bdi) >= 0 && bdi.cluster_size > 0) {
granularity = MAX(4096, bdi.cluster_size);
granularity = MIN(65536, granularity);
} else {
granularity = 65536;
}
return granularity;
}
uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
{
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
}
void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
{
hbitmap_iter_init(hbi, bitmap->bitmap, 0);
}
void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t cur_sector, int nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
if (!out) {
hbitmap_reset_all(bitmap->bitmap);
} else {
HBitmap *backup = bitmap->bitmap;
bitmap->bitmap = hbitmap_alloc(bitmap->size,
hbitmap_granularity(backup));
*out = backup;
}
}
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
{
HBitmap *tmp = bitmap->bitmap;
assert(bdrv_dirty_bitmap_enabled(bitmap));
bitmap->bitmap = in;
hbitmap_free(tmp);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int nr_sectors)
{
BdrvDirtyBitmap *bitmap;
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
if (!bdrv_dirty_bitmap_enabled(bitmap)) {
continue;
}
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
}
/**
* Advance an HBitmapIter to an arbitrary offset.
*/
void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
{
assert(hbi->hb);
hbitmap_iter_init(hbi, hbi->hb, offset);
}
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->bitmap);
}

View File

@@ -44,12 +44,6 @@ static int coroutine_fn bdrv_co_readv_em(BlockDriverState *bs,
static int coroutine_fn bdrv_co_writev_em(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
QEMUIOVector *iov);
static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags);
static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags);
static BlockAIOCB *bdrv_co_aio_rw_vector(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
@@ -301,6 +295,7 @@ void bdrv_drain_all(void)
if (bs->job) {
block_job_pause(bs->job);
}
bdrv_drain_recurse(bs);
aio_context_release(aio_context);
if (!g_slist_find(aio_ctxs, aio_context)) {
@@ -620,20 +615,6 @@ int bdrv_read(BlockDriverState *bs, int64_t sector_num,
return bdrv_rw_co(bs, sector_num, buf, nb_sectors, false, 0);
}
/* Just like bdrv_read(), but with I/O throttling temporarily disabled */
int bdrv_read_unthrottled(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
bool enabled;
int ret;
enabled = bs->io_limits_enabled;
bs->io_limits_enabled = false;
ret = bdrv_read(bs, sector_num, buf, nb_sectors);
bs->io_limits_enabled = enabled;
return ret;
}
/* Return < 0 if error. Important errors are:
-EIO generic I/O error (may happen for all errors)
-ENOMEDIUM No media inserted.
@@ -664,6 +645,7 @@ int bdrv_write_zeroes(BlockDriverState *bs, int64_t sector_num,
int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
{
int64_t target_sectors, ret, nb_sectors, sector_num = 0;
BlockDriverState *file;
int n;
target_sectors = bdrv_nb_sectors(bs);
@@ -676,7 +658,7 @@ int bdrv_make_zero(BlockDriverState *bs, BdrvRequestFlags flags)
if (nb_sectors <= 0) {
return 0;
}
ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n);
ret = bdrv_get_block_status(bs, sector_num, nb_sectors, &n, &file);
if (ret < 0) {
error_report("error getting block status at sector %" PRId64 ": %s",
sector_num, strerror(-ret));
@@ -937,7 +919,7 @@ out:
/*
* Handle a read request in coroutine context
*/
static int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
int coroutine_fn bdrv_co_do_preadv(BlockDriverState *bs,
int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
{
@@ -1282,7 +1264,7 @@ fail:
/*
* Handle a write request in coroutine context
*/
static int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
int coroutine_fn bdrv_co_do_pwritev(BlockDriverState *bs,
int64_t offset, unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
{
@@ -1443,29 +1425,10 @@ int coroutine_fn bdrv_co_write_zeroes(BlockDriverState *bs,
BDRV_REQ_ZERO_WRITE | flags);
}
int bdrv_flush_all(void)
{
BlockDriverState *bs = NULL;
int result = 0;
while ((bs = bdrv_next(bs))) {
AioContext *aio_context = bdrv_get_aio_context(bs);
int ret;
aio_context_acquire(aio_context);
ret = bdrv_flush(bs);
if (ret < 0 && !result) {
result = ret;
}
aio_context_release(aio_context);
}
return result;
}
typedef struct BdrvCoGetBlockStatusData {
BlockDriverState *bs;
BlockDriverState *base;
BlockDriverState **file;
int64_t sector_num;
int nb_sectors;
int *pnum;
@@ -1487,10 +1450,14 @@ typedef struct BdrvCoGetBlockStatusData {
*
* 'nb_sectors' is the max value 'pnum' should be set to. If nb_sectors goes
* beyond the end of the disk image it will be clamped.
*
* If returned value is positive and BDRV_BLOCK_OFFSET_VALID bit is set, 'file'
* points to the BDS which the sector range is allocated in.
*/
static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
int64_t total_sectors;
int64_t n;
@@ -1520,7 +1487,9 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
return ret;
}
ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum);
*file = NULL;
ret = bs->drv->bdrv_co_get_block_status(bs, sector_num, nb_sectors, pnum,
file);
if (ret < 0) {
*pnum = 0;
return ret;
@@ -1529,7 +1498,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
if (ret & BDRV_BLOCK_RAW) {
assert(ret & BDRV_BLOCK_OFFSET_VALID);
return bdrv_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
*pnum, pnum);
*pnum, pnum, file);
}
if (ret & (BDRV_BLOCK_DATA | BDRV_BLOCK_ZERO)) {
@@ -1546,13 +1515,14 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
}
}
if (bs->file &&
if (*file && *file != bs &&
(ret & BDRV_BLOCK_DATA) && !(ret & BDRV_BLOCK_ZERO) &&
(ret & BDRV_BLOCK_OFFSET_VALID)) {
BlockDriverState *file2;
int file_pnum;
ret2 = bdrv_co_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
*pnum, &file_pnum);
ret2 = bdrv_co_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
*pnum, &file_pnum, &file2);
if (ret2 >= 0) {
/* Ignore errors. This is just providing extra information, it
* is useful but not necessary.
@@ -1577,14 +1547,15 @@ static int64_t coroutine_fn bdrv_co_get_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
int64_t sector_num,
int nb_sectors,
int *pnum)
int *pnum,
BlockDriverState **file)
{
BlockDriverState *p;
int64_t ret = 0;
assert(bs != base);
for (p = bs; p != base; p = backing_bs(p)) {
ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum);
ret = bdrv_co_get_block_status(p, sector_num, nb_sectors, pnum, file);
if (ret < 0 || ret & BDRV_BLOCK_ALLOCATED) {
break;
}
@@ -1603,7 +1574,8 @@ static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
data->ret = bdrv_co_get_block_status_above(data->bs, data->base,
data->sector_num,
data->nb_sectors,
data->pnum);
data->pnum,
data->file);
data->done = true;
}
@@ -1615,12 +1587,14 @@ static void coroutine_fn bdrv_get_block_status_above_co_entry(void *opaque)
int64_t bdrv_get_block_status_above(BlockDriverState *bs,
BlockDriverState *base,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
Coroutine *co;
BdrvCoGetBlockStatusData data = {
.bs = bs,
.base = base,
.file = file,
.sector_num = sector_num,
.nb_sectors = nb_sectors,
.pnum = pnum,
@@ -1644,16 +1618,19 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
int64_t bdrv_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
return bdrv_get_block_status_above(bs, backing_bs(bs),
sector_num, nb_sectors, pnum);
sector_num, nb_sectors, pnum, file);
}
int coroutine_fn bdrv_is_allocated(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, int *pnum)
{
int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum);
BlockDriverState *file;
int64_t ret = bdrv_get_block_status(bs, sector_num, nb_sectors, pnum,
&file);
if (ret < 0) {
return ret;
}

View File

@@ -39,6 +39,7 @@
#include "sysemu/sysemu.h"
#include "qmp-commands.h"
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include <iscsi/iscsi.h>
#include <iscsi/scsi-lowlevel.h>
@@ -532,7 +533,8 @@ static bool iscsi_allocationmap_is_allocated(IscsiLun *iscsilun,
static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
IscsiLun *iscsilun = bs->opaque;
struct scsi_get_lba_status *lbas = NULL;
@@ -624,6 +626,9 @@ out:
if (iTask.task != NULL) {
scsi_free_scsi_task(iTask.task);
}
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
*file = bs;
}
return ret;
}
@@ -650,7 +655,8 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
!iscsi_allocationmap_is_allocated(iscsilun, sector_num, nb_sectors)) {
int64_t ret;
int pnum;
ret = iscsi_co_get_block_status(bs, sector_num, INT_MAX, &pnum);
BlockDriverState *file;
ret = iscsi_co_get_block_status(bs, sector_num, INT_MAX, &pnum, &file);
if (ret < 0) {
return ret;
}
@@ -1075,6 +1081,8 @@ static void parse_chap(struct iscsi_context *iscsi, const char *target,
QemuOpts *opts;
const char *user = NULL;
const char *password = NULL;
const char *secretid;
char *secret = NULL;
list = qemu_find_opts("iscsi");
if (!list) {
@@ -1094,8 +1102,20 @@ static void parse_chap(struct iscsi_context *iscsi, const char *target,
return;
}
secretid = qemu_opt_get(opts, "password-secret");
password = qemu_opt_get(opts, "password");
if (!password) {
if (secretid && password) {
error_setg(errp, "'password' and 'password-secret' properties are "
"mutually exclusive");
return;
}
if (secretid) {
secret = qcrypto_secret_lookup_as_utf8(secretid, errp);
if (!secret) {
return;
}
password = secret;
} else if (!password) {
error_setg(errp, "CHAP username specified but no password was given");
return;
}
@@ -1103,6 +1123,8 @@ static void parse_chap(struct iscsi_context *iscsi, const char *target,
if (iscsi_set_initiator_username_pwd(iscsi, user, password)) {
error_setg(errp, "Failed to set initiator username and password");
}
g_free(secret);
}
static void parse_header_digest(struct iscsi_context *iscsi, const char *target,
@@ -1852,6 +1874,11 @@ static QemuOptsList qemu_iscsi_opts = {
.name = "password",
.type = QEMU_OPT_STRING,
.help = "password for CHAP authentication to target",
},{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of the secret providing password for CHAP "
"authentication to target",
},{
.name = "header-digest",
.type = QEMU_OPT_STRING,

View File

@@ -47,7 +47,6 @@ typedef struct MirrorBlockJob {
BlockdevOnError on_source_error, on_target_error;
bool synced;
bool should_complete;
int64_t sector_num;
int64_t granularity;
size_t buf_size;
int64_t bdev_length;
@@ -64,6 +63,8 @@ typedef struct MirrorBlockJob {
int ret;
bool unmap;
bool waiting_for_io;
int target_cluster_sectors;
int max_iov;
} MirrorBlockJob;
typedef struct MirrorOp {
@@ -159,114 +160,84 @@ static void mirror_read_complete(void *opaque, int ret)
mirror_write_complete, op);
}
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
/* Round sector_num and/or nb_sectors to target cluster if COW is needed, and
* return the offset of the adjusted tail sector against original. */
static int mirror_cow_align(MirrorBlockJob *s,
int64_t *sector_num,
int *nb_sectors)
{
bool need_cow;
int ret = 0;
int chunk_sectors = s->granularity >> BDRV_SECTOR_BITS;
int64_t align_sector_num = *sector_num;
int align_nb_sectors = *nb_sectors;
int max_sectors = chunk_sectors * s->max_iov;
need_cow = !test_bit(*sector_num / chunk_sectors, s->cow_bitmap);
need_cow |= !test_bit((*sector_num + *nb_sectors - 1) / chunk_sectors,
s->cow_bitmap);
if (need_cow) {
bdrv_round_to_clusters(s->target, *sector_num, *nb_sectors,
&align_sector_num, &align_nb_sectors);
}
if (align_nb_sectors > max_sectors) {
align_nb_sectors = max_sectors;
if (need_cow) {
align_nb_sectors = QEMU_ALIGN_DOWN(align_nb_sectors,
s->target_cluster_sectors);
}
}
ret = align_sector_num + align_nb_sectors - (*sector_num + *nb_sectors);
*sector_num = align_sector_num;
*nb_sectors = align_nb_sectors;
assert(ret >= 0);
return ret;
}
static inline void mirror_wait_for_io(MirrorBlockJob *s)
{
assert(!s->waiting_for_io);
s->waiting_for_io = true;
qemu_coroutine_yield();
s->waiting_for_io = false;
}
/* Submit async read while handling COW.
* Returns: nb_sectors if no alignment is necessary, or
* (new_end - sector_num) if tail is rounded up or down due to
* alignment or buffer limit.
*/
static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
int nb_sectors)
{
BlockDriverState *source = s->common.bs;
int nb_sectors, sectors_per_chunk, nb_chunks, max_iov;
int64_t end, sector_num, next_chunk, next_sector, hbitmap_next_sector;
uint64_t delay_ns = 0;
int sectors_per_chunk, nb_chunks;
int ret = nb_sectors;
MirrorOp *op;
int pnum;
int64_t ret;
max_iov = MIN(source->bl.max_iov, s->target->bl.max_iov);
s->sector_num = hbitmap_iter_next(&s->hbi);
if (s->sector_num < 0) {
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
s->sector_num = hbitmap_iter_next(&s->hbi);
trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
assert(s->sector_num >= 0);
}
hbitmap_next_sector = s->sector_num;
sector_num = s->sector_num;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
end = s->bdev_length / BDRV_SECTOR_SIZE;
/* Extend the QEMUIOVector to include all adjacent blocks that will
* be copied in this operation.
*
* We have to do this if we have no backing file yet in the destination,
* and the cluster size is very large. Then we need to do COW ourselves.
* The first time a cluster is copied, copy it entirely. Note that,
* because both the granularity and the cluster size are powers of two,
* the number of sectors to copy cannot exceed one cluster.
*
* We also want to extend the QEMUIOVector to include more adjacent
* dirty blocks if possible, to limit the number of I/O operations and
* run efficiently even with a small granularity.
*/
nb_chunks = 0;
nb_sectors = 0;
next_sector = sector_num;
next_chunk = sector_num / sectors_per_chunk;
/* We can only handle as much as buf_size at a time. */
nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
assert(nb_sectors);
/* Wait for I/O to this cluster (from a previous iteration) to be done. */
while (test_bit(next_chunk, s->in_flight_bitmap)) {
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
s->waiting_for_io = true;
qemu_coroutine_yield();
s->waiting_for_io = false;
if (s->cow_bitmap) {
ret += mirror_cow_align(s, &sector_num, &nb_sectors);
}
assert(nb_sectors << BDRV_SECTOR_BITS <= s->buf_size);
/* The sector range must meet granularity because:
* 1) Caller passes in aligned values;
* 2) mirror_cow_align is used only when target cluster is larger. */
assert(!(nb_sectors % sectors_per_chunk));
assert(!(sector_num % sectors_per_chunk));
nb_chunks = nb_sectors / sectors_per_chunk;
do {
int added_sectors, added_chunks;
if (!bdrv_get_dirty(source, s->dirty_bitmap, next_sector) ||
test_bit(next_chunk, s->in_flight_bitmap)) {
assert(nb_sectors > 0);
break;
}
added_sectors = sectors_per_chunk;
if (s->cow_bitmap && !test_bit(next_chunk, s->cow_bitmap)) {
bdrv_round_to_clusters(s->target,
next_sector, added_sectors,
&next_sector, &added_sectors);
/* On the first iteration, the rounding may make us copy
* sectors before the first dirty one.
*/
if (next_sector < sector_num) {
assert(nb_sectors == 0);
sector_num = next_sector;
next_chunk = next_sector / sectors_per_chunk;
}
}
added_sectors = MIN(added_sectors, end - (sector_num + nb_sectors));
added_chunks = (added_sectors + sectors_per_chunk - 1) / sectors_per_chunk;
/* When doing COW, it may happen that there is not enough space for
* a full cluster. Wait if that is the case.
*/
while (nb_chunks == 0 && s->buf_free_count < added_chunks) {
trace_mirror_yield_buf_busy(s, nb_chunks, s->in_flight);
s->waiting_for_io = true;
qemu_coroutine_yield();
s->waiting_for_io = false;
}
if (s->buf_free_count < nb_chunks + added_chunks) {
trace_mirror_break_buf_busy(s, nb_chunks, s->in_flight);
break;
}
if (max_iov < nb_chunks + added_chunks) {
trace_mirror_break_iov_max(s, nb_chunks, added_chunks);
break;
}
/* We have enough free space to copy these sectors. */
bitmap_set(s->in_flight_bitmap, next_chunk, added_chunks);
nb_sectors += added_sectors;
nb_chunks += added_chunks;
next_sector += added_sectors;
next_chunk += added_chunks;
if (!s->synced && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, added_sectors);
}
} while (delay_ns == 0 && next_sector < end);
while (s->buf_free_count < nb_chunks) {
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
mirror_wait_for_io(s);
}
/* Allocate a MirrorOp that is used as an AIO callback. */
op = g_new(MirrorOp, 1);
@@ -278,47 +249,151 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
* from s->buf_free.
*/
qemu_iovec_init(&op->qiov, nb_chunks);
next_sector = sector_num;
while (nb_chunks-- > 0) {
MirrorBuffer *buf = QSIMPLEQ_FIRST(&s->buf_free);
size_t remaining = (nb_sectors * BDRV_SECTOR_SIZE) - op->qiov.size;
size_t remaining = nb_sectors * BDRV_SECTOR_SIZE - op->qiov.size;
QSIMPLEQ_REMOVE_HEAD(&s->buf_free, next);
s->buf_free_count--;
qemu_iovec_add(&op->qiov, buf, MIN(s->granularity, remaining));
/* Advance the HBitmapIter in parallel, so that we do not examine
* the same sector twice.
*/
if (next_sector > hbitmap_next_sector
&& bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
hbitmap_next_sector = hbitmap_iter_next(&s->hbi);
}
next_sector += sectors_per_chunk;
}
bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num, nb_sectors);
/* Copy the dirty cluster. */
s->in_flight++;
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
ret = bdrv_get_block_status_above(source, NULL, sector_num,
nb_sectors, &pnum);
if (ret < 0 || pnum < nb_sectors ||
(ret & BDRV_BLOCK_DATA && !(ret & BDRV_BLOCK_ZERO))) {
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
} else if (ret & BDRV_BLOCK_ZERO) {
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
mirror_read_complete, op);
return ret;
}
static void mirror_do_zero_or_discard(MirrorBlockJob *s,
int64_t sector_num,
int nb_sectors,
bool is_discard)
{
MirrorOp *op;
/* Allocate a MirrorOp that is used as an AIO callback. The qiov is zeroed
* so the freeing in mirror_iteration_done is nop. */
op = g_new0(MirrorOp, 1);
op->s = s;
op->sector_num = sector_num;
op->nb_sectors = nb_sectors;
s->in_flight++;
s->sectors_in_flight += nb_sectors;
if (is_discard) {
bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
} else {
bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
mirror_write_complete, op);
} else {
assert(!(ret & BDRV_BLOCK_DATA));
bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
}
}
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
{
BlockDriverState *source = s->common.bs;
int64_t sector_num;
uint64_t delay_ns = 0;
/* At least the first dirty chunk is mirrored in one iteration. */
int nb_chunks = 1;
int64_t end = s->bdev_length / BDRV_SECTOR_SIZE;
int sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
sector_num = hbitmap_iter_next(&s->hbi);
if (sector_num < 0) {
bdrv_dirty_iter_init(s->dirty_bitmap, &s->hbi);
sector_num = hbitmap_iter_next(&s->hbi);
trace_mirror_restart_iter(s, bdrv_get_dirty_count(s->dirty_bitmap));
assert(sector_num >= 0);
}
/* Find the number of consective dirty chunks following the first dirty
* one, and wait for in flight requests in them. */
while (nb_chunks * sectors_per_chunk < (s->buf_size >> BDRV_SECTOR_BITS)) {
int64_t hbitmap_next;
int64_t next_sector = sector_num + nb_chunks * sectors_per_chunk;
int64_t next_chunk = next_sector / sectors_per_chunk;
if (next_sector >= end ||
!bdrv_get_dirty(source, s->dirty_bitmap, next_sector)) {
break;
}
if (test_bit(next_chunk, s->in_flight_bitmap)) {
if (nb_chunks > 0) {
break;
}
trace_mirror_yield_in_flight(s, next_sector, s->in_flight);
mirror_wait_for_io(s);
/* Now retry. */
} else {
hbitmap_next = hbitmap_iter_next(&s->hbi);
assert(hbitmap_next == next_sector);
nb_chunks++;
}
}
/* Clear dirty bits before querying the block status, because
* calling bdrv_get_block_status_above could yield - if some blocks are
* marked dirty in this window, we need to know.
*/
bdrv_reset_dirty_bitmap(s->dirty_bitmap, sector_num,
nb_chunks * sectors_per_chunk);
bitmap_set(s->in_flight_bitmap, sector_num / sectors_per_chunk, nb_chunks);
while (nb_chunks > 0 && sector_num < end) {
int ret;
int io_sectors;
BlockDriverState *file;
enum MirrorMethod {
MIRROR_METHOD_COPY,
MIRROR_METHOD_ZERO,
MIRROR_METHOD_DISCARD
} mirror_method = MIRROR_METHOD_COPY;
assert(!(sector_num % sectors_per_chunk));
ret = bdrv_get_block_status_above(source, NULL, sector_num,
nb_chunks * sectors_per_chunk,
&io_sectors, &file);
if (ret < 0) {
io_sectors = nb_chunks * sectors_per_chunk;
}
io_sectors -= io_sectors % sectors_per_chunk;
if (io_sectors < sectors_per_chunk) {
io_sectors = sectors_per_chunk;
} else if (ret >= 0 && !(ret & BDRV_BLOCK_DATA)) {
int64_t target_sector_num;
int target_nb_sectors;
bdrv_round_to_clusters(s->target, sector_num, io_sectors,
&target_sector_num, &target_nb_sectors);
if (target_sector_num == sector_num &&
target_nb_sectors == io_sectors) {
mirror_method = ret & BDRV_BLOCK_ZERO ?
MIRROR_METHOD_ZERO :
MIRROR_METHOD_DISCARD;
}
}
switch (mirror_method) {
case MIRROR_METHOD_COPY:
io_sectors = mirror_do_read(s, sector_num, io_sectors);
break;
case MIRROR_METHOD_ZERO:
mirror_do_zero_or_discard(s, sector_num, io_sectors, false);
break;
case MIRROR_METHOD_DISCARD:
mirror_do_zero_or_discard(s, sector_num, io_sectors, true);
break;
default:
abort();
}
assert(io_sectors);
sector_num += io_sectors;
nb_chunks -= io_sectors / sectors_per_chunk;
delay_ns += ratelimit_calculate_delay(&s->limit, io_sectors);
}
return delay_ns;
}
@@ -343,9 +418,7 @@ static void mirror_free_init(MirrorBlockJob *s)
static void mirror_drain(MirrorBlockJob *s)
{
while (s->in_flight > 0) {
s->waiting_for_io = true;
qemu_coroutine_yield();
s->waiting_for_io = false;
mirror_wait_for_io(s);
}
}
@@ -419,6 +492,7 @@ static void coroutine_fn mirror_run(void *opaque)
checking for a NULL string */
int ret = 0;
int n;
int target_cluster_size = BDRV_SECTOR_SIZE;
if (block_job_is_cancelled(&s->common)) {
goto immediate_exit;
@@ -448,16 +522,16 @@ static void coroutine_fn mirror_run(void *opaque)
*/
bdrv_get_backing_filename(s->target, backing_filename,
sizeof(backing_filename));
if (backing_filename[0] && !s->target->backing) {
ret = bdrv_get_info(s->target, &bdi);
if (ret < 0) {
goto immediate_exit;
}
if (s->granularity < bdi.cluster_size) {
s->buf_size = MAX(s->buf_size, bdi.cluster_size);
s->cow_bitmap = bitmap_new(length);
}
if (!bdrv_get_info(s->target, &bdi) && bdi.cluster_size) {
target_cluster_size = bdi.cluster_size;
}
if (backing_filename[0] && !s->target->backing
&& s->granularity < target_cluster_size) {
s->buf_size = MAX(s->buf_size, target_cluster_size);
s->cow_bitmap = bitmap_new(length);
}
s->target_cluster_sectors = target_cluster_size >> BDRV_SECTOR_BITS;
s->max_iov = MIN(s->common.bs->bl.max_iov, s->target->bl.max_iov);
end = s->bdev_length / BDRV_SECTOR_SIZE;
s->buf = qemu_try_blockalign(bs, s->buf_size);
@@ -532,9 +606,7 @@ static void coroutine_fn mirror_run(void *opaque)
if (s->in_flight == MAX_IN_FLIGHT || s->buf_free_count == 0 ||
(cnt == 0 && s->in_flight > 0)) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
s->waiting_for_io = true;
qemu_coroutine_yield();
s->waiting_for_io = false;
mirror_wait_for_io(s);
continue;
} else if (cnt != 0) {
delay_ns = mirror_iteration(s);

View File

@@ -28,7 +28,6 @@
#include "qemu/osdep.h"
#include "nbd-client.h"
#include "qemu/sockets.h"
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
@@ -48,13 +47,21 @@ static void nbd_teardown_connection(BlockDriverState *bs)
{
NbdClientSession *client = nbd_get_client_session(bs);
if (!client->ioc) { /* Already closed */
return;
}
/* finish any pending coroutines */
shutdown(client->sock, 2);
qio_channel_shutdown(client->ioc,
QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);
nbd_recv_coroutines_enter_all(client);
nbd_client_detach_aio_context(bs);
closesocket(client->sock);
client->sock = -1;
object_unref(OBJECT(client->sioc));
client->sioc = NULL;
object_unref(OBJECT(client->ioc));
client->ioc = NULL;
}
static void nbd_reply_ready(void *opaque)
@@ -64,12 +71,16 @@ static void nbd_reply_ready(void *opaque)
uint64_t i;
int ret;
if (!s->ioc) { /* Already closed */
return;
}
if (s->reply.handle == 0) {
/* No reply already in flight. Fetch a header. It is possible
* that another thread has done the same thing in parallel, so
* the socket is not readable anymore.
*/
ret = nbd_receive_reply(s->sock, &s->reply);
ret = nbd_receive_reply(s->ioc, &s->reply);
if (ret == -EAGAIN) {
return;
}
@@ -120,32 +131,35 @@ static int nbd_co_send_request(BlockDriverState *bs,
}
}
g_assert(qemu_in_coroutine());
assert(i < MAX_NBD_REQUESTS);
request->handle = INDEX_TO_HANDLE(s, i);
if (!s->ioc) {
qemu_co_mutex_unlock(&s->send_mutex);
return -EPIPE;
}
s->send_coroutine = qemu_coroutine_self();
aio_context = bdrv_get_aio_context(bs);
aio_set_fd_handler(aio_context, s->sock, false,
aio_set_fd_handler(aio_context, s->sioc->fd, false,
nbd_reply_ready, nbd_restart_write, bs);
if (qiov) {
if (!s->is_unix) {
socket_set_cork(s->sock, 1);
}
rc = nbd_send_request(s->sock, request);
qio_channel_set_cork(s->ioc, true);
rc = nbd_send_request(s->ioc, request);
if (rc >= 0) {
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
offset, request->len, 0);
if (ret != request->len) {
rc = -EIO;
}
}
if (!s->is_unix) {
socket_set_cork(s->sock, 0);
}
qio_channel_set_cork(s->ioc, false);
} else {
rc = nbd_send_request(s->sock, request);
rc = nbd_send_request(s->ioc, request);
}
aio_set_fd_handler(aio_context, s->sock, false,
aio_set_fd_handler(aio_context, s->sioc->fd, false,
nbd_reply_ready, NULL, bs);
s->send_coroutine = NULL;
qemu_co_mutex_unlock(&s->send_mutex);
@@ -162,12 +176,13 @@ static void nbd_co_receive_reply(NbdClientSession *s,
* peek at the next reply and avoid yielding if it's ours? */
qemu_coroutine_yield();
*reply = s->reply;
if (reply->handle != request->handle) {
if (reply->handle != request->handle ||
!s->ioc) {
reply->error = EIO;
} else {
if (qiov && reply->error == 0) {
ret = qemu_co_recvv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
ret = nbd_wr_syncv(s->ioc, qiov->iov, qiov->niov,
offset, request->len, 1);
if (ret != request->len) {
reply->error = EIO;
}
@@ -350,14 +365,14 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
void nbd_client_detach_aio_context(BlockDriverState *bs)
{
aio_set_fd_handler(bdrv_get_aio_context(bs),
nbd_get_client_session(bs)->sock,
nbd_get_client_session(bs)->sioc->fd,
false, NULL, NULL, NULL);
}
void nbd_client_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sock,
aio_set_fd_handler(new_context, nbd_get_client_session(bs)->sioc->fd,
false, nbd_reply_ready, NULL, bs);
}
@@ -370,16 +385,20 @@ void nbd_client_close(BlockDriverState *bs)
.len = 0
};
if (client->sock == -1) {
if (client->ioc == NULL) {
return;
}
nbd_send_request(client->sock, &request);
nbd_send_request(client->ioc, &request);
nbd_teardown_connection(bs);
}
int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
int nbd_client_init(BlockDriverState *bs,
QIOChannelSocket *sioc,
const char *export,
QCryptoTLSCreds *tlscreds,
const char *hostname,
Error **errp)
{
NbdClientSession *client = nbd_get_client_session(bs);
@@ -387,22 +406,32 @@ int nbd_client_init(BlockDriverState *bs, int sock, const char *export,
/* NBD handshake */
logout("session init %s\n", export);
qemu_set_block(sock);
ret = nbd_receive_negotiate(sock, export,
&client->nbdflags, &client->size, errp);
qio_channel_set_blocking(QIO_CHANNEL(sioc), true, NULL);
ret = nbd_receive_negotiate(QIO_CHANNEL(sioc), export,
&client->nbdflags,
tlscreds, hostname,
&client->ioc,
&client->size, errp);
if (ret < 0) {
logout("Failed to negotiate with the NBD server\n");
closesocket(sock);
return ret;
}
qemu_co_mutex_init(&client->send_mutex);
qemu_co_mutex_init(&client->free_sema);
client->sock = sock;
client->sioc = sioc;
object_ref(OBJECT(client->sioc));
if (!client->ioc) {
client->ioc = QIO_CHANNEL(sioc);
object_ref(OBJECT(client->ioc));
}
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
qemu_set_nonblock(sock);
qio_channel_set_blocking(QIO_CHANNEL(sioc), false, NULL);
nbd_client_attach_aio_context(bs, bdrv_get_aio_context(bs));
logout("Established connection with NBD server\n");

View File

@@ -4,6 +4,7 @@
#include "qemu-common.h"
#include "block/nbd.h"
#include "block/block_int.h"
#include "io/channel-socket.h"
/* #define DEBUG_NBD */
@@ -17,7 +18,8 @@
#define MAX_NBD_REQUESTS 16
typedef struct NbdClientSession {
int sock;
QIOChannelSocket *sioc; /* The master data channel */
QIOChannel *ioc; /* The current I/O channel which may differ (eg TLS) */
uint32_t nbdflags;
off_t size;
@@ -34,7 +36,11 @@ typedef struct NbdClientSession {
NbdClientSession *nbd_get_client_session(BlockDriverState *bs);
int nbd_client_init(BlockDriverState *bs, int sock, const char *export_name,
int nbd_client_init(BlockDriverState *bs,
QIOChannelSocket *sock,
const char *export_name,
QCryptoTLSCreds *tlscreds,
const char *hostname,
Error **errp);
void nbd_client_close(BlockDriverState *bs);

View File

@@ -31,7 +31,6 @@
#include "qemu/uri.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/sockets.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qjson.h"
#include "qapi/qmp/qint.h"
@@ -205,18 +204,20 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, char **export,
saddr = g_new0(SocketAddress, 1);
if (qdict_haskey(options, "path")) {
UnixSocketAddress *q_unix;
saddr->type = SOCKET_ADDRESS_KIND_UNIX;
saddr->u.q_unix = g_new0(UnixSocketAddress, 1);
saddr->u.q_unix->path = g_strdup(qdict_get_str(options, "path"));
q_unix = saddr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
q_unix->path = g_strdup(qdict_get_str(options, "path"));
qdict_del(options, "path");
} else {
InetSocketAddress *inet;
saddr->type = SOCKET_ADDRESS_KIND_INET;
saddr->u.inet = g_new0(InetSocketAddress, 1);
saddr->u.inet->host = g_strdup(qdict_get_str(options, "host"));
inet = saddr->u.inet.data = g_new0(InetSocketAddress, 1);
inet->host = g_strdup(qdict_get_str(options, "host"));
if (!qdict_get_try_str(options, "port")) {
saddr->u.inet->port = g_strdup_printf("%d", NBD_DEFAULT_PORT);
inet->port = g_strdup_printf("%d", NBD_DEFAULT_PORT);
} else {
saddr->u.inet->port = g_strdup(qdict_get_str(options, "port"));
inet->port = g_strdup(qdict_get_str(options, "port"));
}
qdict_del(options, "host");
qdict_del(options, "port");
@@ -238,55 +239,113 @@ NbdClientSession *nbd_get_client_session(BlockDriverState *bs)
return &s->client;
}
static int nbd_establish_connection(BlockDriverState *bs,
SocketAddress *saddr,
Error **errp)
static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
Error **errp)
{
BDRVNBDState *s = bs->opaque;
int sock;
QIOChannelSocket *sioc;
Error *local_err = NULL;
sock = socket_connect(saddr, errp, NULL, NULL);
sioc = qio_channel_socket_new();
if (sock < 0) {
logout("Failed to establish connection to NBD server\n");
return -EIO;
qio_channel_socket_connect_sync(sioc,
saddr,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
return NULL;
}
if (!s->client.is_unix) {
socket_set_nodelay(sock);
}
qio_channel_set_delay(QIO_CHANNEL(sioc), false);
return sock;
return sioc;
}
static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp)
{
Object *obj;
QCryptoTLSCreds *creds;
obj = object_resolve_path_component(
object_get_objects_root(), id);
if (!obj) {
error_setg(errp, "No TLS credentials with id '%s'",
id);
return NULL;
}
creds = (QCryptoTLSCreds *)
object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS);
if (!creds) {
error_setg(errp, "Object with id '%s' is not TLS credentials",
id);
return NULL;
}
if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT) {
error_setg(errp,
"Expecting TLS credentials with a client endpoint");
return NULL;
}
object_ref(obj);
return creds;
}
static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVNBDState *s = bs->opaque;
char *export = NULL;
int result, sock;
QIOChannelSocket *sioc = NULL;
SocketAddress *saddr;
const char *tlscredsid;
QCryptoTLSCreds *tlscreds = NULL;
const char *hostname = NULL;
int ret = -EINVAL;
/* Pop the config into our state object. Exit if invalid. */
saddr = nbd_config(s, options, &export, errp);
if (!saddr) {
return -EINVAL;
goto error;
}
tlscredsid = g_strdup(qdict_get_try_str(options, "tls-creds"));
if (tlscredsid) {
qdict_del(options, "tls-creds");
tlscreds = nbd_get_tls_creds(tlscredsid, errp);
if (!tlscreds) {
goto error;
}
if (saddr->type != SOCKET_ADDRESS_KIND_INET) {
error_setg(errp, "TLS only supported over IP sockets");
goto error;
}
hostname = saddr->u.inet.data->host;
}
/* establish TCP connection, return error if it fails
* TODO: Configurable retry-until-timeout behaviour.
*/
sock = nbd_establish_connection(bs, saddr, errp);
qapi_free_SocketAddress(saddr);
if (sock < 0) {
g_free(export);
return sock;
sioc = nbd_establish_connection(saddr, errp);
if (!sioc) {
ret = -ECONNREFUSED;
goto error;
}
/* NBD handshake */
result = nbd_client_init(bs, sock, export, errp);
ret = nbd_client_init(bs, sioc, export,
tlscreds, hostname, errp);
error:
if (sioc) {
object_unref(OBJECT(sioc));
}
if (tlscreds) {
object_unref(OBJECT(tlscreds));
}
qapi_free_SocketAddress(saddr);
g_free(export);
return result;
return ret;
}
static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
@@ -348,6 +407,7 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
const char *host = qdict_get_try_str(options, "host");
const char *port = qdict_get_try_str(options, "port");
const char *export = qdict_get_try_str(options, "export");
const char *tlscreds = qdict_get_try_str(options, "tls-creds");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("nbd")));
@@ -382,6 +442,9 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
if (export) {
qdict_put_obj(opts, "export", QOBJECT(qstring_from_str(export)));
}
if (tlscreds) {
qdict_put_obj(opts, "tls-creds", QOBJECT(qstring_from_str(tlscreds)));
}
bs->full_open_options = opts;
}

View File

@@ -36,6 +36,7 @@
#include <nfsc/libnfs.h>
#define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
#define QEMU_NFS_MAX_DEBUG_LEVEL 2
typedef struct NFSClient {
struct nfs_context *context;
@@ -333,6 +334,17 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
val = QEMU_NFS_MAX_READAHEAD_SIZE;
}
nfs_set_readahead(client->context, val);
#endif
#ifdef LIBNFS_FEATURE_DEBUG
} else if (!strcmp(qp->p[i].name, "debug")) {
/* limit the maximum debug level to avoid potential flooding
* of our log files. */
if (val > QEMU_NFS_MAX_DEBUG_LEVEL) {
error_report("NFS Warning: Limiting NFS debug level"
" to %d", QEMU_NFS_MAX_DEBUG_LEVEL);
val = QEMU_NFS_MAX_DEBUG_LEVEL;
}
nfs_set_debug(client->context, val);
#endif
} else {
error_setg(errp, "Unknown NFS parameter name: %s",

View File

@@ -30,6 +30,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/bitmap.h"
#include "qapi/util.h"
@@ -261,7 +262,7 @@ static coroutine_fn int parallels_co_flush_to_os(BlockDriverState *bs)
static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
BDRVParallelsState *s = bs->opaque;
int64_t offset;
@@ -274,6 +275,7 @@ static int64_t coroutine_fn parallels_co_get_block_status(BlockDriverState *bs,
return 0;
}
*file = bs->file->bs;
return (offset << BDRV_SECTOR_BITS) |
BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID;
}
@@ -460,7 +462,7 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size, cl_size;
uint8_t tmp[BDRV_SECTOR_SIZE];
Error *local_err = NULL;
BlockDriverState *file;
BlockBackend *file;
uint32_t bat_entries, bat_sectors;
ParallelsHeader header;
int ret;
@@ -476,14 +478,17 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
return ret;
}
file = NULL;
ret = bdrv_open(&file, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err);
if (ret < 0) {
file = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (file == NULL) {
error_propagate(errp, local_err);
return ret;
return -EIO;
}
ret = bdrv_truncate(file, 0);
blk_set_allow_write_beyond_eof(file, true);
ret = blk_truncate(file, 0);
if (ret < 0) {
goto exit;
}
@@ -507,18 +512,18 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &header, sizeof(header));
ret = bdrv_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
ret = blk_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
if (ret < 0) {
goto exit;
}
ret = bdrv_write_zeroes(file, 1, bat_sectors - 1, 0);
ret = blk_write_zeroes(file, 1, bat_sectors - 1, 0);
if (ret < 0) {
goto exit;
}
ret = 0;
done:
bdrv_unref(file);
blk_unref(file);
return ret;
exit:

View File

@@ -92,6 +92,26 @@ BlockDeviceInfo *bdrv_block_device_info(BlockDriverState *bs, Error **errp)
info->has_iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->iops_wr_max = cfg.buckets[THROTTLE_OPS_WRITE].max;
info->has_bps_max_length = info->has_bps_max;
info->bps_max_length =
cfg.buckets[THROTTLE_BPS_TOTAL].burst_length;
info->has_bps_rd_max_length = info->has_bps_rd_max;
info->bps_rd_max_length =
cfg.buckets[THROTTLE_BPS_READ].burst_length;
info->has_bps_wr_max_length = info->has_bps_wr_max;
info->bps_wr_max_length =
cfg.buckets[THROTTLE_BPS_WRITE].burst_length;
info->has_iops_max_length = info->has_iops_max;
info->iops_max_length =
cfg.buckets[THROTTLE_OPS_TOTAL].burst_length;
info->has_iops_rd_max_length = info->has_iops_rd_max;
info->iops_rd_max_length =
cfg.buckets[THROTTLE_OPS_READ].burst_length;
info->has_iops_wr_max_length = info->has_iops_wr_max;
info->iops_wr_max_length =
cfg.buckets[THROTTLE_OPS_WRITE].burst_length;
info->has_iops_size = cfg.op_size;
info->iops_size = cfg.op_size;
@@ -211,11 +231,13 @@ void bdrv_query_image_info(BlockDriverState *bs,
Error *err = NULL;
ImageInfo *info;
aio_context_acquire(bdrv_get_aio_context(bs));
size = bdrv_getlength(bs);
if (size < 0) {
error_setg_errno(errp, -size, "Can't get size of device '%s'",
bdrv_get_device_name(bs));
return;
goto out;
}
info = g_new0(ImageInfo, 1);
@@ -283,10 +305,13 @@ void bdrv_query_image_info(BlockDriverState *bs,
default:
error_propagate(errp, err);
qapi_free_ImageInfo(info);
return;
goto out;
}
*p_info = info;
out:
aio_context_release(bdrv_get_aio_context(bs));
}
/* @p_info will be set only on success. */
@@ -300,7 +325,7 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
info->locked = blk_dev_is_medium_locked(blk);
info->removable = blk_dev_has_removable_media(blk);
if (blk_dev_has_removable_media(blk)) {
if (blk_dev_has_tray(blk)) {
info->has_tray_open = true;
info->tray_open = blk_dev_is_tray_open(blk);
}
@@ -330,100 +355,116 @@ static void bdrv_query_info(BlockBackend *blk, BlockInfo **p_info,
qapi_free_BlockInfo(info);
}
static BlockStats *bdrv_query_stats(const BlockDriverState *bs,
bool query_backing)
static BlockStats *bdrv_query_stats(BlockBackend *blk,
const BlockDriverState *bs,
bool query_backing);
static void bdrv_query_blk_stats(BlockStats *s, BlockBackend *blk)
{
BlockStats *s;
BlockAcctStats *stats = blk_get_stats(blk);
BlockAcctTimedStats *ts = NULL;
s = g_malloc0(sizeof(*s));
s->has_device = true;
s->device = g_strdup(blk_name(blk));
if (bdrv_get_device_name(bs)[0]) {
s->has_device = true;
s->device = g_strdup(bdrv_get_device_name(bs));
s->stats->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
s->stats->failed_rd_operations = stats->failed_ops[BLOCK_ACCT_READ];
s->stats->failed_wr_operations = stats->failed_ops[BLOCK_ACCT_WRITE];
s->stats->failed_flush_operations = stats->failed_ops[BLOCK_ACCT_FLUSH];
s->stats->invalid_rd_operations = stats->invalid_ops[BLOCK_ACCT_READ];
s->stats->invalid_wr_operations = stats->invalid_ops[BLOCK_ACCT_WRITE];
s->stats->invalid_flush_operations =
stats->invalid_ops[BLOCK_ACCT_FLUSH];
s->stats->rd_merged = stats->merged[BLOCK_ACCT_READ];
s->stats->wr_merged = stats->merged[BLOCK_ACCT_WRITE];
s->stats->flush_operations = stats->nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = stats->total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = stats->total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = stats->total_time_ns[BLOCK_ACCT_FLUSH];
s->stats->has_idle_time_ns = stats->last_access_time_ns > 0;
if (s->stats->has_idle_time_ns) {
s->stats->idle_time_ns = block_acct_idle_time_ns(stats);
}
s->stats->account_invalid = stats->account_invalid;
s->stats->account_failed = stats->account_failed;
while ((ts = block_acct_interval_next(stats, ts))) {
BlockDeviceTimedStatsList *timed_stats =
g_malloc0(sizeof(*timed_stats));
BlockDeviceTimedStats *dev_stats = g_malloc0(sizeof(*dev_stats));
timed_stats->next = s->stats->timed_stats;
timed_stats->value = dev_stats;
s->stats->timed_stats = timed_stats;
TimedAverage *rd = &ts->latency[BLOCK_ACCT_READ];
TimedAverage *wr = &ts->latency[BLOCK_ACCT_WRITE];
TimedAverage *fl = &ts->latency[BLOCK_ACCT_FLUSH];
dev_stats->interval_length = ts->interval_length;
dev_stats->min_rd_latency_ns = timed_average_min(rd);
dev_stats->max_rd_latency_ns = timed_average_max(rd);
dev_stats->avg_rd_latency_ns = timed_average_avg(rd);
dev_stats->min_wr_latency_ns = timed_average_min(wr);
dev_stats->max_wr_latency_ns = timed_average_max(wr);
dev_stats->avg_wr_latency_ns = timed_average_avg(wr);
dev_stats->min_flush_latency_ns = timed_average_min(fl);
dev_stats->max_flush_latency_ns = timed_average_max(fl);
dev_stats->avg_flush_latency_ns = timed_average_avg(fl);
dev_stats->avg_rd_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_READ);
dev_stats->avg_wr_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
}
}
static void bdrv_query_bds_stats(BlockStats *s, const BlockDriverState *bs,
bool query_backing)
{
if (bdrv_get_node_name(bs)[0]) {
s->has_node_name = true;
s->node_name = g_strdup(bdrv_get_node_name(bs));
}
s->stats = g_malloc0(sizeof(*s->stats));
if (bs->blk) {
BlockAcctStats *stats = blk_get_stats(bs->blk);
BlockAcctTimedStats *ts = NULL;
s->stats->rd_bytes = stats->nr_bytes[BLOCK_ACCT_READ];
s->stats->wr_bytes = stats->nr_bytes[BLOCK_ACCT_WRITE];
s->stats->rd_operations = stats->nr_ops[BLOCK_ACCT_READ];
s->stats->wr_operations = stats->nr_ops[BLOCK_ACCT_WRITE];
s->stats->failed_rd_operations = stats->failed_ops[BLOCK_ACCT_READ];
s->stats->failed_wr_operations = stats->failed_ops[BLOCK_ACCT_WRITE];
s->stats->failed_flush_operations = stats->failed_ops[BLOCK_ACCT_FLUSH];
s->stats->invalid_rd_operations = stats->invalid_ops[BLOCK_ACCT_READ];
s->stats->invalid_wr_operations = stats->invalid_ops[BLOCK_ACCT_WRITE];
s->stats->invalid_flush_operations =
stats->invalid_ops[BLOCK_ACCT_FLUSH];
s->stats->rd_merged = stats->merged[BLOCK_ACCT_READ];
s->stats->wr_merged = stats->merged[BLOCK_ACCT_WRITE];
s->stats->flush_operations = stats->nr_ops[BLOCK_ACCT_FLUSH];
s->stats->wr_total_time_ns = stats->total_time_ns[BLOCK_ACCT_WRITE];
s->stats->rd_total_time_ns = stats->total_time_ns[BLOCK_ACCT_READ];
s->stats->flush_total_time_ns = stats->total_time_ns[BLOCK_ACCT_FLUSH];
s->stats->has_idle_time_ns = stats->last_access_time_ns > 0;
if (s->stats->has_idle_time_ns) {
s->stats->idle_time_ns = block_acct_idle_time_ns(stats);
}
s->stats->account_invalid = stats->account_invalid;
s->stats->account_failed = stats->account_failed;
while ((ts = block_acct_interval_next(stats, ts))) {
BlockDeviceTimedStatsList *timed_stats =
g_malloc0(sizeof(*timed_stats));
BlockDeviceTimedStats *dev_stats = g_malloc0(sizeof(*dev_stats));
timed_stats->next = s->stats->timed_stats;
timed_stats->value = dev_stats;
s->stats->timed_stats = timed_stats;
TimedAverage *rd = &ts->latency[BLOCK_ACCT_READ];
TimedAverage *wr = &ts->latency[BLOCK_ACCT_WRITE];
TimedAverage *fl = &ts->latency[BLOCK_ACCT_FLUSH];
dev_stats->interval_length = ts->interval_length;
dev_stats->min_rd_latency_ns = timed_average_min(rd);
dev_stats->max_rd_latency_ns = timed_average_max(rd);
dev_stats->avg_rd_latency_ns = timed_average_avg(rd);
dev_stats->min_wr_latency_ns = timed_average_min(wr);
dev_stats->max_wr_latency_ns = timed_average_max(wr);
dev_stats->avg_wr_latency_ns = timed_average_avg(wr);
dev_stats->min_flush_latency_ns = timed_average_min(fl);
dev_stats->max_flush_latency_ns = timed_average_max(fl);
dev_stats->avg_flush_latency_ns = timed_average_avg(fl);
dev_stats->avg_rd_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_READ);
dev_stats->avg_wr_queue_depth =
block_acct_queue_depth(ts, BLOCK_ACCT_WRITE);
}
}
s->stats->wr_highest_offset = bs->wr_highest_offset;
if (bs->file) {
s->has_parent = true;
s->parent = bdrv_query_stats(bs->file->bs, query_backing);
s->parent = bdrv_query_stats(NULL, bs->file->bs, query_backing);
}
if (query_backing && bs->backing) {
s->has_backing = true;
s->backing = bdrv_query_stats(bs->backing->bs, query_backing);
s->backing = bdrv_query_stats(NULL, bs->backing->bs, query_backing);
}
}
static BlockStats *bdrv_query_stats(BlockBackend *blk,
const BlockDriverState *bs,
bool query_backing)
{
BlockStats *s;
s = g_malloc0(sizeof(*s));
s->stats = g_malloc0(sizeof(*s->stats));
if (blk) {
bdrv_query_blk_stats(s, blk);
}
if (bs) {
bdrv_query_bds_stats(s, bs, query_backing);
}
return s;
@@ -452,22 +493,38 @@ BlockInfoList *qmp_query_block(Error **errp)
return head;
}
static bool next_query_bds(BlockBackend **blk, BlockDriverState **bs,
bool query_nodes)
{
if (query_nodes) {
*bs = bdrv_next_node(*bs);
return !!*bs;
}
*blk = blk_next(*blk);
*bs = *blk ? blk_bs(*blk) : NULL;
return !!*blk;
}
BlockStatsList *qmp_query_blockstats(bool has_query_nodes,
bool query_nodes,
Error **errp)
{
BlockStatsList *head = NULL, **p_next = &head;
BlockBackend *blk = NULL;
BlockDriverState *bs = NULL;
/* Just to be safe if query_nodes is not always initialized */
query_nodes = has_query_nodes && query_nodes;
while ((bs = query_nodes ? bdrv_next_node(bs) : bdrv_next(bs))) {
while (next_query_bds(&blk, &bs, query_nodes)) {
BlockStatsList *info = g_malloc0(sizeof(*info));
AioContext *ctx = bdrv_get_aio_context(bs);
AioContext *ctx = blk ? blk_get_aio_context(blk)
: bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
info->value = bdrv_query_stats(bs, !query_nodes);
info->value = bdrv_query_stats(blk, bs, !query_nodes);
aio_context_release(ctx);
*p_next = info;
@@ -636,7 +693,7 @@ void bdrv_image_info_specific_dump(fprintf_function func_fprintf, void *f,
QmpOutputVisitor *ov = qmp_output_visitor_new();
QObject *obj, *data;
visit_type_ImageInfoSpecific(qmp_output_get_visitor(ov), &info_spec, NULL,
visit_type_ImageInfoSpecific(qmp_output_get_visitor(ov), NULL, &info_spec,
&error_abort);
obj = qmp_output_get_qobject(ov);
assert(qobject_type(obj) == QTYPE_QDICT);

View File

@@ -24,6 +24,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include <zlib.h>
#include "qapi/qmp/qerror.h"
@@ -120,11 +121,7 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (header.version != QCOW_VERSION) {
char version[64];
snprintf(version, sizeof(version), "QCOW version %" PRIu32,
header.version);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "qcow", version);
error_setg(errp, "Unsupported qcow version %" PRIu32, header.version);
ret = -ENOTSUP;
goto fail;
}
@@ -489,7 +486,7 @@ static uint64_t get_cluster_offset(BlockDriverState *bs,
}
static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
BDRVQcowState *s = bs->opaque;
int index_in_cluster, n;
@@ -510,6 +507,7 @@ static int64_t coroutine_fn qcow_co_get_block_status(BlockDriverState *bs,
return BDRV_BLOCK_DATA;
}
cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
*file = bs->file->bs;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | cluster_offset;
}
@@ -779,7 +777,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
int flags = 0;
Error *local_err = NULL;
int ret;
BlockDriverState *qcow_bs;
BlockBackend *qcow_blk;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
@@ -795,15 +793,18 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
goto cleanup;
}
qcow_bs = NULL;
ret = bdrv_open(&qcow_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err);
if (ret < 0) {
qcow_blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (qcow_blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto cleanup;
}
ret = bdrv_truncate(qcow_bs, 0);
blk_set_allow_write_beyond_eof(qcow_blk, true);
ret = blk_truncate(qcow_blk, 0);
if (ret < 0) {
goto exit;
}
@@ -843,13 +844,13 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
}
/* write all the data */
ret = bdrv_pwrite(qcow_bs, 0, &header, sizeof(header));
ret = blk_pwrite(qcow_blk, 0, &header, sizeof(header));
if (ret != sizeof(header)) {
goto exit;
}
if (backing_file) {
ret = bdrv_pwrite(qcow_bs, sizeof(header),
ret = blk_pwrite(qcow_blk, sizeof(header),
backing_file, backing_filename_len);
if (ret != backing_filename_len) {
goto exit;
@@ -859,7 +860,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
tmp = g_malloc0(BDRV_SECTOR_SIZE);
for (i = 0; i < ((sizeof(uint64_t)*l1_size + BDRV_SECTOR_SIZE - 1)/
BDRV_SECTOR_SIZE); i++) {
ret = bdrv_pwrite(qcow_bs, header_size +
ret = blk_pwrite(qcow_blk, header_size +
BDRV_SECTOR_SIZE*i, tmp, BDRV_SECTOR_SIZE);
if (ret != BDRV_SECTOR_SIZE) {
g_free(tmp);
@@ -870,7 +871,7 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
g_free(tmp);
ret = 0;
exit:
bdrv_unref(qcow_bs);
blk_unref(qcow_blk);
cleanup:
g_free(backing_file);
return ret;

View File

@@ -24,6 +24,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include <zlib.h>
#include "block/qcow2.h"
@@ -197,22 +198,8 @@ static void cleanup_unknown_header_ext(BlockDriverState *bs)
}
}
static void GCC_FMT_ATTR(3, 4) report_unsupported(BlockDriverState *bs,
Error **errp, const char *fmt, ...)
{
char msg[64];
va_list ap;
va_start(ap, fmt);
vsnprintf(msg, sizeof(msg), fmt, ap);
va_end(ap);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "qcow2", msg);
}
static void report_unsupported_feature(BlockDriverState *bs,
Error **errp, Qcow2Feature *table, uint64_t mask)
static void report_unsupported_feature(Error **errp, Qcow2Feature *table,
uint64_t mask)
{
char *features = g_strdup("");
char *old;
@@ -237,7 +224,7 @@ static void report_unsupported_feature(BlockDriverState *bs,
g_free(old);
}
report_unsupported(bs, errp, "%s", features);
error_setg(errp, "Unsupported qcow2 feature(s): %s", features);
g_free(features);
}
@@ -854,7 +841,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
goto fail;
}
if (header.version < 2 || header.version > 3) {
report_unsupported(bs, errp, "QCOW version %" PRIu32, header.version);
error_setg(errp, "Unsupported qcow2 version %" PRIu32, header.version);
ret = -ENOTSUP;
goto fail;
}
@@ -934,7 +921,7 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
void *feature_table = NULL;
qcow2_read_extensions(bs, header.header_length, ext_end,
&feature_table, NULL);
report_unsupported_feature(bs, errp, feature_table,
report_unsupported_feature(errp, feature_table,
s->incompatible_features &
~QCOW2_INCOMPAT_MASK);
ret = -ENOTSUP;
@@ -1330,7 +1317,7 @@ static void qcow2_join_options(QDict *options, QDict *old_options)
}
static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
BDRVQcow2State *s = bs->opaque;
uint64_t cluster_offset;
@@ -1349,6 +1336,7 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
!s->cipher) {
index_in_cluster = sector_num & (s->cluster_sectors - 1);
cluster_offset |= (index_in_cluster << BDRV_SECTOR_BITS);
*file = bs->file->bs;
status |= BDRV_BLOCK_OFFSET_VALID | cluster_offset;
}
if (ret == QCOW2_CLUSTER_ZERO) {
@@ -2096,7 +2084,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
* 2 GB for 64k clusters, and we don't want to have a 2 GB initial file
* size for any qcow2 image.
*/
BlockDriverState* bs;
BlockBackend *blk;
QCowHeader *header;
uint64_t* refcount_table;
Error *local_err = NULL;
@@ -2171,14 +2159,16 @@ static int qcow2_create2(const char *filename, int64_t total_size,
return ret;
}
bs = NULL;
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
return ret;
return -EIO;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Write the header */
QEMU_BUILD_BUG_ON((1 << MIN_CLUSTER_BITS) < sizeof(*header));
header = g_malloc0(cluster_size);
@@ -2206,7 +2196,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
cpu_to_be64(QCOW2_COMPAT_LAZY_REFCOUNTS);
}
ret = bdrv_pwrite(bs, 0, header, cluster_size);
ret = blk_pwrite(blk, 0, header, cluster_size);
g_free(header);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write qcow2 header");
@@ -2216,7 +2206,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* Write a refcount table with one refcount block */
refcount_table = g_malloc0(2 * cluster_size);
refcount_table[0] = cpu_to_be64(2 * cluster_size);
ret = bdrv_pwrite(bs, cluster_size, refcount_table, 2 * cluster_size);
ret = blk_pwrite(blk, cluster_size, refcount_table, 2 * cluster_size);
g_free(refcount_table);
if (ret < 0) {
@@ -2224,8 +2214,8 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
bdrv_unref(bs);
bs = NULL;
blk_unref(blk);
blk = NULL;
/*
* And now open the image and make it consistent first (i.e. increase the
@@ -2234,15 +2224,16 @@ static int qcow2_create2(const char *filename, int64_t total_size,
*/
options = qdict_new();
qdict_put(options, "driver", qstring_from_str("qcow2"));
ret = bdrv_open(&bs, filename, NULL, options,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, options,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_FLUSH,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto out;
}
ret = qcow2_alloc_clusters(bs, 3 * cluster_size);
ret = qcow2_alloc_clusters(blk_bs(blk), 3 * cluster_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not allocate clusters for qcow2 "
"header and refcount table");
@@ -2254,14 +2245,14 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
/* Create a full header (including things like feature table) */
ret = qcow2_update_header(bs);
ret = qcow2_update_header(blk_bs(blk));
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not update qcow2 header");
goto out;
}
/* Okay, now that we have a valid image, let's give it the right size */
ret = bdrv_truncate(bs, total_size);
ret = blk_truncate(blk, total_size);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not resize image");
goto out;
@@ -2269,7 +2260,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* Want a backing file? There you go.*/
if (backing_file) {
ret = bdrv_change_backing_file(bs, backing_file, backing_format);
ret = bdrv_change_backing_file(blk_bs(blk), backing_file, backing_format);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not assign backing file '%s' "
"with format '%s'", backing_file, backing_format);
@@ -2279,9 +2270,9 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* And if we're supposed to preallocate metadata, do that now */
if (prealloc != PREALLOC_MODE_OFF) {
BDRVQcow2State *s = bs->opaque;
BDRVQcow2State *s = blk_bs(blk)->opaque;
qemu_co_mutex_lock(&s->lock);
ret = preallocate(bs);
ret = preallocate(blk_bs(blk));
qemu_co_mutex_unlock(&s->lock);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not preallocate metadata");
@@ -2289,24 +2280,25 @@ static int qcow2_create2(const char *filename, int64_t total_size,
}
}
bdrv_unref(bs);
bs = NULL;
blk_unref(blk);
blk = NULL;
/* Reopen the image without BDRV_O_NO_FLUSH to flush it before returning */
options = qdict_new();
qdict_put(options, "driver", qstring_from_str("qcow2"));
ret = bdrv_open(&bs, filename, NULL, options,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_BACKING,
&local_err);
if (local_err) {
blk = blk_new_open(filename, NULL, options,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_NO_BACKING,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto out;
}
ret = 0;
out:
if (bs) {
bdrv_unref(bs);
if (blk) {
blk_unref(blk);
}
return ret;
}
@@ -2808,15 +2800,15 @@ static ImageInfoSpecific *qcow2_get_specific_info(BlockDriverState *bs)
*spec_info = (ImageInfoSpecific){
.type = IMAGE_INFO_SPECIFIC_KIND_QCOW2,
.u.qcow2 = g_new(ImageInfoSpecificQCow2, 1),
.u.qcow2.data = g_new(ImageInfoSpecificQCow2, 1),
};
if (s->qcow_version == 2) {
*spec_info->u.qcow2 = (ImageInfoSpecificQCow2){
*spec_info->u.qcow2.data = (ImageInfoSpecificQCow2){
.compat = g_strdup("0.10"),
.refcount_bits = s->refcount_bits,
};
} else if (s->qcow_version == 3) {
*spec_info->u.qcow2 = (ImageInfoSpecificQCow2){
*spec_info->u.qcow2.data = (ImageInfoSpecificQCow2){
.compat = g_strdup("1.1"),
.lazy_refcounts = s->compatible_features &
QCOW2_COMPAT_LAZY_REFCOUNTS,

View File

@@ -18,6 +18,7 @@
#include "qed.h"
#include "qapi/qmp/qerror.h"
#include "migration/migration.h"
#include "sysemu/block-backend.h"
static const AIOCBInfo qed_aiocb_info = {
.aiocb_size = sizeof(QEDAIOCB),
@@ -376,18 +377,6 @@ static void bdrv_qed_attach_aio_context(BlockDriverState *bs,
}
}
static void bdrv_qed_drain(BlockDriverState *bs)
{
BDRVQEDState *s = bs->opaque;
/* Cancel timer and start doing I/O that were meant to happen as if it
* fired, that way we get bdrv_drain() taking care of the ongoing requests
* correctly. */
qed_cancel_need_check_timer(s);
qed_plug_allocating_write_reqs(s);
bdrv_aio_flush(s->bs, qed_clear_need_check, s);
}
static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -411,11 +400,8 @@ static int bdrv_qed_open(BlockDriverState *bs, QDict *options, int flags,
}
if (s->header.features & ~QED_FEATURE_MASK) {
/* image uses unsupported feature bits */
char buf[64];
snprintf(buf, sizeof(buf), "%" PRIx64,
s->header.features & ~QED_FEATURE_MASK);
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "QED", buf);
error_setg(errp, "Unsupported QED features: %" PRIx64,
s->header.features & ~QED_FEATURE_MASK);
return -ENOTSUP;
}
if (!qed_is_cluster_size_valid(s->header.cluster_size)) {
@@ -580,7 +566,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
size_t l1_size = header.cluster_size * header.table_size;
Error *local_err = NULL;
int ret = 0;
BlockDriverState *bs;
BlockBackend *blk;
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
@@ -588,17 +574,18 @@ static int qed_create(const char *filename, uint32_t cluster_size,
return ret;
}
bs = NULL;
ret = bdrv_open(&bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
return ret;
return -EIO;
}
blk_set_allow_write_beyond_eof(blk, true);
/* File must start empty and grow, check truncate is supported */
ret = bdrv_truncate(bs, 0);
ret = blk_truncate(blk, 0);
if (ret < 0) {
goto out;
}
@@ -614,18 +601,18 @@ static int qed_create(const char *filename, uint32_t cluster_size,
}
qed_header_cpu_to_le(&header, &le_header);
ret = bdrv_pwrite(bs, 0, &le_header, sizeof(le_header));
ret = blk_pwrite(blk, 0, &le_header, sizeof(le_header));
if (ret < 0) {
goto out;
}
ret = bdrv_pwrite(bs, sizeof(le_header), backing_file,
header.backing_filename_size);
ret = blk_pwrite(blk, sizeof(le_header), backing_file,
header.backing_filename_size);
if (ret < 0) {
goto out;
}
l1_table = g_malloc0(l1_size);
ret = bdrv_pwrite(bs, header.l1_table_offset, l1_table, l1_size);
ret = blk_pwrite(blk, header.l1_table_offset, l1_table, l1_size);
if (ret < 0) {
goto out;
}
@@ -633,7 +620,7 @@ static int qed_create(const char *filename, uint32_t cluster_size,
ret = 0; /* success */
out:
g_free(l1_table);
bdrv_unref(bs);
blk_unref(blk);
return ret;
}
@@ -693,6 +680,7 @@ typedef struct {
uint64_t pos;
int64_t status;
int *pnum;
BlockDriverState **file;
} QEDIsAllocatedCB;
static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t len)
@@ -704,6 +692,7 @@ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t l
case QED_CLUSTER_FOUND:
offset |= qed_offset_into_cluster(s, cb->pos);
cb->status = BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
*cb->file = cb->bs->file->bs;
break;
case QED_CLUSTER_ZERO:
cb->status = BDRV_BLOCK_ZERO;
@@ -725,7 +714,8 @@ static void qed_is_allocated_cb(void *opaque, int ret, uint64_t offset, size_t l
static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
BDRVQEDState *s = bs->opaque;
size_t len = (size_t)nb_sectors * BDRV_SECTOR_SIZE;
@@ -734,6 +724,7 @@ static int64_t coroutine_fn bdrv_qed_co_get_block_status(BlockDriverState *bs,
.pos = (uint64_t)sector_num * BDRV_SECTOR_SIZE,
.status = BDRV_BLOCK_OFFSET_MASK,
.pnum = pnum,
.file = file,
};
QEDRequest request = { .l2_table = NULL };
@@ -1688,7 +1679,6 @@ static BlockDriver bdrv_qed = {
.bdrv_check = bdrv_qed_check,
.bdrv_detach_aio_context = bdrv_qed_detach_aio_context,
.bdrv_attach_aio_context = bdrv_qed_attach_aio_context,
.bdrv_drain = bdrv_qed_drain,
};
static void bdrv_qed_init(void)

View File

@@ -215,14 +215,16 @@ static QuorumAIOCB *quorum_aio_get(BDRVQuorumState *s,
return acb;
}
static void quorum_report_bad(QuorumAIOCB *acb, char *node_name, int ret)
static void quorum_report_bad(QuorumOpType type, uint64_t sector_num,
int nb_sectors, char *node_name, int ret)
{
const char *msg = NULL;
if (ret < 0) {
msg = strerror(-ret);
}
qapi_event_send_quorum_report_bad(!!msg, msg, node_name,
acb->sector_num, acb->nb_sectors, &error_abort);
qapi_event_send_quorum_report_bad(type, !!msg, msg, node_name,
sector_num, nb_sectors, &error_abort);
}
static void quorum_report_failure(QuorumAIOCB *acb)
@@ -284,9 +286,19 @@ static void quorum_aio_cb(void *opaque, int ret)
BDRVQuorumState *s = acb->common.bs->opaque;
bool rewrite = false;
if (ret == 0) {
acb->success_count++;
} else {
QuorumOpType type;
type = acb->is_read ? QUORUM_OP_TYPE_READ : QUORUM_OP_TYPE_WRITE;
quorum_report_bad(type, acb->sector_num, acb->nb_sectors,
sacb->aiocb->bs->node_name, ret);
}
if (acb->is_read && s->read_pattern == QUORUM_READ_PATTERN_FIFO) {
/* We try to read next child in FIFO order if we fail to read */
if (ret < 0 && ++acb->child_iter < s->num_children) {
if (ret < 0 && (acb->child_iter + 1) < s->num_children) {
acb->child_iter++;
read_fifo_child(acb);
return;
}
@@ -301,11 +313,6 @@ static void quorum_aio_cb(void *opaque, int ret)
sacb->ret = ret;
acb->count++;
if (ret == 0) {
acb->success_count++;
} else {
quorum_report_bad(acb, sacb->aiocb->bs->node_name, ret);
}
assert(acb->count <= s->num_children);
assert(acb->success_count <= s->num_children);
if (acb->count < s->num_children) {
@@ -337,7 +344,9 @@ static void quorum_report_bad_versions(BDRVQuorumState *s,
continue;
}
QLIST_FOREACH(item, &version->items, next) {
quorum_report_bad(acb, s->children[item->index]->bs->node_name, 0);
quorum_report_bad(QUORUM_OP_TYPE_READ, acb->sector_num,
acb->nb_sectors,
s->children[item->index]->bs->node_name, 0);
}
}
}
@@ -647,8 +656,9 @@ static BlockAIOCB *read_quorum_children(QuorumAIOCB *acb)
}
for (i = 0; i < s->num_children; i++) {
bdrv_aio_readv(s->children[i]->bs, acb->sector_num, &acb->qcrs[i].qiov,
acb->nb_sectors, quorum_aio_cb, &acb->qcrs[i]);
acb->qcrs[i].aiocb = bdrv_aio_readv(s->children[i]->bs, acb->sector_num,
&acb->qcrs[i].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[i]);
}
return &acb->common;
@@ -663,9 +673,10 @@ static BlockAIOCB *read_fifo_child(QuorumAIOCB *acb)
qemu_iovec_init(&acb->qcrs[acb->child_iter].qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->qcrs[acb->child_iter].qiov, acb->qiov,
acb->qcrs[acb->child_iter].buf);
bdrv_aio_readv(s->children[acb->child_iter]->bs, acb->sector_num,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
acb->qcrs[acb->child_iter].aiocb =
bdrv_aio_readv(s->children[acb->child_iter]->bs, acb->sector_num,
&acb->qcrs[acb->child_iter].qiov, acb->nb_sectors,
quorum_aio_cb, &acb->qcrs[acb->child_iter]);
return &acb->common;
}
@@ -759,19 +770,30 @@ static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
QuorumVoteValue result_value;
int i;
int result = 0;
int success_count = 0;
QLIST_INIT(&error_votes.vote_list);
error_votes.compare = quorum_64bits_compare;
for (i = 0; i < s->num_children; i++) {
result = bdrv_co_flush(s->children[i]->bs);
result_value.l = result;
quorum_count_vote(&error_votes, &result_value, i);
if (result) {
quorum_report_bad(QUORUM_OP_TYPE_FLUSH, 0,
bdrv_nb_sectors(s->children[i]->bs),
s->children[i]->bs->node_name, result);
result_value.l = result;
quorum_count_vote(&error_votes, &result_value, i);
} else {
success_count++;
}
}
winner = quorum_get_vote_winner(&error_votes);
result = winner->value.l;
if (success_count >= s->threshold) {
result = 0;
} else {
winner = quorum_get_vote_winner(&error_votes);
result = winner->value.l;
}
quorum_free_vote_list(&error_votes);
return result;

View File

@@ -1818,7 +1818,8 @@ static int find_allocation(BlockDriverState *bs, off_t start,
*/
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
off_t start, data = 0, hole = 0;
int64_t total_size;
@@ -1860,6 +1861,7 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
ret = BDRV_BLOCK_ZERO;
}
*file = bs;
return ret | BDRV_BLOCK_OFFSET_VALID | start;
}

View File

@@ -115,9 +115,11 @@ fail:
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
int nb_sectors, int *pnum,
BlockDriverState **file)
{
*pnum = nb_sectors;
*file = bs->file->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}

View File

@@ -16,6 +16,7 @@
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "block/block_int.h"
#include "crypto/secret.h"
#include <rbd/librbd.h>
@@ -228,6 +229,27 @@ static char *qemu_rbd_parse_clientname(const char *conf, char *clientname)
return NULL;
}
static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
Error **errp)
{
if (secretid == 0) {
return 0;
}
gchar *secret = qcrypto_secret_lookup_as_base64(secretid,
errp);
if (!secret) {
return -1;
}
rados_conf_set(cluster, "key", secret);
g_free(secret);
return 0;
}
static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
bool only_read_conf_file,
Error **errp)
@@ -299,10 +321,13 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname;
const char *secretid;
rados_t cluster;
rados_ioctx_t io_ctx;
int ret;
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
name, sizeof(name),
@@ -350,6 +375,11 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO;
}
if (qemu_rbd_set_auth(cluster, secretid, errp) < 0) {
rados_shutdown(cluster);
return -EIO;
}
if (rados_connect(cluster) < 0) {
error_setg(errp, "error connecting");
rados_shutdown(cluster);
@@ -423,6 +453,11 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Specification of the rbd image",
},
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{ /* end of list */ }
},
};
@@ -436,6 +471,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname;
const char *secretid;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
@@ -450,6 +486,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
}
filename = qemu_opt_get(opts, "filename");
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
@@ -488,6 +525,11 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
}
}
if (qemu_rbd_set_auth(s->cluster, secretid, errp) < 0) {
r = -EIO;
goto failed_shutdown;
}
/*
* Fallback to more conservative semantics if setting cache
* options fails. Ignore errors from setting rbd_cache because the
@@ -919,6 +961,11 @@ static QemuOptsList qemu_rbd_create_opts = {
.type = QEMU_OPT_SIZE,
.help = "RBD object size"
},
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{ /* end of list */ }
}
};

View File

@@ -18,6 +18,7 @@
#include "qemu/error-report.h"
#include "qemu/sockets.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/bitops.h"
#define SD_PROTO_VER 0x01
@@ -284,6 +285,12 @@ static inline bool is_snapshot(struct SheepdogInode *inode)
return !!inode->snap_ctime;
}
static inline size_t count_data_objs(const struct SheepdogInode *inode)
{
return DIV_ROUND_UP(inode->vdi_size,
(1UL << inode->block_size_shift));
}
#undef DPRINTF
#ifdef DEBUG_SDOG
#define DPRINTF(fmt, args...) \
@@ -609,14 +616,13 @@ static coroutine_fn int send_co_req(int sockfd, SheepdogReq *hdr, void *data,
ret = qemu_co_send(sockfd, hdr, sizeof(*hdr));
if (ret != sizeof(*hdr)) {
error_report("failed to send a req, %s", strerror(errno));
ret = -socket_error();
return ret;
return -errno;
}
ret = qemu_co_send(sockfd, data, *wlen);
if (ret != *wlen) {
ret = -socket_error();
error_report("failed to send a req, %s", strerror(errno));
return -errno;
}
return ret;
@@ -1631,7 +1637,7 @@ static int do_sd_create(BDRVSheepdogState *s, uint32_t *vdi_id, int snapshot,
static int sd_prealloc(const char *filename, Error **errp)
{
BlockDriverState *bs = NULL;
BlockBackend *blk = NULL;
BDRVSheepdogState *base = NULL;
unsigned long buf_size;
uint32_t idx, max_idx;
@@ -1640,19 +1646,23 @@ static int sd_prealloc(const char *filename, Error **errp)
void *buf = NULL;
int ret;
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
errp);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
errp);
if (blk == NULL) {
ret = -EIO;
goto out_with_err_set;
}
vdi_size = bdrv_getlength(bs);
blk_set_allow_write_beyond_eof(blk, true);
vdi_size = blk_getlength(blk);
if (vdi_size < 0) {
ret = vdi_size;
goto out;
}
base = bs->opaque;
base = blk_bs(blk)->opaque;
object_size = (UINT32_C(1) << base->inode.block_size_shift);
buf_size = MIN(object_size, SD_DATA_OBJ_SIZE);
buf = g_malloc0(buf_size);
@@ -1664,23 +1674,24 @@ static int sd_prealloc(const char *filename, Error **errp)
* The created image can be a cloned image, so we need to read
* a data from the source image.
*/
ret = bdrv_pread(bs, idx * buf_size, buf, buf_size);
ret = blk_pread(blk, idx * buf_size, buf, buf_size);
if (ret < 0) {
goto out;
}
ret = bdrv_pwrite(bs, idx * buf_size, buf, buf_size);
ret = blk_pwrite(blk, idx * buf_size, buf, buf_size);
if (ret < 0) {
goto out;
}
}
ret = 0;
out:
if (ret < 0) {
error_setg_errno(errp, -ret, "Can't pre-allocate");
}
out_with_err_set:
if (bs) {
bdrv_unref(bs);
if (blk) {
blk_unref(blk);
}
g_free(buf);
@@ -1820,7 +1831,7 @@ static int sd_create(const char *filename, QemuOpts *opts,
}
if (backing_file) {
BlockDriverState *bs;
BlockBackend *blk;
BDRVSheepdogState *base;
BlockDriver *drv;
@@ -1832,22 +1843,23 @@ static int sd_create(const char *filename, QemuOpts *opts,
goto out;
}
bs = NULL;
ret = bdrv_open(&bs, backing_file, NULL, NULL, BDRV_O_PROTOCOL, errp);
if (ret < 0) {
blk = blk_new_open(backing_file, NULL, NULL,
BDRV_O_PROTOCOL | BDRV_O_CACHE_WB, errp);
if (blk == NULL) {
ret = -EIO;
goto out;
}
base = bs->opaque;
base = blk_bs(blk)->opaque;
if (!is_snapshot(&base->inode)) {
error_setg(errp, "cannot clone from a non snapshot vdi");
bdrv_unref(bs);
blk_unref(blk);
ret = -EINVAL;
goto out;
}
s->inode.vdi_id = base->inode.vdi_id;
bdrv_unref(bs);
blk_unref(blk);
}
s->aio_context = qemu_get_aio_context();
@@ -2478,13 +2490,131 @@ out:
return ret;
}
#define NR_BATCHED_DISCARD 128
static bool remove_objects(BDRVSheepdogState *s)
{
int fd, i = 0, nr_objs = 0;
Error *local_err = NULL;
int ret = 0;
bool result = true;
SheepdogInode *inode = &s->inode;
fd = connect_to_sdog(s, &local_err);
if (fd < 0) {
error_report_err(local_err);
return false;
}
nr_objs = count_data_objs(inode);
while (i < nr_objs) {
int start_idx, nr_filled_idx;
while (i < nr_objs && !inode->data_vdi_id[i]) {
i++;
}
start_idx = i;
nr_filled_idx = 0;
while (i < nr_objs && nr_filled_idx < NR_BATCHED_DISCARD) {
if (inode->data_vdi_id[i]) {
inode->data_vdi_id[i] = 0;
nr_filled_idx++;
}
i++;
}
ret = write_object(fd, s->aio_context,
(char *)&inode->data_vdi_id[start_idx],
vid_to_vdi_oid(s->inode.vdi_id), inode->nr_copies,
(i - start_idx) * sizeof(uint32_t),
offsetof(struct SheepdogInode,
data_vdi_id[start_idx]),
false, s->cache_flags);
if (ret < 0) {
error_report("failed to discard snapshot inode.");
result = false;
goto out;
}
}
out:
closesocket(fd);
return result;
}
static int sd_snapshot_delete(BlockDriverState *bs,
const char *snapshot_id,
const char *name,
Error **errp)
{
/* FIXME: Delete specified snapshot id. */
return 0;
unsigned long snap_id = 0;
char snap_tag[SD_MAX_VDI_TAG_LEN];
Error *local_err = NULL;
int fd, ret;
char buf[SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN];
BDRVSheepdogState *s = bs->opaque;
unsigned int wlen = SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN, rlen = 0;
uint32_t vid;
SheepdogVdiReq hdr = {
.opcode = SD_OP_DEL_VDI,
.data_length = wlen,
.flags = SD_FLAG_CMD_WRITE,
};
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
if (!remove_objects(s)) {
return -1;
}
memset(buf, 0, sizeof(buf));
memset(snap_tag, 0, sizeof(snap_tag));
pstrcpy(buf, SD_MAX_VDI_LEN, s->name);
ret = qemu_strtoul(snapshot_id, NULL, 10, &snap_id);
if (ret || snap_id > UINT32_MAX) {
error_setg(errp, "Invalid snapshot ID: %s",
snapshot_id ? snapshot_id : "<null>");
return -EINVAL;
}
if (snap_id) {
hdr.snapid = (uint32_t) snap_id;
} else {
pstrcpy(snap_tag, sizeof(snap_tag), snapshot_id);
pstrcpy(buf + SD_MAX_VDI_LEN, SD_MAX_VDI_TAG_LEN, snap_tag);
}
ret = find_vdi_name(s, s->name, snap_id, snap_tag, &vid, true,
&local_err);
if (ret) {
return ret;
}
fd = connect_to_sdog(s, &local_err);
if (fd < 0) {
error_report_err(local_err);
return -1;
}
ret = do_req(fd, s->aio_context, (SheepdogReq *)&hdr,
buf, &wlen, &rlen);
closesocket(fd);
if (ret) {
return ret;
}
switch (rsp->result) {
case SD_RES_NO_VDI:
error_report("%s was already deleted", s->name);
case SD_RES_SUCCESS:
break;
default:
error_report("%s, %s", sd_strerror(rsp->result), s->name);
return -1;
}
return ret;
}
static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
@@ -2708,7 +2838,7 @@ retry:
static coroutine_fn int64_t
sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
int *pnum)
int *pnum, BlockDriverState **file)
{
BDRVSheepdogState *s = bs->opaque;
SheepdogInode *inode = &s->inode;
@@ -2739,6 +2869,9 @@ sd_co_get_block_status(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
if (*pnum > nb_sectors) {
*pnum = nb_sectors;
}
if (ret > 0 && ret & BDRV_BLOCK_OFFSET_VALID) {
*file = bs;
}
return ret;
}

View File

@@ -52,6 +52,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "migration/migration.h"
#include "qemu/coroutine.h"
@@ -527,7 +528,7 @@ static int vdi_reopen_prepare(BDRVReopenState *state,
}
static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
/* TODO: Check for too large sector_num (in bdrv_is_allocated or here). */
BDRVVdiState *s = (BDRVVdiState *)bs->opaque;
@@ -551,6 +552,7 @@ static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
offset = s->header.offset_data +
(uint64_t)bmap_entry * s->block_size +
sector_in_block * SECTOR_SIZE;
*file = bs->file->bs;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
}
@@ -732,7 +734,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
size_t bmap_size;
int64_t offset = 0;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
BlockBackend *blk = NULL;
uint32_t *bmap = NULL;
logout("\n");
@@ -765,13 +767,18 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
error_propagate(errp, local_err);
goto exit;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto exit;
}
blk_set_allow_write_beyond_eof(blk, true);
/* We need enough blocks to store the given disk size,
so always round up. */
blocks = DIV_ROUND_UP(bytes, block_size);
@@ -801,7 +808,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
ret = bdrv_pwrite_sync(bs, offset, &header, sizeof(header));
ret = blk_pwrite(blk, offset, &header, sizeof(header));
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
@@ -822,7 +829,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
bmap[i] = VDI_UNALLOCATED;
}
}
ret = bdrv_pwrite_sync(bs, offset, bmap, bmap_size);
ret = blk_pwrite(blk, offset, bmap, bmap_size);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
@@ -831,7 +838,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
}
if (image_type == VDI_TYPE_STATIC) {
ret = bdrv_truncate(bs, offset + blocks * block_size);
ret = blk_truncate(blk, offset + blocks * block_size);
if (ret < 0) {
error_setg(errp, "Failed to statically allocate %s", filename);
goto exit;
@@ -839,7 +846,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
}
exit:
bdrv_unref(bs);
blk_unref(blk);
g_free(bmap);
return ret;
}

View File

@@ -18,6 +18,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/crc32c.h"
#include "block/vhdx.h"
@@ -264,10 +265,10 @@ static void vhdx_region_unregister_all(BDRVVHDXState *s)
static void vhdx_set_shift_bits(BDRVVHDXState *s)
{
s->logical_sector_size_bits = 31 - clz32(s->logical_sector_size);
s->sectors_per_block_bits = 31 - clz32(s->sectors_per_block);
s->chunk_ratio_bits = 63 - clz64(s->chunk_ratio);
s->block_size_bits = 31 - clz32(s->block_size);
s->logical_sector_size_bits = ctz32(s->logical_sector_size);
s->sectors_per_block_bits = ctz32(s->sectors_per_block);
s->chunk_ratio_bits = ctz64(s->chunk_ratio);
s->block_size_bits = ctz32(s->block_size);
}
/*
@@ -857,14 +858,8 @@ static void vhdx_calc_bat_entries(BDRVVHDXState *s)
{
uint32_t data_blocks_cnt, bitmap_blocks_cnt;
data_blocks_cnt = s->virtual_disk_size >> s->block_size_bits;
if (s->virtual_disk_size - (data_blocks_cnt << s->block_size_bits)) {
data_blocks_cnt++;
}
bitmap_blocks_cnt = data_blocks_cnt >> s->chunk_ratio_bits;
if (data_blocks_cnt - (bitmap_blocks_cnt << s->chunk_ratio_bits)) {
bitmap_blocks_cnt++;
}
data_blocks_cnt = DIV_ROUND_UP(s->virtual_disk_size, s->block_size);
bitmap_blocks_cnt = DIV_ROUND_UP(data_blocks_cnt, s->chunk_ratio);
if (s->parent_entries) {
s->bat_entries = bitmap_blocks_cnt * (s->chunk_ratio + 1);
@@ -1778,7 +1773,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
gunichar2 *creator = NULL;
glong creator_items;
BlockDriverState *bs;
BlockBackend *blk;
char *type = NULL;
VHDXImageType image_type;
Error *local_err = NULL;
@@ -1843,14 +1838,17 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
bs = NULL;
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto exit;
}
blk_set_allow_write_beyond_eof(blk, true);
/* Create (A) */
/* The creator field is optional, but may be useful for
@@ -1858,13 +1856,13 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
creator = g_utf8_to_utf16("QEMU v" QEMU_VERSION, -1, NULL,
&creator_items, NULL);
signature = cpu_to_le64(VHDX_FILE_SIGNATURE);
ret = bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature));
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature));
if (ret < 0) {
goto delete_and_exit;
}
if (creator) {
ret = bdrv_pwrite(bs, VHDX_FILE_ID_OFFSET + sizeof(signature),
creator, creator_items * sizeof(gunichar2));
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET + sizeof(signature),
creator, creator_items * sizeof(gunichar2));
if (ret < 0) {
goto delete_and_exit;
}
@@ -1872,13 +1870,13 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
/* Creates (B),(C) */
ret = vhdx_create_new_headers(bs, image_size, log_size);
ret = vhdx_create_new_headers(blk_bs(blk), image_size, log_size);
if (ret < 0) {
goto delete_and_exit;
}
/* Creates (D),(E),(G) explicitly. (F) created as by-product */
ret = vhdx_create_new_region_table(bs, image_size, block_size, 512,
ret = vhdx_create_new_region_table(blk_bs(blk), image_size, block_size, 512,
log_size, use_zero_blocks, image_type,
&metadata_offset);
if (ret < 0) {
@@ -1886,7 +1884,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
}
/* Creates (H) */
ret = vhdx_create_new_metadata(bs, image_size, block_size, 512,
ret = vhdx_create_new_metadata(blk_bs(blk), image_size, block_size, 512,
metadata_offset, image_type);
if (ret < 0) {
goto delete_and_exit;
@@ -1894,7 +1892,7 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
delete_and_exit:
bdrv_unref(bs);
blk_unref(blk);
exit:
g_free(type);
g_free(creator);

View File

@@ -26,6 +26,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
@@ -242,15 +243,17 @@ static void vmdk_free_last_extent(BlockDriverState *bs)
static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
{
char desc[DESC_SIZE];
char *desc;
uint32_t cid = 0xffffffff;
const char *p_name, *cid_str;
size_t cid_str_size;
BDRVVmdkState *s = bs->opaque;
int ret;
desc = g_malloc0(DESC_SIZE);
ret = bdrv_pread(bs->file->bs, s->desc_offset, desc, DESC_SIZE);
if (ret < 0) {
g_free(desc);
return 0;
}
@@ -269,41 +272,45 @@ static uint32_t vmdk_read_cid(BlockDriverState *bs, int parent)
sscanf(p_name, "%" SCNx32, &cid);
}
g_free(desc);
return cid;
}
static int vmdk_write_cid(BlockDriverState *bs, uint32_t cid)
{
char desc[DESC_SIZE], tmp_desc[DESC_SIZE];
char *desc, *tmp_desc;
char *p_name, *tmp_str;
BDRVVmdkState *s = bs->opaque;
int ret;
int ret = 0;
desc = g_malloc0(DESC_SIZE);
tmp_desc = g_malloc0(DESC_SIZE);
ret = bdrv_pread(bs->file->bs, s->desc_offset, desc, DESC_SIZE);
if (ret < 0) {
return ret;
goto out;
}
desc[DESC_SIZE - 1] = '\0';
tmp_str = strstr(desc, "parentCID");
if (tmp_str == NULL) {
return -EINVAL;
ret = -EINVAL;
goto out;
}
pstrcpy(tmp_desc, sizeof(tmp_desc), tmp_str);
pstrcpy(tmp_desc, DESC_SIZE, tmp_str);
p_name = strstr(desc, "CID");
if (p_name != NULL) {
p_name += sizeof("CID");
snprintf(p_name, sizeof(desc) - (p_name - desc), "%" PRIx32 "\n", cid);
pstrcat(desc, sizeof(desc), tmp_desc);
snprintf(p_name, DESC_SIZE - (p_name - desc), "%" PRIx32 "\n", cid);
pstrcat(desc, DESC_SIZE, tmp_desc);
}
ret = bdrv_pwrite_sync(bs->file->bs, s->desc_offset, desc, DESC_SIZE);
if (ret < 0) {
return ret;
}
return 0;
out:
g_free(desc);
g_free(tmp_desc);
return ret;
}
static int vmdk_is_cid_valid(BlockDriverState *bs)
@@ -337,15 +344,16 @@ static int vmdk_reopen_prepare(BDRVReopenState *state,
static int vmdk_parent_open(BlockDriverState *bs)
{
char *p_name;
char desc[DESC_SIZE + 1];
char *desc;
BDRVVmdkState *s = bs->opaque;
int ret;
desc[DESC_SIZE] = '\0';
desc = g_malloc0(DESC_SIZE + 1);
ret = bdrv_pread(bs->file->bs, s->desc_offset, desc, DESC_SIZE);
if (ret < 0) {
return ret;
goto out;
}
ret = 0;
p_name = strstr(desc, "parentFileNameHint");
if (p_name != NULL) {
@@ -354,16 +362,20 @@ static int vmdk_parent_open(BlockDriverState *bs)
p_name += sizeof("parentFileNameHint") + 1;
end_name = strchr(p_name, '\"');
if (end_name == NULL) {
return -EINVAL;
ret = -EINVAL;
goto out;
}
if ((end_name - p_name) > sizeof(bs->backing_file) - 1) {
return -EINVAL;
ret = -EINVAL;
goto out;
}
pstrcpy(bs->backing_file, end_name - p_name + 1, p_name);
}
return 0;
out:
g_free(desc);
return ret;
}
/* Create and append extent to the extent array. Return the added VmdkExtent
@@ -571,6 +583,7 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
VmdkExtent *extent;
BDRVVmdkState *s = bs->opaque;
int64_t l1_backup_offset = 0;
bool compressed;
ret = bdrv_pread(file->bs, sizeof(magic), &header, sizeof(header));
if (ret < 0) {
@@ -645,14 +658,14 @@ static int vmdk_open_vmdk4(BlockDriverState *bs,
header = footer.header;
}
compressed =
le16_to_cpu(header.compressAlgorithm) == VMDK4_COMPRESSION_DEFLATE;
if (le32_to_cpu(header.version) > 3) {
char buf[64];
snprintf(buf, sizeof(buf), "VMDK version %" PRId32,
le32_to_cpu(header.version));
error_setg(errp, QERR_UNKNOWN_BLOCK_FORMAT_FEATURE,
bdrv_get_device_or_node_name(bs), "vmdk", buf);
error_setg(errp, "Unsupported VMDK version %" PRIu32,
le32_to_cpu(header.version));
return -ENOTSUP;
} else if (le32_to_cpu(header.version) == 3 && (flags & BDRV_O_RDWR)) {
} else if (le32_to_cpu(header.version) == 3 && (flags & BDRV_O_RDWR) &&
!compressed) {
/* VMware KB 2064959 explains that version 3 added support for
* persistent changed block tracking (CBT), and backup software can
* read it as version=1 if it doesn't care about the changed area
@@ -1257,7 +1270,7 @@ static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
}
static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
BDRVVmdkState *s = bs->opaque;
int64_t index_in_cluster, n, ret;
@@ -1274,6 +1287,7 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
0, 0);
qemu_co_mutex_unlock(&s->lock);
index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
switch (ret) {
case VMDK_ERROR:
ret = -EIO;
@@ -1286,14 +1300,15 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
break;
case VMDK_OK:
ret = BDRV_BLOCK_DATA;
if (extent->file == bs->file && !extent->compressed) {
ret |= BDRV_BLOCK_OFFSET_VALID | offset;
if (!extent->compressed) {
ret |= BDRV_BLOCK_OFFSET_VALID;
ret |= (offset + (index_in_cluster << BDRV_SECTOR_BITS))
& BDRV_BLOCK_OFFSET_MASK;
}
*file = extent->file->bs;
break;
}
index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
@@ -1633,7 +1648,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
QemuOpts *opts, Error **errp)
{
int ret, i;
BlockDriverState *bs = NULL;
BlockBackend *blk = NULL;
VMDK4Header header;
Error *local_err = NULL;
uint32_t tmp, magic, grains, gd_sectors, gt_size, gt_count;
@@ -1646,16 +1661,19 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
goto exit;
}
assert(bs == NULL);
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto exit;
}
blk_set_allow_write_beyond_eof(blk, true);
if (flat) {
ret = bdrv_truncate(bs, filesize);
ret = blk_truncate(blk, filesize);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
}
@@ -1710,18 +1728,18 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
header.check_bytes[3] = 0xa;
/* write all the data */
ret = bdrv_pwrite(bs, 0, &magic, sizeof(magic));
ret = blk_pwrite(blk, 0, &magic, sizeof(magic));
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
}
ret = bdrv_pwrite(bs, sizeof(magic), &header, sizeof(header));
ret = blk_pwrite(blk, sizeof(magic), &header, sizeof(header));
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
}
ret = bdrv_truncate(bs, le64_to_cpu(header.grain_offset) << 9);
ret = blk_truncate(blk, le64_to_cpu(header.grain_offset) << 9);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
goto exit;
@@ -1734,8 +1752,8 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
i < gt_count; i++, tmp += gt_size) {
gd_buf[i] = cpu_to_le32(tmp);
}
ret = bdrv_pwrite(bs, le64_to_cpu(header.rgd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
ret = blk_pwrite(blk, le64_to_cpu(header.rgd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
@@ -1746,8 +1764,8 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
i < gt_count; i++, tmp += gt_size) {
gd_buf[i] = cpu_to_le32(tmp);
}
ret = bdrv_pwrite(bs, le64_to_cpu(header.gd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
ret = blk_pwrite(blk, le64_to_cpu(header.gd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
@@ -1755,8 +1773,8 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
ret = 0;
exit:
if (bs) {
bdrv_unref(bs);
if (blk) {
blk_unref(blk);
}
g_free(gd_buf);
return ret;
@@ -1805,7 +1823,7 @@ static int filename_decompose(const char *filename, char *path, char *prefix,
static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
{
int idx = 0;
BlockDriverState *new_bs = NULL;
BlockBackend *new_blk = NULL;
Error *local_err = NULL;
char *desc = NULL;
int64_t total_size = 0, filesize;
@@ -1916,7 +1934,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
if (backing_file) {
BlockDriverState *bs = NULL;
BlockBackend *blk;
char *full_backing = g_new0(char, PATH_MAX);
bdrv_get_full_backing_filename_from_filename(filename, backing_file,
full_backing, PATH_MAX,
@@ -1927,18 +1945,21 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
ret = -ENOENT;
goto exit;
}
ret = bdrv_open(&bs, full_backing, NULL, NULL, BDRV_O_NO_BACKING, errp);
blk = blk_new_open(full_backing, NULL, NULL,
BDRV_O_NO_BACKING | BDRV_O_CACHE_WB, errp);
g_free(full_backing);
if (ret != 0) {
if (blk == NULL) {
ret = -EIO;
goto exit;
}
if (strcmp(bs->drv->format_name, "vmdk")) {
bdrv_unref(bs);
if (strcmp(blk_bs(blk)->drv->format_name, "vmdk")) {
blk_unref(blk);
ret = -EINVAL;
goto exit;
}
parent_cid = vmdk_read_cid(bs, 0);
bdrv_unref(bs);
parent_cid = vmdk_read_cid(blk_bs(blk), 0);
blk_unref(blk);
snprintf(parent_desc_line, BUF_SIZE,
"parentFileNameHint=\"%s\"", backing_file);
}
@@ -1996,14 +2017,19 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
goto exit;
}
}
assert(new_bs == NULL);
ret = bdrv_open(&new_bs, filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_PROTOCOL, &local_err);
if (ret < 0) {
new_blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (new_blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto exit;
}
ret = bdrv_pwrite(new_bs, desc_offset, desc, desc_len);
blk_set_allow_write_beyond_eof(new_blk, true);
ret = blk_pwrite(new_blk, desc_offset, desc, desc_len);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write description");
goto exit;
@@ -2011,14 +2037,14 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
/* bdrv_pwrite write padding zeros to align to sector, we don't need that
* for description file */
if (desc_offset == 0) {
ret = bdrv_truncate(new_bs, desc_len);
ret = blk_truncate(new_blk, desc_len);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not truncate file");
}
}
exit:
if (new_bs) {
bdrv_unref(new_bs);
if (new_blk) {
blk_unref(new_blk);
}
g_free(adapter_type);
g_free(backing_file);
@@ -2177,18 +2203,18 @@ static ImageInfoSpecific *vmdk_get_specific_info(BlockDriverState *bs)
*spec_info = (ImageInfoSpecific){
.type = IMAGE_INFO_SPECIFIC_KIND_VMDK,
{
.vmdk = g_new0(ImageInfoSpecificVmdk, 1),
.u = {
.vmdk.data = g_new0(ImageInfoSpecificVmdk, 1),
},
};
*spec_info->u.vmdk = (ImageInfoSpecificVmdk) {
*spec_info->u.vmdk.data = (ImageInfoSpecificVmdk) {
.create_type = g_strdup(s->create_type),
.cid = s->cid,
.parent_cid = s->parent_cid,
};
next = &spec_info->u.vmdk->extents;
next = &spec_info->u.vmdk.data->extents;
for (i = 0; i < s->num_extents; i++) {
*next = g_new0(ImageInfoList, 1);
(*next)->value = vmdk_get_extent_info(&s->extents[i]);

View File

@@ -25,6 +25,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "migration/migration.h"
#if defined(CONFIG_UUID)
@@ -46,8 +47,14 @@ enum vhd_type {
// Seconds since Jan 1, 2000 0:00:00 (UTC)
#define VHD_TIMESTAMP_BASE 946684800
#define VHD_CHS_MAX_C 65535LL
#define VHD_CHS_MAX_H 16
#define VHD_CHS_MAX_S 255
#define VHD_MAX_SECTORS (65535LL * 255 * 255)
#define VHD_MAX_GEOMETRY (65535LL * 16 * 255)
#define VHD_MAX_GEOMETRY (VHD_CHS_MAX_C * VHD_CHS_MAX_H * VHD_CHS_MAX_S)
#define VPC_OPT_FORCE_SIZE "force_size"
// always big-endian
typedef struct vhd_footer {
@@ -128,6 +135,8 @@ typedef struct BDRVVPCState {
uint32_t block_size;
uint32_t bitmap_size;
bool force_use_chs;
bool force_use_sz;
#ifdef CACHE
uint8_t *pageentry_u8;
@@ -140,6 +149,22 @@ typedef struct BDRVVPCState {
Error *migration_blocker;
} BDRVVPCState;
#define VPC_OPT_SIZE_CALC "force_size_calc"
static QemuOptsList vpc_runtime_opts = {
.name = "vpc-runtime-opts",
.head = QTAILQ_HEAD_INITIALIZER(vpc_runtime_opts.head),
.desc = {
{
.name = VPC_OPT_SIZE_CALC,
.type = QEMU_OPT_STRING,
.help = "Force disk size calculation to use either CHS geometry, "
"or use the disk current_size specified in the VHD footer. "
"{chs, current_size}"
},
{ /* end of list */ }
}
};
static uint32_t vpc_checksum(uint8_t* buf, size_t size)
{
uint32_t res = 0;
@@ -159,6 +184,25 @@ static int vpc_probe(const uint8_t *buf, int buf_size, const char *filename)
return 0;
}
static void vpc_parse_options(BlockDriverState *bs, QemuOpts *opts,
Error **errp)
{
BDRVVPCState *s = bs->opaque;
const char *size_calc;
size_calc = qemu_opt_get(opts, VPC_OPT_SIZE_CALC);
if (!size_calc) {
/* no override, use autodetect only */
} else if (!strcmp(size_calc, "current_size")) {
s->force_use_sz = true;
} else if (!strcmp(size_calc, "chs")) {
s->force_use_chs = true;
} else {
error_setg(errp, "Invalid size calculation mode: '%s'", size_calc);
}
}
static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -166,6 +210,9 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
int i;
VHDFooter *footer;
VHDDynDiskHeader *dyndisk_header;
QemuOpts *opts = NULL;
Error *local_err = NULL;
bool use_chs;
uint8_t buf[HEADER_SIZE];
uint32_t checksum;
uint64_t computed_size;
@@ -173,6 +220,21 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
int disk_type = VHD_DYNAMIC;
int ret;
opts = qemu_opts_create(&vpc_runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
vpc_parse_options(bs, opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
ret = bdrv_pread(bs->file->bs, 0, s->footer_buf, HEADER_SIZE);
if (ret < 0) {
goto fail;
@@ -218,12 +280,36 @@ static int vpc_open(BlockDriverState *bs, QDict *options, int flags,
bs->total_sectors = (int64_t)
be16_to_cpu(footer->cyls) * footer->heads * footer->secs_per_cyl;
/* Images that have exactly the maximum geometry are probably bigger and
* would be truncated if we adhered to the geometry for them. Rely on
* footer->current_size for them. */
if (bs->total_sectors == VHD_MAX_GEOMETRY) {
/* Microsoft Virtual PC and Microsoft Hyper-V produce and read
* VHD image sizes differently. VPC will rely on CHS geometry,
* while Hyper-V and disk2vhd use the size specified in the footer.
*
* We use a couple of approaches to try and determine the correct method:
* look at the Creator App field, and look for images that have CHS
* geometry that is the maximum value.
*
* If the CHS geometry is the maximum CHS geometry, then we assume that
* the size is the footer->current_size to avoid truncation. Otherwise,
* we follow the table based on footer->creator_app:
*
* Known creator apps:
* 'vpc ' : CHS Virtual PC (uses disk geometry)
* 'qemu' : CHS QEMU (uses disk geometry)
* 'qem2' : current_size QEMU (uses current_size)
* 'win ' : current_size Hyper-V
* 'd2v ' : current_size Disk2vhd
*
* The user can override the table values via drive options, however
* even with an override we will still use current_size for images
* that have CHS geometry of the maximum size.
*/
use_chs = (!!strncmp(footer->creator_app, "win ", 4) &&
!!strncmp(footer->creator_app, "qem2", 4) &&
!!strncmp(footer->creator_app, "d2v ", 4)) || s->force_use_chs;
if (!use_chs || bs->total_sectors == VHD_MAX_GEOMETRY || s->force_use_sz) {
bs->total_sectors = be64_to_cpu(footer->current_size) /
BDRV_SECTOR_SIZE;
BDRV_SECTOR_SIZE;
}
/* Allow a maximum disk size of approximately 2 TB */
@@ -579,7 +665,7 @@ static coroutine_fn int vpc_co_write(BlockDriverState *bs, int64_t sector_num,
}
static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int *pnum)
int64_t sector_num, int nb_sectors, int *pnum, BlockDriverState **file)
{
BDRVVPCState *s = bs->opaque;
VHDFooter *footer = (VHDFooter*) s->footer_buf;
@@ -589,6 +675,7 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
if (be32_to_cpu(footer->type) == VHD_FIXED) {
*pnum = nb_sectors;
*file = bs->file->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}
@@ -610,6 +697,7 @@ static int64_t coroutine_fn vpc_co_get_block_status(BlockDriverState *bs,
/* *pnum can't be greater than one block for allocated
* sectors since there is always a bitmap in between. */
if (allocated) {
*file = bs->file->bs;
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | start;
}
if (nb_sectors == 0) {
@@ -671,7 +759,7 @@ static int calculate_geometry(int64_t total_sectors, uint16_t* cyls,
return 0;
}
static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
static int create_dynamic_disk(BlockBackend *blk, uint8_t *buf,
int64_t total_sectors)
{
VHDDynDiskHeader *dyndisk_header =
@@ -685,13 +773,13 @@ static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
block_size = 0x200000;
num_bat_entries = (total_sectors + block_size / 512) / (block_size / 512);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE);
if (ret) {
goto fail;
}
offset = 1536 + ((num_bat_entries * 4 + 511) & ~511);
ret = bdrv_pwrite_sync(bs, offset, buf, HEADER_SIZE);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE);
if (ret < 0) {
goto fail;
}
@@ -701,7 +789,7 @@ static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
memset(buf, 0xFF, 512);
for (i = 0; i < (num_bat_entries * 4 + 511) / 512; i++) {
ret = bdrv_pwrite_sync(bs, offset, buf, 512);
ret = blk_pwrite(blk, offset, buf, 512);
if (ret < 0) {
goto fail;
}
@@ -728,7 +816,7 @@ static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
// Write the header
offset = 512;
ret = bdrv_pwrite_sync(bs, offset, buf, 1024);
ret = blk_pwrite(blk, offset, buf, 1024);
if (ret < 0) {
goto fail;
}
@@ -737,7 +825,7 @@ static int create_dynamic_disk(BlockDriverState *bs, uint8_t *buf,
return ret;
}
static int create_fixed_disk(BlockDriverState *bs, uint8_t *buf,
static int create_fixed_disk(BlockBackend *blk, uint8_t *buf,
int64_t total_size)
{
int ret;
@@ -745,12 +833,12 @@ static int create_fixed_disk(BlockDriverState *bs, uint8_t *buf,
/* Add footer to total size */
total_size += HEADER_SIZE;
ret = bdrv_truncate(bs, total_size);
ret = blk_truncate(blk, total_size);
if (ret < 0) {
return ret;
}
ret = bdrv_pwrite_sync(bs, total_size - HEADER_SIZE, buf, HEADER_SIZE);
ret = blk_pwrite(blk, total_size - HEADER_SIZE, buf, HEADER_SIZE);
if (ret < 0) {
return ret;
}
@@ -771,8 +859,9 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size;
int disk_type;
int ret = -EIO;
bool force_size;
Error *local_err = NULL;
BlockDriverState *bs = NULL;
BlockBackend *blk = NULL;
/* Read out options */
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
@@ -791,30 +880,44 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
disk_type = VHD_DYNAMIC;
}
force_size = qemu_opt_get_bool_del(opts, VPC_OPT_FORCE_SIZE, false);
ret = bdrv_create_file(filename, opts, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto out;
}
ret = bdrv_open(&bs, filename, NULL, NULL, BDRV_O_RDWR | BDRV_O_PROTOCOL,
&local_err);
if (ret < 0) {
blk = blk_new_open(filename, NULL, NULL,
BDRV_O_RDWR | BDRV_O_CACHE_WB | BDRV_O_PROTOCOL,
&local_err);
if (blk == NULL) {
error_propagate(errp, local_err);
ret = -EIO;
goto out;
}
blk_set_allow_write_beyond_eof(blk, true);
/*
* Calculate matching total_size and geometry. Increase the number of
* sectors requested until we get enough (or fail). This ensures that
* qemu-img convert doesn't truncate images, but rather rounds up.
*
* If the image size can't be represented by a spec conform CHS geometry,
* If the image size can't be represented by a spec conformant CHS geometry,
* we set the geometry to 65535 x 16 x 255 (CxHxS) sectors and use
* the image size from the VHD footer to calculate total_sectors.
*/
total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) {
calculate_geometry(total_sectors + i, &cyls, &heads, &secs_per_cyl);
if (force_size) {
/* This will force the use of total_size for sector count, below */
cyls = VHD_CHS_MAX_C;
heads = VHD_CHS_MAX_H;
secs_per_cyl = VHD_CHS_MAX_S;
} else {
total_sectors = MIN(VHD_MAX_GEOMETRY, total_size / BDRV_SECTOR_SIZE);
for (i = 0; total_sectors > (int64_t)cyls * heads * secs_per_cyl; i++) {
calculate_geometry(total_sectors + i, &cyls, &heads, &secs_per_cyl);
}
}
if ((int64_t)cyls * heads * secs_per_cyl == VHD_MAX_GEOMETRY) {
@@ -833,8 +936,11 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
memset(buf, 0, 1024);
memcpy(footer->creator, "conectix", 8);
/* TODO Check if "qemu" creator_app is ok for VPC */
memcpy(footer->creator_app, "qemu", 4);
if (force_size) {
memcpy(footer->creator_app, "qem2", 4);
} else {
memcpy(footer->creator_app, "qemu", 4);
}
memcpy(footer->creator_os, "Wi2k", 4);
footer->features = cpu_to_be32(0x02);
@@ -864,13 +970,13 @@ static int vpc_create(const char *filename, QemuOpts *opts, Error **errp)
footer->checksum = cpu_to_be32(vpc_checksum(buf, HEADER_SIZE));
if (disk_type == VHD_DYNAMIC) {
ret = create_dynamic_disk(bs, buf, total_sectors);
ret = create_dynamic_disk(blk, buf, total_sectors);
} else {
ret = create_fixed_disk(bs, buf, total_size);
ret = create_fixed_disk(blk, buf, total_size);
}
out:
bdrv_unref(bs);
blk_unref(blk);
g_free(disk_type_param);
return ret;
}
@@ -915,6 +1021,13 @@ static QemuOptsList vpc_create_opts = {
"Type of virtual hard disk format. Supported formats are "
"{dynamic (default) | fixed} "
},
{
.name = VPC_OPT_FORCE_SIZE,
.type = QEMU_OPT_BOOL,
.help = "Force disk size calculation to use the actual size "
"specified, rather than using the nearest CHS-based "
"calculation"
},
{ /* end of list */ }
}
};

View File

@@ -2884,7 +2884,7 @@ static coroutine_fn int vvfat_co_write(BlockDriverState *bs, int64_t sector_num,
}
static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, int* n)
int64_t sector_num, int nb_sectors, int *n, BlockDriverState **file)
{
BDRVVVFATState* s = bs->opaque;
*n = s->sector_count - sector_num;

View File

@@ -9,6 +9,7 @@
* later. See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "sysemu/blockdev.h"
#include "sysemu/block-backend.h"
#include "hw/block/block.h"
@@ -17,57 +18,128 @@
#include "qmp-commands.h"
#include "trace.h"
#include "block/nbd.h"
#include "qemu/sockets.h"
#include "io/channel-socket.h"
static int server_fd = -1;
typedef struct NBDServerData {
QIOChannelSocket *listen_ioc;
int watch;
QCryptoTLSCreds *tlscreds;
} NBDServerData;
static void nbd_accept(void *opaque)
static NBDServerData *nbd_server;
static gboolean nbd_accept(QIOChannel *ioc, GIOCondition condition,
gpointer opaque)
{
struct sockaddr_in addr;
socklen_t addr_len = sizeof(addr);
QIOChannelSocket *cioc;
int fd = accept(server_fd, (struct sockaddr *)&addr, &addr_len);
if (fd >= 0) {
nbd_client_new(NULL, fd, nbd_client_put);
if (!nbd_server) {
return FALSE;
}
cioc = qio_channel_socket_accept(QIO_CHANNEL_SOCKET(ioc),
NULL);
if (!cioc) {
return TRUE;
}
nbd_client_new(NULL, cioc,
nbd_server->tlscreds, NULL,
nbd_client_put);
object_unref(OBJECT(cioc));
return TRUE;
}
void qmp_nbd_server_start(SocketAddress *addr, Error **errp)
static void nbd_server_free(NBDServerData *server)
{
if (server_fd != -1) {
if (!server) {
return;
}
if (server->watch != -1) {
g_source_remove(server->watch);
}
object_unref(OBJECT(server->listen_ioc));
if (server->tlscreds) {
object_unref(OBJECT(server->tlscreds));
}
g_free(server);
}
static QCryptoTLSCreds *nbd_get_tls_creds(const char *id, Error **errp)
{
Object *obj;
QCryptoTLSCreds *creds;
obj = object_resolve_path_component(
object_get_objects_root(), id);
if (!obj) {
error_setg(errp, "No TLS credentials with id '%s'",
id);
return NULL;
}
creds = (QCryptoTLSCreds *)
object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS);
if (!creds) {
error_setg(errp, "Object with id '%s' is not TLS credentials",
id);
return NULL;
}
if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_SERVER) {
error_setg(errp,
"Expecting TLS credentials with a server endpoint");
return NULL;
}
object_ref(obj);
return creds;
}
void qmp_nbd_server_start(SocketAddress *addr,
bool has_tls_creds, const char *tls_creds,
Error **errp)
{
if (nbd_server) {
error_setg(errp, "NBD server already running");
return;
}
server_fd = socket_listen(addr, errp);
if (server_fd != -1) {
qemu_set_fd_handler(server_fd, nbd_accept, NULL, NULL);
nbd_server = g_new0(NBDServerData, 1);
nbd_server->watch = -1;
nbd_server->listen_ioc = qio_channel_socket_new();
if (qio_channel_socket_listen_sync(
nbd_server->listen_ioc, addr, errp) < 0) {
goto error;
}
}
/*
* Hook into the BlockBackend notifiers to close the export when the
* backend is closed.
*/
typedef struct NBDCloseNotifier {
Notifier n;
NBDExport *exp;
QTAILQ_ENTRY(NBDCloseNotifier) next;
} NBDCloseNotifier;
if (has_tls_creds) {
nbd_server->tlscreds = nbd_get_tls_creds(tls_creds, errp);
if (!nbd_server->tlscreds) {
goto error;
}
static QTAILQ_HEAD(, NBDCloseNotifier) close_notifiers =
QTAILQ_HEAD_INITIALIZER(close_notifiers);
if (addr->type != SOCKET_ADDRESS_KIND_INET) {
error_setg(errp, "TLS is only supported with IPv4/IPv6");
goto error;
}
}
static void nbd_close_notifier(Notifier *n, void *data)
{
NBDCloseNotifier *cn = DO_UPCAST(NBDCloseNotifier, n, n);
nbd_server->watch = qio_channel_add_watch(
QIO_CHANNEL(nbd_server->listen_ioc),
G_IO_IN,
nbd_accept,
NULL,
NULL);
notifier_remove(&cn->n);
QTAILQ_REMOVE(&close_notifiers, cn, next);
return;
nbd_export_close(cn->exp);
nbd_export_put(cn->exp);
g_free(cn);
error:
nbd_server_free(nbd_server);
nbd_server = NULL;
}
void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
@@ -75,9 +147,8 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
{
BlockBackend *blk;
NBDExport *exp;
NBDCloseNotifier *n;
if (server_fd == -1) {
if (!nbd_server) {
error_setg(errp, "NBD server not running");
return;
}
@@ -113,23 +184,16 @@ void qmp_nbd_server_add(const char *device, bool has_writable, bool writable,
nbd_export_set_name(exp, device);
n = g_new0(NBDCloseNotifier, 1);
n->n.notify = nbd_close_notifier;
n->exp = exp;
blk_add_close_notifier(blk, &n->n);
QTAILQ_INSERT_TAIL(&close_notifiers, n, next);
/* The list of named exports has a strong reference to this export now and
* our only way of accessing it is through nbd_export_find(), so we can drop
* the strong reference that is @exp. */
nbd_export_put(exp);
}
void qmp_nbd_server_stop(Error **errp)
{
while (!QTAILQ_EMPTY(&close_notifiers)) {
NBDCloseNotifier *cn = QTAILQ_FIRST(&close_notifiers);
nbd_close_notifier(&cn->n, nbd_export_get_blockdev(cn->exp));
}
nbd_export_close_all();
if (server_fd != -1) {
qemu_set_fd_handler(server_fd, NULL, NULL, NULL);
close(server_fd);
server_fd = -1;
}
nbd_server_free(nbd_server);
nbd_server = NULL;
}

View File

@@ -30,6 +30,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/block-backend.h"
#include "sysemu/blockdev.h"
#include "hw/block/block.h"
@@ -50,6 +51,9 @@
#include "trace.h"
#include "sysemu/arch_init.h"
static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
static const char *const if_name[IF_COUNT] = {
[IF_NONE] = "none",
[IF_IDE] = "ide",
@@ -143,6 +147,7 @@ void blockdev_auto_del(BlockBackend *blk)
DriveInfo *dinfo = blk_legacy_dinfo(blk);
if (dinfo && dinfo->auto_del) {
monitor_remove_blk(blk);
blk_unref(blk);
}
}
@@ -339,29 +344,6 @@ static bool parse_stats_intervals(BlockAcctStats *stats, QList *intervals,
return true;
}
static bool check_throttle_config(ThrottleConfig *cfg, Error **errp)
{
if (throttle_conflicting(cfg)) {
error_setg(errp, "bps/iops/max total values and read/write values"
" cannot be used at the same time");
return false;
}
if (!throttle_is_valid(cfg)) {
error_setg(errp, "bps/iops/max values must be within [0, %lld]",
THROTTLE_VALUE_MAX);
return false;
}
if (throttle_max_is_missing_limit(cfg)) {
error_setg(errp, "bps_max/iops_max require corresponding"
" bps/iops values");
return false;
}
return true;
}
typedef enum { MEDIA_DISK, MEDIA_CDROM } DriveMediaType;
/* All parameters but @opts are optional and may be set to NULL. */
@@ -406,7 +388,7 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags,
}
if (throttle_cfg) {
memset(throttle_cfg, 0, sizeof(*throttle_cfg));
throttle_config_init(throttle_cfg);
throttle_cfg->buckets[THROTTLE_BPS_TOTAL].avg =
qemu_opt_get_number(opts, "throttling.bps-total", 0);
throttle_cfg->buckets[THROTTLE_BPS_READ].avg =
@@ -433,10 +415,23 @@ static void extract_common_blockdev_options(QemuOpts *opts, int *bdrv_flags,
throttle_cfg->buckets[THROTTLE_OPS_WRITE].max =
qemu_opt_get_number(opts, "throttling.iops-write-max", 0);
throttle_cfg->buckets[THROTTLE_BPS_TOTAL].burst_length =
qemu_opt_get_number(opts, "throttling.bps-total-max-length", 1);
throttle_cfg->buckets[THROTTLE_BPS_READ].burst_length =
qemu_opt_get_number(opts, "throttling.bps-read-max-length", 1);
throttle_cfg->buckets[THROTTLE_BPS_WRITE].burst_length =
qemu_opt_get_number(opts, "throttling.bps-write-max-length", 1);
throttle_cfg->buckets[THROTTLE_OPS_TOTAL].burst_length =
qemu_opt_get_number(opts, "throttling.iops-total-max-length", 1);
throttle_cfg->buckets[THROTTLE_OPS_READ].burst_length =
qemu_opt_get_number(opts, "throttling.iops-read-max-length", 1);
throttle_cfg->buckets[THROTTLE_OPS_WRITE].burst_length =
qemu_opt_get_number(opts, "throttling.iops-write-max-length", 1);
throttle_cfg->op_size =
qemu_opt_get_number(opts, "throttling.iops-size", 0);
if (!check_throttle_config(throttle_cfg, errp)) {
if (!throttle_is_valid(throttle_cfg, errp)) {
return;
}
}
@@ -567,7 +562,7 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
if ((!file || !*file) && !qdict_size(bs_opts)) {
BlockBackendRootState *blk_rs;
blk = blk_new(qemu_opts_id(opts), errp);
blk = blk_new(errp);
if (!blk) {
goto early_err;
}
@@ -599,15 +594,11 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_DIRECT, "off");
qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_NO_FLUSH, "off");
if (snapshot) {
/* always use cache=unsafe with snapshot */
qdict_put(bs_opts, BDRV_OPT_CACHE_WB, qstring_from_str("on"));
qdict_put(bs_opts, BDRV_OPT_CACHE_DIRECT, qstring_from_str("off"));
qdict_put(bs_opts, BDRV_OPT_CACHE_NO_FLUSH, qstring_from_str("on"));
if (runstate_check(RUN_STATE_INMIGRATE)) {
bdrv_flags |= BDRV_O_INACTIVE;
}
blk = blk_new_open(qemu_opts_id(opts), file, NULL, bs_opts, bdrv_flags,
errp);
blk = blk_new_open(file, NULL, bs_opts, bdrv_flags, errp);
if (!blk) {
goto err_no_bs_opts;
}
@@ -639,6 +630,12 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
blk_set_on_error(blk, on_read_error, on_write_error);
if (!monitor_add_blk(blk, qemu_opts_id(opts), errp)) {
blk_unref(blk);
blk = NULL;
goto err_no_bs_opts;
}
err_no_bs_opts:
qemu_opts_del(opts);
QDECREF(interval_dict);
@@ -684,6 +681,17 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
goto fail;
}
/* bdrv_open() defaults to the values in bdrv_flags (for compatibility
* with other callers) rather than what we want as the real defaults.
* Apply the defaults here instead. */
qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_WB, "on");
qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_DIRECT, "off");
qdict_set_default_str(bs_opts, BDRV_OPT_CACHE_NO_FLUSH, "off");
if (runstate_check(RUN_STATE_INMIGRATE)) {
bdrv_flags |= BDRV_O_INACTIVE;
}
bs = NULL;
ret = bdrv_open(&bs, NULL, NULL, bs_opts, bdrv_flags, errp);
if (ret < 0) {
@@ -702,6 +710,26 @@ fail:
return NULL;
}
void blockdev_close_all_bdrv_states(void)
{
BlockDriverState *bs, *next_bs;
QTAILQ_FOREACH_SAFE(bs, &monitor_bdrv_states, monitor_list, next_bs) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
bdrv_unref(bs);
aio_context_release(ctx);
}
}
/* Iterates over the list of monitor-owned BlockDriverStates */
BlockDriverState *bdrv_next_monitor_owned(BlockDriverState *bs)
{
return bs ? QTAILQ_NEXT(bs, monitor_list)
: QTAILQ_FIRST(&monitor_bdrv_states);
}
static void qemu_opt_rename(QemuOpts *opts, const char *from, const char *to,
Error **errp)
{
@@ -1158,7 +1186,7 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
int ret;
if (!strcmp(device, "all")) {
ret = bdrv_commit_all();
ret = blk_commit_all();
} else {
BlockDriverState *bs;
AioContext *aio_context;
@@ -1187,15 +1215,11 @@ void hmp_commit(Monitor *mon, const QDict *qdict)
}
}
static void blockdev_do_action(TransactionActionKind type, void *data,
Error **errp)
static void blockdev_do_action(TransactionAction *action, Error **errp)
{
TransactionAction action;
TransactionActionList list;
action.type = type;
action.u.data = data;
list.value = &action;
list.value = action;
list.next = NULL;
qmp_transaction(&list, false, NULL, errp);
}
@@ -1221,8 +1245,11 @@ void qmp_blockdev_snapshot_sync(bool has_device, const char *device,
.has_mode = has_mode,
.mode = mode,
};
blockdev_do_action(TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC,
&snapshot, errp);
TransactionAction action = {
.type = TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC,
.u.blockdev_snapshot_sync.data = &snapshot,
};
blockdev_do_action(&action, errp);
}
void qmp_blockdev_snapshot(const char *node, const char *overlay,
@@ -1232,9 +1259,11 @@ void qmp_blockdev_snapshot(const char *node, const char *overlay,
.node = (char *) node,
.overlay = (char *) overlay
};
blockdev_do_action(TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT,
&snapshot_data, errp);
TransactionAction action = {
.type = TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT,
.u.blockdev_snapshot.data = &snapshot_data,
};
blockdev_do_action(&action, errp);
}
void qmp_blockdev_snapshot_internal_sync(const char *device,
@@ -1245,9 +1274,11 @@ void qmp_blockdev_snapshot_internal_sync(const char *device,
.device = (char *) device,
.name = (char *) name
};
blockdev_do_action(TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC,
&snapshot, errp);
TransactionAction action = {
.type = TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC,
.u.blockdev_snapshot_internal_sync.data = &snapshot,
};
blockdev_do_action(&action, errp);
}
SnapshotInfo *qmp_blockdev_snapshot_delete_internal_sync(const char *device,
@@ -1484,7 +1515,7 @@ static void internal_snapshot_prepare(BlkActionState *common,
g_assert(common->action->type ==
TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_INTERNAL_SYNC);
internal = common->action->u.blockdev_snapshot_internal_sync;
internal = common->action->u.blockdev_snapshot_internal_sync.data;
state = DO_UPCAST(InternalSnapshotState, common, common);
/* 1. parse input */
@@ -1634,7 +1665,7 @@ static void external_snapshot_prepare(BlkActionState *common,
switch (action->type) {
case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT:
{
BlockdevSnapshot *s = action->u.blockdev_snapshot;
BlockdevSnapshot *s = action->u.blockdev_snapshot.data;
device = s->node;
node_name = s->node;
new_image_file = NULL;
@@ -1643,7 +1674,7 @@ static void external_snapshot_prepare(BlkActionState *common,
break;
case TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC:
{
BlockdevSnapshotSync *s = action->u.blockdev_snapshot_sync;
BlockdevSnapshotSync *s = action->u.blockdev_snapshot_sync.data;
device = s->has_device ? s->device : NULL;
node_name = s->has_node_name ? s->node_name : NULL;
new_image_file = s->snapshot_file;
@@ -1692,7 +1723,7 @@ static void external_snapshot_prepare(BlkActionState *common,
}
if (action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_SNAPSHOT_SYNC) {
BlockdevSnapshotSync *s = action->u.blockdev_snapshot_sync;
BlockdevSnapshotSync *s = action->u.blockdev_snapshot_sync.data;
const char *format = s->has_format ? s->format : "qcow2";
enum NewImageMode mode;
const char *snapshot_node_name =
@@ -1714,10 +1745,15 @@ static void external_snapshot_prepare(BlkActionState *common,
/* create new image w/backing file */
mode = s->has_mode ? s->mode : NEW_IMAGE_MODE_ABSOLUTE_PATHS;
if (mode != NEW_IMAGE_MODE_EXISTING) {
int64_t size = bdrv_getlength(state->old_bs);
if (size < 0) {
error_setg_errno(errp, -size, "bdrv_getlength failed");
return;
}
bdrv_img_create(new_image_file, format,
state->old_bs->filename,
state->old_bs->drv->format_name,
NULL, -1, flags, &local_err, false);
NULL, size, flags, &local_err, false);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1825,7 +1861,7 @@ static void drive_backup_prepare(BlkActionState *common, Error **errp)
Error *local_err = NULL;
assert(common->action->type == TRANSACTION_ACTION_KIND_DRIVE_BACKUP);
backup = common->action->u.drive_backup;
backup = common->action->u.drive_backup.data;
blk = blk_by_name(backup->device);
if (!blk) {
@@ -1907,7 +1943,7 @@ static void blockdev_backup_prepare(BlkActionState *common, Error **errp)
Error *local_err = NULL;
assert(common->action->type == TRANSACTION_ACTION_KIND_BLOCKDEV_BACKUP);
backup = common->action->u.blockdev_backup;
backup = common->action->u.blockdev_backup.data;
blk = blk_by_name(backup->device);
if (!blk) {
@@ -1993,7 +2029,7 @@ static void block_dirty_bitmap_add_prepare(BlkActionState *common,
return;
}
action = common->action->u.block_dirty_bitmap_add;
action = common->action->u.block_dirty_bitmap_add.data;
/* AIO context taken and released within qmp_block_dirty_bitmap_add */
qmp_block_dirty_bitmap_add(action->node, action->name,
action->has_granularity, action->granularity,
@@ -2012,7 +2048,7 @@ static void block_dirty_bitmap_add_abort(BlkActionState *common)
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
action = common->action->u.block_dirty_bitmap_add;
action = common->action->u.block_dirty_bitmap_add.data;
/* Should not be able to fail: IF the bitmap was added via .prepare(),
* then the node reference and bitmap name must have been valid.
*/
@@ -2032,7 +2068,7 @@ static void block_dirty_bitmap_clear_prepare(BlkActionState *common,
return;
}
action = common->action->u.block_dirty_bitmap_clear;
action = common->action->u.block_dirty_bitmap_clear.data;
state->bitmap = block_dirty_bitmap_lookup(action->node,
action->name,
&state->bs,
@@ -2304,6 +2340,11 @@ void qmp_blockdev_open_tray(const char *device, bool has_force, bool force,
return;
}
if (!blk_dev_has_tray(blk)) {
/* Ignore this command on tray-less devices */
return;
}
if (blk_dev_is_tray_open(blk)) {
return;
}
@@ -2334,6 +2375,11 @@ void qmp_blockdev_close_tray(const char *device, Error **errp)
return;
}
if (!blk_dev_has_tray(blk)) {
/* Ignore this command on tray-less devices */
return;
}
if (!blk_dev_is_tray_open(blk)) {
return;
}
@@ -2363,7 +2409,7 @@ void qmp_x_blockdev_remove_medium(const char *device, Error **errp)
return;
}
if (has_device && !blk_dev_is_tray_open(blk)) {
if (has_device && blk_dev_has_tray(blk) && !blk_dev_is_tray_open(blk)) {
error_setg(errp, "Tray of device '%s' is not open", device);
return;
}
@@ -2380,14 +2426,16 @@ void qmp_x_blockdev_remove_medium(const char *device, Error **errp)
goto out;
}
/* This follows the convention established by bdrv_make_anon() */
if (bs->device_list.tqe_prev) {
QTAILQ_REMOVE(&bdrv_states, bs, device_list);
bs->device_list.tqe_prev = NULL;
}
blk_remove_bs(blk);
if (!blk_dev_has_tray(blk)) {
/* For tray-less devices, blockdev-open-tray is a no-op (or may not be
* called at all); therefore, the medium needs to be ejected here.
* Do it after blk_remove_bs() so blk_is_inserted(blk) returns the @load
* value passed here (i.e. false). */
blk_dev_change_media_cb(blk, false);
}
out:
aio_context_release(aio_context);
}
@@ -2413,7 +2461,7 @@ static void qmp_blockdev_insert_anon_medium(const char *device,
return;
}
if (has_device && !blk_dev_is_tray_open(blk)) {
if (has_device && blk_dev_has_tray(blk) && !blk_dev_is_tray_open(blk)) {
error_setg(errp, "Tray of device '%s' is not open", device);
return;
}
@@ -2425,7 +2473,14 @@ static void qmp_blockdev_insert_anon_medium(const char *device,
blk_insert_bs(blk, bs);
QTAILQ_INSERT_TAIL(&bdrv_states, bs, device_list);
if (!blk_dev_has_tray(blk)) {
/* For tray-less devices, blockdev-close-tray is a no-op (or may not be
* called at all); therefore, the medium needs to be pushed into the
* slot here.
* Do it after blk_insert_bs() so blk_is_inserted(blk) returns the @load
* value passed here (i.e. true). */
blk_dev_change_media_cb(blk, true);
}
}
void qmp_x_blockdev_insert_medium(const char *device, const char *node_name,
@@ -2472,6 +2527,8 @@ void qmp_blockdev_change_medium(const char *device, const char *filename,
}
bdrv_flags = blk_get_open_flags_from_root_state(blk);
bdrv_flags &= ~(BDRV_O_TEMPORARY | BDRV_O_SNAPSHOT | BDRV_O_NO_BACKING |
BDRV_O_PROTOCOL);
if (!has_read_only) {
read_only = BLOCKDEV_CHANGE_READ_ONLY_MODE_RETAIN;
@@ -2557,6 +2614,18 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
int64_t iops_rd_max,
bool has_iops_wr_max,
int64_t iops_wr_max,
bool has_bps_max_length,
int64_t bps_max_length,
bool has_bps_rd_max_length,
int64_t bps_rd_max_length,
bool has_bps_wr_max_length,
int64_t bps_wr_max_length,
bool has_iops_max_length,
int64_t iops_max_length,
bool has_iops_rd_max_length,
int64_t iops_rd_max_length,
bool has_iops_wr_max_length,
int64_t iops_wr_max_length,
bool has_iops_size,
int64_t iops_size,
bool has_group,
@@ -2583,7 +2652,7 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
goto out;
}
memset(&cfg, 0, sizeof(cfg));
throttle_config_init(&cfg);
cfg.buckets[THROTTLE_BPS_TOTAL].avg = bps;
cfg.buckets[THROTTLE_BPS_READ].avg = bps_rd;
cfg.buckets[THROTTLE_BPS_WRITE].avg = bps_wr;
@@ -2611,11 +2680,30 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
cfg.buckets[THROTTLE_OPS_WRITE].max = iops_wr_max;
}
if (has_bps_max_length) {
cfg.buckets[THROTTLE_BPS_TOTAL].burst_length = bps_max_length;
}
if (has_bps_rd_max_length) {
cfg.buckets[THROTTLE_BPS_READ].burst_length = bps_rd_max_length;
}
if (has_bps_wr_max_length) {
cfg.buckets[THROTTLE_BPS_WRITE].burst_length = bps_wr_max_length;
}
if (has_iops_max_length) {
cfg.buckets[THROTTLE_OPS_TOTAL].burst_length = iops_max_length;
}
if (has_iops_rd_max_length) {
cfg.buckets[THROTTLE_OPS_READ].burst_length = iops_rd_max_length;
}
if (has_iops_wr_max_length) {
cfg.buckets[THROTTLE_OPS_WRITE].burst_length = iops_wr_max_length;
}
if (has_iops_size) {
cfg.op_size = iops_size;
}
if (!check_throttle_config(&cfg, errp)) {
if (!throttle_is_valid(&cfg, errp)) {
goto out;
}
@@ -2742,6 +2830,15 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
AioContext *aio_context;
Error *local_err = NULL;
bs = bdrv_find_node(id);
if (bs) {
qmp_x_blockdev_del(false, NULL, true, id, &local_err);
if (local_err) {
error_report_err(local_err);
}
return;
}
blk = blk_by_name(id);
if (!blk) {
error_report("Device '%s' not found", id);
@@ -2765,16 +2862,19 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
return;
}
bdrv_close(bs);
blk_remove_bs(blk);
}
/* if we have a device attached to this BlockDriverState
* then we need to make the drive anonymous until the device
* can be removed. If this is a drive with no device backing
* then we can just get rid of the block driver state right here.
/* Make the BlockBackend and the attached BlockDriverState anonymous */
monitor_remove_blk(blk);
if (blk_bs(blk)) {
bdrv_make_anon(blk_bs(blk));
}
/* If this BlockBackend has a device attached to it, its refcount will be
* decremented when the device is removed; otherwise we have to do so here.
*/
if (blk_get_attached_dev(blk)) {
blk_hide_on_behalf_of_hmp_drive_del(blk);
/* Further I/O must not pause the guest */
blk_set_on_error(blk, BLOCKDEV_ON_ERROR_REPORT,
BLOCKDEV_ON_ERROR_REPORT);
@@ -3793,6 +3893,37 @@ out:
aio_context_release(aio_context);
}
void hmp_drive_add_node(Monitor *mon, const char *optstr)
{
QemuOpts *opts;
QDict *qdict;
Error *local_err = NULL;
opts = qemu_opts_parse_noisily(&qemu_drive_opts, optstr, false);
if (!opts) {
return;
}
qdict = qemu_opts_to_qdict(opts, NULL);
if (!qdict_get_try_str(qdict, "node-name")) {
QDECREF(qdict);
error_report("'node-name' needs to be specified");
goto out;
}
BlockDriverState *bs = bds_tree_init(qdict, &local_err);
if (!bs) {
error_report_err(local_err);
goto out;
}
QTAILQ_INSERT_TAIL(&monitor_bdrv_states, bs, monitor_list);
out:
qemu_opts_del(opts);
}
void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
{
QmpOutputVisitor *ov = qmp_output_visitor_new();
@@ -3817,8 +3948,8 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
}
}
visit_type_BlockdevOptions(qmp_output_get_visitor(ov),
&options, NULL, &local_err);
visit_type_BlockdevOptions(qmp_output_get_visitor(ov), NULL, &options,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
@@ -3848,12 +3979,16 @@ void qmp_blockdev_add(BlockdevOptions *options, Error **errp)
if (!bs) {
goto fail;
}
QTAILQ_INSERT_TAIL(&monitor_bdrv_states, bs, monitor_list);
}
if (bs && bdrv_key_required(bs)) {
if (blk) {
monitor_remove_blk(blk);
blk_unref(blk);
} else {
QTAILQ_REMOVE(&monitor_bdrv_states, bs, monitor_list);
bdrv_unref(bs);
}
error_setg(errp, "blockdev-add doesn't support encrypted devices");
@@ -3880,6 +4015,7 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
}
if (has_id) {
/* blk_by_name() never returns a BB that is not owned by the monitor */
blk = blk_by_name(id);
if (!blk) {
error_setg(errp, "Cannot find block backend %s", id);
@@ -3913,7 +4049,13 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
goto out;
}
if (bs->refcnt > 1 || !QLIST_EMPTY(&bs->parents)) {
if (!blk && !bs->monitor_list.tqe_prev) {
error_setg(errp, "Node %s is not owned by the monitor",
bs->node_name);
goto out;
}
if (bs->refcnt > 1) {
error_setg(errp, "Block device %s is in use",
bdrv_get_device_or_node_name(bs));
goto out;
@@ -3921,8 +4063,10 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
}
if (blk) {
monitor_remove_blk(blk);
blk_unref(blk);
} else {
QTAILQ_REMOVE(&monitor_bdrv_states, bs, monitor_list);
bdrv_unref(bs);
}
@@ -4033,6 +4177,30 @@ QemuOptsList qemu_common_drive_opts = {
.name = "throttling.bps-write-max",
.type = QEMU_OPT_NUMBER,
.help = "total bytes write burst",
},{
.name = "throttling.iops-total-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the iops-total-max burst period, in seconds",
},{
.name = "throttling.iops-read-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the iops-read-max burst period, in seconds",
},{
.name = "throttling.iops-write-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the iops-write-max burst period, in seconds",
},{
.name = "throttling.bps-total-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the bps-total-max burst period, in seconds",
},{
.name = "throttling.bps-read-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the bps-read-max burst period, in seconds",
},{
.name = "throttling.bps-write-max-length",
.type = QEMU_OPT_NUMBER,
.help = "length of the bps-write-max burst period, in seconds",
},{
.name = "throttling.iops-size",
.type = QEMU_OPT_NUMBER,
@@ -4066,7 +4234,7 @@ QemuOptsList qemu_common_drive_opts = {
static QemuOptsList qemu_root_bds_opts = {
.name = "root-bds",
.head = QTAILQ_HEAD_INITIALIZER(qemu_common_drive_opts.head),
.head = QTAILQ_HEAD_INITIALIZER(qemu_root_bds_opts.head),
.desc = {
{
.name = "discard",

View File

@@ -23,7 +23,7 @@
* THE SOFTWARE.
*/
#include "config-host.h"
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "trace.h"
#include "block/block.h"
@@ -278,14 +278,6 @@ void block_job_iostatus_reset(BlockJob *job)
}
}
struct BlockFinishData {
BlockJob *job;
BlockCompletionFunc *cb;
void *opaque;
bool cancelled;
int ret;
};
static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
@@ -304,7 +296,9 @@ static int block_job_finish_sync(BlockJob *job,
return -EBUSY;
}
while (!job->completed) {
aio_poll(bdrv_get_aio_context(bs), true);
aio_poll(job->deferred_to_main_loop ? qemu_get_aio_context() :
bdrv_get_aio_context(bs),
true);
}
ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
block_job_unref(job);
@@ -478,6 +472,7 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
aio_context = bdrv_get_aio_context(data->job->bs);
aio_context_acquire(aio_context);
data->job->deferred_to_main_loop = false;
data->fn(data->job, data->opaque);
aio_context_release(aio_context);
@@ -497,6 +492,7 @@ void block_job_defer_to_main_loop(BlockJob *job,
data->aio_context = bdrv_get_aio_context(job->bs);
data->fn = fn;
data->opaque = opaque;
job->deferred_to_main_loop = true;
qemu_bh_schedule(data->bh);
}

View File

@@ -22,6 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/sysemu.h"
#include "qapi/visitor.h"
#include "qemu/error-report.h"
@@ -270,21 +271,21 @@ typedef struct {
DeviceState *dev;
} BootIndexProperty;
static void device_get_bootindex(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
static void device_get_bootindex(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
BootIndexProperty *prop = opaque;
visit_type_int32(v, prop->bootindex, name, errp);
visit_type_int32(v, name, prop->bootindex, errp);
}
static void device_set_bootindex(Object *obj, Visitor *v, void *opaque,
const char *name, Error **errp)
static void device_set_bootindex(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
BootIndexProperty *prop = opaque;
int32_t boot_index;
Error *local_err = NULL;
visit_type_int32(v, &boot_index, name, &local_err);
visit_type_int32(v, name, &boot_index, &local_err);
if (local_err) {
goto out;
}

View File

@@ -1,12 +1,6 @@
/* Code for loading BSD executables. Mostly linux kernel code. */
#include <sys/types.h>
#include <sys/stat.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include "qemu/osdep.h"
#include "qemu.h"

View File

@@ -1,13 +1,7 @@
/* This is the Linux kernel elf-loading code, ported into user space */
#include <stdio.h>
#include <sys/types.h>
#include <fcntl.h>
#include <errno.h>
#include <unistd.h>
#include "qemu/osdep.h"
#include <sys/mman.h>
#include <stdlib.h>
#include <string.h>
#include "qemu.h"
#include "disas/disas.h"

View File

@@ -1,3 +1,6 @@
#ifndef TARGET_SYSCALL_H
#define TARGET_SYSCALL_H
/* default linux values for the selectors */
#define __USER_CS (0x23)
#define __USER_DS (0x2B)
@@ -159,3 +162,4 @@ struct target_vm86plus_struct {
#define UNAME_MACHINE "i386"
#endif /* TARGET_SYSCALL_H */

View File

@@ -16,14 +16,8 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include "qemu/osdep.h"
#include <machine/trap.h>
#include <sys/types.h>
#include <sys/mman.h>
#include "qemu.h"
@@ -33,6 +27,7 @@
#include "tcg.h"
#include "qemu/timer.h"
#include "qemu/envlist.h"
#include "exec/log.h"
int singlestep;
unsigned long mmap_min_addr;

View File

@@ -16,12 +16,7 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdarg.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include "qemu/osdep.h"
#include <sys/mman.h>
#include "qemu.h"

View File

@@ -17,15 +17,12 @@
#ifndef QEMU_H
#define QEMU_H
#include <signal.h>
#include <string.h>
#include "cpu.h"
#include "exec/cpu_ldst.h"
#undef DEBUG_REMAP
#ifdef DEBUG_REMAP
#include <stdlib.h>
#endif /* DEBUG_REMAP */
#include "exec/user/abitypes.h"
@@ -38,7 +35,7 @@ enum BSDType {
extern enum BSDType bsd_type;
#include "syscall_defs.h"
#include "syscall.h"
#include "target_syscall.h"
#include "target_signal.h"
#include "exec/gdbstub.h"

View File

@@ -16,12 +16,7 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdlib.h>
#include <stdio.h>
#include <string.h>
#include <stdarg.h>
#include <unistd.h>
#include <errno.h>
#include "qemu/osdep.h"
#include "qemu.h"
#include "target_signal.h"

View File

@@ -1,3 +1,6 @@
#ifndef TARGET_SYSCALL_H
#define TARGET_SYSCALL_H
struct target_pt_regs {
abi_ulong psr;
abi_ulong pc;
@@ -7,3 +10,5 @@ struct target_pt_regs {
};
#define UNAME_MACHINE "sun4"
#endif /* TARGET_SYSCALL_H */

View File

@@ -1,3 +1,6 @@
#ifndef TARGET_SYSCALL_H
#define TARGET_SYSCALL_H
struct target_pt_regs {
abi_ulong u_regs[16];
abi_ulong tstate;
@@ -8,3 +11,5 @@ struct target_pt_regs {
};
#define UNAME_MACHINE "sun4u"
#endif /* TARGET_SYSCALL_H */

View File

@@ -16,14 +16,10 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdio.h>
#include <errno.h>
#include "qemu/osdep.h"
#include <sys/select.h>
#include <sys/types.h>
#include <unistd.h>
#include <sys/syscall.h>
#include <sys/ioccom.h>
#include <ctype.h>
#include "qemu.h"

View File

@@ -16,17 +16,7 @@
* You should have received a copy of the GNU General Public License
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include <stdlib.h>
#include <stdio.h>
#include <stdint.h>
#include <stdarg.h>
#include <string.h>
#include <errno.h>
#include <unistd.h>
#include <fcntl.h>
#include <time.h>
#include <limits.h>
#include <sys/types.h>
#include "qemu/osdep.h"
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/param.h>

View File

@@ -1,6 +1,5 @@
/* User memory access */
#include <stdio.h>
#include <string.h>
#include "qemu/osdep.h"
#include "qemu.h"

View File

@@ -1,3 +1,6 @@
#ifndef TARGET_SYSCALL_H
#define TARGET_SYSCALL_H
#define __USER_CS (0x33)
#define __USER_DS (0x2B)
@@ -114,3 +117,5 @@ struct target_msqid64_ds {
#define TARGET_ARCH_SET_FS 0x1002
#define TARGET_ARCH_GET_FS 0x1003
#define TARGET_ARCH_GET_GS 0x1004
#endif /* TARGET_SYSCALL_H */

View File

@@ -17,12 +17,12 @@
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/bt.h"
#include "qemu/main-loop.h"
#ifndef _WIN32
# include <errno.h>
# include <sys/ioctl.h>
# include <sys/uio.h>
# ifdef CONFIG_BLUEZ

View File

@@ -17,6 +17,7 @@
* with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/bt.h"
#include "hw/bt.h"

292
configure vendored
View File

@@ -116,38 +116,6 @@ compile_prog() {
do_cc $QEMU_CFLAGS $local_cflags -o $TMPE $TMPC $LDFLAGS $local_ldflags
}
do_libtool() {
local mode=$1
shift
# Run the compiler, capturing its output to the log.
echo $libtool $mode --tag=CC $cc "$@" >> config.log
$libtool $mode --tag=CC $cc "$@" >> config.log 2>&1 || return $?
# Test passed. If this is an --enable-werror build, rerun
# the test with -Werror and bail out if it fails. This
# makes warning-generating-errors in configure test code
# obvious to developers.
if test "$werror" != "yes"; then
return 0
fi
# Don't bother rerunning the compile if we were already using -Werror
case "$*" in
*-Werror*)
return 0
;;
esac
echo $libtool $mode --tag=CC $cc -Werror "$@" >> config.log
$libtool $mode --tag=CC $cc -Werror "$@" >> config.log 2>&1 && return $?
error_exit "configure test passed without -Werror but failed with -Werror." \
"This is probably a bug in the configure script. The failing command" \
"will be at the bottom of config.log." \
"You can run configure with --disable-werror to bypass this check."
}
libtool_prog() {
do_libtool --mode=compile $QEMU_CFLAGS -c -fPIE -DPIE -o $TMPO $TMPC || return $?
do_libtool --mode=link $LDFLAGS -o $TMPA $TMPL -rpath /usr/local/lib
}
# symbolically link $1 to $2. Portable version of "ln -sf".
symlink() {
rm -rf "$2"
@@ -303,7 +271,7 @@ pkgversion=""
pie=""
zero_malloc=""
qom_cast_debug="yes"
trace_backends="nop"
trace_backends="log"
trace_file="trace"
spice=""
rbd=""
@@ -311,6 +279,8 @@ smartcard=""
libusb=""
usb_redir=""
opengl=""
opengl_dmabuf="no"
avx2_opt="no"
zlib="yes"
lzo=""
snappy=""
@@ -336,8 +306,10 @@ gtkabi=""
gtk_gl="no"
gnutls=""
gnutls_hash=""
gnutls_rnd=""
nettle=""
gcrypt=""
gcrypt_kdf="no"
vte=""
virglrenderer=""
tpm="yes"
@@ -398,7 +370,6 @@ as="${AS-${cross_prefix}as}"
cpp="${CPP-$cc -E}"
objcopy="${OBJCOPY-${cross_prefix}objcopy}"
ld="${LD-${cross_prefix}ld}"
libtool="${LIBTOOL-${cross_prefix}libtool}"
nm="${NM-${cross_prefix}nm}"
strip="${STRIP-${cross_prefix}strip}"
windres="${WINDRES-${cross_prefix}windres}"
@@ -1514,7 +1485,6 @@ EOF
if do_cc $QEMU_CFLAGS -Werror $flag -c -o $TMPO $TMPC &&
compile_prog "-Werror $flag" ""; then
QEMU_CFLAGS="$QEMU_CFLAGS $flag"
LIBTOOLFLAGS="$LIBTOOLFLAGS -Wc,$flag"
sp_on=1
break
fi
@@ -1609,32 +1579,6 @@ EOF
fi
fi
# check for broken gcc and libtool in RHEL5
if test -n "$libtool" -a "$pie" != "no" ; then
cat > $TMPC <<EOF
void *f(unsigned char *buf, int len);
void *g(unsigned char *buf, int len);
void *
f(unsigned char *buf, int len)
{
return (void*)0L;
}
void *
g(unsigned char *buf, int len)
{
return f(buf, len);
}
EOF
if ! libtool_prog; then
echo "Disabling libtool due to broken toolchain support"
libtool=
fi
fi
##########################################
# __sync_fetch_and_and requires at least -march=i486. Many toolchains
# use i686 as default anyway, but for those that don't, an explicit
@@ -1832,6 +1776,21 @@ EOF
fi
##########################################
# avx2 optimization requirement check
cat > $TMPC << EOF
static void bar(void) {}
static void *bar_ifunc(void) {return (void*) bar;}
static void foo(void) __attribute__((ifunc("bar_ifunc")));
int main(void) { foo(); return 0; }
EOF
if compile_prog "-mavx2" "" ; then
if readelf --syms $TMPE |grep "IFUNC.*foo" >/dev/null 2>&1; then
avx2_opt="yes"
fi
fi
#########################################
# zlib check
if test "$zlib" != "no" ; then
@@ -2111,100 +2070,10 @@ EOF
xen_ctrl_version=420
xen=yes
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xs.h>
#include <stdint.h>
#include <xen/hvm/hvm_info_table.h>
#if !defined(HVM_MAX_VCPUS)
# error HVM_MAX_VCPUS not defined
#endif
int main(void) {
xs_daemon_open();
xc_interface_open(0, 0, 0);
xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
xc_gnttab_open(NULL, 0);
xc_domain_add_to_physmap(0, 0, XENMAPSPACE_gmfn, 0, 0);
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=410
xen=yes
# Xen 4.0.0
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xs.h>
#include <stdint.h>
#include <xen/hvm/hvm_info_table.h>
#if !defined(HVM_MAX_VCPUS)
# error HVM_MAX_VCPUS not defined
#endif
int main(void) {
struct xen_add_to_physmap xatp = {
.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0,
};
xs_daemon_open();
xc_interface_open();
xc_gnttab_open();
xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
xc_memory_op(0, XENMEM_add_to_physmap, &xatp);
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=400
xen=yes
# Xen 3.4.0
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xs.h>
int main(void) {
struct xen_add_to_physmap xatp = {
.domid = 0, .space = XENMAPSPACE_gmfn, .idx = 0, .gpfn = 0,
};
xs_daemon_open();
xc_interface_open();
xc_gnttab_open();
xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
xc_memory_op(0, XENMEM_add_to_physmap, &xatp);
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=340
xen=yes
# Xen 3.3.0
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xs.h>
int main(void) {
xs_daemon_open();
xc_interface_open();
xc_gnttab_open();
xc_hvm_set_mem_type(0, 0, HVMMEM_ram_ro, 0, 0);
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=330
xen=yes
# Xen version unsupported
else
if test "$xen" = "yes" ; then
feature_not_found "xen (unsupported version)" "Install supported xen (e.g. 4.0, 3.4, 3.3)"
feature_not_found "xen (unsupported version)" \
"Install a supported xen (xen 4.2 or newer)"
fi
xen=no
fi
@@ -2218,15 +2087,10 @@ EOF
fi
if test "$xen_pci_passthrough" != "no"; then
if test "$xen" = "yes" && test "$linux" = "yes" &&
test "$xen_ctrl_version" -ge 340; then
if test "$xen" = "yes" && test "$linux" = "yes"; then
xen_pci_passthrough=yes
else
if test "$xen_pci_passthrough" = "yes"; then
if test "$xen_ctrl_version" -lt 340; then
error_exit "User requested feature Xen PCI Passthrough" \
"This feature does not work with Xen 3.3"
fi
error_exit "User requested feature Xen PCI Passthrough" \
" but this feature requires /sys from Linux"
fi
@@ -2240,21 +2104,6 @@ if test "$xen_pv_domain_build" = "yes" &&
"which requires Xen support."
fi
##########################################
# libtool probe
if ! has $libtool; then
libtool=
fi
# MacOSX ships with a libtool which isn't the GNU one; weed this
# out by checking whether libtool supports the --version switch
if test -n "$libtool"; then
if ! "$libtool" --version >/dev/null 2>&1; then
libtool=
fi
fi
##########################################
# Sparse probe
if test "$sparse" != "no" ; then
@@ -2354,6 +2203,13 @@ if test "$gnutls" != "no"; then
gnutls_hash="no"
fi
# gnutls_rnd requires >= 2.11.0
if $pkg_config --exists "gnutls >= 2.11.0"; then
gnutls_rnd="yes"
else
gnutls_rnd="no"
fi
if $pkg_config --exists 'gnutls >= 3.0'; then
gnutls_gcrypt=no
gnutls_nettle=yes
@@ -2381,9 +2237,11 @@ if test "$gnutls" != "no"; then
else
gnutls="no"
gnutls_hash="no"
gnutls_rnd="no"
fi
else
gnutls_hash="no"
gnutls_rnd="no"
fi
@@ -2445,6 +2303,19 @@ if test "$gcrypt" != "no"; then
if test -z "$nettle"; then
nettle="no"
fi
cat > $TMPC << EOF
#include <gcrypt.h>
int main(void) {
gcry_kdf_derive(NULL, 0, GCRY_KDF_PBKDF2,
GCRY_MD_SHA256,
NULL, 0, 0, 0, NULL);
return 0;
}
EOF
if compile_prog "$gcrypt_cflags" "$gcrypt_libs" ; then
gcrypt_kdf=yes
fi
else
if test "$gcrypt" = "yes"; then
feature_not_found "gcrypt" "Install gcrypt devel"
@@ -2965,7 +2836,7 @@ fi
# curses probe
if test "$curses" != "no" ; then
if test "$mingw32" = "yes" ; then
curses_list="-lpdcurses"
curses_list="$($pkg_config --libs ncurses 2>/dev/null):-lpdcurses"
else
curses_list="$($pkg_config --libs ncurses 2>/dev/null):-lncurses:-lcurses"
fi
@@ -3063,6 +2934,30 @@ for i in $glib_modules; do
fi
done
# Sanity check that the current size_t matches the
# size that glib thinks it should be. This catches
# problems on multi-arch where people try to build
# 32-bit QEMU while pointing at 64-bit glib headers
cat > $TMPC <<EOF
#include <glib.h>
#include <unistd.h>
#define QEMU_BUILD_BUG_ON(x) \
typedef char qemu_build_bug_on[(x)?-1:1] __attribute__((unused));
int main(void) {
QEMU_BUILD_BUG_ON(sizeof(size_t) != GLIB_SIZEOF_SIZE_T);
return 0;
}
EOF
if ! compile_prog "-Werror $CFLAGS" "$LIBS" ; then
error_exit "sizeof(size_t) doesn't match GLIB_SIZEOF_SIZE_T."\
"You probably need to set PKG_CONFIG_LIBDIR"\
"to point to the right pkg-config files for your"\
"build target"
fi
# g_test_trap_subprocess added in 2.38. Used by some tests.
glib_subprocess=yes
if ! $pkg_config --atleast-version=2.38 glib-2.0; then
@@ -3420,7 +3315,7 @@ libs_softmmu="$libs_softmmu $fdt_libs"
# opengl probe (for sdl2, gtk, milkymist-tmu2)
if test "$opengl" != "no" ; then
opengl_pkgs="epoxy"
opengl_pkgs="epoxy libdrm gbm"
if $pkg_config $opengl_pkgs x11; then
opengl_cflags="$($pkg_config --cflags $opengl_pkgs) $x11_cflags"
opengl_libs="$($pkg_config --libs $opengl_pkgs) $x11_libs"
@@ -3438,6 +3333,18 @@ if test "$opengl" != "no" ; then
fi
fi
if test "$opengl" = "yes"; then
cat > $TMPC << EOF
#include <epoxy/egl.h>
#ifndef EGL_MESA_image_dma_buf_export
# error mesa/epoxy lacks support for dmabufs (mesa 10.6+)
#endif
int main(void) { return 0; }
EOF
if compile_prog "" "" ; then
opengl_dmabuf=yes
fi
fi
##########################################
# archipelago probe
@@ -4831,7 +4738,9 @@ echo "GTK support $gtk"
echo "GTK GL support $gtk_gl"
echo "GNUTLS support $gnutls"
echo "GNUTLS hash $gnutls_hash"
echo "GNUTLS rnd $gnutls_rnd"
echo "libgcrypt $gcrypt"
echo "libgcrypt kdf $gcrypt_kdf"
if test "$nettle" = "yes"; then
echo "nettle $nettle ($nettle_version)"
else
@@ -4898,6 +4807,7 @@ echo "smartcard support $smartcard"
echo "libusb $libusb"
echo "usb net redir $usb_redir"
echo "OpenGL support $opengl"
echo "OpenGL dmabufs $opengl_dmabuf"
echo "libiscsi support $libiscsi"
echo "libnfs support $libnfs"
echo "build guest agent $guest_agent"
@@ -4922,6 +4832,7 @@ echo "bzip2 support $bzip2"
echo "NUMA host support $numa"
echo "tcmalloc support $tcmalloc"
echo "jemalloc support $jemalloc"
echo "avx2 optimization $avx2_opt"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
@@ -5196,6 +5107,7 @@ if test "$gtk" = "yes" ; then
echo "CONFIG_GTK=y" >> $config_host_mak
echo "CONFIG_GTKABI=$gtkabi" >> $config_host_mak
echo "GTK_CFLAGS=$gtk_cflags" >> $config_host_mak
echo "GTK_LIBS=$gtk_libs" >> $config_host_mak
if test "$gtk_gl" = "yes" ; then
echo "CONFIG_GTK_GL=y" >> $config_host_mak
fi
@@ -5206,8 +5118,14 @@ fi
if test "$gnutls_hash" = "yes" ; then
echo "CONFIG_GNUTLS_HASH=y" >> $config_host_mak
fi
if test "$gnutls_rnd" = "yes" ; then
echo "CONFIG_GNUTLS_RND=y" >> $config_host_mak
fi
if test "$gcrypt" = "yes" ; then
echo "CONFIG_GCRYPT=y" >> $config_host_mak
if test "$gcrypt_kdf" = "yes" ; then
echo "CONFIG_GCRYPT_KDF=y" >> $config_host_mak
fi
fi
if test "$nettle" = "yes" ; then
echo "CONFIG_NETTLE=y" >> $config_host_mak
@@ -5304,6 +5222,13 @@ if test "$opengl" = "yes" ; then
echo "CONFIG_OPENGL=y" >> $config_host_mak
echo "OPENGL_CFLAGS=$opengl_cflags" >> $config_host_mak
echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak
if test "$opengl_dmabuf" = "yes" ; then
echo "CONFIG_OPENGL_DMABUF=y" >> $config_host_mak
fi
fi
if test "$avx2_opt" = "yes" ; then
echo "CONFIG_AVX2_OPT=y" >> $config_host_mak
fi
if test "$lzo" = "yes" ; then
@@ -5445,8 +5370,8 @@ if have_backend "simple"; then
# Set the appropriate trace file.
trace_file="\"$trace_file-\" FMT_pid"
fi
if have_backend "stderr"; then
echo "CONFIG_TRACE_STDERR=y" >> $config_host_mak
if have_backend "log"; then
echo "CONFIG_TRACE_LOG=y" >> $config_host_mak
fi
if have_backend "ust"; then
echo "CONFIG_TRACE_UST=y" >> $config_host_mak
@@ -5501,13 +5426,8 @@ echo "MAKE=$make" >> $config_host_mak
echo "INSTALL=$install" >> $config_host_mak
echo "INSTALL_DIR=$install -d -m 0755" >> $config_host_mak
echo "INSTALL_DATA=$install -c -m 0644" >> $config_host_mak
if test -n "$libtool"; then
echo "INSTALL_PROG=\$(LIBTOOL) --mode=install $install -c -m 0755" >> $config_host_mak
echo "INSTALL_LIB=\$(LIBTOOL) --mode=install $install -c -m 0644" >> $config_host_mak
else
echo "INSTALL_PROG=$install -c -m 0755" >> $config_host_mak
echo "INSTALL_LIB=$install -c -m 0644" >> $config_host_mak
fi
echo "INSTALL_PROG=$install -c -m 0755" >> $config_host_mak
echo "INSTALL_LIB=$install -c -m 0644" >> $config_host_mak
echo "PYTHON=$python" >> $config_host_mak
echo "CC=$cc" >> $config_host_mak
if $iasl -h > /dev/null 2>&1; then
@@ -5525,7 +5445,6 @@ echo "OBJCOPY=$objcopy" >> $config_host_mak
echo "LD=$ld" >> $config_host_mak
echo "NM=$nm" >> $config_host_mak
echo "WINDRES=$windres" >> $config_host_mak
echo "LIBTOOL=$libtool" >> $config_host_mak
echo "CFLAGS=$CFLAGS" >> $config_host_mak
echo "CFLAGS_NOPIE=$CFLAGS_NOPIE" >> $config_host_mak
echo "QEMU_CFLAGS=$QEMU_CFLAGS" >> $config_host_mak
@@ -5544,7 +5463,6 @@ else
fi
echo "LDFLAGS=$LDFLAGS" >> $config_host_mak
echo "LDFLAGS_NOPIE=$LDFLAGS_NOPIE" >> $config_host_mak
echo "LIBTOOLFLAGS=$LIBTOOLFLAGS" >> $config_host_mak
echo "LIBS+=$LIBS" >> $config_host_mak
echo "LIBS_TOOLS+=$libs_tools" >> $config_host_mak
echo "EXESUF=$EXESUF" >> $config_host_mak

View File

@@ -6,7 +6,7 @@
* top-level directory.
*/
#include <sys/types.h>
#include "qemu/osdep.h"
#include <sys/socket.h>
#include <sys/un.h>

View File

@@ -19,7 +19,6 @@
* purposes.
*/
#include <limits.h>
#include <sys/select.h>
#include "qemu/queue.h"

View File

@@ -6,6 +6,7 @@
* top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "ivshmem-client.h"

View File

@@ -5,16 +5,13 @@
* (at your option) any later version. See the COPYING file in the
* top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/sockets.h"
#include <sys/mman.h>
#include <sys/types.h>
#include <sys/socket.h>
#include <sys/un.h>
#ifdef CONFIG_LINUX
#include <sys/vfs.h>
#endif
#include "ivshmem-server.h"
@@ -257,7 +254,8 @@ ivshmem_server_ftruncate(int fd, unsigned shmsize)
/* Init a new ivshmem server */
int
ivshmem_server_init(IvshmemServer *server, const char *unix_sock_path,
const char *shm_path, size_t shm_size, unsigned n_vectors,
const char *shm_path, bool use_shm_open,
size_t shm_size, unsigned n_vectors,
bool verbose)
{
int ret;
@@ -278,6 +276,7 @@ ivshmem_server_init(IvshmemServer *server, const char *unix_sock_path,
return -1;
}
server->use_shm_open = use_shm_open;
server->shm_size = shm_size;
server->n_vectors = n_vectors;
@@ -286,31 +285,6 @@ ivshmem_server_init(IvshmemServer *server, const char *unix_sock_path,
return 0;
}
#ifdef CONFIG_LINUX
#define HUGETLBFS_MAGIC 0x958458f6
static long gethugepagesize(const char *path)
{
struct statfs fs;
int ret;
do {
ret = statfs(path, &fs);
} while (ret != 0 && errno == EINTR);
if (ret != 0) {
return -1;
}
if (fs.f_type != HUGETLBFS_MAGIC) {
return -1;
}
return fs.f_bsize;
}
#endif
/* open shm, create and bind to the unix socket */
int
ivshmem_server_start(IvshmemServer *server)
@@ -319,27 +293,17 @@ ivshmem_server_start(IvshmemServer *server)
int shm_fd, sock_fd, ret;
/* open shm file */
#ifdef CONFIG_LINUX
long hpagesize;
hpagesize = gethugepagesize(server->shm_path);
if (hpagesize < 0 && errno != ENOENT) {
IVSHMEM_SERVER_DEBUG(server, "cannot stat shm file %s: %s\n",
server->shm_path, strerror(errno));
}
if (hpagesize > 0) {
if (server->use_shm_open) {
IVSHMEM_SERVER_DEBUG(server, "Using POSIX shared memory: %s\n",
server->shm_path);
shm_fd = shm_open(server->shm_path, O_CREAT | O_RDWR, S_IRWXU);
} else {
gchar *filename = g_strdup_printf("%s/ivshmem.XXXXXX", server->shm_path);
IVSHMEM_SERVER_DEBUG(server, "Using hugepages: %s\n", server->shm_path);
IVSHMEM_SERVER_DEBUG(server, "Using file-backed shared memory: %s\n",
server->shm_path);
shm_fd = mkstemp(filename);
unlink(filename);
g_free(filename);
} else
#endif
{
IVSHMEM_SERVER_DEBUG(server, "Using POSIX shared memory: %s\n",
server->shm_path);
shm_fd = shm_open(server->shm_path, O_CREAT|O_RDWR, S_IRWXU);
}
if (shm_fd < 0) {

View File

@@ -26,10 +26,7 @@
* associated to the ivshmem shared memory.
*/
#include <limits.h>
#include <sys/select.h>
#include <stdint.h>
#include <stdbool.h>
#include "qemu/event_notifier.h"
#include "qemu/queue.h"
@@ -69,6 +66,7 @@ typedef struct IvshmemServer {
char unix_sock_path[PATH_MAX]; /**< path to unix socket */
int sock_fd; /**< unix sock file descriptor */
char shm_path[PATH_MAX]; /**< path to shm */
bool use_shm_open;
size_t shm_size; /**< size of shm */
int shm_fd; /**< shm file descriptor */
unsigned n_vectors; /**< number of vectors */
@@ -92,7 +90,8 @@ typedef struct IvshmemServer {
*/
int
ivshmem_server_init(IvshmemServer *server, const char *unix_sock_path,
const char *shm_path, size_t shm_size, unsigned n_vectors,
const char *shm_path, bool use_shm_open,
size_t shm_size, unsigned n_vectors,
bool verbose);
/**

View File

@@ -6,6 +6,7 @@
* top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "ivshmem-server.h"
@@ -28,35 +29,38 @@ typedef struct IvshmemServerArgs {
const char *pid_file;
const char *unix_socket_path;
const char *shm_path;
bool use_shm_open;
uint64_t shm_size;
unsigned n_vectors;
} IvshmemServerArgs;
/* show ivshmem_server_usage and exit with given error code */
static void
ivshmem_server_usage(const char *name, int code)
ivshmem_server_usage(const char *progname)
{
fprintf(stderr, "%s [opts]\n", name);
fprintf(stderr, " -h: show this help\n");
fprintf(stderr, " -v: verbose mode\n");
fprintf(stderr, " -F: foreground mode (default is to daemonize)\n");
fprintf(stderr, " -p <pid_file>: path to the PID file (used in daemon\n"
" mode only).\n"
" Default=%s\n", IVSHMEM_SERVER_DEFAULT_SHM_PATH);
fprintf(stderr, " -S <unix_socket_path>: path to the unix socket\n"
" to listen to.\n"
" Default=%s\n", IVSHMEM_SERVER_DEFAULT_UNIX_SOCK_PATH);
fprintf(stderr, " -m <shm_path>: path to the shared memory.\n"
" The path corresponds to a POSIX shm name or a\n"
" hugetlbfs mount point.\n"
" default=%s\n", IVSHMEM_SERVER_DEFAULT_SHM_PATH);
fprintf(stderr, " -l <size>: size of shared memory in bytes. The suffix\n"
" K, M and G can be used (ex: 1K means 1024).\n"
" default=%u\n", IVSHMEM_SERVER_DEFAULT_SHM_SIZE);
fprintf(stderr, " -n <n_vects>: number of vectors.\n"
" default=%u\n", IVSHMEM_SERVER_DEFAULT_N_VECTORS);
printf("Usage: %s [OPTION]...\n"
" -h: show this help\n"
" -v: verbose mode\n"
" -F: foreground mode (default is to daemonize)\n"
" -p <pid-file>: path to the PID file (used in daemon mode only)\n"
" default " IVSHMEM_SERVER_DEFAULT_PID_FILE "\n"
" -S <unix-socket-path>: path to the unix socket to listen to\n"
" default " IVSHMEM_SERVER_DEFAULT_UNIX_SOCK_PATH "\n"
" -M <shm-name>: POSIX shared memory object to use\n"
" default " IVSHMEM_SERVER_DEFAULT_SHM_PATH "\n"
" -m <dir-name>: where to create shared memory\n"
" -l <size>: size of shared memory in bytes\n"
" suffixes K, M and G can be used, e.g. 1K means 1024\n"
" default %u\n"
" -n <nvectors>: number of vectors\n"
" default %u\n",
progname, IVSHMEM_SERVER_DEFAULT_SHM_SIZE,
IVSHMEM_SERVER_DEFAULT_N_VECTORS);
}
exit(code);
static void
ivshmem_server_help(const char *progname)
{
fprintf(stderr, "Try '%s -h' for more information.\n", progname);
}
/* parse the program arguments, exit on error */
@@ -67,20 +71,12 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int argc, char *argv[])
unsigned long long v;
Error *err = NULL;
while ((c = getopt(argc, argv,
"h" /* help */
"v" /* verbose */
"F" /* foreground */
"p:" /* pid_file */
"S:" /* unix_socket_path */
"m:" /* shm_path */
"l:" /* shm_size */
"n:" /* n_vectors */
)) != -1) {
while ((c = getopt(argc, argv, "hvFp:S:m:M:l:n:")) != -1) {
switch (c) {
case 'h': /* help */
ivshmem_server_usage(argv[0], 0);
ivshmem_server_usage(argv[0]);
exit(0);
break;
case 'v': /* verbose */
@@ -91,36 +87,41 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int argc, char *argv[])
args->foreground = 1;
break;
case 'p': /* pid_file */
case 'p': /* pid file */
args->pid_file = optarg;
break;
case 'S': /* unix_socket_path */
case 'S': /* unix socket path */
args->unix_socket_path = optarg;
break;
case 'm': /* shm_path */
case 'M': /* shm name */
case 'm': /* dir name */
args->shm_path = optarg;
args->use_shm_open = c == 'M';
break;
case 'l': /* shm_size */
case 'l': /* shm size */
parse_option_size("shm_size", optarg, &args->shm_size, &err);
if (err) {
error_report_err(err);
ivshmem_server_usage(argv[0], 1);
ivshmem_server_help(argv[0]);
exit(1);
}
break;
case 'n': /* n_vectors */
case 'n': /* number of vectors */
if (parse_uint_full(optarg, &v, 0) < 0) {
fprintf(stderr, "cannot parse n_vectors\n");
ivshmem_server_usage(argv[0], 1);
ivshmem_server_help(argv[0]);
exit(1);
}
args->n_vectors = v;
break;
default:
ivshmem_server_usage(argv[0], 1);
ivshmem_server_usage(argv[0]);
exit(1);
break;
}
}
@@ -128,12 +129,14 @@ ivshmem_server_parse_args(IvshmemServerArgs *args, int argc, char *argv[])
if (args->n_vectors > IVSHMEM_SERVER_MAX_VECTORS) {
fprintf(stderr, "too many requested vectors (max is %d)\n",
IVSHMEM_SERVER_MAX_VECTORS);
ivshmem_server_usage(argv[0], 1);
ivshmem_server_help(argv[0]);
exit(1);
}
if (args->verbose == 1 && args->foreground == 0) {
fprintf(stderr, "cannot use verbose in daemon mode\n");
ivshmem_server_usage(argv[0], 1);
ivshmem_server_help(argv[0]);
exit(1);
}
}
@@ -191,11 +194,18 @@ main(int argc, char *argv[])
.pid_file = IVSHMEM_SERVER_DEFAULT_PID_FILE,
.unix_socket_path = IVSHMEM_SERVER_DEFAULT_UNIX_SOCK_PATH,
.shm_path = IVSHMEM_SERVER_DEFAULT_SHM_PATH,
.use_shm_open = true,
.shm_size = IVSHMEM_SERVER_DEFAULT_SHM_SIZE,
.n_vectors = IVSHMEM_SERVER_DEFAULT_N_VECTORS,
};
int ret = 1;
/*
* Do not remove this notice without adding proper error handling!
* Start with handling ivshmem_server_send_one_msg() failure.
*/
printf("*** Example code, do not use in production ***\n");
/* parse arguments, will exit on error */
ivshmem_server_parse_args(&args, argc, argv);
@@ -218,7 +228,8 @@ main(int argc, char *argv[])
}
/* init the ivshms structure */
if (ivshmem_server_init(&server, args.unix_socket_path, args.shm_path,
if (ivshmem_server_init(&server, args.unix_socket_path,
args.shm_path, args.use_shm_open,
args.shm_size, args.n_vectors, args.verbose) < 0) {
fprintf(stderr, "cannot init server\n");
goto err;

View File

@@ -27,6 +27,7 @@
#include "exec/address-spaces.h"
#include "qemu/rcu.h"
#include "exec/tb-hash.h"
#include "exec/log.h"
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
#include "hw/i386/apic.h"
#endif

88
cpus.c
View File

@@ -29,6 +29,7 @@
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "sysemu/sysemu.h"
#include "sysemu/block-backend.h"
#include "exec/gdbstub.h"
#include "sysemu/dma.h"
#include "sysemu/kvm.h"
@@ -370,9 +371,12 @@ static void icount_warp_rt(void)
}
}
static void icount_dummy_timer(void *opaque)
static void icount_timer_cb(void *opaque)
{
(void)opaque;
/* No need for a checkpoint because the timer already synchronizes
* with CHECKPOINT_CLOCK_VIRTUAL_RT.
*/
icount_warp_rt();
}
void qtest_clock_warp(int64_t dest)
@@ -396,17 +400,12 @@ void qtest_clock_warp(int64_t dest)
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
}
void qemu_clock_warp(QEMUClockType type)
void qemu_start_warp_timer(void)
{
int64_t clock;
int64_t deadline;
/*
* There are too many global variables to make the "warp" behavior
* applicable to other clocks. But a clock argument removes the
* need for if statements all over the place.
*/
if (type != QEMU_CLOCK_VIRTUAL || !use_icount) {
if (!use_icount) {
return;
}
@@ -418,29 +417,17 @@ void qemu_clock_warp(QEMUClockType type)
}
/* warp clock deterministically in record/replay mode */
if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP)) {
if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_START)) {
return;
}
if (icount_sleep) {
/*
* If the CPUs have been sleeping, advance QEMU_CLOCK_VIRTUAL timer now.
* This ensures that the deadline for the timer is computed correctly
* below.
* This also makes sure that the insn counter is synchronized before
* the CPU starts running, in case the CPU is woken by an event other
* than the earliest QEMU_CLOCK_VIRTUAL timer.
*/
icount_warp_rt();
timer_del(icount_warp_timer);
}
if (!all_cpu_threads_idle()) {
return;
}
if (qtest_enabled()) {
/* When testing, qtest commands advance icount. */
return;
return;
}
/* We want to use the earliest deadline from ALL vm_clocks */
@@ -496,6 +483,28 @@ void qemu_clock_warp(QEMUClockType type)
}
}
static void qemu_account_warp_timer(void)
{
if (!use_icount || !icount_sleep) {
return;
}
/* Nothing to do if the VM is stopped: QEMU_CLOCK_VIRTUAL timers
* do not fire, so computing the deadline does not make sense.
*/
if (!runstate_is_running()) {
return;
}
/* warp clock deterministically in record/replay mode */
if (!replay_checkpoint(CHECKPOINT_CLOCK_WARP_ACCOUNT)) {
return;
}
timer_del(icount_warp_timer);
icount_warp_rt();
}
static bool icount_state_needed(void *opaque)
{
return use_icount;
@@ -624,13 +633,13 @@ void configure_icount(QemuOpts *opts, Error **errp)
icount_sleep = qemu_opt_get_bool(opts, "sleep", true);
if (icount_sleep) {
icount_warp_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
icount_dummy_timer, NULL);
icount_timer_cb, NULL);
}
icount_align_option = qemu_opt_get_bool(opts, "align", false);
if (icount_align_option && !icount_sleep) {
error_setg(errp, "align=on and sleep=no are incompatible");
error_setg(errp, "align=on and sleep=off are incompatible");
}
if (strcmp(option, "auto") != 0) {
errno = 0;
@@ -643,7 +652,7 @@ void configure_icount(QemuOpts *opts, Error **errp)
} else if (icount_align_option) {
error_setg(errp, "shift=auto and align=on are incompatible");
} else if (!icount_sleep) {
error_setg(errp, "shift=auto and sleep=no are incompatible");
error_setg(errp, "shift=auto and sleep=off are incompatible");
}
use_icount = 2;
@@ -726,7 +735,7 @@ static int do_vm_stop(RunState state)
}
bdrv_drain_all();
ret = bdrv_flush_all();
ret = blk_flush_all();
return ret;
}
@@ -995,9 +1004,6 @@ static void qemu_wait_io_event_common(CPUState *cpu)
static void qemu_tcg_wait_io_event(CPUState *cpu)
{
while (all_cpu_threads_idle()) {
/* Start accounting real time to the virtual clock if the CPUs
are idle. */
qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
}
@@ -1428,7 +1434,7 @@ int vm_stop_force_state(RunState state)
bdrv_drain_all();
/* Make sure to return an error if the flush in a previous vm_stop()
* failed. */
return bdrv_flush_all();
return blk_flush_all();
}
}
@@ -1499,7 +1505,7 @@ static void tcg_exec_all(void)
int r;
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
qemu_clock_warp(QEMU_CLOCK_VIRTUAL);
qemu_account_warp_timer();
if (next_cpu == NULL) {
next_cpu = first_cpu;
@@ -1568,28 +1574,22 @@ CpuInfoList *qmp_query_cpus(Error **errp)
info->value->thread_id = cpu->thread_id;
#if defined(TARGET_I386)
info->value->arch = CPU_INFO_ARCH_X86;
info->value->u.x86 = g_new0(CpuInfoX86, 1);
info->value->u.x86->pc = env->eip + env->segs[R_CS].base;
info->value->u.x86.pc = env->eip + env->segs[R_CS].base;
#elif defined(TARGET_PPC)
info->value->arch = CPU_INFO_ARCH_PPC;
info->value->u.ppc = g_new0(CpuInfoPPC, 1);
info->value->u.ppc->nip = env->nip;
info->value->u.ppc.nip = env->nip;
#elif defined(TARGET_SPARC)
info->value->arch = CPU_INFO_ARCH_SPARC;
info->value->u.sparc = g_new0(CpuInfoSPARC, 1);
info->value->u.sparc->pc = env->pc;
info->value->u.sparc->npc = env->npc;
info->value->u.q_sparc.pc = env->pc;
info->value->u.q_sparc.npc = env->npc;
#elif defined(TARGET_MIPS)
info->value->arch = CPU_INFO_ARCH_MIPS;
info->value->u.mips = g_new0(CpuInfoMIPS, 1);
info->value->u.mips->PC = env->active_tc.PC;
info->value->u.q_mips.PC = env->active_tc.PC;
#elif defined(TARGET_TRICORE)
info->value->arch = CPU_INFO_ARCH_TRICORE;
info->value->u.tricore = g_new0(CpuInfoTricore, 1);
info->value->u.tricore->PC = env->PC;
info->value->u.tricore.PC = env->PC;
#else
info->value->arch = CPU_INFO_ARCH_OTHER;
info->value->u.other = g_new0(CpuInfoOther, 1);
#endif
/* XXX: waiting for the qapi to support GSList */

View File

@@ -416,8 +416,8 @@ void tlb_set_page_with_attrs(CPUState *cpu, target_ulong vaddr,
/* Write access calls the I/O callback. */
te->addr_write = address | TLB_MMIO;
} else if (memory_region_is_ram(section->mr)
&& cpu_physical_memory_is_clean(section->mr->ram_addr
+ xlat)) {
&& cpu_physical_memory_is_clean(
memory_region_get_ram_addr(section->mr) + xlat)) {
te->addr_write = address | TLB_NOTDIRTY;
} else {
te->addr_write = address;

View File

@@ -8,6 +8,23 @@ crypto-obj-y += tlscredsanon.o
crypto-obj-y += tlscredsx509.o
crypto-obj-y += tlssession.o
crypto-obj-y += secret.o
crypto-obj-$(CONFIG_GCRYPT) += random-gcrypt.o
crypto-obj-$(if $(CONFIG_GCRYPT),n,$(CONFIG_GNUTLS_RND)) += random-gnutls.o
crypto-obj-y += pbkdf.o
crypto-obj-$(CONFIG_NETTLE) += pbkdf-nettle.o
crypto-obj-$(if $(CONFIG_NETTLE),n,$(CONFIG_GCRYPT_KDF)) += pbkdf-gcrypt.o
crypto-obj-y += ivgen.o
crypto-obj-y += ivgen-essiv.o
crypto-obj-y += ivgen-plain.o
crypto-obj-y += ivgen-plain64.o
crypto-obj-y += afsplit.o
crypto-obj-y += xts.o
crypto-obj-y += block.o
crypto-obj-y += block-qcow.o
crypto-obj-y += block-luks.o
# Let the userspace emulators avoid linking gnutls/etc
crypto-aes-obj-y = aes.o
stub-obj-y += random-stub.o
stub-obj-y += pbkdf-stub.o

158
crypto/afsplit.c Normal file
View File

@@ -0,0 +1,158 @@
/*
* QEMU Crypto anti forensic information splitter
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* Derived from cryptsetup package lib/luks1/af.c
*
* Copyright (C) 2004, Clemens Fruhwirth <clemens@endorphin.org>
* Copyright (C) 2009-2012, Red Hat, Inc. All rights reserved.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public License
* as published by the Free Software Foundation; either version 2
* of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/afsplit.h"
#include "crypto/random.h"
static void qcrypto_afsplit_xor(size_t blocklen,
const uint8_t *in1,
const uint8_t *in2,
uint8_t *out)
{
size_t i;
for (i = 0; i < blocklen; i++) {
out[i] = in1[i] ^ in2[i];
}
}
static int qcrypto_afsplit_hash(QCryptoHashAlgorithm hash,
size_t blocklen,
uint8_t *block,
Error **errp)
{
size_t digestlen = qcrypto_hash_digest_len(hash);
size_t hashcount = blocklen / digestlen;
size_t finallen = blocklen % digestlen;
uint32_t i;
if (finallen) {
hashcount++;
} else {
finallen = digestlen;
}
for (i = 0; i < hashcount; i++) {
uint8_t *out = NULL;
size_t outlen = 0;
uint32_t iv = cpu_to_be32(i);
struct iovec in[] = {
{ .iov_base = &iv,
.iov_len = sizeof(iv) },
{ .iov_base = block + (i * digestlen),
.iov_len = (i == (hashcount - 1)) ? finallen : digestlen },
};
if (qcrypto_hash_bytesv(hash,
in,
G_N_ELEMENTS(in),
&out, &outlen,
errp) < 0) {
return -1;
}
assert(outlen == digestlen);
memcpy(block + (i * digestlen), out,
(i == (hashcount - 1)) ? finallen : digestlen);
g_free(out);
}
return 0;
}
int qcrypto_afsplit_encode(QCryptoHashAlgorithm hash,
size_t blocklen,
uint32_t stripes,
const uint8_t *in,
uint8_t *out,
Error **errp)
{
uint8_t *block = g_new0(uint8_t, blocklen);
size_t i;
int ret = -1;
for (i = 0; i < (stripes - 1); i++) {
if (qcrypto_random_bytes(out + (i * blocklen), blocklen, errp) < 0) {
goto cleanup;
}
qcrypto_afsplit_xor(blocklen,
out + (i * blocklen),
block,
block);
if (qcrypto_afsplit_hash(hash, blocklen, block,
errp) < 0) {
goto cleanup;
}
}
qcrypto_afsplit_xor(blocklen,
in,
block,
out + (i * blocklen));
ret = 0;
cleanup:
g_free(block);
return ret;
}
int qcrypto_afsplit_decode(QCryptoHashAlgorithm hash,
size_t blocklen,
uint32_t stripes,
const uint8_t *in,
uint8_t *out,
Error **errp)
{
uint8_t *block = g_new0(uint8_t, blocklen);
size_t i;
int ret = -1;
for (i = 0; i < (stripes - 1); i++) {
qcrypto_afsplit_xor(blocklen,
in + (i * blocklen),
block,
block);
if (qcrypto_afsplit_hash(hash, blocklen, block,
errp) < 0) {
goto cleanup;
}
}
qcrypto_afsplit_xor(blocklen,
in + (i * blocklen),
block,
out);
ret = 0;
cleanup:
g_free(block);
return ret;
}

1328
crypto/block-luks.c Normal file

File diff suppressed because it is too large Load Diff

28
crypto/block-luks.h Normal file
View File

@@ -0,0 +1,28 @@
/*
* QEMU Crypto block device encryption LUKS format
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef QCRYPTO_BLOCK_LUKS_H__
#define QCRYPTO_BLOCK_LUKS_H__
#include "crypto/blockpriv.h"
extern const QCryptoBlockDriver qcrypto_block_driver_luks;
#endif /* QCRYPTO_BLOCK_LUKS_H__ */

173
crypto/block-qcow.c Normal file
View File

@@ -0,0 +1,173 @@
/*
* QEMU Crypto block device encryption QCow/QCow2 AES-CBC format
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
/*
* Note that the block encryption implemented in this file is broken
* by design. This exists only to allow data to be liberated from
* existing qcow[2] images and should not be used in any new areas.
*/
#include "qemu/osdep.h"
#include "crypto/block-qcow.h"
#include "crypto/secret.h"
#define QCRYPTO_BLOCK_QCOW_SECTOR_SIZE 512
static bool
qcrypto_block_qcow_has_format(const uint8_t *buf G_GNUC_UNUSED,
size_t buf_size G_GNUC_UNUSED)
{
return false;
}
static int
qcrypto_block_qcow_init(QCryptoBlock *block,
const char *keysecret,
Error **errp)
{
char *password;
int ret;
uint8_t keybuf[16];
int len;
memset(keybuf, 0, 16);
password = qcrypto_secret_lookup_as_utf8(keysecret, errp);
if (!password) {
return -1;
}
len = strlen(password);
memcpy(keybuf, password, MIN(len, sizeof(keybuf)));
g_free(password);
block->niv = qcrypto_cipher_get_iv_len(QCRYPTO_CIPHER_ALG_AES_128,
QCRYPTO_CIPHER_MODE_CBC);
block->ivgen = qcrypto_ivgen_new(QCRYPTO_IVGEN_ALG_PLAIN64,
0, 0, NULL, 0, errp);
if (!block->ivgen) {
ret = -ENOTSUP;
goto fail;
}
block->cipher = qcrypto_cipher_new(QCRYPTO_CIPHER_ALG_AES_128,
QCRYPTO_CIPHER_MODE_CBC,
keybuf, G_N_ELEMENTS(keybuf),
errp);
if (!block->cipher) {
ret = -ENOTSUP;
goto fail;
}
block->payload_offset = 0;
return 0;
fail:
qcrypto_cipher_free(block->cipher);
qcrypto_ivgen_free(block->ivgen);
return ret;
}
static int
qcrypto_block_qcow_open(QCryptoBlock *block,
QCryptoBlockOpenOptions *options,
QCryptoBlockReadFunc readfunc G_GNUC_UNUSED,
void *opaque G_GNUC_UNUSED,
unsigned int flags,
Error **errp)
{
if (flags & QCRYPTO_BLOCK_OPEN_NO_IO) {
return 0;
} else {
if (!options->u.qcow.key_secret) {
error_setg(errp,
"Parameter 'key-secret' is required for cipher");
return -1;
}
return qcrypto_block_qcow_init(block,
options->u.qcow.key_secret, errp);
}
}
static int
qcrypto_block_qcow_create(QCryptoBlock *block,
QCryptoBlockCreateOptions *options,
QCryptoBlockInitFunc initfunc G_GNUC_UNUSED,
QCryptoBlockWriteFunc writefunc G_GNUC_UNUSED,
void *opaque G_GNUC_UNUSED,
Error **errp)
{
if (!options->u.qcow.key_secret) {
error_setg(errp, "Parameter 'key-secret' is required for cipher");
return -1;
}
/* QCow2 has no special header, since everything is hardwired */
return qcrypto_block_qcow_init(block, options->u.qcow.key_secret, errp);
}
static void
qcrypto_block_qcow_cleanup(QCryptoBlock *block)
{
}
static int
qcrypto_block_qcow_decrypt(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
return qcrypto_block_decrypt_helper(block->cipher,
block->niv, block->ivgen,
QCRYPTO_BLOCK_QCOW_SECTOR_SIZE,
startsector, buf, len, errp);
}
static int
qcrypto_block_qcow_encrypt(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
return qcrypto_block_encrypt_helper(block->cipher,
block->niv, block->ivgen,
QCRYPTO_BLOCK_QCOW_SECTOR_SIZE,
startsector, buf, len, errp);
}
const QCryptoBlockDriver qcrypto_block_driver_qcow = {
.open = qcrypto_block_qcow_open,
.create = qcrypto_block_qcow_create,
.cleanup = qcrypto_block_qcow_cleanup,
.decrypt = qcrypto_block_qcow_decrypt,
.encrypt = qcrypto_block_qcow_encrypt,
.has_format = qcrypto_block_qcow_has_format,
};

28
crypto/block-qcow.h Normal file
View File

@@ -0,0 +1,28 @@
/*
* QEMU Crypto block device encryption QCow/QCow2 AES-CBC format
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef QCRYPTO_BLOCK_QCOW_H__
#define QCRYPTO_BLOCK_QCOW_H__
#include "crypto/blockpriv.h"
extern const QCryptoBlockDriver qcrypto_block_driver_qcow;
#endif /* QCRYPTO_BLOCK_QCOW_H__ */

260
crypto/block.c Normal file
View File

@@ -0,0 +1,260 @@
/*
* QEMU Crypto block device encryption
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/blockpriv.h"
#include "crypto/block-qcow.h"
#include "crypto/block-luks.h"
static const QCryptoBlockDriver *qcrypto_block_drivers[] = {
[Q_CRYPTO_BLOCK_FORMAT_QCOW] = &qcrypto_block_driver_qcow,
[Q_CRYPTO_BLOCK_FORMAT_LUKS] = &qcrypto_block_driver_luks,
};
bool qcrypto_block_has_format(QCryptoBlockFormat format,
const uint8_t *buf,
size_t len)
{
const QCryptoBlockDriver *driver;
if (format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
!qcrypto_block_drivers[format]) {
return false;
}
driver = qcrypto_block_drivers[format];
return driver->has_format(buf, len);
}
QCryptoBlock *qcrypto_block_open(QCryptoBlockOpenOptions *options,
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
Error **errp)
{
QCryptoBlock *block = g_new0(QCryptoBlock, 1);
block->format = options->format;
if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
!qcrypto_block_drivers[options->format]) {
error_setg(errp, "Unsupported block driver %d", options->format);
g_free(block);
return NULL;
}
block->driver = qcrypto_block_drivers[options->format];
if (block->driver->open(block, options,
readfunc, opaque, flags, errp) < 0) {
g_free(block);
return NULL;
}
return block;
}
QCryptoBlock *qcrypto_block_create(QCryptoBlockCreateOptions *options,
QCryptoBlockInitFunc initfunc,
QCryptoBlockWriteFunc writefunc,
void *opaque,
Error **errp)
{
QCryptoBlock *block = g_new0(QCryptoBlock, 1);
block->format = options->format;
if (options->format >= G_N_ELEMENTS(qcrypto_block_drivers) ||
!qcrypto_block_drivers[options->format]) {
error_setg(errp, "Unsupported block driver %d", options->format);
g_free(block);
return NULL;
}
block->driver = qcrypto_block_drivers[options->format];
if (block->driver->create(block, options, initfunc,
writefunc, opaque, errp) < 0) {
g_free(block);
return NULL;
}
return block;
}
int qcrypto_block_decrypt(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
return block->driver->decrypt(block, startsector, buf, len, errp);
}
int qcrypto_block_encrypt(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
return block->driver->encrypt(block, startsector, buf, len, errp);
}
QCryptoCipher *qcrypto_block_get_cipher(QCryptoBlock *block)
{
return block->cipher;
}
QCryptoIVGen *qcrypto_block_get_ivgen(QCryptoBlock *block)
{
return block->ivgen;
}
QCryptoHashAlgorithm qcrypto_block_get_kdf_hash(QCryptoBlock *block)
{
return block->kdfhash;
}
uint64_t qcrypto_block_get_payload_offset(QCryptoBlock *block)
{
return block->payload_offset;
}
void qcrypto_block_free(QCryptoBlock *block)
{
if (!block) {
return;
}
block->driver->cleanup(block);
qcrypto_cipher_free(block->cipher);
qcrypto_ivgen_free(block->ivgen);
g_free(block);
}
int qcrypto_block_decrypt_helper(QCryptoCipher *cipher,
size_t niv,
QCryptoIVGen *ivgen,
int sectorsize,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
uint8_t *iv;
int ret = -1;
iv = niv ? g_new0(uint8_t, niv) : NULL;
while (len > 0) {
size_t nbytes;
if (niv) {
if (qcrypto_ivgen_calculate(ivgen,
startsector,
iv, niv,
errp) < 0) {
goto cleanup;
}
if (qcrypto_cipher_setiv(cipher,
iv, niv,
errp) < 0) {
goto cleanup;
}
}
nbytes = len > sectorsize ? sectorsize : len;
if (qcrypto_cipher_decrypt(cipher, buf, buf,
nbytes, errp) < 0) {
goto cleanup;
}
startsector++;
buf += nbytes;
len -= nbytes;
}
ret = 0;
cleanup:
g_free(iv);
return ret;
}
int qcrypto_block_encrypt_helper(QCryptoCipher *cipher,
size_t niv,
QCryptoIVGen *ivgen,
int sectorsize,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp)
{
uint8_t *iv;
int ret = -1;
iv = niv ? g_new0(uint8_t, niv) : NULL;
while (len > 0) {
size_t nbytes;
if (niv) {
if (qcrypto_ivgen_calculate(ivgen,
startsector,
iv, niv,
errp) < 0) {
goto cleanup;
}
if (qcrypto_cipher_setiv(cipher,
iv, niv,
errp) < 0) {
goto cleanup;
}
}
nbytes = len > sectorsize ? sectorsize : len;
if (qcrypto_cipher_encrypt(cipher, buf, buf,
nbytes, errp) < 0) {
goto cleanup;
}
startsector++;
buf += nbytes;
len -= nbytes;
}
ret = 0;
cleanup:
g_free(iv);
return ret;
}

92
crypto/blockpriv.h Normal file
View File

@@ -0,0 +1,92 @@
/*
* QEMU Crypto block device encryption
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef QCRYPTO_BLOCK_PRIV_H__
#define QCRYPTO_BLOCK_PRIV_H__
#include "crypto/block.h"
typedef struct QCryptoBlockDriver QCryptoBlockDriver;
struct QCryptoBlock {
QCryptoBlockFormat format;
const QCryptoBlockDriver *driver;
void *opaque;
QCryptoCipher *cipher;
QCryptoIVGen *ivgen;
QCryptoHashAlgorithm kdfhash;
size_t niv;
uint64_t payload_offset; /* In bytes */
};
struct QCryptoBlockDriver {
int (*open)(QCryptoBlock *block,
QCryptoBlockOpenOptions *options,
QCryptoBlockReadFunc readfunc,
void *opaque,
unsigned int flags,
Error **errp);
int (*create)(QCryptoBlock *block,
QCryptoBlockCreateOptions *options,
QCryptoBlockInitFunc initfunc,
QCryptoBlockWriteFunc writefunc,
void *opaque,
Error **errp);
void (*cleanup)(QCryptoBlock *block);
int (*encrypt)(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp);
int (*decrypt)(QCryptoBlock *block,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp);
bool (*has_format)(const uint8_t *buf,
size_t buflen);
};
int qcrypto_block_decrypt_helper(QCryptoCipher *cipher,
size_t niv,
QCryptoIVGen *ivgen,
int sectorsize,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp);
int qcrypto_block_encrypt_helper(QCryptoCipher *cipher,
size_t niv,
QCryptoIVGen *ivgen,
int sectorsize,
uint64_t startsector,
uint8_t *buf,
size_t len,
Error **errp);
#endif /* QCRYPTO_BLOCK_PRIV_H__ */

View File

@@ -21,11 +21,17 @@
#include "qemu/osdep.h"
#include "crypto/aes.h"
#include "crypto/desrfb.h"
#include "crypto/xts.h"
typedef struct QCryptoCipherBuiltinAESContext QCryptoCipherBuiltinAESContext;
struct QCryptoCipherBuiltinAESContext {
AES_KEY enc;
AES_KEY dec;
};
typedef struct QCryptoCipherBuiltinAES QCryptoCipherBuiltinAES;
struct QCryptoCipherBuiltinAES {
AES_KEY encrypt_key;
AES_KEY decrypt_key;
QCryptoCipherBuiltinAESContext key;
QCryptoCipherBuiltinAESContext key_tweak;
uint8_t iv[AES_BLOCK_SIZE];
};
typedef struct QCryptoCipherBuiltinDESRFB QCryptoCipherBuiltinDESRFB;
@@ -67,6 +73,82 @@ static void qcrypto_cipher_free_aes(QCryptoCipher *cipher)
}
static void qcrypto_cipher_aes_ecb_encrypt(AES_KEY *key,
const void *in,
void *out,
size_t len)
{
const uint8_t *inptr = in;
uint8_t *outptr = out;
while (len) {
if (len > AES_BLOCK_SIZE) {
AES_encrypt(inptr, outptr, key);
inptr += AES_BLOCK_SIZE;
outptr += AES_BLOCK_SIZE;
len -= AES_BLOCK_SIZE;
} else {
uint8_t tmp1[AES_BLOCK_SIZE], tmp2[AES_BLOCK_SIZE];
memcpy(tmp1, inptr, len);
/* Fill with 0 to avoid valgrind uninitialized reads */
memset(tmp1 + len, 0, sizeof(tmp1) - len);
AES_encrypt(tmp1, tmp2, key);
memcpy(outptr, tmp2, len);
len = 0;
}
}
}
static void qcrypto_cipher_aes_ecb_decrypt(AES_KEY *key,
const void *in,
void *out,
size_t len)
{
const uint8_t *inptr = in;
uint8_t *outptr = out;
while (len) {
if (len > AES_BLOCK_SIZE) {
AES_decrypt(inptr, outptr, key);
inptr += AES_BLOCK_SIZE;
outptr += AES_BLOCK_SIZE;
len -= AES_BLOCK_SIZE;
} else {
uint8_t tmp1[AES_BLOCK_SIZE], tmp2[AES_BLOCK_SIZE];
memcpy(tmp1, inptr, len);
/* Fill with 0 to avoid valgrind uninitialized reads */
memset(tmp1 + len, 0, sizeof(tmp1) - len);
AES_decrypt(tmp1, tmp2, key);
memcpy(outptr, tmp2, len);
len = 0;
}
}
}
static void qcrypto_cipher_aes_xts_encrypt(const void *ctx,
size_t length,
uint8_t *dst,
const uint8_t *src)
{
const QCryptoCipherBuiltinAESContext *aesctx = ctx;
qcrypto_cipher_aes_ecb_encrypt((AES_KEY *)&aesctx->enc,
src, dst, length);
}
static void qcrypto_cipher_aes_xts_decrypt(const void *ctx,
size_t length,
uint8_t *dst,
const uint8_t *src)
{
const QCryptoCipherBuiltinAESContext *aesctx = ctx;
qcrypto_cipher_aes_ecb_decrypt((AES_KEY *)&aesctx->dec,
src, dst, length);
}
static int qcrypto_cipher_encrypt_aes(QCryptoCipher *cipher,
const void *in,
void *out,
@@ -75,29 +157,26 @@ static int qcrypto_cipher_encrypt_aes(QCryptoCipher *cipher,
{
QCryptoCipherBuiltin *ctxt = cipher->opaque;
if (cipher->mode == QCRYPTO_CIPHER_MODE_ECB) {
const uint8_t *inptr = in;
uint8_t *outptr = out;
while (len) {
if (len > AES_BLOCK_SIZE) {
AES_encrypt(inptr, outptr, &ctxt->state.aes.encrypt_key);
inptr += AES_BLOCK_SIZE;
outptr += AES_BLOCK_SIZE;
len -= AES_BLOCK_SIZE;
} else {
uint8_t tmp1[AES_BLOCK_SIZE], tmp2[AES_BLOCK_SIZE];
memcpy(tmp1, inptr, len);
/* Fill with 0 to avoid valgrind uninitialized reads */
memset(tmp1 + len, 0, sizeof(tmp1) - len);
AES_encrypt(tmp1, tmp2, &ctxt->state.aes.encrypt_key);
memcpy(outptr, tmp2, len);
len = 0;
}
}
} else {
switch (cipher->mode) {
case QCRYPTO_CIPHER_MODE_ECB:
qcrypto_cipher_aes_ecb_encrypt(&ctxt->state.aes.key.enc,
in, out, len);
break;
case QCRYPTO_CIPHER_MODE_CBC:
AES_cbc_encrypt(in, out, len,
&ctxt->state.aes.encrypt_key,
&ctxt->state.aes.key.enc,
ctxt->state.aes.iv, 1);
break;
case QCRYPTO_CIPHER_MODE_XTS:
xts_encrypt(&ctxt->state.aes.key,
&ctxt->state.aes.key_tweak,
qcrypto_cipher_aes_xts_encrypt,
qcrypto_cipher_aes_xts_decrypt,
ctxt->state.aes.iv,
len, out, in);
break;
default:
g_assert_not_reached();
}
return 0;
@@ -112,29 +191,26 @@ static int qcrypto_cipher_decrypt_aes(QCryptoCipher *cipher,
{
QCryptoCipherBuiltin *ctxt = cipher->opaque;
if (cipher->mode == QCRYPTO_CIPHER_MODE_ECB) {
const uint8_t *inptr = in;
uint8_t *outptr = out;
while (len) {
if (len > AES_BLOCK_SIZE) {
AES_decrypt(inptr, outptr, &ctxt->state.aes.decrypt_key);
inptr += AES_BLOCK_SIZE;
outptr += AES_BLOCK_SIZE;
len -= AES_BLOCK_SIZE;
} else {
uint8_t tmp1[AES_BLOCK_SIZE], tmp2[AES_BLOCK_SIZE];
memcpy(tmp1, inptr, len);
/* Fill with 0 to avoid valgrind uninitialized reads */
memset(tmp1 + len, 0, sizeof(tmp1) - len);
AES_decrypt(tmp1, tmp2, &ctxt->state.aes.decrypt_key);
memcpy(outptr, tmp2, len);
len = 0;
}
}
} else {
switch (cipher->mode) {
case QCRYPTO_CIPHER_MODE_ECB:
qcrypto_cipher_aes_ecb_decrypt(&ctxt->state.aes.key.dec,
in, out, len);
break;
case QCRYPTO_CIPHER_MODE_CBC:
AES_cbc_encrypt(in, out, len,
&ctxt->state.aes.decrypt_key,
&ctxt->state.aes.key.dec,
ctxt->state.aes.iv, 0);
break;
case QCRYPTO_CIPHER_MODE_XTS:
xts_decrypt(&ctxt->state.aes.key,
&ctxt->state.aes.key_tweak,
qcrypto_cipher_aes_xts_encrypt,
qcrypto_cipher_aes_xts_decrypt,
ctxt->state.aes.iv,
len, out, in);
break;
default:
g_assert_not_reached();
}
return 0;
@@ -166,21 +242,46 @@ static int qcrypto_cipher_init_aes(QCryptoCipher *cipher,
QCryptoCipherBuiltin *ctxt;
if (cipher->mode != QCRYPTO_CIPHER_MODE_CBC &&
cipher->mode != QCRYPTO_CIPHER_MODE_ECB) {
cipher->mode != QCRYPTO_CIPHER_MODE_ECB &&
cipher->mode != QCRYPTO_CIPHER_MODE_XTS) {
error_setg(errp, "Unsupported cipher mode %d", cipher->mode);
return -1;
}
ctxt = g_new0(QCryptoCipherBuiltin, 1);
if (AES_set_encrypt_key(key, nkey * 8, &ctxt->state.aes.encrypt_key) != 0) {
error_setg(errp, "Failed to set encryption key");
goto error;
}
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
if (AES_set_encrypt_key(key, nkey * 4, &ctxt->state.aes.key.enc) != 0) {
error_setg(errp, "Failed to set encryption key");
goto error;
}
if (AES_set_decrypt_key(key, nkey * 8, &ctxt->state.aes.decrypt_key) != 0) {
error_setg(errp, "Failed to set decryption key");
goto error;
if (AES_set_decrypt_key(key, nkey * 4, &ctxt->state.aes.key.dec) != 0) {
error_setg(errp, "Failed to set decryption key");
goto error;
}
if (AES_set_encrypt_key(key + (nkey / 2), nkey * 4,
&ctxt->state.aes.key_tweak.enc) != 0) {
error_setg(errp, "Failed to set encryption key");
goto error;
}
if (AES_set_decrypt_key(key + (nkey / 2), nkey * 4,
&ctxt->state.aes.key_tweak.dec) != 0) {
error_setg(errp, "Failed to set decryption key");
goto error;
}
} else {
if (AES_set_encrypt_key(key, nkey * 8, &ctxt->state.aes.key.enc) != 0) {
error_setg(errp, "Failed to set encryption key");
goto error;
}
if (AES_set_decrypt_key(key, nkey * 8, &ctxt->state.aes.key.dec) != 0) {
error_setg(errp, "Failed to set decryption key");
goto error;
}
}
ctxt->blocksize = AES_BLOCK_SIZE;
@@ -322,7 +423,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
cipher->alg = alg;
cipher->mode = mode;
if (!qcrypto_cipher_validate_key_length(alg, nkey, errp)) {
if (!qcrypto_cipher_validate_key_length(alg, mode, nkey, errp)) {
goto error;
}

View File

@@ -19,6 +19,8 @@
*/
#include "qemu/osdep.h"
#include "crypto/xts.h"
#include <gcrypt.h>
@@ -29,6 +31,12 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg)
case QCRYPTO_CIPHER_ALG_AES_128:
case QCRYPTO_CIPHER_ALG_AES_192:
case QCRYPTO_CIPHER_ALG_AES_256:
case QCRYPTO_CIPHER_ALG_CAST5_128:
case QCRYPTO_CIPHER_ALG_SERPENT_128:
case QCRYPTO_CIPHER_ALG_SERPENT_192:
case QCRYPTO_CIPHER_ALG_SERPENT_256:
case QCRYPTO_CIPHER_ALG_TWOFISH_128:
case QCRYPTO_CIPHER_ALG_TWOFISH_256:
return true;
default:
return false;
@@ -38,7 +46,9 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg)
typedef struct QCryptoCipherGcrypt QCryptoCipherGcrypt;
struct QCryptoCipherGcrypt {
gcry_cipher_hd_t handle;
gcry_cipher_hd_t tweakhandle;
size_t blocksize;
uint8_t *iv;
};
QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
@@ -53,6 +63,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
switch (mode) {
case QCRYPTO_CIPHER_MODE_ECB:
case QCRYPTO_CIPHER_MODE_XTS:
gcrymode = GCRY_CIPHER_MODE_ECB;
break;
case QCRYPTO_CIPHER_MODE_CBC:
@@ -63,7 +74,7 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
return NULL;
}
if (!qcrypto_cipher_validate_key_length(alg, nkey, errp)) {
if (!qcrypto_cipher_validate_key_length(alg, mode, nkey, errp)) {
return NULL;
}
@@ -84,6 +95,30 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
gcryalg = GCRY_CIPHER_AES256;
break;
case QCRYPTO_CIPHER_ALG_CAST5_128:
gcryalg = GCRY_CIPHER_CAST5;
break;
case QCRYPTO_CIPHER_ALG_SERPENT_128:
gcryalg = GCRY_CIPHER_SERPENT128;
break;
case QCRYPTO_CIPHER_ALG_SERPENT_192:
gcryalg = GCRY_CIPHER_SERPENT192;
break;
case QCRYPTO_CIPHER_ALG_SERPENT_256:
gcryalg = GCRY_CIPHER_SERPENT256;
break;
case QCRYPTO_CIPHER_ALG_TWOFISH_128:
gcryalg = GCRY_CIPHER_TWOFISH128;
break;
case QCRYPTO_CIPHER_ALG_TWOFISH_256:
gcryalg = GCRY_CIPHER_TWOFISH;
break;
default:
error_setg(errp, "Unsupported cipher algorithm %d", alg);
return NULL;
@@ -101,6 +136,14 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
gcry_strerror(err));
goto error;
}
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
err = gcry_cipher_open(&ctx->tweakhandle, gcryalg, gcrymode, 0);
if (err != 0) {
error_setg(errp, "Cannot initialize cipher: %s",
gcry_strerror(err));
goto error;
}
}
if (cipher->alg == QCRYPTO_CIPHER_ALG_DES_RFB) {
/* We're using standard DES cipher from gcrypt, so we need
@@ -112,13 +155,44 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
g_free(rfbkey);
ctx->blocksize = 8;
} else {
err = gcry_cipher_setkey(ctx->handle, key, nkey);
ctx->blocksize = 16;
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
nkey /= 2;
err = gcry_cipher_setkey(ctx->handle, key, nkey);
if (err != 0) {
error_setg(errp, "Cannot set key: %s",
gcry_strerror(err));
goto error;
}
err = gcry_cipher_setkey(ctx->tweakhandle, key + nkey, nkey);
} else {
err = gcry_cipher_setkey(ctx->handle, key, nkey);
}
if (err != 0) {
error_setg(errp, "Cannot set key: %s",
gcry_strerror(err));
goto error;
}
switch (cipher->alg) {
case QCRYPTO_CIPHER_ALG_AES_128:
case QCRYPTO_CIPHER_ALG_AES_192:
case QCRYPTO_CIPHER_ALG_AES_256:
case QCRYPTO_CIPHER_ALG_SERPENT_128:
case QCRYPTO_CIPHER_ALG_SERPENT_192:
case QCRYPTO_CIPHER_ALG_SERPENT_256:
case QCRYPTO_CIPHER_ALG_TWOFISH_128:
case QCRYPTO_CIPHER_ALG_TWOFISH_256:
ctx->blocksize = 16;
break;
case QCRYPTO_CIPHER_ALG_CAST5_128:
ctx->blocksize = 8;
break;
default:
g_assert_not_reached();
}
}
if (err != 0) {
error_setg(errp, "Cannot set key: %s",
gcry_strerror(err));
goto error;
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
ctx->iv = g_new0(uint8_t, ctx->blocksize);
}
cipher->opaque = ctx;
@@ -126,6 +200,9 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
error:
gcry_cipher_close(ctx->handle);
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
gcry_cipher_close(ctx->tweakhandle);
}
g_free(ctx);
g_free(cipher);
return NULL;
@@ -140,11 +217,35 @@ void qcrypto_cipher_free(QCryptoCipher *cipher)
}
ctx = cipher->opaque;
gcry_cipher_close(ctx->handle);
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
gcry_cipher_close(ctx->tweakhandle);
}
g_free(ctx->iv);
g_free(ctx);
g_free(cipher);
}
static void qcrypto_gcrypt_xts_encrypt(const void *ctx,
size_t length,
uint8_t *dst,
const uint8_t *src)
{
gcry_error_t err;
err = gcry_cipher_encrypt((gcry_cipher_hd_t)ctx, dst, length, src, length);
g_assert(err == 0);
}
static void qcrypto_gcrypt_xts_decrypt(const void *ctx,
size_t length,
uint8_t *dst,
const uint8_t *src)
{
gcry_error_t err;
err = gcry_cipher_decrypt((gcry_cipher_hd_t)ctx, dst, length, src, length);
g_assert(err == 0);
}
int qcrypto_cipher_encrypt(QCryptoCipher *cipher,
const void *in,
void *out,
@@ -160,13 +261,20 @@ int qcrypto_cipher_encrypt(QCryptoCipher *cipher,
return -1;
}
err = gcry_cipher_encrypt(ctx->handle,
out, len,
in, len);
if (err != 0) {
error_setg(errp, "Cannot encrypt data: %s",
gcry_strerror(err));
return -1;
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
xts_encrypt(ctx->handle, ctx->tweakhandle,
qcrypto_gcrypt_xts_encrypt,
qcrypto_gcrypt_xts_decrypt,
ctx->iv, len, out, in);
} else {
err = gcry_cipher_encrypt(ctx->handle,
out, len,
in, len);
if (err != 0) {
error_setg(errp, "Cannot encrypt data: %s",
gcry_strerror(err));
return -1;
}
}
return 0;
@@ -188,13 +296,20 @@ int qcrypto_cipher_decrypt(QCryptoCipher *cipher,
return -1;
}
err = gcry_cipher_decrypt(ctx->handle,
out, len,
in, len);
if (err != 0) {
error_setg(errp, "Cannot decrypt data: %s",
gcry_strerror(err));
return -1;
if (cipher->mode == QCRYPTO_CIPHER_MODE_XTS) {
xts_decrypt(ctx->handle, ctx->tweakhandle,
qcrypto_gcrypt_xts_encrypt,
qcrypto_gcrypt_xts_decrypt,
ctx->iv, len, out, in);
} else {
err = gcry_cipher_decrypt(ctx->handle,
out, len,
in, len);
if (err != 0) {
error_setg(errp, "Cannot decrypt data: %s",
gcry_strerror(err));
return -1;
}
}
return 0;
@@ -213,12 +328,16 @@ int qcrypto_cipher_setiv(QCryptoCipher *cipher,
return -1;
}
gcry_cipher_reset(ctx->handle);
err = gcry_cipher_setiv(ctx->handle, iv, niv);
if (err != 0) {
error_setg(errp, "Cannot set IV: %s",
if (ctx->iv) {
memcpy(ctx->iv, iv, niv);
} else {
gcry_cipher_reset(ctx->handle);
err = gcry_cipher_setiv(ctx->handle, iv, niv);
if (err != 0) {
error_setg(errp, "Cannot set IV: %s",
gcry_strerror(err));
return -1;
return -1;
}
}
return 0;

View File

@@ -19,56 +19,174 @@
*/
#include "qemu/osdep.h"
#include "crypto/xts.h"
#include <nettle/nettle-types.h>
#include <nettle/aes.h>
#include <nettle/des.h>
#include <nettle/cbc.h>
#include <nettle/cast128.h>
#include <nettle/serpent.h>
#include <nettle/twofish.h>
typedef void (*QCryptoCipherNettleFuncWrapper)(const void *ctx,
size_t length,
uint8_t *dst,
const uint8_t *src);
#if CONFIG_NETTLE_VERSION_MAJOR < 3
typedef nettle_crypt_func nettle_cipher_func;
typedef nettle_crypt_func * QCryptoCipherNettleFuncNative;
typedef void * cipher_ctx_t;
typedef unsigned cipher_length_t;
#define cast5_set_key cast128_set_key
#else
typedef nettle_cipher_func * QCryptoCipherNettleFuncNative;
typedef const void * cipher_ctx_t;
typedef size_t cipher_length_t;
#endif
static nettle_cipher_func aes_encrypt_wrapper;
static nettle_cipher_func aes_decrypt_wrapper;
static nettle_cipher_func des_encrypt_wrapper;
static nettle_cipher_func des_decrypt_wrapper;
typedef struct QCryptoNettleAES {
struct aes_ctx enc;
struct aes_ctx dec;
} QCryptoNettleAES;
static void aes_encrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
static void aes_encrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
aes_encrypt(ctx, length, dst, src);
const QCryptoNettleAES *aesctx = ctx;
aes_encrypt(&aesctx->enc, length, dst, src);
}
static void aes_decrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
static void aes_decrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
aes_decrypt(ctx, length, dst, src);
const QCryptoNettleAES *aesctx = ctx;
aes_decrypt(&aesctx->dec, length, dst, src);
}
static void des_encrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length,
static void des_encrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
des_encrypt(ctx, length, dst, src);
}
static void des_decrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
des_decrypt(ctx, length, dst, src);
}
static void cast128_encrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
cast128_encrypt(ctx, length, dst, src);
}
static void cast128_decrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
cast128_decrypt(ctx, length, dst, src);
}
static void serpent_encrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
serpent_encrypt(ctx, length, dst, src);
}
static void serpent_decrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
serpent_decrypt(ctx, length, dst, src);
}
static void twofish_encrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
twofish_encrypt(ctx, length, dst, src);
}
static void twofish_decrypt_native(cipher_ctx_t ctx, cipher_length_t length,
uint8_t *dst, const uint8_t *src)
{
twofish_decrypt(ctx, length, dst, src);
}
static void aes_encrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
const QCryptoNettleAES *aesctx = ctx;
aes_encrypt(&aesctx->enc, length, dst, src);
}
static void aes_decrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
const QCryptoNettleAES *aesctx = ctx;
aes_decrypt(&aesctx->dec, length, dst, src);
}
static void des_encrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
des_encrypt(ctx, length, dst, src);
}
static void des_decrypt_wrapper(cipher_ctx_t ctx, cipher_length_t length,
static void des_decrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
des_decrypt(ctx, length, dst, src);
}
static void cast128_encrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
cast128_encrypt(ctx, length, dst, src);
}
static void cast128_decrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
cast128_decrypt(ctx, length, dst, src);
}
static void serpent_encrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
serpent_encrypt(ctx, length, dst, src);
}
static void serpent_decrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
serpent_decrypt(ctx, length, dst, src);
}
static void twofish_encrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
twofish_encrypt(ctx, length, dst, src);
}
static void twofish_decrypt_wrapper(const void *ctx, size_t length,
uint8_t *dst, const uint8_t *src)
{
twofish_decrypt(ctx, length, dst, src);
}
typedef struct QCryptoCipherNettle QCryptoCipherNettle;
struct QCryptoCipherNettle {
void *ctx_encrypt;
void *ctx_decrypt;
nettle_cipher_func *alg_encrypt;
nettle_cipher_func *alg_decrypt;
/* Primary cipher context for all modes */
void *ctx;
/* Second cipher context for XTS mode only */
void *ctx_tweak;
/* Cipher callbacks for both contexts */
QCryptoCipherNettleFuncNative alg_encrypt_native;
QCryptoCipherNettleFuncNative alg_decrypt_native;
QCryptoCipherNettleFuncWrapper alg_encrypt_wrapper;
QCryptoCipherNettleFuncWrapper alg_decrypt_wrapper;
uint8_t *iv;
size_t blocksize;
};
@@ -80,6 +198,13 @@ bool qcrypto_cipher_supports(QCryptoCipherAlgorithm alg)
case QCRYPTO_CIPHER_ALG_AES_128:
case QCRYPTO_CIPHER_ALG_AES_192:
case QCRYPTO_CIPHER_ALG_AES_256:
case QCRYPTO_CIPHER_ALG_CAST5_128:
case QCRYPTO_CIPHER_ALG_SERPENT_128:
case QCRYPTO_CIPHER_ALG_SERPENT_192:
case QCRYPTO_CIPHER_ALG_SERPENT_256:
case QCRYPTO_CIPHER_ALG_TWOFISH_128:
case QCRYPTO_CIPHER_ALG_TWOFISH_192:
case QCRYPTO_CIPHER_ALG_TWOFISH_256:
return true;
default:
return false;
@@ -99,13 +224,14 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
switch (mode) {
case QCRYPTO_CIPHER_MODE_ECB:
case QCRYPTO_CIPHER_MODE_CBC:
case QCRYPTO_CIPHER_MODE_XTS:
break;
default:
error_setg(errp, "Unsupported cipher mode %d", mode);
return NULL;
}
if (!qcrypto_cipher_validate_key_length(alg, nkey, errp)) {
if (!qcrypto_cipher_validate_key_length(alg, mode, nkey, errp)) {
return NULL;
}
@@ -117,14 +243,15 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
switch (alg) {
case QCRYPTO_CIPHER_ALG_DES_RFB:
ctx->ctx_encrypt = g_new0(struct des_ctx, 1);
ctx->ctx_decrypt = NULL; /* 1 ctx can do both */
ctx->ctx = g_new0(struct des_ctx, 1);
rfbkey = qcrypto_cipher_munge_des_rfb_key(key, nkey);
des_set_key(ctx->ctx_encrypt, rfbkey);
des_set_key(ctx->ctx, rfbkey);
g_free(rfbkey);
ctx->alg_encrypt = des_encrypt_wrapper;
ctx->alg_decrypt = des_decrypt_wrapper;
ctx->alg_encrypt_native = des_encrypt_native;
ctx->alg_decrypt_native = des_decrypt_native;
ctx->alg_encrypt_wrapper = des_encrypt_wrapper;
ctx->alg_decrypt_wrapper = des_decrypt_wrapper;
ctx->blocksize = DES_BLOCK_SIZE;
break;
@@ -132,17 +259,103 @@ QCryptoCipher *qcrypto_cipher_new(QCryptoCipherAlgorithm alg,
case QCRYPTO_CIPHER_ALG_AES_128:
case QCRYPTO_CIPHER_ALG_AES_192:
case QCRYPTO_CIPHER_ALG_AES_256:
ctx->ctx_encrypt = g_new0(struct aes_ctx, 1);
ctx->ctx_decrypt = g_new0(struct aes_ctx, 1);
ctx->ctx = g_new0(QCryptoNettleAES, 1);
aes_set_encrypt_key(ctx->ctx_encrypt, nkey, key);
aes_set_decrypt_key(ctx->ctx_decrypt, nkey, key);
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
ctx->ctx_tweak = g_new0(QCryptoNettleAES, 1);
ctx->alg_encrypt = aes_encrypt_wrapper;
ctx->alg_decrypt = aes_decrypt_wrapper;
nkey /= 2;
aes_set_encrypt_key(&((QCryptoNettleAES *)ctx->ctx)->enc,
nkey, key);
aes_set_decrypt_key(&((QCryptoNettleAES *)ctx->ctx)->dec,
nkey, key);
aes_set_encrypt_key(&((QCryptoNettleAES *)ctx->ctx_tweak)->enc,
nkey, key + nkey);
aes_set_decrypt_key(&((QCryptoNettleAES *)ctx->ctx_tweak)->dec,
nkey, key + nkey);
} else {
aes_set_encrypt_key(&((QCryptoNettleAES *)ctx->ctx)->enc,
nkey, key);
aes_set_decrypt_key(&((QCryptoNettleAES *)ctx->ctx)->dec,
nkey, key);
}
ctx->alg_encrypt_native = aes_encrypt_native;
ctx->alg_decrypt_native = aes_decrypt_native;
ctx->alg_encrypt_wrapper = aes_encrypt_wrapper;
ctx->alg_decrypt_wrapper = aes_decrypt_wrapper;
ctx->blocksize = AES_BLOCK_SIZE;
break;
case QCRYPTO_CIPHER_ALG_CAST5_128:
ctx->ctx = g_new0(struct cast128_ctx, 1);
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
ctx->ctx_tweak = g_new0(struct cast128_ctx, 1);
nkey /= 2;
cast5_set_key(ctx->ctx, nkey, key);
cast5_set_key(ctx->ctx_tweak, nkey, key + nkey);
} else {
cast5_set_key(ctx->ctx, nkey, key);
}
ctx->alg_encrypt_native = cast128_encrypt_native;
ctx->alg_decrypt_native = cast128_decrypt_native;
ctx->alg_encrypt_wrapper = cast128_encrypt_wrapper;
ctx->alg_decrypt_wrapper = cast128_decrypt_wrapper;
ctx->blocksize = CAST128_BLOCK_SIZE;
break;
case QCRYPTO_CIPHER_ALG_SERPENT_128:
case QCRYPTO_CIPHER_ALG_SERPENT_192:
case QCRYPTO_CIPHER_ALG_SERPENT_256:
ctx->ctx = g_new0(struct serpent_ctx, 1);
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
ctx->ctx_tweak = g_new0(struct serpent_ctx, 1);
nkey /= 2;
serpent_set_key(ctx->ctx, nkey, key);
serpent_set_key(ctx->ctx_tweak, nkey, key + nkey);
} else {
serpent_set_key(ctx->ctx, nkey, key);
}
ctx->alg_encrypt_native = serpent_encrypt_native;
ctx->alg_decrypt_native = serpent_decrypt_native;
ctx->alg_encrypt_wrapper = serpent_encrypt_wrapper;
ctx->alg_decrypt_wrapper = serpent_decrypt_wrapper;
ctx->blocksize = SERPENT_BLOCK_SIZE;
break;
case QCRYPTO_CIPHER_ALG_TWOFISH_128:
case QCRYPTO_CIPHER_ALG_TWOFISH_192:
case QCRYPTO_CIPHER_ALG_TWOFISH_256:
ctx->ctx = g_new0(struct twofish_ctx, 1);
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
ctx->ctx_tweak = g_new0(struct twofish_ctx, 1);
nkey /= 2;
twofish_set_key(ctx->ctx, nkey, key);
twofish_set_key(ctx->ctx_tweak, nkey, key + nkey);
} else {
twofish_set_key(ctx->ctx, nkey, key);
}
ctx->alg_encrypt_native = twofish_encrypt_native;
ctx->alg_decrypt_native = twofish_decrypt_native;
ctx->alg_encrypt_wrapper = twofish_encrypt_wrapper;
ctx->alg_decrypt_wrapper = twofish_decrypt_wrapper;
ctx->blocksize = TWOFISH_BLOCK_SIZE;
break;
default:
error_setg(errp, "Unsupported cipher algorithm %d", alg);
goto error;
@@ -170,8 +383,8 @@ void qcrypto_cipher_free(QCryptoCipher *cipher)
ctx = cipher->opaque;
g_free(ctx->iv);
g_free(ctx->ctx_encrypt);
g_free(ctx->ctx_decrypt);
g_free(ctx->ctx);
g_free(ctx->ctx_tweak);
g_free(ctx);
g_free(cipher);
}
@@ -193,14 +406,21 @@ int qcrypto_cipher_encrypt(QCryptoCipher *cipher,
switch (cipher->mode) {
case QCRYPTO_CIPHER_MODE_ECB:
ctx->alg_encrypt(ctx->ctx_encrypt, len, out, in);
ctx->alg_encrypt_wrapper(ctx->ctx, len, out, in);
break;
case QCRYPTO_CIPHER_MODE_CBC:
cbc_encrypt(ctx->ctx_encrypt, ctx->alg_encrypt,
cbc_encrypt(ctx->ctx, ctx->alg_encrypt_native,
ctx->blocksize, ctx->iv,
len, out, in);
break;
case QCRYPTO_CIPHER_MODE_XTS:
xts_encrypt(ctx->ctx, ctx->ctx_tweak,
ctx->alg_encrypt_wrapper, ctx->alg_encrypt_wrapper,
ctx->iv, len, out, in);
break;
default:
error_setg(errp, "Unsupported cipher algorithm %d",
cipher->alg);
@@ -226,15 +446,26 @@ int qcrypto_cipher_decrypt(QCryptoCipher *cipher,
switch (cipher->mode) {
case QCRYPTO_CIPHER_MODE_ECB:
ctx->alg_decrypt(ctx->ctx_decrypt ? ctx->ctx_decrypt : ctx->ctx_encrypt,
len, out, in);
ctx->alg_decrypt_wrapper(ctx->ctx, len, out, in);
break;
case QCRYPTO_CIPHER_MODE_CBC:
cbc_decrypt(ctx->ctx_decrypt ? ctx->ctx_decrypt : ctx->ctx_encrypt,
ctx->alg_decrypt, ctx->blocksize, ctx->iv,
cbc_decrypt(ctx->ctx, ctx->alg_decrypt_native,
ctx->blocksize, ctx->iv,
len, out, in);
break;
case QCRYPTO_CIPHER_MODE_XTS:
if (ctx->blocksize != XTS_BLOCK_SIZE) {
error_setg(errp, "Block size must be %d not %zu",
XTS_BLOCK_SIZE, ctx->blocksize);
return -1;
}
xts_decrypt(ctx->ctx, ctx->ctx_tweak,
ctx->alg_encrypt_wrapper, ctx->alg_decrypt_wrapper,
ctx->iv, len, out, in);
break;
default:
error_setg(errp, "Unsupported cipher algorithm %d",
cipher->alg);

View File

@@ -27,6 +27,13 @@ static size_t alg_key_len[QCRYPTO_CIPHER_ALG__MAX] = {
[QCRYPTO_CIPHER_ALG_AES_192] = 24,
[QCRYPTO_CIPHER_ALG_AES_256] = 32,
[QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
[QCRYPTO_CIPHER_ALG_CAST5_128] = 16,
[QCRYPTO_CIPHER_ALG_SERPENT_128] = 16,
[QCRYPTO_CIPHER_ALG_SERPENT_192] = 24,
[QCRYPTO_CIPHER_ALG_SERPENT_256] = 32,
[QCRYPTO_CIPHER_ALG_TWOFISH_128] = 16,
[QCRYPTO_CIPHER_ALG_TWOFISH_192] = 24,
[QCRYPTO_CIPHER_ALG_TWOFISH_256] = 32,
};
static size_t alg_block_len[QCRYPTO_CIPHER_ALG__MAX] = {
@@ -34,11 +41,19 @@ static size_t alg_block_len[QCRYPTO_CIPHER_ALG__MAX] = {
[QCRYPTO_CIPHER_ALG_AES_192] = 16,
[QCRYPTO_CIPHER_ALG_AES_256] = 16,
[QCRYPTO_CIPHER_ALG_DES_RFB] = 8,
[QCRYPTO_CIPHER_ALG_CAST5_128] = 8,
[QCRYPTO_CIPHER_ALG_SERPENT_128] = 16,
[QCRYPTO_CIPHER_ALG_SERPENT_192] = 16,
[QCRYPTO_CIPHER_ALG_SERPENT_256] = 16,
[QCRYPTO_CIPHER_ALG_TWOFISH_128] = 16,
[QCRYPTO_CIPHER_ALG_TWOFISH_192] = 16,
[QCRYPTO_CIPHER_ALG_TWOFISH_256] = 16,
};
static bool mode_need_iv[QCRYPTO_CIPHER_MODE__MAX] = {
[QCRYPTO_CIPHER_MODE_ECB] = false,
[QCRYPTO_CIPHER_MODE_CBC] = true,
[QCRYPTO_CIPHER_MODE_XTS] = true,
};
@@ -79,6 +94,7 @@ size_t qcrypto_cipher_get_iv_len(QCryptoCipherAlgorithm alg,
static bool
qcrypto_cipher_validate_key_length(QCryptoCipherAlgorithm alg,
QCryptoCipherMode mode,
size_t nkey,
Error **errp)
{
@@ -88,10 +104,27 @@ qcrypto_cipher_validate_key_length(QCryptoCipherAlgorithm alg,
return false;
}
if (alg_key_len[alg] != nkey) {
error_setg(errp, "Cipher key length %zu should be %zu",
nkey, alg_key_len[alg]);
return false;
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
if (alg == QCRYPTO_CIPHER_ALG_DES_RFB) {
error_setg(errp, "XTS mode not compatible with DES-RFB");
return false;
}
if (nkey % 2) {
error_setg(errp, "XTS cipher key length should be a multiple of 2");
return false;
}
if (alg_key_len[alg] != (nkey / 2)) {
error_setg(errp, "Cipher key length %zu should be %zu",
nkey, alg_key_len[alg] * 2);
return false;
}
} else {
if (alg_key_len[alg] != nkey) {
error_setg(errp, "Cipher key length %zu should be %zu",
nkey, alg_key_len[alg]);
return false;
}
}
return true;
}

View File

@@ -24,12 +24,8 @@
#ifdef CONFIG_GNUTLS_HASH
#include <gnutls/gnutls.h>
#include <gnutls/crypto.h>
#endif
static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
};
static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = {
[QCRYPTO_HASH_ALG_MD5] = 16,
@@ -37,6 +33,22 @@ static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = {
[QCRYPTO_HASH_ALG_SHA256] = 32,
};
size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg)
{
if (alg >= G_N_ELEMENTS(qcrypto_hash_alg_size)) {
return 0;
}
return qcrypto_hash_alg_size[alg];
}
#ifdef CONFIG_GNUTLS_HASH
static int qcrypto_hash_alg_map[QCRYPTO_HASH_ALG__MAX] = {
[QCRYPTO_HASH_ALG_MD5] = GNUTLS_DIG_MD5,
[QCRYPTO_HASH_ALG_SHA1] = GNUTLS_DIG_SHA1,
[QCRYPTO_HASH_ALG_SHA256] = GNUTLS_DIG_SHA256,
};
gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg)
{
if (alg < G_N_ELEMENTS(qcrypto_hash_alg_map)) {
@@ -45,14 +57,6 @@ gboolean qcrypto_hash_supports(QCryptoHashAlgorithm alg)
return false;
}
size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg)
{
if (alg >= G_N_ELEMENTS(qcrypto_hash_alg_size)) {
return 0;
}
return qcrypto_hash_alg_size[alg];
}
int qcrypto_hash_bytesv(QCryptoHashAlgorithm alg,
const struct iovec *iov,

118
crypto/ivgen-essiv.c Normal file
View File

@@ -0,0 +1,118 @@
/*
* QEMU Crypto block IV generator - essiv
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/ivgen-essiv.h"
typedef struct QCryptoIVGenESSIV QCryptoIVGenESSIV;
struct QCryptoIVGenESSIV {
QCryptoCipher *cipher;
};
static int qcrypto_ivgen_essiv_init(QCryptoIVGen *ivgen,
const uint8_t *key, size_t nkey,
Error **errp)
{
uint8_t *salt;
size_t nhash;
size_t nsalt;
QCryptoIVGenESSIV *essiv = g_new0(QCryptoIVGenESSIV, 1);
/* Not necessarily the same as nkey */
nsalt = qcrypto_cipher_get_key_len(ivgen->cipher);
nhash = qcrypto_hash_digest_len(ivgen->hash);
/* Salt must be larger of hash size or key size */
salt = g_new0(uint8_t, MAX(nhash, nsalt));
if (qcrypto_hash_bytes(ivgen->hash, (const gchar *)key, nkey,
&salt, &nhash,
errp) < 0) {
g_free(essiv);
return -1;
}
/* Now potentially truncate salt to match cipher key len */
essiv->cipher = qcrypto_cipher_new(ivgen->cipher,
QCRYPTO_CIPHER_MODE_ECB,
salt, MIN(nhash, nsalt),
errp);
if (!essiv->cipher) {
g_free(essiv);
g_free(salt);
return -1;
}
g_free(salt);
ivgen->private = essiv;
return 0;
}
static int qcrypto_ivgen_essiv_calculate(QCryptoIVGen *ivgen,
uint64_t sector,
uint8_t *iv, size_t niv,
Error **errp)
{
QCryptoIVGenESSIV *essiv = ivgen->private;
size_t ndata = qcrypto_cipher_get_block_len(ivgen->cipher);
uint8_t *data = g_new(uint8_t, ndata);
sector = cpu_to_le64(sector);
memcpy(data, (uint8_t *)&sector, ndata);
if (sizeof(sector) < ndata) {
memset(data + sizeof(sector), 0, ndata - sizeof(sector));
}
if (qcrypto_cipher_encrypt(essiv->cipher,
data,
data,
ndata,
errp) < 0) {
g_free(data);
return -1;
}
if (ndata > niv) {
ndata = niv;
}
memcpy(iv, data, ndata);
if (ndata < niv) {
memset(iv + ndata, 0, niv - ndata);
}
g_free(data);
return 0;
}
static void qcrypto_ivgen_essiv_cleanup(QCryptoIVGen *ivgen)
{
QCryptoIVGenESSIV *essiv = ivgen->private;
qcrypto_cipher_free(essiv->cipher);
g_free(essiv);
}
struct QCryptoIVGenDriver qcrypto_ivgen_essiv = {
.init = qcrypto_ivgen_essiv_init,
.calculate = qcrypto_ivgen_essiv_calculate,
.cleanup = qcrypto_ivgen_essiv_cleanup,
};

28
crypto/ivgen-essiv.h Normal file
View File

@@ -0,0 +1,28 @@
/*
* QEMU Crypto block IV generator - essiv
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "crypto/ivgenpriv.h"
#ifndef QCRYPTO_IVGEN_ESSIV_H__
#define QCRYPTO_IVGEN_ESSIV_H__
extern struct QCryptoIVGenDriver qcrypto_ivgen_essiv;
#endif /* QCRYPTO_IVGEN_ESSIV_H__ */

59
crypto/ivgen-plain.c Normal file
View File

@@ -0,0 +1,59 @@
/*
* QEMU Crypto block IV generator - plain
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/ivgen-plain.h"
static int qcrypto_ivgen_plain_init(QCryptoIVGen *ivgen,
const uint8_t *key, size_t nkey,
Error **errp)
{
return 0;
}
static int qcrypto_ivgen_plain_calculate(QCryptoIVGen *ivgen,
uint64_t sector,
uint8_t *iv, size_t niv,
Error **errp)
{
size_t ivprefix;
uint32_t shortsector = cpu_to_le32((sector & 0xffffffff));
ivprefix = sizeof(shortsector);
if (ivprefix > niv) {
ivprefix = niv;
}
memcpy(iv, &shortsector, ivprefix);
if (ivprefix < niv) {
memset(iv + ivprefix, 0, niv - ivprefix);
}
return 0;
}
static void qcrypto_ivgen_plain_cleanup(QCryptoIVGen *ivgen)
{
}
struct QCryptoIVGenDriver qcrypto_ivgen_plain = {
.init = qcrypto_ivgen_plain_init,
.calculate = qcrypto_ivgen_plain_calculate,
.cleanup = qcrypto_ivgen_plain_cleanup,
};

28
crypto/ivgen-plain.h Normal file
View File

@@ -0,0 +1,28 @@
/*
* QEMU Crypto block IV generator - plain
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "crypto/ivgenpriv.h"
#ifndef QCRYPTO_IVGEN_PLAIN_H__
#define QCRYPTO_IVGEN_PLAIN_H__
extern struct QCryptoIVGenDriver qcrypto_ivgen_plain;
#endif /* QCRYPTO_IVGEN_PLAIN_H__ */

59
crypto/ivgen-plain64.c Normal file
View File

@@ -0,0 +1,59 @@
/*
* QEMU Crypto block IV generator - plain
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/ivgen-plain.h"
static int qcrypto_ivgen_plain_init(QCryptoIVGen *ivgen,
const uint8_t *key, size_t nkey,
Error **errp)
{
return 0;
}
static int qcrypto_ivgen_plain_calculate(QCryptoIVGen *ivgen,
uint64_t sector,
uint8_t *iv, size_t niv,
Error **errp)
{
size_t ivprefix;
ivprefix = sizeof(sector);
sector = cpu_to_le64(sector);
if (ivprefix > niv) {
ivprefix = niv;
}
memcpy(iv, &sector, ivprefix);
if (ivprefix < niv) {
memset(iv + ivprefix, 0, niv - ivprefix);
}
return 0;
}
static void qcrypto_ivgen_plain_cleanup(QCryptoIVGen *ivgen)
{
}
struct QCryptoIVGenDriver qcrypto_ivgen_plain64 = {
.init = qcrypto_ivgen_plain_init,
.calculate = qcrypto_ivgen_plain_calculate,
.cleanup = qcrypto_ivgen_plain_cleanup,
};

28
crypto/ivgen-plain64.h Normal file
View File

@@ -0,0 +1,28 @@
/*
* QEMU Crypto block IV generator - plain64
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "crypto/ivgenpriv.h"
#ifndef QCRYPTO_IVGEN_PLAIN64_H__
#define QCRYPTO_IVGEN_PLAIN64_H__
extern struct QCryptoIVGenDriver qcrypto_ivgen_plain64;
#endif /* QCRYPTO_IVGEN_PLAIN64_H__ */

99
crypto/ivgen.c Normal file
View File

@@ -0,0 +1,99 @@
/*
* QEMU Crypto block IV generator
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "crypto/ivgenpriv.h"
#include "crypto/ivgen-plain.h"
#include "crypto/ivgen-plain64.h"
#include "crypto/ivgen-essiv.h"
QCryptoIVGen *qcrypto_ivgen_new(QCryptoIVGenAlgorithm alg,
QCryptoCipherAlgorithm cipheralg,
QCryptoHashAlgorithm hash,
const uint8_t *key, size_t nkey,
Error **errp)
{
QCryptoIVGen *ivgen = g_new0(QCryptoIVGen, 1);
ivgen->algorithm = alg;
ivgen->cipher = cipheralg;
ivgen->hash = hash;
switch (alg) {
case QCRYPTO_IVGEN_ALG_PLAIN:
ivgen->driver = &qcrypto_ivgen_plain;
break;
case QCRYPTO_IVGEN_ALG_PLAIN64:
ivgen->driver = &qcrypto_ivgen_plain64;
break;
case QCRYPTO_IVGEN_ALG_ESSIV:
ivgen->driver = &qcrypto_ivgen_essiv;
break;
default:
error_setg(errp, "Unknown block IV generator algorithm %d", alg);
g_free(ivgen);
return NULL;
}
if (ivgen->driver->init(ivgen, key, nkey, errp) < 0) {
g_free(ivgen);
return NULL;
}
return ivgen;
}
int qcrypto_ivgen_calculate(QCryptoIVGen *ivgen,
uint64_t sector,
uint8_t *iv, size_t niv,
Error **errp)
{
return ivgen->driver->calculate(ivgen, sector, iv, niv, errp);
}
QCryptoIVGenAlgorithm qcrypto_ivgen_get_algorithm(QCryptoIVGen *ivgen)
{
return ivgen->algorithm;
}
QCryptoCipherAlgorithm qcrypto_ivgen_get_cipher(QCryptoIVGen *ivgen)
{
return ivgen->cipher;
}
QCryptoHashAlgorithm qcrypto_ivgen_get_hash(QCryptoIVGen *ivgen)
{
return ivgen->hash;
}
void qcrypto_ivgen_free(QCryptoIVGen *ivgen)
{
if (!ivgen) {
return;
}
ivgen->driver->cleanup(ivgen);
g_free(ivgen);
}

49
crypto/ivgenpriv.h Normal file
View File

@@ -0,0 +1,49 @@
/*
* QEMU Crypto block IV generator
*
* Copyright (c) 2015-2016 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef QCRYPTO_IVGEN_PRIV_H__
#define QCRYPTO_IVGEN_PRIV_H__
#include "crypto/ivgen.h"
typedef struct QCryptoIVGenDriver QCryptoIVGenDriver;
struct QCryptoIVGenDriver {
int (*init)(QCryptoIVGen *ivgen,
const uint8_t *key, size_t nkey,
Error **errp);
int (*calculate)(QCryptoIVGen *ivgen,
uint64_t sector,
uint8_t *iv, size_t niv,
Error **errp);
void (*cleanup)(QCryptoIVGen *ivgen);
};
struct QCryptoIVGen {
QCryptoIVGenDriver *driver;
void *private;
QCryptoIVGenAlgorithm algorithm;
QCryptoCipherAlgorithm cipher;
QCryptoHashAlgorithm hash;
};
#endif /* QCRYPTO_IVGEN_PRIV_H__ */

Some files were not shown because too many files have changed in this diff Show More