Compare commits

..

979 Commits

Author SHA1 Message Date
Alexander Graf
2222e0a633 input: Add trace event for empty keyboard queue
When driving QEMU from the outside, we have basically no chance to
determine how quickly the guest OS picks up key events, so we usually
have to limit ourselves to very slow keyboard presses to make sure
the guest always has enough chance to pick them up.

This patch adds a trace events when the keyboarde queue is drained.
An external driver can use that as hint that new keys can be pressed.

Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1490883775-94658-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-03 14:20:12 +02:00
Marc-André Lureau
05c6638b20 input: don't queue delay if paused
qemu_input_event_send() discards key event when the guest is paused,
but not the delay.

The delay ends up in the input queue, and qemu_input_event_send_key()
will further fill the queue with upcoming events.

VNC uses qemu_input_event_send_key_delay(), not SPICE, which results
in a different input behaviour on pause: VNC will queue the events
(except the first that is discarded), SPICE will discard all events.

Don't queue delay if paused, and provide same behaviour on SPICE and
VNC clients on resume (and potentially avoid over-allocating the
buffer queue)

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1444326

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170425130520.31819-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-05-03 14:19:40 +02:00
Gerd Hoffmann
fa18f36a46 input: limit kbd queue depth
Apply a limit to the number of items we accept into the keyboard queue.

Impact: Without this limit vnc clients can exhaust host memory by
sending keyboard events faster than qemu feeds them to the guest.

Fixes: CVE-2017-8379
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: jiangxin1@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170428084237.23960-1-kraxel@redhat.com
2017-05-03 14:18:21 +02:00
Stefan Hajnoczi
e619b14746 Merge remote-tracking branch 'sthibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Sat 29 Apr 2017 05:45:24 PM BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* sthibault/tags/samuel-thibault:
  slirp: VMStatify remaining except for loop
  slirp: VMStatify socket level
  slirp: Common lhost/fhost union
  slirp: VMStatify sbuf
  slirp: VMState conversion; tcpcb
  slirp: fix pinging the virtual ipv4 DNS server
  slirp: tftp, copy sockaddr_size
  slirp/smb: Replace constant strings by glib string
  slirp: allow host port 0 for hostfwd

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-05-02 15:16:29 +01:00
Dr. David Alan Gilbert
eb5d4f5329 slirp: VMStatify remaining except for loop
This converts the remaining components, except for the top level
loop, to VMState.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
14650df402 slirp: VMStatify socket level
Working up the stack, this replaces the slirp_socket_load/save
with VMState definitions.

A place holder for IPv6 support is added as a comment; it needs
testing once the rest of the IPv6 code is there.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
7eddf37c63 slirp: Common lhost/fhost union
The socket structure has a pair of unions for lhost and fhost
addresses; the unions are identical so split them out into
a separate union declaration.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
2a7cab9e17 slirp: VMStatify sbuf
Convert the sbuf structure to a VMStateDescription.
Note this uses the VMSTATE_WITH_TMP mechanism to calculate
and reload the offsets based on the pointers.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Dr. David Alan Gilbert
e3ec38ffd6 slirp: VMState conversion; tcpcb
Convert the migration of the struct tcpcb to use a VMStateDescription,
the rest of it will come later.

Mostly mechanical, except for conversion of some 'char' to uint8_t
to ensure portability.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:44:16 +02:00
Samuel Thibault
7d1724976f slirp: fix pinging the virtual ipv4 DNS server
so that people do not think it is not working at least basically.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Marc-André Lureau
17eb587aeb slirp: tftp, copy sockaddr_size
ASAN detects an "unknown-crash" when running pxe-test:

/ppc64/pxe/spapr-vlan: =================================================================
==7143==ERROR: AddressSanitizer: unknown-crash on address 0x7f6dcd298d30 at pc 0x55e22218830d bp 0x7f6dcd2989e0 sp 0x7f6dcd2989d0
READ of size 128 at 0x7f6dcd298d30 thread T2
    #0 0x55e22218830c in tftp_session_allocate /home/elmarco/src/qq/slirp/tftp.c:73
    #1 0x55e22218a1f8 in tftp_handle_rrq /home/elmarco/src/qq/slirp/tftp.c:289
    #2 0x55e22218b54c in tftp_input /home/elmarco/src/qq/slirp/tftp.c:446
    #3 0x55e2221833fe in udp6_input /home/elmarco/src/qq/slirp/udp6.c:82
    #4 0x55e222137b17 in ip6_input /home/elmarco/src/qq/slirp/ip6_input.c:67

Address 0x7f6dcd298d30 is located in stack of thread T2 at offset 96 in frame
    #0 0x55e222182420 in udp6_input /home/elmarco/src/qq/slirp/udp6.c:13

  This frame has 3 object(s):
    [32, 48) '<unknown>'
    [96, 124) 'lhost' <== Memory access at offset 96 partially overflows this variable
    [160, 200) 'save_ip' <== Memory access at offset 96 partially underflows this variable

The sockaddr_storage pointer is the sockaddr_in6 lhost on the
stack. Copy only the source addr size.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Dr. David Alan Gilbert
f95cc8b6cc slirp/smb: Replace constant strings by glib string
gcc 7 (on fedora 26) objects to many of the snprintf's
in the smb path and command creation because it can't
figure out that the smb_dir (i.e. the /tmp dir for the configuration)
is known to be short.

Replace all these fixed length buffers by g_str* functions that dynamically
allocate and use g_dir_make_tmp to make the directory.
(It's fairly new glib but we have a compat function for it).

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Vincent Bernat
0bed71edbc slirp: allow host port 0 for hostfwd
The OS will allocate automatically a free port. This is useful if you
want to be sure to not get any port conflict. You still have to figure
out which port you got, for example with "lsof" (this could be exposed
in the monitor if needed).

Example of use:

     $ qemu-system-x86_64 -net user,hostfwd=127.0.0.1:0-:22 ...

Then, get your port with:

     $ lsof -np 1474 | grep LISTEN
     qemu-syst 31777 bernat 12u IPv4 [...] TCP 127.0.0.1:35145 (LISTEN)

Signed-off-by: Vincent Bernat <vincent@bernat.im>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-04-29 18:29:58 +02:00
Markus Armbruster
38bb54f323 replication: Make --disable-replication compile again
Broken in commit daa33c5.

Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Message-id: 1493298053-17140-1-git-send-email-armbru@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:50:16 +01:00
Greg Kurz
64a6047d54 configure: fix trace backend list for out-of-tree builds
Since commit "c53eeaf75a04 configure: eliminate Python dependency for
--help", configure --help fails to produce the list of available trace
backends if invoked out-of-tree. It also spits the following error:

grep: scripts/tracetool/backend/*.py: No such file or directory

This patch simply adds the missing $source_path to fix it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Message-id: 149321376763.7874.12797658801011614451.stgit@bahia
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:49:41 +01:00
Stefan Hajnoczi
7ad691ec98 Merge remote-tracking branch 'mdroth/tags/qga-pull-2017-04-25-v2-tag' into staging
qemu-ga patch queue

* new commands: guest-get-timezone, guest-get-users, guest-get-host-name
* fix hang on w32 when stopping qemu-ga service while fs frozen
* fix missing setting of can-offline in guest-get-vcpus
* make qemu-ga VSS w32 service on-demand rather than on-startup
* fix unecessary errors to EventLog on w32
* improvements to fsfreeze documentation

v2:
 * document 'zone' field of guest-get-timezone as informational-only
   (Daniel, Eric)
 * fix build error for glib < 2.32 (Peter)

# gpg: Signature made Thu 27 Apr 2017 06:43:42 AM BST
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* mdroth/tags/qga-pull-2017-04-25-v2-tag:
  qga: Add `guest-get-timezone` command
  qga: Add 'guest-get-users' command
  qga: improve fsfreeze documentations
  qga: Add 'guest-get-host-name' command
  qga-win: Fix Event Viewer errors caused by qemu-ga
  qga-win: Fix a bug where qemu-ga service is stuck during stop operation
  qga-win: Enable 'can-offline' field in 'guest-get-vcpus' reply
  qemu-ga: Make QGA VSS provider service run only when needed

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-28 16:12:19 +01:00
Vinzenz Feenstra
53c58e64d0 qga: Add guest-get-timezone command
Adds a new command `guest-get-timezone` reporting the currently
configured timezone on the system. The information on what timezone is
currently is configured is useful in case of Windows VMs where the
offset of the hardware clock is required to have the same offset. This
can be used for management systems like `oVirt` to detect the timezone
difference and warn administrators of the misconfiguration.

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
Reviewed-by: Sameeh Jubran <sameeh@daynix.com>
Tested-by: Sameeh Jubran <sameeh@daynix.com>
* moved stub implementation to end of function for consistency
* document that timezone names are for informational use only.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-27 00:40:25 -05:00
Vinzenz Feenstra
161a56a906 qga: Add 'guest-get-users' command
A command that will list all currently logged in users, and the time
since when they are logged in.

Examples:

virsh # qemu-agent-command F25 '{ "execute": "guest-get-users" }'
{"return":[{"login-time":1490622289.903835,"user":"root"}]}

virsh # qemu-agent-command Win2k12r2 '{ "execute": "guest-get-users" }'
{"return":[{"login-time":1490351044.670552,"domain":"LADIDA",
"user":"Administrator"}]}

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
* make g_hash_table_contains compat func inline to avoid
  unused warnings
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:57:45 -05:00
Marc-André Lureau
1dbfbc17fe qga: improve fsfreeze documentations
Some users find the fsfreeze behaviour confusing. Add some notes about
invalid mount points and Windows usage.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1436976

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Vinzenz Feenstra <vfeenstr@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Vinzenz Feenstra
0a3d197a71 qga: Add 'guest-get-host-name' command
Retrieving the guest host name is a very useful feature for virtual management
systems. This information can help to have more user friendly VM access
details, instead of an IP there would be the host name. Also the host name
reported can be used to have automated checks for valid SSL certificates.

virsh # qemu-agent-command F25 '{ "execute": "guest-get-host-name" }'
{"return":{"host-name":"F25.lab.evilissimo.net"}}

Signed-off-by: Vinzenz Feenstra <vfeenstr@redhat.com>
* minor whitespace fix-ups
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
1529605337 qga-win: Fix Event Viewer errors caused by qemu-ga
When the command "guest-fsfreeze-freeze" is executed it causes
the VSS service to log the error below in the Event Viewer. This
error is caused by an issue in the function "CommitSnapshots" in
provider.cpp:

* When VSS_TIMEOUT_MSEC expires the funtion returns E_ABORT. This causes
the error #12293.

|event id|                           error                               |
* 12293  : Volume Shadow Copy Service error: Error calling a routine on a
           Shadow Copy Provider {00000000-0000-0000-0000-000000000000}.
           Routine details CommitSnapshots [hr = 0x80004004, Operation
           aborted.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
94d81ae896 qga-win: Fix a bug where qemu-ga service is stuck during stop operation
After triggering a freeze command without any following thaw command,
qemu-ga will not respond to stop operation. This behaviour is wanted on Linux
as there is no time limit for a freeze command and we want to prevent
quitting in the middle of freeze, on the other hand on Windows the time
limit for freeze is 10 seconds, so we should wait for the timeout, thaw
the file system and quit.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
54858553de qga-win: Enable 'can-offline' field in 'guest-get-vcpus' reply
The QGA schema states:

@can-offline: Whether offlining the VCPU is possible. This member
               is always filled in by the guest agent when the structure
               is returned, and always ignored on input (hence it can be
               omitted then).

Currently 'can-offline' is missing entirely from the reply. This causes
errors in libvirt which is expecting the reply to be compliant with the
schema docs.

BZ#1438735: https://bugzilla.redhat.com/show_bug.cgi?id=1438735

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:47 -05:00
Sameeh Jubran
f342cc93ec qemu-ga: Make QGA VSS provider service run only when needed
Currently the service runs in background on boot even though it is not
needed and once it is running it never stops. The service needs to be
running only during freeze operation and it should be stopped after
executing thaw.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-04-26 23:56:46 -05:00
Peter Maydell
81b2d5ceb0 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20170426' into staging
Fix for exit_atomic tcg opcode paths

# gpg: Signature made Wed 26 Apr 2017 18:27:11 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20170426:
  tcg: Initialize return value after exit_atomic

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 20:50:49 +01:00
Richard Henderson
79b1af9062 tcg: Initialize return value after exit_atomic
Users of tcg_gen_atomic_cmpxchg and do_atomic_op rightfully utilize
the output.  Even though this code is dead, it gets translated, and
without the initialization we encounter a tcg_error.

Reported-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2017-04-26 19:26:11 +02:00
Peter Maydell
51b9d495f2 Revert "COLO-compare: Optimize tcp compare trace event"
This reverts commit 0fc8aec7de.

In commit 2dfe5113b1 we split a trace event with a lot of arguments
in two, because the UST trace backend has a limit on the number
of arguments you can have in a single trace event. Unfortunately
we subsequently forgot about this, and in commit 0fc8aec7de
we merged the two trace events again, recreating the "UST backend
doesn't build" bug.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 16:19:27 +01:00
Peter Maydell
41c7c7ef29 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20170426' into staging
HMP pull, with tcg fix

# gpg: Signature made Wed 26 Apr 2017 14:55:30 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20170426:
  tests: Add a tester for HMP commands
  libqtest: Add a generic function to run a callback function for every machine
  libqtest: Ignore QMP events when parsing the response for HMP commands
  monitor: Check whether TCG is enabled before running the "info jit" code
  hmp: gpa2hva and gpa2hpa hostaddr command

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 15:32:20 +01:00
Thomas Huth
78f86a2b7c tests: Add a tester for HMP commands
HMP commands do not get any automatic testing yet, so on certain
QEMU machines, some HMP commands were causing crashes in the past.
Thus we should test HMP commands in our test suite, too, to avoid
that such problems creep in again in the future.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493097407-20482-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Thomas Huth
02ef6e878f libqtest: Add a generic function to run a callback function for every machine
Some tests need to run single tests for every available machine of the
current QEMU binary. To avoid code duplication, let's extract this
code that deals with 'query-machines' into a separate function.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1490860207-8302-3-git-send-email-thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Thomas Huth
6bb87be893 libqtest: Ignore QMP events when parsing the response for HMP commands
When running certain HMP commands (like "device_del") via QMP, we
can sometimes get a QMP event in the response first, so that the
"g_assert(ret)" statement in qtest_hmp() triggers and the test
fails. Fix this by ignoring such QMP events while looking for the
real return value from QMP.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1490860207-8302-2-git-send-email-thuth@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
  Added note to qtest_hmp/qtest_hmpv's header description to say
  it discards events
2017-04-26 14:42:31 +01:00
Thomas Huth
b7da97eef7 monitor: Check whether TCG is enabled before running the "info jit" code
The "info jit" command currently aborts on Mac OS X with the message
"qemu_mutex_lock: Invalid argument" when running with "-M accel=qtest".
We should only call into the TCG code here if TCG has really been
enabled and initialized.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1493179907-22516-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Paolo Bonzini
e9628441df hmp: gpa2hva and gpa2hpa hostaddr command
These commands are useful when testing machine-check passthrough.
gpa2hva is useful to inject a MADV_HWPOISON madvise from gdb, while
gpa2hpa is useful to inject an error with the mce-inject kernel
module.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1490021158-4469-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170420133058.12911-1-pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-26 14:42:31 +01:00
Peter Maydell
dcaed66cbe Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.10-20170426' into staging
ppc patch queue 2017-04-26

Here's a respind of my first pull request for qemu-2.10, consisting of
assorted patches which have accumulated while qemu-2.9 stabilized.
Highlights are:
    * Rework / cleanup of the XICS interrupt controller
    * Substantial improvement to the 'powernv' machine type
        - Includes an MMIO XICS version
    * POWER9 support improvements
        - POWER9 guests with KVM
        - Partial support for POWER9 guests with TCG
    * IOMMU and VFIO improvements
    * Assorted minor changes

There are several IPMI patches here that aren't usually in my area of
maintenance, but there isn't a regular maintainer and these patches
are for the benefit of the powernv machine type.

This pull request supersedes my 2017-04-26 pull request.  This new set
fixes a bug in one of the aforementioned IPMI patches which caused
clang sanitizer failures (and may have crashed on some libc / host
versions).

# gpg: Signature made Wed 26 Apr 2017 07:58:10 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.10-20170426: (48 commits)
  MAINTAINERS: Remove myself from e500
  target/ppc: Style fixes
  e500,book3s: mfspr 259: Register mapped/aliased SPRG3 user read
  target/ppc: Flush TLB on write to PIDR
  spapr-cpu-core: Release ICPState object during CPU unrealization
  ppc/pnv: generate an OEM SEL event on shutdown
  ppc/pnv: add initial IPMI sensors for the BMC simulator
  ppc/pnv: populate device tree for IPMI BT devices
  ppc/pnv: populate device tree for serial devices
  ppc/pnv: populate device tree for RTC devices
  ppc/pnv: scan ISA bus to populate device tree
  ppc/pnv: enable only one LPC bus
  ppc/pnv: Add support for POWER8+ LPC Controller
  spapr: remove the 'nr_servers' field from the machine
  target/ppc: Fix size of struct PPCElfPrstatus
  ipmi: introduce an ipmi_bmc_gen_event() API
  ipmi: introduce an ipmi_bmc_sdr_find() API
  ipmi: provide support for FRUs
  ipmi: use a file to load SDRs
  ppc: add IPMI support
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 13:17:11 +01:00
Peter Maydell
52e94ea5de Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20170421-v2-tag' into staging
Xen 2017/04/21 + fix

# gpg: Signature made Tue 25 Apr 2017 19:10:37 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg:                 aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20170421-v2-tag: (21 commits)
  move xen-mapcache.c to hw/i386/xen/
  move xen-hvm.c to hw/i386/xen/
  move xen-common.c to hw/xen/
  add xen-9p-backend to MAINTAINERS under Xen
  xen/9pfs: build and register Xen 9pfs backend
  xen/9pfs: send responses back to the frontend
  xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal
  xen/9pfs: receive requests from the frontend
  xen/9pfs: connect to the frontend
  xen/9pfs: introduce Xen 9pfs backend
  9p: introduce a type for the 9p header
  xen: import ring.h from xen
  configure: use pkg-config for obtaining xen version
  xen: additionally restrict xenforeignmemory operations
  xen: use libxendevice model to restrict operations
  xen: use 5 digit xen versions
  xen: use libxendevicemodel when available
  configure: detect presence of libxendevicemodel
  xen: create wrappers for all other uses of xc_hvm_XXX() functions
  xen: rename xen_modified_memory() to xen_hvm_modified_memory()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-26 10:22:31 +01:00
Scott Wood
df02d2ca8b MAINTAINERS: Remove myself from e500
I recently left Freescale/NXP, and even before that it'd been a few years
since I was actively involved in KVM/QEMU work.

Signed-off-by: Scott Wood <oss@buserror.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
David Gibson
c364946dd5 target/ppc: Style fixes
This makes a small step fixing one of many style problems that exist in
the older ppc code.  This removes spaces between function (or macro) name
and the following '('.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Bernhard Kaindl
b1c897d587 e500,book3s: mfspr 259: Register mapped/aliased SPRG3 user read
This patch registers mfspr 259 for Book3S and e500 family cores
following this research:

mfspr 259 provides read-only mapped user access to SPRG3(SPR 275) according to:

- PowerISA 2.02, Book III (documents implementation starting with POWER4+ @ p20)
- IBM PowerPC 970MP RISC Microprocessor User's Manual v2.1, page 48
- Amit Singh: "Mac OS X Internals: A Systems Approach" on 970 and 970FX cores:
  He demonstrates mfspr 259 reading TLS data from Mac OS X on G5 on page 588
- NXP documents it in the Core Reference Manuals of: e500, e500mc and e5500
- getcpu() of the 32 & 64-bit Book3S Linux vDSOs use it to read the core number

mfspr 259 does not appear to be implemented in these cores according to:

- 74xx series: MPC7410/MPC7400 and MPC7450 RISC Microprocessor Reference Manuals
- 4xx series:  PPC440 Processor User's Manual, Revision 1.09 by AMCC
- 750 series:  IBM PowerPC 750CL RISC Microprocessor User's Manual
- e200 series: e200z4 Power Architectureâ Core Reference Manual

Implementation: gen_spr_usprg3() is called from init_proc_book3s_common()
(covers the 970 and POWER cores) and init_proc_e500() (covers the e500 family)
to register spr_read_ureg() in the same way which it already provides
the mapped SPR access for SPR_USPRG4-7 in gen_spr_usprgh() for cores
which have the same read-only mapped SPRG register access for SPRG4-7.

Verified using Linux by pinning a thread to a core and checking sched_getcpu()
using qemu-system-ppc64 -M pseries -cpu POWER8 using MTTCG on a x86_64 host.

Signed-off-by: Bernhard Kaindl <bernhard.kaindl@thalesgroup.com>
Reviewed-by: Stefan Resch <stefan.resch@thalesgroup.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Suraj Jitindar Singh
31b2b0f846 target/ppc: Flush TLB on write to PIDR
The PIDR (process id register) is used to store the id of the currently
running process, which is used to select the process table entry used to
perform address translation. This means that when we write to this register
all the translations in the TLB become outdated as they are for a
previously running process. Thus when this register is written to we need
to invalidate the TLB entries to ensure stale entries aren't used to
to perform translation for the new process, which would result in at best
segfaults or alternatively just random memory being accessed.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Fixed compile error for 32-bit targets]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Bharata B Rao
8f37e54e5b spapr-cpu-core: Release ICPState object during CPU unrealization
Recent commits that re-organized ICPState object missed to destroy
the object when CPU is unrealized. Fix this so that CPU unplug
doesn't abort QEMU.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
bce0b69159 ppc/pnv: generate an OEM SEL event on shutdown
OpenPOWER systems expect to be notified with such an event before a
shutdown or a reboot. An OEM SEL message is sent with specific
identifiers and a user data containing the request : OFF or REBOOT.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
aeaef83dab ppc/pnv: add initial IPMI sensors for the BMC simulator
Skiboot, the firmware for the PowerNV platform, expects the BMC to
provide some specific IPMI sensors. These sensors are exposed in the
device tree and their values are updated by the firmware at boot time.

Sensors of interest are :

	"FW Boot Progress"
	"Boot Count"

As such a device is defined on the command line, we can only detect
its presence at reset time.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
04f6c8b2c0 ppc/pnv: populate device tree for IPMI BT devices
When an ipmi-bt device [1] is defined on the ISA bus, we need to
populate the device tree with the object properties. Such devices are
created with the command line options :

   -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-bt,bmc=bmc0,irq=10

[1] https://lists.gnu.org/archive/html/qemu-devel/2015-11/msg03168.html

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
cb228f5a00 ppc/pnv: populate device tree for serial devices
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
c5ffdcaea5 ppc/pnv: populate device tree for RTC devices
The code could be common to any ISA device but we are missing the IO
length.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:56 +10:00
Cédric Le Goater
e7a3fee340 ppc/pnv: scan ISA bus to populate device tree
This is an empty shell that we will use to include nodes in the device
tree for ISA devices. We expect RTC, UART and IPMI BT devices.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
5a7e14a274 ppc/pnv: enable only one LPC bus
The default LPC bus of a multichip system is on chip 0. It's
recognized by the firmware (skiboot) using a "primary" property in the
device tree.

We introduce a pnv_chip_lpc_offset() routine to locate the LPC node of
a chip and set the property directly from the machine level.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Benjamin Herrenschmidt
4d1df88b63 ppc/pnv: Add support for POWER8+ LPC Controller
It adds the Naples chip which supports proper LPC interrupts via the
LPC controller rather than via an external CPLD.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
      - ported on latest PowerNV patchset
      - moved the IRQ handler in pnv_lpc.c
      - introduced pnv_lpc_isa_irq_create() to create the ISA IRQs ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
71cd4dace9 spapr: remove the 'nr_servers' field from the machine
xics_system_init() does not need 'nr_servers' anymore as it is only
used to define the 'interrupt-controller' node in the device tree. So
let's just compute the value when calling spapr_dt_xics().

This also gives us an opportunity to simplify the xics_system_init()
routine and introduce a specific spapr_ics_create() helper to create
the sPAPR ICS object.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Anton Blanchard
b88290cd9e target/ppc: Fix size of struct PPCElfPrstatus
gdb refuses to parse QEMU memory dumps because struct PPCElfPrstatus
is the wrong size. Fix it.

Signed-off-by: Anton Blanchard <anton@samba.org>
Fixes: e62fbc54d4 ("target-ppc: dump-guest-memory support")
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
cd60d85ef6 ipmi: introduce an ipmi_bmc_gen_event() API
It will be used to fill the message buffer with custom events expected
by some systems. Typically, an Open PowerNV platform guest is notified
with an OEM SEL message before a shutdown or a reboot.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
7fabcdb942 ipmi: introduce an ipmi_bmc_sdr_find() API
This patch exposes a new IPMI routine to query a sdr entry from the
sdr table maintained by the IPMI BMC simulator. The API is very
similar to the internal sdr_find_entry() routine and should be used
the same way to query one or all sdrs.

A typical use would be to loop on the sdrs to build nodes of a device
tree.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:55 +10:00
Cédric Le Goater
540c07d345 ipmi: provide support for FRUs
This patch provides a simple FRU support for the BMC simulator. FRUs
are loaded from a file which name is specified in the object
properties, each entry having a fixed size, also specified in the
properties. If the file is unknown or not accessible for some reason,
a unique entry of 1024 bytes is created as a default. Just enough to
start some simulation.

These commands complies with the IPMI spec : "34. FRU Inventory Device
Commands".

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
[dwg: Folded in subsequent fix to handle NULL filename]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:41:21 +10:00
Cédric Le Goater
8c6fd7f341 ipmi: use a file to load SDRs
The IPMI BMC simulator populates the sdr/sensor tables with a minimal
set of entries (Watchdog). But some qemu platforms might want to use
extra entries for their custom needs.

This patch modifies slighty the initializing routine to take into
account a larger set read from a file. The name of the file to use is
defined through a new 'sdr' property of the simulator device.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
4a44fd26db ppc: add IPMI support
OpenPOWER systems use a BT device to communicate with the BMC.
Provide support for it.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Benjamin Herrenschmidt
0722d05ad8 ppc/pnv: Add OCC model stub with interrupt support
The OCC is an on-chip microcontroller based on a ppc405 core used
for various power management tasks. It comes with a pile of additional
hardware sitting on the PIB (aka XSCOM bus). At this point we don't
emulate it (nor plan to do so). However there is one facility which
is provided by the surrounding hardware that we do need, which is the
interrupt generation facility. OPAL uses it to send itself interrupts
under some circumstances and there are other uses around the corner.

So this implement just enough to support this.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.9
      - changed the XSCOM interface to fit new model
      - QOMified the model ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
54f59d786c ppc/pnv: Add cut down PSI bridge model and hookup external interrupt
The Processor Service Interface (PSI) Controller is one of the engines
of the "Bridge" unit which connects the different interfaces to the
Power Processor.

This adds just enough of the PSI bridge to handle various on-chip and
the one external interrupt. The rest of PSI has to do with the link to
the IBM FSP service processor which we don't plan to emulate (not used
on OpenPower machines).

The ics_get() and ics_resend() handlers of the XICSFabric interface of
the PowerNV machine are now defined to handle the Interrupt Control
Source of PSI. The InterruptStatsProvider interface is also modified
to dump the new ICS.

Originally from Benjamin Herrenschmidt <benh@kernel.crashing.org>

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
bf5615e77c ppc/pnv: add memory regions for the ICP registers
This provides to a PowerNV chip (POWER8) access to the Interrupt
Management area, which contains the registers of the Interrupt Control
Presenters of each thread. These are used to accept, return, forward
interrupts in the system.

This area is modeled with a per-chip container memory region holding
all the ICP registers. Each thread of a chip is then associated with
its ICP registers using a memory subregion indexed by its PIR number
in the overall region.

The device tree is populated accordingly.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
5509db4aec ppc/pnv: add a helper to calculate MMIO addresses registers
Some controllers (ICP, PSI) have a base register address which is
calculated using the chip id.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
960fbd29e5 ppc/pnv: create the ICP object under PnvCore
Each thread of a core is linked to an ICP. This allocates a PnvICPState
object before the PowerPCCPU object is realized and lets the XICSFabric
do the store under the 'intc' backlink when xics_cpu_setup() is
called.

This modeling removes the need of maintaining an array of ICP objects
under the PowerNV machine and also simplifies the XICSFabric icp_get()
handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
47fea43aa3 ppc/pnv: extend the machine with a InterruptStatsProvider interface
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
36fc6f0800 ppc/pnv: extend the machine with a XICSFabric interface
A XICSFabric QOM interface is used by the XICS layer to manipulate the
ICP and ICS objects. Let's define the associated handlers for the
PowerNV machine. All handlers should be defined even if there is no
ICS under the PowerNV machine yet.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
99285aae16 ppc/pnv: add a PnvICPState object
This provides a new ICPState object for the PowerNV machine (POWER8).
Access to the Interrupt Management area is done though a memory
region. It contains the registers of the Interrupt Control Presenters
of each thread which are used to accept, return, forward interrupts in
the system.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
439071a92d ppc/xics: add a realize() handler to ICPStateClass
It will be used by derived classes in PowerNV for customization.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
5bc8d26de2 spapr: allocate the ICPState object from under sPAPRCPUCore
Today, all the ICPs are created before the CPUs, stored in an array
under the sPAPR machine and linked to the CPU when the core threads
are realized. This modeling brings some complexity when a lookup in
the array is required and it can be simplified by allocating the ICPs
when the CPUs are.

This is the purpose of this proposal which introduces a new 'icp_type'
field under the machine and creates the ICP objects of the right type
(KVM or not) before the PowerPCCPU object are.

This change allows more cleanups : the removal of the icps array under
the sPAPR machine and the removal of the xics_get_cpu_index_by_dt_id()
helper.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
06747ba6d4 spapr: move the IRQ server number mapping under the machine
This is the second step to abstract the IRQ 'server' number of the
XICS layer. Now that the prereq cleanups have been done in the
previous patch, we can move down the 'cpu_dt_id' to 'cpu_index'
mapping in the sPAPR machine handler.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Cédric Le Goater
ad5d1add86 ppc/xics: introduce an 'intc' backlink under PowerPCCPU
Today, the ICPState array of the sPAPR machine is indexed with
'cpu_index' of the CPUState. This numbering of CPUs is internal to
QEMU and the guest only knows about what is exposed in the device
tree, that is the 'cpu_dt_id'. This is why sPAPR uses the helper
xics_get_cpu_index_by_dt_id() to do the mapping in a couple of places.

To provide a more generic XICS layer, we need to abstract the IRQ
'server' number and remove any assumption made on its nature. It
should not be used as a 'cpu_index' for lookups like xics_cpu_setup()
and xics_cpu_destroy() do.

To reach that goal, we choose to introduce a generic 'intc' backlink
under PowerPCCPU, and let the machine core init routine do the
ICPState lookup. The resulting object is passed on to xics_cpu_setup()
which does the store under PowerPCCPU. The IRQ 'server' number in XICS
is now generic. sPAPR uses 'cpu_dt_id' and PowerNV will use 'PIR'
number.

This also has the benefit of simplifying the sPAPR hcall routines
which do not need to do any ICPState lookups anymore.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Suraj Jitindar Singh
ccd531b9c9 target/ppc: Add ibm,processor-radix-AP-encodings for TCG
The ibm,processor-radix-AP-encodings device tree property of the cpu node
is used to specify the radix mode supported page sizes of the processor
to the guest os. Contained in the top 3 bits of the msb is the actual
page size (AP) encoding associated with the corresponding radix mode
supported page size. Add this property for a TCG guest, note the TCG code
is capable of translating any format so just add the 4 default page sizes.

The ibm,processor-radix-AP-encodings device tree property is defined as:
One to n cells in ascending order of radix mode supported page sizes
encoded as BE ints (32bit on ppc) in the form:
0bxxxyyyyyyyyyyyyyyyyyyyyyyyyyyyyy
- 0bxxx -> AP encoding
- 0byyyyyyyyyyyyyyyyyyyyyyyyyyyyy -> supported page size encoded as a shift

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:42 +10:00
Alexey Kardashevskiy
c88fa6dd4a spapr_pci: Removed unused include
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
a01f3432dd spapr_pci: Warn when RAM page size is not enabled in IOMMU page mask
If a page size used by QEMU is not enabled in the PHB IOMMU page mask,
in-kernel acceleration of TCE handling won't be enabled and performance
might be slower than expected.

This prints a warning if system page size is not enabled. This should
print a warning if huge pages are enabled but sphb.pgsz still uses
the default value of 4K|64K.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
3dc410ae83 target-ppc/kvm: Enable in-kernel TCE acceleration for multi-tce
This enables in-kernel handling of H_PUT_TCE_INDIRECT and
H_STUFF_TCE hypercalls. The host kernel support is there since v4.6,
in particular d3695aa4f452
("KVM: PPC: Add support for multiple-TCE hcalls").

H_PUT_TCE is already accelerated and does not need any special enablement.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
e957f6a9b9 spapr: Workaround for broken radix guests
For a little while around 4.9, Linux kernels that saw the radix bit in
ibm,pa-features would attempt to set up the MMU as if they were a
hypervisor, even if they were a guest, which would cause them to
crash.

Work around this by detecting pre-ISA 3.0 guests by their lack of that
bit in option vector 1, and then removing the radix bit from
ibm,pa-features. Note: This now requires regeneration of that node
after CAS negotiation.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
9fb4541f58 spapr: Enable ISA 3.0 MMU mode selection via CAS
Add the new node, /chosen/ibm,arch-vec-5-platform-support to the
device tree. This allows the guest to determine which modes are
supported by the hypervisor.

Update the option vector processing in h_client_architecture_support()
to handle the new MMU bits. This allows guests to request hash or
radix mode and QEMU to create the guest's HPT at this time if it is
necessary but hasn't yet been done.  QEMU will terminate the guest if
it requests an unavailable mode, as required by the architecture.

Extend the ibm,pa-features node with the new ISA 3.0 values
and set the radix bit if KVM supports radix mode. This probably won't
be used directly by guests to determine the availability of radix mode
(that is indicated by the new node added above) but the architecture
requires that it be set when the hardware supports it.

If QEMU is using KVM, and KVM is capable of running in radix mode,
guests can be run in real-mode without allocating a HPT (because KVM
will use a minimal RPT). So in this case, we avoid creating the HPT
at reset time and later (during CAS) create it if it is necessary.

ISA 3.0 guests will now begin to call h_register_process_table(),
which has been added previously.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Strip some unneeded prefix from error messages]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
86d5771a5a spapr: move spapr_populate_pa_features()
In the next patch, spapr_fixup_cpu_dt() will need to call
spapr_populate_pa_features() so move it's definition up without making
any other changes.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Suraj Jitindar Singh
b4db54132f target/ppc: Implement H_REGISTER_PROCESS_TABLE H_CALL
The H_REGISTER_PROCESS_TABLE H_CALL is used by a guest to indicate to the
hypervisor where in memory its process table is and how translation should
be performed using this process table.

Provide the implementation of this H_CALL for a guest.

We first check for invalid flags, then parse the flags to determine the
operation, and then check the other parameters for valid values based on
the operation (register new table/deregister table/maintain registration).
The process table is then stored in the appropriate location and registered
with the hypervisor (if running under KVM), and the LPCR_[UPRT/GTSE] bits
are updated as required.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Correct missing prototype and uninitialized variable]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Suraj Jitindar Singh
d77a98b015 target/ppc: Add new H-CALL shells for in memory table translation
The use of the new in memory tables introduced in ISAv3.00 for translation,
also referred to as process tables, requires the introduction of 3 new
H-CALLs; H_REGISTER_PROCESS_TABLE, H_CLEAN_SLB, and H_INVALIDATE_PID.

Add shells for each of these and register them as the hypercall handlers.
Currently they all log an unimplemented hypercall and return H_FUNCTION.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Fix style nits]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
cf1c4cce7c target-ppc: support KVM_CAP_PPC_MMU_RADIX, KVM_CAP_PPC_MMU_HASH_V3
Query and cache the value of two new KVM capabilities that indicate
KVM's support for new radix and hash modes of the MMU.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
c64abd1f9c spapr: Add ibm,processor-radix-AP-encodings to the device tree
Use the new ioctl, KVM_PPC_GET_RMMU_INFO, to fetch radix MMU
information from KVM and present the page encodings in the device tree
under ibm,processor-radix-AP-encodings. This provides page size
information to the guest which is necessary for it to use radix mode.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Compile fix for 32-bit targets, style nit fix]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Alexey Kardashevskiy
d6ee2a7c85 target-ppc: kvm: make use of KVM_CREATE_SPAPR_TCE_64
KVM_CAP_SPAPR_TCE capability allows creating TCE tables in KVM which
allows having in-kernel acceleration for H_PUT_TCE_xxx hypercalls.
However it only supports 32bit DMA windows at zero bus offset.

There is a new KVM_CAP_SPAPR_TCE_64 capability which supports 64bit
window size, variable page size and bus offset.

This makes use of the new capability. The kernel headers are already
updated as the kernel support went in to v4.6.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Thomas Huth
9d169fb3c8 hw/ppc/pnv: Classify the "PowerNV Chip" devices as CPU devices
The devices that are derived from TYPE_PNV_CHIP currently show up
as "uncategorized" devices in the help text of "-device ?". Since
they obviously are related to the CPU, let's put them into the
CPU category instead.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Cédric Le Goater
147ff8079e ppc/spapr: QOM'ify sPAPRRTCState
Also use an 'sPAPRRTCState' attribute under the sPAPR machine to hold
the RTC object. Overall, these changes remove an unnecessary and
implicit dependency on SysBus.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
David Gibson
3fa14fbe13 pseries: Add pseries-2.10 machine type
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Sam Bobroff
f3d9f303ac target/ppc: Improve accuracy of guest HTM availability on P8s
On Power8 hosts it is currently theoretically possible for QEMU/KVM-HV guests
to receive a ibm,pa-features property indicating that HTM support is available
when it is not.  The situation would occur if the platform firmware of
a Power8 host cleared the HTM bit of the ibm,pa-features property.
QEMU would query KVM for the availability of HTM, which will return no
support, but workaround code in kvm_arch_init_vcpu() would then
re-enable it because KVM_HV is in use and the processor is P8.

This patch adjusts the workaround in kvm_arch_init_vcpu() so that it does not
enable HTM (in the above case) unless the host kernel indicates to the QEMU
process, via the auxiliary vector, that userspace can use HTM (via the HWCAP2
bit KVM_FEATURE2_HTM).

The reason to use the value from the auxiliary vector is that it is
set based only on what the host kernel found in the ibm,pa-features
HTM bit at boot time.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-04-26 12:00:41 +10:00
Anthony Xu
28b99f473b move xen-mapcache.c to hw/i386/xen/
move xen-mapcache.c to hw/i386/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Anthony Xu
93d43e7e11 move xen-hvm.c to hw/i386/xen/
move xen-hvm.c to hw/i386/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Anthony Xu
56e2cd2452 move xen-common.c to hw/xen/
move xen-common.c to hw/xen/

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-25 11:04:34 -07:00
Stefano Stabellini
d6a3f64ad3 add xen-9p-backend to MAINTAINERS under Xen
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: groug@kaod.org
CC: anthony.perard@citrix.com
2017-04-25 11:04:33 -07:00
Stefano Stabellini
e737b6d5c3 xen/9pfs: build and register Xen 9pfs backend
Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
4476e09e34 xen/9pfs: send responses back to the frontend
Once a request is completed, xen_9pfs_push_and_notify gets called. In
xen_9pfs_push_and_notify, update the indexes (data has already been
copied to the sg by the common code) and send a notification to the
frontend.

Schedule the bottom-half to check if we already have any other requests
pending.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
40a2389207 xen/9pfs: implement in/out_iov_from_pdu and vmarshal/vunmarshal
Implement xen_9pfs_init_in/out_iov_from_pdu and
xen_9pfs_pdu_vmarshal/vunmarshall by creating new sg pointing to the
data on the ring.

This is safe as we only handle one request per ring at any given time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
47b70fb1e4 xen/9pfs: receive requests from the frontend
Upon receiving an event channel notification from the frontend, schedule
the bottom half. From the bottom half, read one request from the ring,
create a pdu and call pdu_submit to handle it.

For now, only handle one request per ring at a time.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
f23ef34a5d xen/9pfs: connect to the frontend
Write the limits of the backend to xenstore. Connect to the frontend.
Upon connection, allocate the rings according to the protocol
specification.

Initialize a QEMUBH to schedule work upon receiving an event channel
notification from the frontend.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:33 -07:00
Stefano Stabellini
b37eeb0201 xen/9pfs: introduce Xen 9pfs backend
Introduce the Xen 9pfs backend: add struct XenDevOps to register as a
Xen backend and add struct V9fsTransport to register as v9fs transport.

All functions are empty stubs for now.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-25 11:04:28 -07:00
Peter Maydell
fe491fa85c Merge remote-tracking branch 'remotes/agraf/tags/signed-s390-for-upstream' into staging
Patch queue for s390 - 2017-04-25

Two simple fixes this time around:

  - fix BQL for s390 virtio target
  - Fix SIGP emulation

# gpg: Signature made Tue 25 Apr 2017 12:40:38 BST
# gpg:                using RSA key 0x2B33791E03FEDC60
# gpg: Good signature from "Alexander Graf <agraf@suse.de>"
# gpg:                 aka "Alexander Graf <alex@csgraf.de>"
# Primary key fingerprint: 7F2A 4C87 F94C 5EFE DBC3  75DC 1631 30DA 5B24 530A
#      Subkey fingerprint: 54D3 364F A6C3 60FF AF81  EB53 2B33 791E 03FE DC60

* remotes/agraf/tags/signed-s390-for-upstream:
  s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL
  target-s390x: Mask the SIGP order_code to 8bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 14:48:55 +01:00
Peter Maydell
b8c7193fe9 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 25 Apr 2017 12:22:03 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  COLO-compare: Optimize tcp compare trace event
  COLO-compare: Optimize tcp compare for option field
  slirp: add a fake NC-SI backend
  aspeed: add a FTGMAC100 nic
  net/ftgmac100: add a 'aspeed' property
  net: add FTGMAC100 support
  hw/net: add MII definitions
  colo-compare: Fix old packet check bug.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 14:14:17 +01:00
Aurelien Jarno
2cf9953bee s390x/misc_helper.c: wrap s390_virtio_hypercall in BQL
s390_virtio_hypercall can trigger IO events and interrupts, most notably
when using virtio-ccw devices.

Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Fixes: 278f5e98c6 ("s390x/misc_helper.c: wrap IO instructions in BQL")
Signed-off-by: Alexander Graf <agraf@suse.de>
2017-04-25 13:39:43 +02:00
Philipp Kern
601b9a9008 target-s390x: Mask the SIGP order_code to 8bit.
According to "CPU Signaling and Response", "Signal-Processor Orders",
the order field is bit position 56-63. Without this, the Linux
guest kernel is sometimes unable to stop emulation and enters
an infinite loop of "XXX unknown sigp: 0xffffffff00000005".

Signed-off-by: Philipp Kern <phil@philkern.de>
Reviewed-by: Thomas Huth <thuth@tuxfamily.org>
[agraf: add comment according to email]
Signed-off-by: Alexander Graf <agraf@suse.de>
2017-04-25 13:39:43 +02:00
Brendan Shanks
4ba967ad74 ui/cocoa.m: Fix macOS 10.12 deprecation warnings
macOS 10.12 deprecated/replaced many AppKit constants to make naming
more consistent. Use the new constants, and #define them to the
old constants when compiling against a pre-10.12 SDK.

Signed-off-by: Brendan Shanks <brendan@bslabs.net>
Message-id: 20170425062952.99149-1-brendan@bslabs.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 12:33:51 +01:00
Zhang Chen
0fc8aec7de COLO-compare: Optimize tcp compare trace event
Optimize two trace events as one, adjust print format make
it easy to read. rename trace_colo_compare_pkt_info_src/dst
to trace_colo_compare_tcp_info.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Zhang Chen
184d4d4203 COLO-compare: Optimize tcp compare for option field
In this patch we support packet that have tcp options field.
Add tcp options field check, If the packet have options
field we just skip it and compare tcp payload,
Avoid unnecessary checkpoint, optimize performance.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
47bb83cad4 slirp: add a fake NC-SI backend
NC-SI (Network Controller Sideband Interface) enables a BMC to manage
a set of NICs on a system. This model takes the simplest approach and
reverses the NC-SI packets to pretend a NIC is present and exercise
the Linux driver.

The NCSI header file <ncsi-pkt.h> comes from mainline Linux and was
untabified.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
ea337c6549 aspeed: add a FTGMAC100 nic
There is a second NIC but we do not use it for the moment. We use the
'aspeed' property to tune the definition of the end of ring buffer bit
for the Aspeed SoCs.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Cédric Le Goater
1335fe3eb2 net/ftgmac100: add a 'aspeed' property
The Aspeed SoCs have a different definition of the end of the ring
buffer bit. Add a property to specify which set of bits should be used
by the NIC.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-25 19:17:25 +08:00
Krzysztof Kozlowski
d77b71c2a4 hw/arm/exynos: Add generic SDHCI devices
Exynos4210 has four SD/MMC controllers supporting:
 - SD Standard Host Specification Version 2.0,
 - MMC Specification Version 4.3,
 - SDIO Card Specification Version 2.0,
 - DMA and ADMA.

Add emulation of SDHCI devices which allows accessing storage through SD
cards. Differences from real hardware:
 - Devices are shipped with eMMC memory, not SD card.
 - The Exynos4210 SDHCI has few more registers, e.g. for
   controlling the clocks, additional status (0x80, 0x84, 0x8c). These
   are not implemented.

Testing on smdkc210 machine with "-drive file=FILE,if=sd,bus=0,index=2".

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170422190709.8676-1-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 11:17:49 +01:00
Peter Maydell
f4b5b021c8 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Mon 24 Apr 2017 20:18:05 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  qemu-iotests: _cleanup_qemu must be called on exit
  block/rbd: Add support for reopen()
  block/rbd - update variable names to more apt names
  block: use bdrv_can_set_read_only() during reopen
  block: introduce bdrv_can_set_read_only()
  block: code movement
  block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
  block: do not set BDS read_only if copy_on_read enabled
  block: add bdrv_set_read_only() helper function
  qemu-iotests: exclude vxhs from image creation via protocol
  block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
  block/vxhs.c: Add support for a new block device type called "vxhs"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-25 09:21:54 +01:00
Jeff Cody
ecfa185400 qemu-iotests: _cleanup_qemu must be called on exit
For the tests that use the common.qemu functions for running a QEMU
process, _cleanup_qemu must be called in the exit function.

If it is not, if the qemu process aborts, then not all of the droppings
are cleaned up (e.g. pidfile, fifos).

This updates those tests that did not have a cleanup in qemu-iotests.

(I swapped spaces for tabs in test 102 as well)

Reported-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: d59c2f6ad6c1da8b9b3c7f357c94a7122ccfc55a.1492544096.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
56e7cf8df0 block/rbd: Add support for reopen()
This adds support for reopen in rbd, for changing between r/w and r/o.

Note, that this is only a flag change, but we will block a change from
r/o to r/w if we are using an RBD internal snapshot.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: d4e87539167ec6527d44c97b164eabcccf96e4f3.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
80b61a27c6 block/rbd - update variable names to more apt names
Update 'clientname' to be 'user', which tracks better with both
the QAPI and rados variable naming.

Update 'name' to be 'image_name', as it indicates the rbd image.
Naming it 'image' would have been ideal, but we are using that for
the rados_image_t value returned by rbd_open().

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: b7ec1fb2e1cf36f9b6911631447a5b0422590b7d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
3d8ce171cb block: use bdrv_can_set_read_only() during reopen
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 00aed7ffdd7be4b9ed9ce1007d50028a72b34ebe.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
45803a0396 block: introduce bdrv_can_set_read_only()
Introduce check function for setting read_only flags.  Will return < 0 on
error, with appropriate Error value set.  Does not alter any flags.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: e2bba34ac3bc76a0c42adc390413f358ae0566e8.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
93ed524e3d block: code movement
Move bdrv_is_read_only() up with its friends.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 73b2399459760c32506f9407efb9dddb3a2789de.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
d6fcdf06d9 block: honor BDRV_O_ALLOW_RDWR when clearing bs->read_only
The BDRV_O_ALLOW_RDWR flag allows / prohibits the changing of
the BDS 'read_only' state, but there are a few places where it
is ignored.  In the bdrv_set_read_only() helper, make sure to
honor the flag.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: be2e5fb2d285cbece2b6d06bed54a6f56520d251.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
e2b8247a32 block: do not set BDS read_only if copy_on_read enabled
A few block drivers will set the BDS read_only flag from their
.bdrv_open() function.  This means the bs->read_only flag could
be set after we enable copy_on_read, as the BDRV_O_COPY_ON_READ
flag check occurs prior to the call to bdrv->bdrv_open().

This adds an error return to bdrv_set_read_only(), and an error will be
return if we try to set the BDS to read_only while copy_on_read is
enabled.

This patch also changes the behavior of vvfat.  Before, vvfat could
override the drive 'readonly' flag with its own, internal 'rw' flag.

For instance, this -drive parameter would result in a writable image:

"-drive format=vvfat,dir=/tmp/vvfat,rw,if=virtio,readonly=on"

This is not correct.  Now, attempting to use the above -drive parameter
will result in an error (i.e., 'rw' is incompatible with 'readonly=on').

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 0c5b4c1cc2c651471b131f21376dfd5ea24d2196.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
fe5241bfe3 block: add bdrv_set_read_only() helper function
We have a helper wrapper for checking for the BDS read_only flag,
add a helper wrapper to set the read_only flag as well.

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 9b18972d05f5fa2ac16c014f0af98d680553048d.1491597120.git.jcody@redhat.com
2017-04-24 15:09:33 -04:00
Jeff Cody
a98f49f46a qemu-iotests: exclude vxhs from image creation via protocol
The protocol VXHS does not support image creation.  Some tests expect
to be able to create images through the protocol.  Exclude VXHS from
these tests.

Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-04-24 15:09:33 -04:00
Ashish Mittal
ae0c0a3dec block/vxhs.c: Add qemu-iotests for new block device type "vxhs"
These changes use a vxhs test server that is a part of the following
repository:
https://github.com/VeritasHyperScale/libqnio.git

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-3-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:09:33 -04:00
Ashish Mittal
da92c3ff60 block/vxhs.c: Add support for a new block device type called "vxhs"
Source code for the qnio library that this code loads can be downloaded from:
https://github.com/VeritasHyperScale/libqnio.git

Sample command line using JSON syntax:
./x86_64-softmmu/qemu-system-x86_64 -name instance-00000008 -S -vnc 0.0.0.0:0
-k en-us -vga cirrus -device virtio-balloon-pci,id=balloon0,bus=pci.0,addr=0x5
-msg timestamp=on
'json:{"driver":"vxhs","vdisk-id":"c3e9095a-a5ee-4dce-afeb-2a59fb387410",
"server":{"host":"172.172.17.4","port":"9999"}}'

Sample command line using URI syntax:
qemu-img convert -f raw -O raw -n
/var/lib/nova/instances/_base/0c5eacd5ebea5ed914b6a3e7b18f1ce734c386ad
vxhs://192.168.0.1:9999/c6718f6b-0401-441d-a8c3-1f0064d75ee0

Sample command line using TLS credentials (run in secure mode):
./qemu-io --object
tls-creds-x509,id=tls0,dir=/etc/pki/qemu/vxhs,endpoint=client -c 'read
-v 66000 2.5k' 'json:{"server.host": "127.0.0.1", "server.port": "9999",
"vdisk-id": "/test.raw", "driver": "vxhs", "tls-creds":"tls0"}'

[Jeff: Modified trace-events with the correct string formatting]

Signed-off-by: Ashish Mittal <Ashish.Mittal@veritas.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 1491277689-24949-2-git-send-email-Ashish.Mittal@veritas.com
2017-04-24 15:08:42 -04:00
Peter Maydell
eab1e53cac Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20170424-1' into staging
fix display update races, part one.
add xres + yres properties to qxl and virtio.
misc fixes and cleanups.

# gpg: Signature made Mon 24 Apr 2017 13:14:49 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-vga-20170424-1:
  virtio-gpu: add xres and yres properties
  qxl: add xres and yres properties
  vmsvga: fix vmsvga_update_display
  g364fb: make display updates thread safe
  exynos: make display updates thread safe
  framebuffer: make display updates thread safe
  vga: make display updates thread safe.
  vga: add vga_scanline_invalidated helper
  memory: add support getting and using a dirty bitmap copy.
  bitmap: add bitmap_copy_and_clear_atomic
  virtio-gpu: replace PIXMAN_* by PIXMAN_BE_*
  console: add same displaychangelistener registration pre-condition
  console: add same surface replace pre-condition

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 15:37:30 +01:00
Peter Maydell
4c55b1d0ba Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2017-04-24' into staging
Error reporting patches for 2017-04-24

# gpg: Signature made Mon 24 Apr 2017 08:16:34 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2017-04-24:
  error: Apply error_propagate_null.cocci again
  qga: Make errp the last parameter of qga_vss_fsfreeze
  migration: Make errp the last parameter of local functions
  scsi: Make errp the last parameter of virtio_scsi_common_realize
  fdc: Make errp the last parameter of fdctrl_connect_drives
  nfs: Make errp the last parameter of nfs_client_open
  block: Make errp the last parameter of commit_active_start
  mirror: Make errp the last parameter of mirror_start_job
  crypto: Make errp the last parameter of functions
  block: Make errp the last parameter of bdrv_img_create
  socket: Make errp the last parameter of vsock_connect_saddr
  socket: Make errp the last parameter of unix_connect_saddr
  socket: Make errp the last parameter of inet_connect_saddr
  socket: Make errp the last parameter of socket_connect
  util/error: Fix leak in error_vprepend()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 14:49:48 +01:00
BALATON Zoltan
9eb2575e6c ppc: Add SM501 device in ppc softmmu targets default configs
This is not used by default on any emulated machine yet but it is
still useful to have it compiled so it can be added from the command
line for clients that can use it (e.g. MorphOS has no driver for any
other emulated video cards but can output via SM501)

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: bf305f36dcde152668cf12438ef983cd09ed2d3f.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
2edd6e4ac5 sm501: Add vmstate descriptor
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 86803c6f40cd678b61b3b1a1429683f60f0aa89a.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
b612a49db2 sm501: Add some more missing registers
This is to allow clients to initialise these without failing as long
as no 2D engine function is called that would use the written value.
Saved values are not used yet (may get used when more of 2D engine is
added sometimes) and clients normally only write to most of these
registers, nothing is known to ever read them but they are documented
as read/write so also implement read for these.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 80adf8e4d084ec6cc30d149f8e8215debb67314a.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
1ae5e6eb42 sm501: Add support for panel layer
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 2029a276362c0c3a14c78acb56baa9466848dd51.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
01d2d584c9 sm501: Misc clean ups
- Rename a variable
- Move variable declarations out of loop to the beginning in draw_hwc_line

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 187c9e4e09d9bc2967b2454b36bb088ceef0b8bc.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
6a2a5aae02 sm501: Fix hardware cursor
Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
afef2e1d53 sm501: Fix device endianness
We only emulate the sysbus device in its default LE mode and PCI is LE
as well so specify this for registers and framebuffer memory.

Note that though the Linux kernel driver has code which claims to
handle both big and little endian, it is obviously bogus for 16 bit
and cannot be trusted as a source of information on the framebuffer
pixel format. This is our best guess about device behaviour based on
the specs and testing with MorphOS that is known to work on real HW.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 8b9605a569f8bf54074e15903620b18cd9967c89.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
efae27848d sm501: Add emulation of chip connected via PCI
Only the display controller part is created automatically on PCI

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 647d292c6f5abba8b2a614687229949b5dcb864e.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
c795fa8447 sm501: Get rid of base address in draw_hwc_line
Do not use the base address to access data in local memory. This is in
preparation to allow chip connected via PCI where base address depends
on where the BAR is mapped so it will be unknown.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 79dab21bc6ec4d563aabf265c3bab40e2e95aae8.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
ca8a110470 sm501: QOMify
Adding vmstate saving is not in this patch because the state structure
will be changed in further patches, then another patch will add
vmstate descriptor after those changes.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: a32b7fc981a20205f96d530d8e958f12ace1104c.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
70e46ca887 sm501: Add missing arbitration control register
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: d1eaf3b19c40aeb32a343a211f2b56664a67f948.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
e2ee84760e sm501: Use defined constants instead of literal values where available
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 31205c2df623e7b133ef942ff4f5e95fff800a14.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
BALATON Zoltan
64f1603b07 sm501: Fixed code style and a few typos in comments
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 36288b703e7d56822c818567193ff28cdc47377e.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 12:32:12 +01:00
Peter Maydell
d526e5d874 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Sat 22 Apr 2017 07:36:01 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to 04898e8 built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 11:32:02 +01:00
Peter Maydell
cd1ea50895 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Fri 21 Apr 2017 20:09:35 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  tcx: switch to load_image_mr() and remove prom_addr hack
  tcx: use tcx_set_dirty() for accelerated ops
  tcx: remove primitives for non-32-bit surfaces
  tcx: remove TARGET_PAGE_SIZE from tcx24_update_display()
  tcx: remove TARGET_PAGE_SIZE from tcx_update_display()
  tcx: remove page24 and cpage from tcx24_update_display()
  tcx: alter tcx24_reset_dirty() to accept address and length parameters
  tcx: alter tcx24_check_dirty() to accept address and length parameters
  tcx: ensure tcx_set_dirty() also invalidates the 24-bit plane and cplane
  tcx: alter tcx_set_dirty() to accept address and length parameters
  cg3: switch to load_image_mr() and remove prom-addr hack
  cg3: fix up size parameter for memory_region_get_dirty()
  cg3: remove TARGET_PAGE_SIZE rounding on dirty page detection

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-24 09:43:03 +01:00
Gerd Hoffmann
729abb6a92 virtio-gpu: add xres and yres properties
So the default resolution is configurable.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170421092214.8176-1-kraxel@redhat.com
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
6f663d7be9 qxl: add xres and yres properties
Add properties for the default display resolution, pass
on that information to the guest so the driver can use it.

Also move up qxl_crc32() function so we don't need a
forward declaration.

Additionally guest driver updates are needed so the
guest driver will actually pick this up, which will
probably land in linux kernel 4.12.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421092234.8368-1-kraxel@redhat.com
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
104bd1dc70 vmsvga: fix vmsvga_update_display
Fix standard vga mode check:  Both s->config and s->enabled must be set
to enable vmware command fifo processing.

Drop dirty tracking code from the fifo rendering code path, it isn't
used anyway because vmsvga turns off dirty tracking when leaving
standard vga mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-9-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
7fcf0c24e7 g364fb: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-8-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
553bcce5ac exynos: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-7-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
167e9c7982 framebuffer: make display updates thread safe
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-6-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
fec5e8c92b vga: make display updates thread safe.
The vga code clears the dirty bits *after* reading the framebuffer
memory.  So if the guest framebuffer updates hits the race window
between vga reading the framebuffer and vga clearing the dirty bits
vga will miss that update

Fix it by using the new memory_region_copy_and_clear_dirty()
memory_region_copy_get_dirty() functions.  That way we clear the
dirty bitmap before reading the framebuffer.  Any guest display
updates happening in parallel will be properly tracked in the
dirty bitmap then and the next display refresh will pick them up.

Problem triggers with mttcg only.  Before mttcg was merged tcg
never ran in parallel to vga emulation.  Using kvm will hide the
problem too, due to qemu operating on a userspace copy of the
kernel's dirty bitmap.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-5-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
f3289f6f0f vga: add vga_scanline_invalidated helper
Add vga_scanline_invalidated helper to check whenever a scanline was
invalidated.  Add a sanity check to fix OOB read access for display
heights larger than 2048.

Only cirrus uses this, for hardware cursor rendering, so having this
work properly for the first 2048 scanlines only shouldn't be a problem
as the cirrus can't handle large resolutions anyway.  Also changing the
invalidated_y_table size would break live migration.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-4-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
8deaf12ca1 memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Gerd Hoffmann
d6eb141392 bitmap: add bitmap_copy_and_clear_atomic
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-2-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Laurent Vivier
a27450ec04 virtio-gpu: replace PIXMAN_* by PIXMAN_BE_*
This avoids a "#ifdef HOST_WORDS_BIGENDIAN" and this is the purpose
of PIXMAN_BE_* macros.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Sahid Orentino Ferdjaoui <sahid.ferdjaoui@redhat.com>
Message-id: 20170403114044.15762-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Marc-André Lureau
e0665c3b5f console: add same displaychangelistener registration pre-condition
Catch an invalid state. Mainly useful for documentation purposes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-3-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Marc-André Lureau
6905b93447 console: add same surface replace pre-condition
Catch an invalid state early, before a potential use-after-free. This is
mainly useful for documentation purposes.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-2-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-24 10:12:28 +02:00
Fam Zheng
021c9d25b3 error: Apply error_propagate_null.cocci again
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-15-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:45 +02:00
Fam Zheng
4462a5307b qga: Make errp the last parameter of qga_vss_fsfreeze
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-14-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
bbfb89e38c migration: Make errp the last parameter of local functions
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-13-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
bf46e67ddb scsi: Make errp the last parameter of virtio_scsi_common_realize
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-12-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
c0ca74f6f3 fdc: Make errp the last parameter of fdctrl_connect_drives
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-11-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
cb8d4bf677 nfs: Make errp the last parameter of nfs_client_open
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-10-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
78bbd910bb block: Make errp the last parameter of commit_active_start
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-9-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
51ccfa2dbf mirror: Make errp the last parameter of mirror_start_job
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-8-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:44 +02:00
Fam Zheng
375092332e crypto: Make errp the last parameter of functions
Move opaque to 2nd instead of the 2nd to last, so that compilers help
check with the conversion.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-7-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Commit message typo corrected]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:13:22 +02:00
Fam Zheng
9217283dc8 block: Make errp the last parameter of bdrv_img_create
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-6-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
1a9a7f25a0 socket: Make errp the last parameter of vsock_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-5-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
2bdc6791b9 socket: Make errp the last parameter of unix_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-4-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
6dffc1f670 socket: Make errp the last parameter of inet_connect_saddr
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-3-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Fam Zheng
226799cec5 socket: Make errp the last parameter of socket_connect
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170421122710.15373-2-famz@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Max Reitz
536eeea869 util/error: Fix leak in error_vprepend()
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170413160952.29918-1-mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-24 09:12:59 +02:00
Cédric Le Goater
bd44300d1a net: add FTGMAC100 support
The FTGMAC100 device is an Ethernet controller with DMA function that
can be found on Aspeed SoCs (which include NCSI).

It is fully compliant with IEEE 802.3 specification for 10/100 Mbps
Ethernet and IEEE 802.3z specification for 1000 Mbps Ethernet and
includes Reduced Media Independent Interface (RMII) and Reduced
Gigabit Media Independent Interface (RGMII) interfaces. It adopts an
AHB bus interface and integrates a link list DMA engine with direct
M-Bus accesses for transmitting and receiving packets. It has
independent TX/RX fifos, supports half and full duplex (1000 Mbps mode
only supports full duplex), flow control for full duplex and
backpressure for half duplex.

The FTGMAC100 also implements IP, TCP, UDP checksum offloads and
supports IEEE 802.1Q VLAN tag insertion and removal. It offers
high-priority transmit queue for QoS and CoS applications

This model is backed with a RealTek 8211E PHY which is the chip found
on the AST2500 EVB. It is complete enough to satisfy two different
Linux drivers and a U-Boot driver. Not supported features are :

 - IEEE 802.1Q VLAN
 - High Priority Transmit Queue
 - Wake-On-LAN functions

The code is based on the Coldfire Fast Ethernet Controller model.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Cédric Le Goater
30adcc8fab hw/net: add MII definitions
This adds comments on the Basic mode control and status registers bit
definitions. It also adds a couple of bits for 1000BASE-T and the
RealTek 8211E PHY for the FTGMAC100 model to use.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Zhang Chen
d25a7dabf2 colo-compare: Fix old packet check bug.
If colo-compare find one old packet,we can notify colo-frame
do checkpoint, no need continue find more old packet here.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-04-24 11:30:03 +08:00
Mark Cave-Ayland
e0a31457e1 Update OpenBIOS images to 04898e8 built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-04-22 07:27:20 +01:00
Stefano Stabellini
c9fb47e7d0 9p: introduce a type for the 9p header
Use the new type in virtio-9p-device.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
CC: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
CC: Greg Kurz <groug@kaod.org>
2017-04-21 12:41:29 -07:00
Stefano Stabellini
f65eadb639 xen: import ring.h from xen
Do not use the ring.h header installed on the system. Instead, import
the header into the QEMU codebase. This avoids problems when QEMU is
built against a Xen version too old to provide all the ring macros.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: anthony.perard@citrix.com
CC: jgross@suse.com
2017-04-21 12:41:29 -07:00
Juergen Gross
c1cdd9d5be configure: use pkg-config for obtaining xen version
Instead of trying to guess the Xen version to use by compiling various
test programs first just ask the system via pkg-config. Only if it
can't return the version fall back to the test program scheme.

If configure is being called with dedicated flags for the Xen libraries
use those instead of the pkg-config output. This will avoid breaking
an in-tree Xen build of an old Xen version while a new Xen version is
installed on the build machine: pkg-config would pick up the installed
Xen config files as the Xen tree wouldn't contain any of them.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:41:23 -07:00
Paul Durrant
14d015b6fc xen: additionally restrict xenforeignmemory operations
Commit f0f272baf3a7 "xen: use libxendevice model to restrict operations"
added a command-line option (-xen-domid-restrict) to limit operations
using the libxendevicemodel API to a specified domid. The commit also
noted that the restriction would be extended to cover operations issued
via other xen libraries by subsequent patches.

My recent Xen patch [1] added a call to the xenforeignmemory API to allow
it to be restricted. This patch now makes use of that new call when the
-xen-domid-restrict option is passed.

[1] http://xenbits.xen.org/gitweb/?p=xen.git;a=commit;h=5823d6eb

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:14 -07:00
Paul Durrant
1c599472b0 xen: use libxendevice model to restrict operations
This patch adds a command-line option (-xen-domid-restrict) which will
use the new libxendevicemodel API to restrict devicemodel [1] operations
to the specified domid. (Such operations are not applicable to the xenpv
machine type).

This patch also adds a tracepoint to allow successful enabling of the
restriction to be monitored.

[1] I.e. operations issued by libxendevicemodel. Operation issued by other
    xen libraries (e.g. libxenforeignmemory) are currently still unrestricted
    but this will be rectified by subsequent patches.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:14 -07:00
Juergen Gross
f1167ee684 xen: use 5 digit xen versions
Today qemu is using e.g. the value 480 for Xen version 4.8.0. As some
Xen version tests are using ">" relations this scheme will lead to
problems when Xen version 4.10.0 is being reached.

Instead of the 3 digit schem use a 5 digit scheme (e.g. 40800 for
version 4.8.0).

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:40:09 -07:00
Paul Durrant
d655f34e6d xen: use libxendevicemodel when available
This patch modifies the wrapper functions in xen_common.h to use the
new xendevicemodel interface if it is available along with compatibility
code to use the old libxenctrl interface if it is not.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:39:43 -07:00
Paul Durrant
da8090ccb7 configure: detect presence of libxendevicemodel
This patch adds code in configure to set CONFIG_XEN_CTRL_INTERFACE_VERSION
to a new value of 490 if libxendevicemodel is present in the build
environment.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-04-21 12:39:36 -07:00
Peter Maydell
32c7e0ab75 Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170421' into staging
migration/next for 20170421

# gpg: Signature made Fri 21 Apr 2017 11:28:13 BST
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170421: (65 commits)
  hmp: info migrate_parameters format tunes
  hmp: info migrate_capability format tunes
  migration: rename max_size to threshold_size
  migration: set current_active_state once
  virtio-rng: stop virtqueue while the CPU is stopped
  migration: don't close a file descriptor while it can be in use
  ram: Remove migration_bitmap_extend()
  migration: Disable hotplug/unplug during migration
  qdev: Move qdev_unplug() to qdev-monitor.c
  qdev: Export qdev_hot_removed
  qdev: qdev_hotplug is really a bool
  migration: Remove MigrationState parameter from migration_is_idle()
  ram: Use RAMBitmap type for coherence
  ram: rename last_ram_offset() last_ram_pages()
  ram: Use ramblock and page offset instead of absolute offset
  ram: Change offset field in PageSearchStatus to page
  ram: Remember last_page instead of last_offset
  ram: Use page number instead of an address for the bitmap operations
  ram: reorganize last_sent_block
  ram: ram_discard_range() don't use the mis parameter
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 15:59:27 +01:00
Peter Maydell
af7ec403b2 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Fri 21 Apr 2017 10:52:18 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  simpletrace: document Analyzer method signatures
  trace: Put all trace.o into libqemuutil.a
  configure: eliminate Python dependency for --help

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 14:36:45 +01:00
Peter Maydell
09fc586db3 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Fri 21 Apr 2017 10:43:04 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  MAINTAINERS: update my email address
  MAINTAINERS: update Wen's email address
  migration/block: use blk_pwrite_zeroes for each zero cluster
  throttle: make throttle_config(throttle_get_config()) symmetric
  throttle: do not use invalid config in test
  qemu-options: explain disk I/O throttling options

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 14:02:10 +01:00
Peter Maydell
b4c963fa82 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170421' into staging
The first batch of s390x changes for 2.10:
- the new compat machine
- several cleanups and optimizations
- introspection for css ids

# gpg: Signature made Fri 21 Apr 2017 08:36:25 BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170421:
  s390x: Drop useless casts
  s390x: register I/O adapters per ISC during init
  s390x/flic: cache flic in s390_get_flic
  s390x: initialize flic before I/O subsystems
  s390x: use enum for adapter type and standardize its naming
  s390x/css: consolidate the devno property for ccw devices
  s390x/css: provide introspection for virtual subchannel and device busid
  s390x/css: introduce read-only property type for device ids
  s390x/pci: make printf always compile in debug output
  s390x/kvm: make printf always compile in debug output
  s390x: introduce 2.10 compat machine

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 12:59:42 +01:00
Peter Maydell
bfec359afb Merge remote-tracking branch 'remotes/armbru/tags/pull-qdev-2017-04-21' into staging
qdev patches for 2017-04-21

# gpg: Signature made Fri 21 Apr 2017 06:37:19 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qdev-2017-04-21:
  qdev: remove cannot_destroy_with_object_finalize_yet
  versatile: remove cannot_destroy_with_object_finalize_yet
  ppc: remove cannot_destroy_with_object_finalize_yet
  arm: remove remaining cannot_destroy_with_object_finalize_yet

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 11:42:03 +01:00
Peter Xu
2c02468c9b hmp: info migrate_parameters format tunes
Do the same (one per line) to the parameter list.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
d80a0169e3 hmp: info migrate_capability format tunes
Dump the info in a single line is hard to read. Do it one per line.
Also, the first "capabilities:" didn't help much. Let's remove it.

CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
faec066ab8 migration: rename max_size to threshold_size
In migration codes (especially in migration_thread()), max_size is used
in many place for the threshold value that we will start to do the final
flush and jump to the next stage to dump the whole rest things to
destination. However its name is confusing to first readers. Let's
rename it to "threshold_size" when proper and add a comment for it. No
functional change is made.

CC: Juan Quintela <quintela@redhat.com>
CC: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Peter Xu
343001f68d migration: set current_active_state once
We set it right above this one. No need to set it twice.

CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:41 +02:00
Laurent Vivier
a23a6d1839 virtio-rng: stop virtqueue while the CPU is stopped
If we modify the virtio-rng virqueue while the
vmstate is already migrated we can have some
inconsistencies between the virtqueue state and
the memory content.

To avoid this, stop the virtqueue while the CPU
is stopped.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Laurent Vivier
e8199e4895 migration: don't close a file descriptor while it can be in use
If we close the QEMUFile descriptor in process_incoming_migration_co()
while it has been stopped by an error, the postcopy_ram_listen_thread()
can try to continue to use it. And as the memory has been freed
it is working with an invalid pointer and crashes.

Fix this by releasing the memory after having managed the error
case (which, in fact, calls exit())

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by:  Amit Shah <amit@kernel.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
66103a5796 ram: Remove migration_bitmap_extend()
We have disabled memory hotplug, so we don't need to handle
migration_bitamp there.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
b06424de62 migration: Disable hotplug/unplug during migration
Until we have reviewed what can/can't be hotplugged during migration,
disable it.  We can enable it later for the things that we know that
work.  For instance, memory hotplug during postcopy doesn't work
currently.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>

--

- Fix typo.  Thanks Thomas.
- Delay migration check after we have checked that we can hotplug that
  device.
- more typos
2017-04-21 12:25:40 +02:00
Juan Quintela
329006799f qdev: Move qdev_unplug() to qdev-monitor.c
It is not used by linux-user, otherwise I need to to create one stub
for migration_is_idle() on following patch.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
21def24a5a qdev: Export qdev_hot_removed
I need to move qdev_unplug to qdev-monitor in the following patch, and
it needs access to this variable.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
9bed84c191 qdev: qdev_hotplug is really a bool
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:40 +02:00
Juan Quintela
fab3500526 migration: Remove MigrationState parameter from migration_is_idle()
Only user don't have a MigrationState handly.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
352b0de982 ram: Use RAMBitmap type for coherence
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
b8c4899398 ram: rename last_ram_offset() last_ram_pages()
We always use it as pages anyways.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
f20e286516 ram: Use ramblock and page offset instead of absolute offset
This removes the needto pass also the absolute offset.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:40 +02:00
Juan Quintela
a935e30fbb ram: Change offset field in PageSearchStatus to page
We are moving everything to work on pages, not addresses.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
269ace2951 ram: Remember last_page instead of last_offset
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>

--

Improve comment
Fix typo
2017-04-21 12:25:39 +02:00
Juan Quintela
06b106889a ram: Use page number instead of an address for the bitmap operations
We use an unsigned long for the page number.  Notice that our bitmaps
already got that for the index, so we have that limit.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

rename page to page_abs everywhere.
fix trace types for pages
2017-04-21 12:25:39 +02:00
Juan Quintela
2479569466 ram: reorganize last_sent_block
We were setting it far away of when we changed it.  Now everything is
done inside save_page_header.  Once there, reorganize code to pass
RAMState.  We also set CONTINUE flag in a single place.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
aaa2064c2a ram: ram_discard_range() don't use the mis parameter
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
15440dd5a0 ram: Pass RAMBlock to bitmap_sync
We change the meaning of start to be the offset from the beggining of
the block.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Chao Fan
030ce1f861 ram: Add page-size to output in 'info migrate'
The number of dirty pages is output in 'pages' in the command
'info migrate', so add page-size to calculate the number of dirty
pages in bytes.

Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
20afaed98b ram: Rename qemu_target_page_bits() to qemu_target_page_size()
It was used as a size in all cases except one.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
a0a8aa147a ram: We don't need MigrationState parameter anymore
Remove it from callers and callees.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
5727309d25 migration: Remove MigrationState from migration_in_postcopy
We need to call for the migrate_get_current() in more that half of the
uses, so call that inside.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
6d358d9494 ram: Remove compression_switch and inline its logic
We can calculate its value, so we don't create a variable for it.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>

--

After Peter and Dave review, I dropped the variable and just inlined
the condition.

Fix typo
2017-04-21 12:25:39 +02:00
Juan Quintela
ce25d33781 ram: Move QEMUFile into RAMState
We receive the file from save_live operations and we don't use it
until 3 or 4 levels of calls down.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:39 +02:00
Juan Quintela
204b88b869 ram: Add QEMUFile to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:38 +02:00
Juan Quintela
96506894a3 ram: Move postcopy_requests into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:38 +02:00
Juan Quintela
47ad861976 ram: Move dirty_pages_rate to RAMState
Treat it like the rest of ram stats counters.  Export its value the
same way.  As an added bonus, no more MigrationState used in
migration_bitmap_sync();

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Again, dave was the one reviewing it
2017-04-21 12:25:38 +02:00
Juan Quintela
abbf1d7f9b ram: Remove dirty_bytes_rate
It can be recalculated from dirty_pages_rate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Dave was the one that reviewed it O:-)
2017-04-21 12:25:38 +02:00
Juan Quintela
42d219d3b0 ram: Create ram_dirty_sync_count()
This is a ram field that was inside MigrationState.  Move it to
RAMState and make it the same that the other ram stats.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
ec481c6c57 ram: Move src_page_req* to RAMState
This are the last postcopy fields still at MigrationState.  Once there
Move MigrationSrcPageRequest to ram.c and remove MigrationState
parameters where appropiate.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
68a098f386 ram: Move last_req_rb to RAMState
It was on MigrationState when it is only used inside ram.c for
postcopy.  Problem is that we need to access it without being able to
pass it RAMState directly.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
9edabd4de6 ram: Remove ram_save_remaining
Just unfold it.  Move ram_bytes_remaining() with the rest of exported
functions.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
072c251157 ram: Use the RAMState bytes_transferred parameter
Somewhere it was passed by reference, just use it from RAMState.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
2f4fde9352 ram: Move bytes_transferred into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
eb859c53dd ram: Move migration_bitmap_rcu into RAMState
Once there, rename the type to be shorter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
108cfae019 ram: Move migration_bitmap_mutex into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
ceb4d16898 ram: Everything was init to zero, so use memset
And then init only things that are not zero by default.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
0d8ec885ed ram: Move migration_dirty_pages to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
180f61f75a ram: Move xbzrle_overflows into RAMState
Once there, remove the now unused AccountingInfo struct and var.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
b07016b674 ram: Move xbzrle_cache_miss_rate into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
544c36f188 ram: Move xbzrle_cache_miss into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:37 +02:00
Juan Quintela
f36ada95de ram: Move xbzrle_pages into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Comment why we need bytes and pages
2017-04-21 12:25:37 +02:00
Juan Quintela
07ed50a2bb ram: Move xbzrle_bytes into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
23b28c3c62 ram: Move iterations into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
29cc3d8a9b ram: Remove norm_mig_bytes_transferred
Its value can be calculated by other exported.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
b4d1c6e722 ram: Move norm_pages to RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
bedf53c14c ram: Remove unused pages_skipped variable
For compatibility, we need to still send a value, but just specify it
and comment the fact.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
5bb1272c38 ram: Remove unused dup_mig_bytes_transferred()
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
f7ccd61b4c ram: Move dup_pages into RAMState
Once there rename it to its actual meaning, zero_pages.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
36040d9cb2 ram: Move iterations_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:36 +02:00
Juan Quintela
b5833fde40 ram: Move xbzrle_cache_miss_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 12:25:36 +02:00
Juan Quintela
68908ed665 ram: Change num_dirty_pages_period type to uint64_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
a66cd90c74 ram: Move num_dirty_pages_period into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
c4bdf0cf4b ram: Change byte_xfer_{prev,now} type to uint64_t
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
eac7415958 ram: Move bytes_xfer_prev into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:36 +02:00
Juan Quintela
f664da80fc ram: Move start time into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Renamed start_time to time_last_bitmap_sync(peterx suggestion)
2017-04-21 12:25:35 +02:00
Juan Quintela
5a98773896 ram: Move bitmap_sync_count into RAMState
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
8d820d6f67 ram: Add dirty_rate_high_cnt to RAMState
We need to add a parameter to several functions to make this work.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
6f37bb8bf3 ram: Create RAMState
We create a struct where to put all the ram state

Start with the following fields:

last_seen_block, last_sent_block, last_offset, last_version and
ram_bulk_stage are globals that are really related together.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix typo and warnings
2017-04-21 12:25:35 +02:00
Juan Quintela
3644915726 ram: Rename block_name to rbname
So all places are consistent on the naming of a block name parameter.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
5e58f968f4 ram: Rename flush_page_queue() to migration_page_queue_free()
It reflects better what it does.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
2017-04-21 12:25:35 +02:00
Juan Quintela
3d0684b2ad ram: Update all functions comments
Added doc comments for existing functions comment and rewrite them in
a common style.

Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>

--

Fix Peter Xu comments
Improve postcopy comments as per reviews.
2017-04-21 12:25:35 +02:00
Stefan Hajnoczi
659370f71f simpletrace: document Analyzer method signatures
Users can inherit from the simpletrace.Analyzer class and receive
callbacks when events of interest occur in a trace file.  The method
signature is a little magic because the timestamp and pid arguments are
optional.  Document this.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20170411095654.18383-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:35 +01:00
Xu, Anthony
3d1baccb08 trace: Put all trace.o into libqemuutil.a
Currently all trace.o are linked into qemu-system, qemu-img,
qemu-nbd, qemu-io etc., even the corresponding components
are not included.
Put all trace.o into libqemuutil.a that the linker would only pull in .o
files containing symbols that are actually referenced by the
program.

Signed-off -by: Anthony Xu <anthony.xu@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:35 +01:00
Stefan Hajnoczi
c53eeaf75a configure: eliminate Python dependency for --help
The ./configure script should produce --help output even if Python is
not installed.

Listing trace backends is simple: show the names of all Python modules
in scripts/tracetool/backend/ whose source code contains 'PUBLIC =
True'.

Perform the backend enumeration in shell instead of Python so that we
can move the Python check until after ./configure --help.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170328134418.3426-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:45:34 +01:00
Zhang Chen
3ccc0a0163 MAINTAINERS: update my email address
I'm leaving my job at Fujitsu, this email address will stop working
this week. Update it to one that I will have access to later.

Signed-off-by: Xie Changlong <xiecl.fnst@cn.fujitsu.com>
Message-id: 1492758767-19716-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Changlong Xie
205f8618be MAINTAINERS: update Wen's email address
So he can get CC'ed on future patches and bugs for this feature

Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-id: 1492484893-23435-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Lidong Chen
3928d50bf1 migration/block: use blk_pwrite_zeroes for each zero cluster
BLOCK_SIZE is (1 << 20), qcow2 cluster size is 65536 by default,
this may cause the qcow2 file size to be bigger after migration.
This patch checks each cluster, using blk_pwrite_zeroes for each
zero cluster.

[Initialize cluster_size to BLOCK_SIZE to prevent a gcc uninitialized
variable compiler warning.  In reality we always initialize cluster_size
in a conditional but gcc doesn't know that.
--Stefan]

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Lidong Chen <lidongchen@tencent.com>
Message-id: 1492050868-16200-1-git-send-email-lidongchen@tencent.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
d72915c60b throttle: make throttle_config(throttle_get_config()) symmetric
Throttling has a weird property that throttle_get_config() does not
always return the same throttling settings that were given with
throttle_config().  In other words, the set and get functions aren't
symmetric.

If .max is 0 then the throttling code assigns a default value of .avg /
10 in throttle_config().  This is an implementation detail of the
throttling algorithm.  When throttle_get_config() is called the .max
value returned should still be 0.

Users are exposed to this quirk via "info block" or "query-block"
monitor commands.  This has caused confusion because it looks like a bug
when an unexpected value is reported.

This patch hides the .max value adjustment in throttle_get_config() and
updates test-throttle.c appropriately.

Reported-by: Nini Gu <ngu@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170301115026.22621-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
ab08aec45f throttle: do not use invalid config in test
The (burst) max parameter cannot be smaller than the avg parameter.
There is a test case that uses avg = 56, max = 1 and gets away with it
because no input validation is performed by the test case.

This patch switches to valid test input parameters.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20170301115026.22621-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Stefan Hajnoczi
01f9cfab8b qemu-options: explain disk I/O throttling options
The disk I/O throttling options have been listed for a long time but
never explained on the QEMU man page.

Suggested-by: Nini Gu <ngu@redhat.com>
Cc: Alberto Garcia <berto@igalia.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-id: 20170301115026.22621-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-04-21 10:36:12 +01:00
Peter Maydell
7cd37925a1 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine queue for 2.10

# gpg: Signature made Thu 20 Apr 2017 19:44:27 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-pull-request:
  qdev: Constify local variable returned by blk_bs
  qdev: Constify value passed to qdev_prop_set_macaddr
  hostmem: use host_memory_backend_mr_inited() where proper
  hostmem: introduce host_memory_backend_mr_inited()
  hw/core/null-machine: Print error message when using the -kernel parameter
  qdev: Make "hotplugged" property read-only
  intel_iommu: enable remote IOTLB
  intel_iommu: allow dynamic switch of IOMMU region
  intel_iommu: provide its own replay() callback
  intel_iommu: use the correct memory region for device IOTLB notification
  memory: add MemoryRegionIOMMUOps.replay() callback
  memory: introduce memory_region_notify_one()
  memory: provide iommu_replay_all()
  memory: provide IOMMU_NOTIFIER_FOREACH macro
  memory: add section range info for IOMMU notifier

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-21 10:23:56 +01:00
Mark Cave-Ayland
7497638642 tcx: switch to load_image_mr() and remove prom_addr hack
Previous to the existence of load_image_mr(), the only way to load in the
FCode ROM image was to pass in its physical address via qdev properties
and use load_image_targphys().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
973945804d tcx: use tcx_set_dirty() for accelerated ops
Rather than calling memory_region_set_dirty() directly, make sure that we call
tcx_set_dirty() instead. This ensures that the 24-bit plane and cplane are
also invalidated correctly.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
ee72bed08c tcx: remove primitives for non-32-bit surfaces
As all surfaces in QEMU are now either shared or 32-bit ARGB regardless of
the guest depth, remove all non-32-bit primitives from tcx_update_display()
and consequence their implementation which are no longer required.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
d18e101225 tcx: remove TARGET_PAGE_SIZE from tcx24_update_display()
Now that page alignment is handled by the memory API, there is no need to
duplicate the code 4 times (4 * 1024 == 4096 == TARGET_PAGE_SIZE).

Finally we have now removed all traces of TARGET_PAGE_SIZE.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
0a97c6c4f9 tcx: remove TARGET_PAGE_SIZE from tcx_update_display()
Now that page alignment is handled by the memory API, there is no need to
duplicate the code 4 times (4 * 1024 == 4096 == TARGET_PAGE_SIZE).

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
66dcabea47 tcx: remove page24 and cpage from tcx24_update_display()
Since all of the tcx_*_dirty() functions now calculate the 24-bit and
cplane offsets themselves from the base address, these variables are no
longer needed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
36180430ac tcx: alter tcx24_reset_dirty() to accept address and length parameters
This can now be used by both the 8-bit and 24-bit display code, so rename
to tcx_check_dirty().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
427ee02bc9 tcx: alter tcx24_check_dirty() to accept address and length parameters
This can now be used by both the 8-bit and 24-bit display code, so rename
to tcx_check_dirty().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
4b865c2809 tcx: ensure tcx_set_dirty() also invalidates the 24-bit plane and cplane
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
9800b3c20e tcx: alter tcx_set_dirty() to accept address and length parameters
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 09:02:04 +01:00
Mark Cave-Ayland
8c95e1f20c cg3: switch to load_image_mr() and remove prom-addr hack
Previous to the existence of load_image_mr(), the only way to load in the
FCode ROM image was to pass in its physical address via qdev properties
and use load_image_targphys().

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-04-21 09:01:49 +01:00
Eric Blake
cb55c19a26 s390x: Drop useless casts
An upcoming Coccinelle cleanup script wanted to reformat the casts
present in this file - but on closer look, we don't need the casts
at all because C automatically converts void* to any other pointer.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170405194741.18956-4-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
dde522bbc5 s390x: register I/O adapters per ISC during init
The I/O adapters should exist as soon as the bus/infrastructure
exists, and not only when the guest is actually trying to do something
with them. While the lazy allocation was not wrong, allocating at init
time is cleaner, both for the architecture and the code. Let's adjust
this by having each device type (currently for PCI and virtio-ccw)
register the adapters for each ISC (as now we don't know which ISC the
guest will use) as soon as it initializes.

Use a two-dimensional array io_adapters[type][isc] to store adapters
in ChannelSubSys, so that we can conveniently get the adapter id by
the helper function css_get_adapter_id(type, isc).

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
bc66d6cbca s390x/flic: cache flic in s390_get_flic
s390_get_flic() is called many times to obtain the flic. This wastes a
lot of time as it calls object_resolve_path() every time. Let's cache
S390FLICState by defining it as static.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
c572d3f313 s390x: initialize flic before I/O subsystems
Let's have a flic before we move on to initialize more specific
subsystems that make use of it.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Fei Li
5b00bef270 s390x: use enum for adapter type and standardize its naming
Let's use an enum for io adapter type, and standardize its naming to
CSS_IO_ADAPTER_* by changing S390_PCIPT_ADAPTER to CSS_IO_ADAPTER_PCI.

Signed-off-by: Fei Li <sherrylf@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
2a78ac660f s390x/css: consolidate the devno property for ccw devices
'devno' should rather be a property of the ccw device, instead of a
property of a specific virtio-ccw device. Let's consolidate it.

While we are at here, also rename CcwDevice.bus_id to CcwDevice.devno to
make things clearer.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
d8d98db5f0 s390x/css: provide introspection for virtual subchannel and device busid
Expose the busids of the virtual I/O subchannel and the virtual CCW
device to ease debugging. This is needed because:
1. subchannel id are assigned dynamically, and cannot be set from
   outside.
2. device busid could possibly be auto generated.

An example of using HMP to retrieve the property values of a
virtio-balloon-ccw device looks like:

[root@localhost ~]# lscss -d 0.0.0004
Device   Subchan.  DevType CU Type Use  PIM PAM POM  CHPIDs
----------------------------------------------------------------------
0.0.0004 0.0.0003  0000/00 3832/05 yes  80  80  ff   00000000 00000000

(qemu) info qtree
... ...
      dev: virtio-balloon-ccw, id "balloon0"
        devno = "<unset>"
        ioeventfd = true
        max_revision = 2 (0x2)
        dev_id = "fe.0.0004"
        subch_id = "fe.0.0003"
... ...

After migration, if we have the same device that shows up on a
different subchannel, we must re-fill the subch_id of the ccw
device with the new schid, or the subch_id will have an old wrong
schid value. So this also re-fills the subch_id after migration.

While we are at it, also neaten the related error handling a bit.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Dong Jia Shi
c35fc6aa18 s390x/css: introduce read-only property type for device ids
Let's introduce a read-only property type that handles device ids of the
CssDevId type used for channel devices for future use. e.g. exposing the
busid of an I/O subchannel that is assigned to a ccw device.

Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Danil Antonov
229913f0ef s390x/pci: make printf always compile in debug output
Wrapped printf calls inside debug macros (DPRINTF) in `if` statement.
This will ensure that printf function will always compile even if debug
output is turned off and, in turn, will prevent bitrot of the format
strings.

Signed-off-by: Danil Antonov <g.danil.anto@gmail.com>
Message-Id: <CA+KKJYBi31Bs7DtVdzZdwG2t+u5+FGiAhQpd3pqJzUX1O8Cprg@mail.gmail.com>
[CH: remove now misleading comments]
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Danil Antonov
08564ecd4d s390x/kvm: make printf always compile in debug output
Wrapped printf calls inside debug macros (DPRINTF) in `if` statement.
This will ensure that printf function will always compile even if debug
output is turned off and, in turn, will prevent bitrot of the format
strings.

Signed-off-by: Danil Antonov <g.danil.anto@gmail.com>
Message-Id: <CA+KKJYAhsuTodm3s2rK65hR=-Xi5+Z7Q+M2nJYZQf2wa44HfOg@mail.gmail.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Cornelia Huck
10890873ca s390x: introduce 2.10 compat machine
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-04-21 09:32:09 +02:00
Mark Cave-Ayland
be4221d993 cg3: fix up size parameter for memory_region_get_dirty()
The code was incorrectly calculating the end address rather than the size of
the required region.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 08:31:30 +01:00
Mark Cave-Ayland
66e2f304a3 cg3: remove TARGET_PAGE_SIZE rounding on dirty page detection
This was an artifact from very early versions of the code from before the
memory API and is no longer needed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-21 08:31:15 +01:00
Laurent Vivier
08f00df4f4 qdev: remove cannot_destroy_with_object_finalize_yet
As all users have been removed, we can remove
cannot_destroy_with_object_finalize_yet field
from the DeviceClass structure.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-5-lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:34 +02:00
Laurent Vivier
d28fca153b versatile: remove cannot_destroy_with_object_finalize_yet
cannot_destroy_with_object_finalize_yet was added by 4c315c2
("qdev: Protect device-list-properties against broken devices")
because "realview_pci" and "versatile_pci" were hanging
during "device-list-properties" cleanup (an infinite loop in
bus_unparent()).

We have this problem because the child is not removed from
the list of the PCI bus children because it has no defined parent:
qdev_set_parent_bus() set the device parent_bus pointer to bus, and
adds the device in the bus children list, but doesn't update the
device parent pointer.

To fix the problem, move all the involved parts to the realize function.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-4-lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:34 +02:00
Laurent Vivier
40fda982f2 ppc: remove cannot_destroy_with_object_finalize_yet
This removes the assert(kvm_enabled()) from kvmppc_host_cpu_initfn()

This assert can never be triggered as the function is only registered
when KVM is available (see also 4c315c2
"qdev: Protect device-list-properties against broken devices").

So we can remove the cannot_destroy_with_object_finalize_yet from
kvmppc_host_cpu_class_init() without fear and beyond reproach.
(as it has already be done for i386 with 771a13e "i386: Unset
cannot_destroy_with_object_finalize_yet on "host" model" and
e435601 "target-i386: Remove assert(kvm_enabled()) from
host_x86_cpu_initfn()")

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170414083717.13641-3-lvivier@redhat.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-21 07:18:23 +02:00
Krzysztof Kozlowski
be9721f400 qdev: Constify local variable returned by blk_bs
Inside qdev_prop_set_drive() the value returned by blk_bs() is passed
only as pointer to const to bdrv_get_node_name() and pointed values is
not modified in other places so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-Id: <20170310200550.13313-3-krzk@kernel.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Krzysztof Kozlowski
606fd0e206 qdev: Constify value passed to qdev_prop_set_macaddr
The 'value' argument is not modified so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-Id: <20170310200550.13313-2-krzk@kernel.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
6f4c60e47f hostmem: use host_memory_backend_mr_inited() where proper
Use the new interface to boost readability.

Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-3-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
4728b57410 hostmem: introduce host_memory_backend_mr_inited()
We were checking this against memory region size of host memory
backend's mr field to see whether the mr has been inited. This is
efficient but less elegant. Let's make a helper for it to avoid
confusions, along with some notes.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489151370-15453-2-git-send-email-peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Thomas Huth
991db24774 hw/core/null-machine: Print error message when using the -kernel parameter
If the user currently tries to use the -kernel parameter, simply nothing
happens, and the user might get confused that there is nothing loaded
to memory, but also no error message has been issued. Since there is no
real generic way to load a kernel on all CPU types (but on some targets,
the generic loader can be used instead), issue an appropriate error
message here now to avoid the possible confusion.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1488271971-12624-1-git-send-email-thuth@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Eduardo Habkost
36cccb8c57 qdev: Make "hotplugged" property read-only
The "hotplugged" property is user visible, but it was never meant
to be set by the user. There are probably multiple ways to break
or crash device code by overriding the property. For example, we
recently fixed a crash in rtc_set_memory() related to the
property (commit 26ef65beab).

There has been some discussion about making management software
use "hotplugged=on" on migration, to indicate devices that were
hotplugged in the migration source. There were other suggestions
to address this, like including the "hotplugged" field in the
migration stream instead of requiring it to be set explicitly.

Whatever solution we choose in the future, this patch disables
setting "hotplugged" explicitly in the command-line by now,
because the ability to set the property is unused, untested, and
undocumented.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170222192647.19690-1-ehabkost@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
dd4d607e40 intel_iommu: enable remote IOTLB
This patch is based on Aviv Ben-David (<bd.aviv@gmail.com>)'s patch
upstream:

  "IOMMU: enable intel_iommu map and unmap notifiers"
  https://lists.gnu.org/archive/html/qemu-devel/2016-11/msg01453.html

However I removed/fixed some content, and added my own codes.

Instead of translate() every page for iotlb invalidations (which is
slower), we walk the pages when needed and notify in a hook function.

This patch enables vfio devices for VT-d emulation.

And, since we already have vhost DMAR support via device-iotlb, a
natural benefit that this patch brings is that vt-d enabled vhost can
live even without ATS capability now. Though more tests are needed.

Signed-off-by: Aviv Ben-David <bdaviv@cs.technion.ac.il>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-10-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
558e0024a4 intel_iommu: allow dynamic switch of IOMMU region
This is preparation work to finally enabled dynamic switching ON/OFF for
VT-d protection. The old VT-d codes is using static IOMMU address space,
and that won't satisfy vfio-pci device listeners.

Let me explain.

vfio-pci devices depend on the memory region listener and IOMMU replay
mechanism to make sure the device mapping is coherent with the guest
even if there are domain switches. And there are two kinds of domain
switches:

  (1) switch from domain A -> B
  (2) switch from domain A -> no domain (e.g., turn DMAR off)

Case (1) is handled by the context entry invalidation handling by the
VT-d replay logic. What the replay function should do here is to replay
the existing page mappings in domain B.

However for case (2), we don't want to replay any domain mappings - we
just need the default GPA->HPA mappings (the address_space_memory
mapping). And this patch helps on case (2) to build up the mapping
automatically by leveraging the vfio-pci memory listeners.

Another important thing that this patch does is to seperate
IR (Interrupt Remapping) from DMAR (DMA Remapping). IR region should not
depend on the DMAR region (like before this patch). It should be a
standalone region, and it should be able to be activated without
DMAR (which is a common behavior of Linux kernel - by default it enables
IR while disabled DMAR).

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-9-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
f06a696dc9 intel_iommu: provide its own replay() callback
The default replay() don't work for VT-d since vt-d will have a huge
default memory region which covers address range 0-(2^64-1). This will
normally consumes a lot of time (which looks like a dead loop).

The solution is simple - we don't walk over all the regions. Instead, we
jump over the regions when we found that the page directories are empty.
It'll greatly reduce the time to walk the whole region.

To achieve this, we provided a page walk helper to do that, invoking
corresponding hook function when we found an page we are interested in.
vtd_page_walk_level() is the core logic for the page walking. It's
interface is designed to suite further use case, e.g., to invalidate a
range of addresses.

Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-8-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Jason Wang
10315b9b28 intel_iommu: use the correct memory region for device IOTLB notification
We have a specific memory region for DMAR now, so it's wrong to
trigger the notifier with the root region.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-7-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
faa362e3cc memory: add MemoryRegionIOMMUOps.replay() callback
Originally we have one memory_region_iommu_replay() function, which is
the default behavior to replay the translations of the whole IOMMU
region. However, on some platform like x86, we may want our own replay
logic for IOMMU regions. This patch adds one more hook for IOMMUOps for
the callback, and it'll override the default if set.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-6-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
bd2bfa4c52 memory: introduce memory_region_notify_one()
Generalizing the notify logic in memory_region_notify_iommu() into a
single function. This can be further used in customized replay()
functions for IOMMUs.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-5-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
de472e4a92 memory: provide iommu_replay_all()
This is an "global" version of existing memory_region_iommu_replay() -
we announce the translations to all the registered notifiers, instead of
a specific one.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-4-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
512fa40867 memory: provide IOMMU_NOTIFIER_FOREACH macro
A new macro is provided to iterate all the IOMMU notifiers hooked
under specific IOMMU memory region.

Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: \"Michael S. Tsirkin\" <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-3-git-send-email-peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Xu
698feb5e13 memory: add section range info for IOMMU notifier
In this patch, IOMMUNotifier.{start|end} are introduced to store section
information for a specific notifier. When notification occurs, we not
only check the notification type (MAP|UNMAP), but also check whether the
notified iova range overlaps with the range of specific IOMMU notifier,
and skip those notifiers if not in the listened range.

When removing an region, we need to make sure we removed the correct
VFIOGuestIOMMU by checking the IOMMUNotifier.start address as well.

This patch is solving the problem that vfio-pci devices receive
duplicated UNMAP notification on x86 platform when vIOMMU is there. The
issue is that x86 IOMMU has a (0, 2^64-1) IOMMU region, which is
splitted by the (0xfee00000, 0xfeefffff) IRQ region. AFAIK
this (splitted IOMMU region) is only happening on x86.

This patch also helps vhost to leverage the new interface as well, so
that vhost won't get duplicated cache flushes. In that sense, it's an
slight performance improvement.

Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1491562755-23867-2-git-send-email-peterx@redhat.com>
[ehabkost: included extra vhost_iommu_region_del() change from Peter Xu]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-20 15:22:41 -03:00
Peter Maydell
fa54abb8c2 Drop QEMU_GNUC_PREREQ() checks for gcc older than 4.1
We already require gcc 4.1 or newer (for the atomic
support), so the fallback codepaths for older gcc
versions than that are now dead code and we can
just delete them.

NB: clang reports itself as gcc 4.2 (regardless of
clang version), so clang won't be using the fallbacks
either.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2017-04-20 18:33:33 +01:00
Peter Maydell
da92ada855 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170420' into staging
target-arm queue:
 * implement M profile exception return properly
 * cadence GEM: fix multiqueue handling bugs
 * pxa2xx.c: QOMify a device
 * arm/kvm: Remove trailing newlines from error_report()
 * stellaris: Don't hw_error() on bad register accesses
 * Add assertion about FSC format for syndrome registers
 * Move excnames[] array into arm_log_exceptions()
 * exynos: minor code cleanups
 * hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account
 * Fix APSR writes via M profile MSR

# gpg: Signature made Thu 20 Apr 2017 17:39:35 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170420: (24 commits)
  arm: Remove workarounds for old M-profile exception return implementation
  arm: Implement M profile exception return properly
  arm: Track M profile handler mode state in TB flags
  arm: Abstract out "are we singlestepping" test to utility function
  arm: Move condition-failed codepath generation out of if()
  arm: Move gen_set_condexec() and gen_set_pc_im() up in the file
  arm: Factor out "generate right kind of step exception"
  arm: Thumb shift operations should not permit interworking branches
  arm: Don't implement BXJ on M-profile CPUs
  xlnx-zynqmp: Set the Cadence GEM revision
  cadence_gem: Make the revision a property
  cadence_gem: Correct the interupt logic
  cadence_gem: Correct the multi-queue can rx logic
  cadence_gem: Read the correct queue descriptor
  hw/arm: Qomify pxa2xx.c
  arm/kvm: Remove trailing newlines from error_report()
  stellaris: Don't hw_error() on bad register accesses
  target/arm: Add assertion about FSC format for syndrome registers
  arm: Move excnames[] array into arm_log_exceptions()
  target/arm: Add missing entries to excnames[] for log strings
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:41:34 +01:00
Peter Maydell
f4e8e4edda arm: Remove workarounds for old M-profile exception return implementation
Now that we've rewritten M-profile exception return so that the magic
PC values are not visible to other parts of QEMU, we can delete the
special casing of them elsewhere.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-10-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
3bb8a96f53 arm: Implement M profile exception return properly
On M profile, return from exceptions happen when code in Handler mode
executes one of the following function call return instructions:
 * POP or LDM which loads the PC
 * LDR to PC
 * BX register
and the new PC value is 0xFFxxxxxx.

QEMU tries to implement this by not treating the instruction
specially but then catching the attempt to execute from the magic
address value.  This is not ideal, because:
 * there are guest visible differences from the architecturally
   specified behaviour (for instance jumping to 0xFFxxxxxx via a
   different instruction should not cause an exception return but it
   will in the QEMU implementation)
 * we have to account for it in various places (like refusing to take
   an interrupt if the PC is at a magic value, and making sure that
   the MPU doesn't deny execution at the magic value addresses)

Drop these hacks, and instead implement exception return the way the
architecture specifies -- by having the relevant instructions check
for the magic value and raise the 'do an exception return' QEMU
internal exception immediately.

The effect on the generated code is minor:

 bx lr, old code (and new code for Thread mode):
  TCG:
   mov_i32 tmp5,r14
   movi_i32 tmp6,$0xfffffffffffffffe
   and_i32 pc,tmp5,tmp6
   movi_i32 tmp6,$0x1
   and_i32 tmp5,tmp5,tmp6
   st_i32 tmp5,env,$0x218
   exit_tb $0x0
   set_label $L0
   exit_tb $0x7f2aabd61993
  x86_64 generated code:
   0x7f2aabe87019:  mov    %ebx,%ebp
   0x7f2aabe8701b:  and    $0xfffffffffffffffe,%ebp
   0x7f2aabe8701e:  mov    %ebp,0x3c(%r14)
   0x7f2aabe87022:  and    $0x1,%ebx
   0x7f2aabe87025:  mov    %ebx,0x218(%r14)
   0x7f2aabe8702c:  xor    %eax,%eax
   0x7f2aabe8702e:  jmpq   0x7f2aabe7c016

 bx lr, new code when in Handler mode:
  TCG:
   mov_i32 tmp5,r14
   movi_i32 tmp6,$0xfffffffffffffffe
   and_i32 pc,tmp5,tmp6
   movi_i32 tmp6,$0x1
   and_i32 tmp5,tmp5,tmp6
   st_i32 tmp5,env,$0x218
   movi_i32 tmp5,$0xffffffffff000000
   brcond_i32 pc,tmp5,geu,$L1
   exit_tb $0x0
   set_label $L1
   movi_i32 tmp5,$0x8
   call exception_internal,$0x0,$0,env,tmp5
  x86_64 generated code:
   0x7fe8fa1264e3:  mov    %ebp,%ebx
   0x7fe8fa1264e5:  and    $0xfffffffffffffffe,%ebx
   0x7fe8fa1264e8:  mov    %ebx,0x3c(%r14)
   0x7fe8fa1264ec:  and    $0x1,%ebp
   0x7fe8fa1264ef:  mov    %ebp,0x218(%r14)
   0x7fe8fa1264f6:  cmp    $0xff000000,%ebx
   0x7fe8fa1264fc:  jae    0x7fe8fa126509
   0x7fe8fa126502:  xor    %eax,%eax
   0x7fe8fa126504:  jmpq   0x7fe8fa122016
   0x7fe8fa126509:  mov    %r14,%rdi
   0x7fe8fa12650c:  mov    $0x8,%esi
   0x7fe8fa126511:  mov    $0x56095dbeccf5,%r10
   0x7fe8fa12651b:  callq  *%r10

which is a difference of one cmp/branch-not-taken. This will
be lost in the noise of having to exit generated code and
look up the next TB anyway.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-9-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
064c379c99 arm: Track M profile handler mode state in TB flags
For M profile exception-return handling we'd like to generate different
code for some instructions depending on whether we are in Handler
mode or Thread mode. This isn't the same as "are we privileged
or user", so we need an extra bit in the TB flags to distinguish.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-8-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
b636649f5a arm: Abstract out "are we singlestepping" test to utility function
We now test for "are we singlestepping" in several places and
it's not a trivial check because we need to care about both
architectural singlestep and QEMU gdbstub singlestep. We're
also about to add another place that needs to make this check,
so pull the condition out into a function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-7-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
f021b2c462 arm: Move condition-failed codepath generation out of if()
Move the code to generate the "condition failed" instruction
codepath out of the if (singlestepping) {} else {}. This
will allow adding support for handling a new is_jmp type
which can't be neatly split into "singlestepping case"
versus "not singlestepping case".

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-6-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
4d5e8c969a arm: Move gen_set_condexec() and gen_set_pc_im() up in the file
Move the utility routines gen_set_condexec() and gen_set_pc_im()
up in the file, as we will want to use them from a function
placed earlier in the file than their current location.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-5-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
5425415ebb arm: Factor out "generate right kind of step exception"
We currently have two places that do:
            if (dc->ss_active) {
                gen_step_complete_exception(dc);
            } else {
                gen_exception_internal(EXCP_DEBUG);
            }

Factor this out into its own function, as we're about to add
a third place that needs the same logic.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-4-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
bedb8a6b09 arm: Thumb shift operations should not permit interworking branches
In Thumb mode, the only instructions which can cause an interworking
branch by writing the PC are BLX, BX, BXJ, LDR, POP and LDM. Unlike
ARM mode, data processing instructions which target the PC do not
cause interworking branches.

When we added support for doing interworking branches on writes to
PC from data processing instructions in commit 21aeb3430c, we
accidentally changed a Thumb instruction to have interworking
branch behaviour for writes to PC. (MOV, MOVS register-shifted
register, encoding T2; this is the standard encoding for
LSL/LSR/ASR/ROR (register).)

For this encoding, behaviour with Rd == R15 is specified as
UNPREDICTABLE, so allowing an interworking branch is within
spec, but it's confusing and differs from our handling of this
class of UNPREDICTABLE for other Thumb ALU operations. Make
it perform a simple (non-interworking) branch like the others.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491844419-12485-3-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
9d7c59c84d arm: Don't implement BXJ on M-profile CPUs
For M-profile CPUs, the BXJ instruction does not exist at all, and
the encoding should always UNDEF. We were accidentally implementing
it to behave like A-profile BXJ; correct the error.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1491844419-12485-2-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Alistair Francis
20bff21307 xlnx-zynqmp: Set the Cadence GEM revision
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 026dbe01a1d42619eee30ce3f2079741bf04bc73.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
a5517666b2 cadence_gem: Make the revision a property
Expose the Cadence GEM revision as a property.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 541324373cf87b50f8be0439a0cb89f5028b016f.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
596b6f51b7 cadence_gem: Correct the interupt logic
This patch fixes two mistakes in the interrupt logic.

First we only trigger single-queue or multi-queue interrupts if the status
register is set. This logic was already used for non multi-queue interrupts
but it also applies to multi-queue interrupts.

Secondly we need to lower the interrupts if the ISR isn't set. As part
of this we can remove the other interrupt lowering logic and consolidate
it inside gem_update_int_status().

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 438bcc014f8f8a2f8f68f322cb6a53f4c04688c2.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
dacc0566ac cadence_gem: Correct the multi-queue can rx logic
Correct the buffer descriptor busy logic to work correctly when using
multiple queues.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 8a7e8059984e27d46a276a66299d035a0afd280f.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Alistair Francis
75b7760212 cadence_gem: Read the correct queue descriptor
Read the correct descriptor instead of hardcoding the first (q=0).

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 988b183dcf951856d8b3379f7e911ec95233bbf4.1491947224.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Suramya Shah
0493a139c9 hw/arm: Qomify pxa2xx.c
Signed-off-by: Suramya Shah <shah.suramya@gmail.com>
Message-id: 20170415180316.2694-1-shah.suramya@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Ishani Chugh
dffc58519e arm/kvm: Remove trailing newlines from error_report()
Signed-off-by: Ishani Chugh <chugh.ishani@research.iiit.ac.in>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1491629987-6826-1-git-send-email-chugh.ishani@research.iiit.ac.in
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Peter Maydell
df3692e04b stellaris: Don't hw_error() on bad register accesses
Current recommended style is to log a guest error on bad register
accesses, not kill the whole system with hw_error().  Change the
hw_error() calls to log as LOG_GUEST_ERROR or LOG_UNIMP or use
g_assert_not_reached() as appropriate.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491486314-25823-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
65ed2ed90d target/arm: Add assertion about FSC format for syndrome registers
In tlb_fill() we construct a syndrome register value from a
fault status register value which is filled in by arm_tlb_fill().
arm_tlb_fill() returns FSR values which might be in the format
used with short-format page descriptors, or the format used
with long-format (LPAE) descriptors. The syndrome register
always uses LPAE-format FSR status codes.

It isn't actually possible to end up delivering a syndrome
register value to the guest for a fault which is reported
with a short-format FSR (that kind of stage 1 fault will only
happen for an AArch32 translation regime which doesn't have
a syndrome register, and can never be redirected to an AArch64
or Hyp exception level). Add an assertion which checks this,
and adjust the code so that we construct a syndrome with
an invalid status code, rather than allowing set bits in
the FSR input to randomly corrupt other fields in the syndrome.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1491486152-24304-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
2c4a7cc5af arm: Move excnames[] array into arm_log_exceptions()
The excnames[] array is defined in internals.h because we used
to use it from two different source files for handling logging
of AArch32 and AArch64 exception entry. Refactoring means that
it's now used only in arm_log_exception() in helper.c, so move
the array into that function.

Suggested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1491821097-5647-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Peter Maydell
32b81e620e target/arm: Add missing entries to excnames[] for log strings
Recent changes have added new EXCP_ values to ARM but forgot
to update the excnames[] array which is used to provide
human-readable strings when printing information about the
exception for debug logging. Add the missing entries, and
add a comment to the list of #defines to help avoid the mistake
being repeated in future.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1491486340-25988-1-git-send-email-peter.maydell@linaro.org
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
885f271056 hw/misc/exynos4210_pmu: Reorder local variables for readability
Short declaration of 'i' was in the middle of declarations with
assignments.  Make it a little bit more readable.  Additionally switch
from "unsigned" to "unsigned int" as this pattern is more widely used.
No functional change.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170313184750.429-4-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
75c6d92e4c hw/char/exynos4210_uart: Constify static array and few arguments
The static array exynos4210_uart_regs with register values is not
modified so it can be made const.

Few other functions accept driver or uart state as an argument but they
do not change it and do not cast it so this can be made const for code
safeness.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Message-id: 20170313184750.429-3-krzk@kernel.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Krzysztof Kozlowski
f2ad5140fa hw/arm/exynos: Convert fprintf to qemu_log_mask/error_report
qemu_log_mask() and error_report() are preferred over fprintf() for
logging errors.  Also remove square brackets [] and additional new line
characters in printed messages.

Signed-off-by: Krzysztof Kozlowski <krzk@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170313184750.429-2-krzk@kernel.org
[PMM: wrapped long line]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Ard Biesheuvel
68115ed5fc hw/arm/boot: take Linux/arm64 TEXT_OFFSET header field into account
The arm64 boot protocol stipulates that the kernel must be loaded
TEXT_OFFSET bytes beyond a 2 MB aligned base address, where TEXT_OFFSET
could be any 4 KB multiple between 0 and 2 MB, and whose value can be
found in the header of the Image file.

So after attempts to load the arm64 kernel image as an ELF file or as a
U-Boot image have failed (both of which have their own way of specifying
the load offset), try to determine the TEXT_OFFSET from the image after
loading it but before mapping it as a ROM mapping into the guest address
space.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1489414630-21609-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 17:39:17 +01:00
Laurent Vivier
53d6e53189 arm: remove remaining cannot_destroy_with_object_finalize_yet
With commit ce5b1bbf62 ("exec: move cpu_exec_init() calls to
realize functions"), we can now remove all the
remaining cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27).

Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170414083717.13641-2-lvivier@redhat.com>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-04-20 17:51:32 +02:00
Peter Maydell
64c8ed97cc Open 2.10 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 15:42:31 +01:00
Peter Maydell
359c41abe3 Update version for v2.9.0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-20 15:31:34 +01:00
Peter Maydell
ca55019dac Update version for v2.9.0-rc5 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 17:13:50 +01:00
Peter Maydell
d1263f8f18 Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 18 Apr 2017 15:58:32 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-pull-request:
  block: Drain BH in bdrv_drained_begin
  block: Walk bs->children carefully in bdrv_drain_recurse

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 16:18:15 +01:00
Fam Zheng
91af091f92 block: Drain BH in bdrv_drained_begin
During block job completion, nothing is preventing
block_job_defer_to_main_loop_bh from being called in a nested
aio_poll(), which is a trouble, such as in this code path:

    qmp_block_commit
      commit_active_start
        bdrv_reopen
          bdrv_reopen_multiple
            bdrv_reopen_prepare
              bdrv_flush
                aio_poll
                  aio_bh_poll
                    aio_bh_call
                      block_job_defer_to_main_loop_bh
                        stream_complete
                          bdrv_reopen

block_job_defer_to_main_loop_bh is the last step of the stream job,
which should have been "paused" by the bdrv_drained_begin/end in
bdrv_reopen_multiple, but it is not done because it's in the form of a
main loop BH.

Similar to why block jobs should be paused between drained_begin and
drained_end, BHs they schedule must be excluded as well.  To achieve
this, this patch forces draining the BH in BDRV_POLL_WHILE.

As a side effect this fixes a hang in block_job_detach_aio_context
during system_reset when a block job is ready:

    #0  0x0000555555aa79f3 in bdrv_drain_recurse
    #1  0x0000555555aa825d in bdrv_drained_begin
    #2  0x0000555555aa8449 in bdrv_drain
    #3  0x0000555555a9c356 in blk_drain
    #4  0x0000555555aa3cfd in mirror_drain
    #5  0x0000555555a66e11 in block_job_detach_aio_context
    #6  0x0000555555a62f4d in bdrv_detach_aio_context
    #7  0x0000555555a63116 in bdrv_set_aio_context
    #8  0x0000555555a9d326 in blk_set_aio_context
    #9  0x00005555557e38da in virtio_blk_data_plane_stop
    #10 0x00005555559f9d5f in virtio_bus_stop_ioeventfd
    #11 0x00005555559fa49b in virtio_bus_stop_ioeventfd
    #12 0x00005555559f6a18 in virtio_pci_stop_ioeventfd
    #13 0x00005555559f6a18 in virtio_pci_reset
    #14 0x00005555559139a9 in qdev_reset_one
    #15 0x0000555555916738 in qbus_walk_children
    #16 0x0000555555913318 in qdev_walk_children
    #17 0x0000555555916738 in qbus_walk_children
    #18 0x00005555559168ca in qemu_devices_reset
    #19 0x000055555581fcbb in pc_machine_reset
    #20 0x00005555558a4d96 in qemu_system_reset
    #21 0x000055555577157a in main_loop_should_exit
    #22 0x000055555577157a in main_loop
    #23 0x000055555577157a in main

The rationale is that the loop in block_job_detach_aio_context cannot
make any progress in pausing/completing the job, because bs->in_flight
is 0, so bdrv_drain doesn't process the block_job_defer_to_main_loop
BH. With this patch, it does.

Reported-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-3-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Fam Zheng
178bd438af block: Walk bs->children carefully in bdrv_drain_recurse
The recursive bdrv_drain_recurse may run a block job completion BH that
drops nodes. The coming changes will make that more likely and use-after-free
would happen without this patch

Stash the bs pointer and use bdrv_ref/bdrv_unref in addition to
QLIST_FOREACH_SAFE to prevent such a case from happening.

Since bdrv_unref accesses global state that is not protected by the AioContext
lock, we cannot use bdrv_ref/bdrv_unref unconditionally.  Fortunately the
protection is not needed in IOThread because only main loop can modify a graph
with the AioContext lock held.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170418143044.12187-2-famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-18 22:56:28 +08:00
Greg Kurz
9c6b899f7a 9pfs: local: set the path of the export root to "."
The local backend was recently converted to using "at*()" syscalls in order
to ensure all accesses happen below the shared directory. This requires that
we only pass relative paths, otherwise the dirfd argument to the "at*()"
syscalls is ignored and the path is treated as an absolute path in the host.
This is actually the case for paths in all fids, with the notable exception
of the root fid, whose path is "/". This causes the following backend ops to
act on the "/" directory of the host instead of the virtfs shared directory
when the export root is involved:
- lstat
- chmod
- chown
- utimensat

ie, chmod /9p_mount_point in the guest will be converted to chmod / in the
host for example. This could cause security issues with a privileged QEMU.

All "*at()" syscalls are being passed an open file descriptor. In the case
of the export root, this file descriptor points to the path in the host that
was passed to -fsdev.

The fix is thus as simple as changing the path of the export root fid to be
"." instead of "/".

This is CVE-2017-7471.

Cc: qemu-stable@nongnu.org
Reported-by: Léo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-18 14:01:43 +01:00
Peter Maydell
372b3fe0b2 Update version for v2.9.0-rc4 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 17:18:03 +01:00
Max Reitz
e3e0003a8f block/io: Comment out permission assertions
In case of block migration, there may be writes to BlockBackends that do
not have the write permission taken. Before this issue is fixed (which
is not going to happen in 2.9), we therefore cannot assert that this is
the case.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170411145050.31290-1-mreitz@redhat.com
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:09:31 +01:00
Kevin Wolf
5eceb01adf sheepdog: Fix crash in co_read_response()
This fixes a regression introduced in commit 9d456654.

aio_co_wake() can only be used to reenter a coroutine that was already
previously entered, otherwise co->ctx is uninitialised and we access
garbage. Using it immediately after qemu_coroutine_create() like in
co_read_response() is wrong and causes segfaults.

Replace the call with aio_co_enter(), which gets an explicit AioContext
parameter and works even for new coroutines.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1491919733-21065-1-git-send-email-kwolf@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 16:08:29 +01:00
Peter Maydell
f5ac5cfeb6 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-04-11' into staging
Block patches for 2.9.0-rc4

# gpg: Signature made Tue 11 Apr 2017 14:40:07 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-04-11:
  iscsi: Fix iscsi_create
  throttle: Remove block from group on hot-unplug
  block: pass the right options for BlockDriver.bdrv_open()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 14:53:33 +01:00
Fam Zheng
2ec9a782d1 iscsi: Fix iscsi_create
Since d5895fcb (iscsi: Split URL into individual options), creating
qcow2 image on an iscsi LUN fails:

    qemu-img create -f qcow2 iscsi://$SERVER/$IQN/0 1G
    qemu-img: iscsi://$SERVER/$IQN/0: Could not create image: Invalid
        argument

The problem is iscsi_open now expects that transport_name, portal and
target are already parsed into structured options by
iscsi_parse_filename, but it is not called in iscsi_create.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170410075451.21329-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[mreitz: Dropped now superfluous
         qdict_put(bs_options, "filename", ...)]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Eric Blake
1606e4cf8a throttle: Remove block from group on hot-unplug
When a block device that is part of a throttle group is hot-unplugged,
we forgot to remove it from the throttle group. This leaves stale
memory around, and causes an easily reproducible crash:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-device virtio-scsi-pci,bus=pci.0 -drive \
id=drive_image2,if=none,format=raw,file=file2,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image2,drive=drive_image2 -drive \
id=drive_image3,if=none,format=raw,file=file3,bps=512000,iops=100,group=foo \
-device scsi-hd,id=image3,drive=drive_image3
{'execute':'qmp_capabilities'}
{'execute':'device_del','arguments':{'id':'image3'}}
{'execute':'system_reset'}

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1428810

Suggested-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170406190847.29347-1-eblake@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Dong Jia Shi
7a9e51198c block: pass the right options for BlockDriver.bdrv_open()
raw_open() expects the caller always passing in the right actual
@options parameter. But when trying to applying snapshot on a RBD
image, bdrv_snapshot_goto() calls raw_open() (by calling the
bdrv_open callback on the BlockDriver) with a NULL @options, and
that will result in a Segmentation fault.

For the other non-raw format drivers, it also makes sense to passing
in the actual options, althought they don't trigger the problem so
far.

Let's prepare a @options by adding the "file" key-value pair to a
copy of the actual options that were given for the node (i.e.
bs->options), and pass it to the callback.

BlockDriver.bdrv_open() expects bs->file to be NULL and just
overwrites it with the result from bdrv_open_child(). That means we
should actually make sure it's NULL because otherwise the child BDS
will have a reference count that is 1 too high. So we unconditionally
invoke bdrv_unref_child() before calling BlockDriver.bdrv_open(), and
we wrap everything in bdrv_ref()/bdrv_unref() so the BDS isn't
deleted in the meantime.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-id: 20170405091909.36357-2-bjsdjshi@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-11 15:33:00 +02:00
Peter Maydell
aa388ddc36 Merge remote-tracking branch 'remotes/famz/tags/block-pull-request' into staging
# gpg: Signature made Tue 11 Apr 2017 13:10:55 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/block-pull-request:
  sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
  block: Fix bdrv_co_flush early return
  block: Use bdrv_coroutine_enter to start I/O coroutines
  qemu-io-cmds: Use bdrv_coroutine_enter
  blockjob: Use bdrv_coroutine_enter to start coroutine
  block: Introduce bdrv_coroutine_enter
  async: Introduce aio_co_enter
  coroutine: Extract qemu_aio_coroutine_enter
  tests/block-job-txn: Don't start block job before adding to txn
  block: Quiesce old aio context during bdrv_set_aio_context
  block: Make bdrv_parent_drained_begin/end public

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 13:27:05 +01:00
Fam Zheng
76296dff97 sheepdog: Use bdrv_coroutine_enter before BDRV_POLL_WHILE
When called from main thread, the coroutine should run in the context of
bs. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
49ca625913 block: Fix bdrv_co_flush early return
bdrv_inc_in_flight and bdrv_dec_in_flight are mandatory for
BDRV_POLL_WHILE to work, even for the shortcut case where flush is
unnecessary. Move the if block to below bdrv_dec_in_flight, and BTW fix
the variable declaration position.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
e92f0e1910 block: Use bdrv_coroutine_enter to start I/O coroutines
BDRV_POLL_WHILE waits for the started I/O by releasing bs's ctx then polling
the main context, which relies on the yielded coroutine continuing on bs->ctx
before notifying qemu_aio_context with bdrv_wakeup().

Thus, using qemu_coroutine_enter to start I/O is wrong because if the coroutine
is entered from main loop, co->ctx will be qemu_aio_context, as a result of the
"release, poll, acquire" loop of BDRV_POLL_WHILE, race conditions happen when
both main thread and the iothread access the same BDS:

  main loop                                iothread
-----------------------------------------------------------------------
  blockdev_snapshot
    aio_context_acquire(bs->ctx)
                                           virtio_scsi_data_plane_handle_cmd
    bdrv_drained_begin(bs->ctx)
    bdrv_flush(bs)
      bdrv_co_flush(bs)                      aio_context_acquire(bs->ctx).enter
        ...
        qemu_coroutine_yield(co)
      BDRV_POLL_WHILE()
        aio_context_release(bs->ctx)
                                             aio_context_acquire(bs->ctx).return
                                               ...
                                                 aio_co_wake(co)
        aio_poll(qemu_aio_context)               ...
          co_schedule_bh_cb()                    ...
            qemu_coroutine_enter(co)             ...

              /* (A) bdrv_co_flush(bs)           /* (B) I/O on bs */
                      continues... */
                                             aio_context_release(bs->ctx)
        aio_context_acquire(bs->ctx)

Note that in above case, bdrv_drained_begin() doesn't do the "release,
poll, acquire" in BDRV_POLL_WHILE, because bs->in_flight == 0.

Fix this by using bdrv_coroutine_enter and enter coroutine in the right
context.

iotests 109 output is updated because the coroutine reenter flow during
mirror job complete is different (now through co_queue_wakeup, instead
of the unconditional qemu_coroutine_switch before), making the end job
len different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
324ec3e4f2 qemu-io-cmds: Use bdrv_coroutine_enter
qemu_coroutine_create associates @co to qemu_aio_context but we poll
blk's context below. If the coroutine yields, it may never get resumed
again.

Use bdrv_coroutine_enter to make sure we are starting the I/O on the
right context.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
aef4278c5a blockjob: Use bdrv_coroutine_enter to start coroutine
Resuming and especially starting of the block job coroutine, could be issued in
the main thread.  However the coroutine's "home" ctx should be set to the same
context as job->blk. Use bdrv_coroutine_enter to ensure that.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
052a75721f block: Introduce bdrv_coroutine_enter
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
8865852e00 async: Introduce aio_co_enter
They start the coroutine on the specified context.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
ba9e75ceef coroutine: Extract qemu_aio_coroutine_enter
It's a variant of qemu_coroutine_enter with an explicit AioContext
parameter.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
d90fce9446 tests/block-job-txn: Don't start block job before adding to txn
Previously, before test_block_job_start returns, the job can already
complete, as a result, the transactional state of other jobs added to
the same txn later cannot be handled correctly.

Move the block_job_start() calls to callers after
block_job_txn_add_job() calls.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
aabf591007 block: Quiesce old aio context during bdrv_set_aio_context
The fact that the bs->aio_context is changing can confuse the dataplane
iothread, because of the now fine granularity aio context lock.
bdrv_drain should rather be a bdrv_drained_begin/end pair, but since
bs->aio_context is changing, we can just use aio_disable_external and
bdrv_parent_drained_begin.

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Fam Zheng
14e9559f46 block: Make bdrv_parent_drained_begin/end public
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-04-11 20:07:15 +08:00
Peter Maydell
17fa24b79c Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170411-1' into staging
qxl: bugfixes.

# gpg: Signature made Tue 11 Apr 2017 08:00:00 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170411-1:
  qxl: add migration blocker to avoid pre-save assert
  qxl: switch display on entering VGA

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-11 10:03:51 +01:00
Gerd Hoffmann
86dbcdd9c7 qxl: add migration blocker to avoid pre-save assert
Cc: 1635339@bugs.launchpad.net
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170410113131.2585-1-kraxel@redhat.com
2017-04-11 08:38:17 +02:00
Peter Maydell
c322fdd41a Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Fixes a memory leak.

# gpg: Signature made Mon 10 Apr 2017 13:20:39 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: xattr: fix memory leak in v9fs_list_xattr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-10 16:08:37 +01:00
Peter Maydell
0a49bfa1ab Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1' into staging
Final icount and misc MTTCG fixes for 2.9

Minor differences from:
  Message-Id: <20170405132503.32125-1-alex.bennee@linaro.org>

  - dropped new feature patches
  - last minute typo fix from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>

# gpg: Signature made Mon 10 Apr 2017 11:38:10 BST
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-100417-1:
  replay: assert time only goes forward
  cpus: call cpu_update_icount on read
  cpu-exec: update icount after each TB_EXIT
  cpus: introduce cpu_update_icount helper
  cpus: don't credit executed instructions before they have run
  cpus: move icount preparation out of tcg_exec_cpu
  cpus: check cpu->running in cpu_get_icount_raw()
  cpus: remove icount handling from qemu_tcg_cpu_thread_fn
  target/i386/misc_helper: wrap BQL around another IRQ generator
  cpus: fix wrong define name
  scripts/qemugdb/mtree.py: fix up mtree dump

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-10 15:01:15 +01:00
Peter Maydell
ad04d8cb2f configure: on Windows minimum glib version must be 2.30
In the 2.7 release we stated in the ChangeLog that the
minimum glib version for Windows hosts was 2.30, but we
didn't update configure to enforce this because we were
very close to the release at the point where we noticed
the issue, and it only affected building the test suite.
We then forgot that we needed to do it. Fix the omission.

(The reason for the 2.30 requirement is use of
g_dir_make_tmp() -- our fallback implementation uses
mkdtemp(), which isn't available on Windows.)

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1491224655-5776-1-git-send-email-peter.maydell@linaro.org
2017-04-10 12:54:35 +01:00
Alex Bennée
982263ce71 replay: assert time only goes forward
If we find ourselves trying to add an event to the log where time has
gone backwards it is because a vCPU event has occurred and the
main-loop is not yet aware of time moving forward. This should not
happen and if it does its better to fail early than generate a log
that will have weird behaviour.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
1d05906b95 cpus: call cpu_update_icount on read
This ensures each time the vCPU thread reads the icount we update the
master timer_state.qemu_icount field. This way as long as updates are
in BQL protected sections (which they should be) the main-loop can
never come to update the log and find time has gone backwards.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
eda5f7c6a1 cpu-exec: update icount after each TB_EXIT
There is no particular reason we shouldn't update the global system
icount time as we exit each TranslationBlock run. This ensures the
main-loop doesn't have to wait until we exit to the outer loop for
executed instructions to be credited to timer_state.

The prepare_icount_for_run function is slightly tweaked to match the
logic we run in cpu_loop_exec_tb.

Based on Paolo's original suggestion.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
512d3c8071 cpus: introduce cpu_update_icount helper
By holding off updates to timer_state.qemu_icount we can run into
trouble when the non-vCPU thread needs to know the time. This helper
ensures we atomically update timers_state.qemu_icount based on what
has been currently executed.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
e4cd96571f cpus: don't credit executed instructions before they have run
Outside of the vCPU thread icount time will only be tracked against
timers_state.qemu_icount. We no longer credit cycles until they have
completed the run. Inside the vCPU thread we adjust for passage of
time by looking at how many have run so far. This is only valid inside
the vCPU thread while it is running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:38 +01:00
Alex Bennée
0524838225 cpus: move icount preparation out of tcg_exec_cpu
As icount is only supported for single-threaded execution due to the
requirement for determinism let's remove it from the common
tcg_exec_cpu path.

Also remove the additional fiddling which shouldn't be required as the
icount counters should all be rectified as you enter the loop.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:23:33 +01:00
Alex Bennée
243c5f77f6 cpus: check cpu->running in cpu_get_icount_raw()
The lifetime of current_cpu is now the lifetime of the vCPU thread.
However get_icount_raw() can apply a fudge factor if called while code
is running to take into account the current executed instruction
count.

To ensure this is always the case we also check cpu->running.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:22:28 +01:00
Alex Bennée
bf51c7206f cpus: remove icount handling from qemu_tcg_cpu_thread_fn
We should never be running in multi-threaded mode with icount enabled.
There is no point calling handle_icount_deadline here so remove it and
assert !use_icount.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-10 10:14:50 +01:00
Alex Bennée
b4e79a502f target/i386/misc_helper: wrap BQL around another IRQ generator
Anything that calls into HW emulation must be protected by the BQL.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2017-04-10 10:14:50 +01:00
Nikunj A Dadhania
8695350357 cpus: fix wrong define name
While the configure script generates TARGET_SUPPORTS_MTTCG define, one
of the define is cpus.c is checking wrong name: TARGET_SUPPORT_MTTCG

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-10 10:14:28 +01:00
Li Qiang
4ffcdef427 9pfs: xattr: fix memory leak in v9fs_list_xattr
Free 'orig_value' in error path.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-04-10 09:38:05 +02:00
Alex Bennée
8037fa55ac scripts/qemugdb/mtree.py: fix up mtree dump
Since QEMU has been able to build with native Int128 support this was
broken as it attempts to fish values out of the non-existent
structure. Also the alias print was trying to make a %x out of
gdb.ValueType directly which didn't seem to work.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-04-07 15:24:56 +01:00
Peter Maydell
5daf9b3025 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc4

# gpg: Signature made Fri 07 Apr 2017 13:44:17 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  mirror: Fix aio context of mirror_top_bs
  block: Assert attached child node has right aio context
  block: Fix unpaired aio_disable_external in external snapshot
  block: Don't check permissions for copy on read
  qemu-img: img_create does not support image-opts, fix docs
  iotests: Add mirror tests for orphaned source
  block/mirror: Fix use-after-free
  commit: Set commit_top_bs->total_sectors
  commit: Set commit_top_bs->aio_context
  block: Ignore guest dev permissions during incoming migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-07 15:23:48 +01:00
Fam Zheng
19dd29e8a7 mirror: Fix aio context of mirror_top_bs
It should be moved to the same context as source, before inserting to the
graph.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Fam Zheng
bb2614e991 block: Assert attached child node has right aio context
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Fam Zheng
c26a5ab713 block: Fix unpaired aio_disable_external in external snapshot
bdrv_replace_child_noperm tries to hand over the quiesce_counter state
from old bs to the new one, but if they are not on the same aio context
this causes unbalance.

Fix this by setting the correct aio context before calling
bdrv_append().

Reported-by: Ed Swierk <eswierk@skyportsystems.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Kevin Wolf
1bf03e66fd block: Don't check permissions for copy on read
The assertion is currently failing. We can't require callers to have
write permissions when all they are doing is a read, so comment it out.
Add a FIXME comment in the code so that the check is re-enabled when
copy on read is refactored into its own filter driver.

Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
2017-04-07 14:44:06 +02:00
Jeff Cody
89aa0465b9 qemu-img: img_create does not support image-opts, fix docs
The documentation and help for qemu-img claims that 'qemu-img create'
will take the '--image-opts' argument.  This is not true, so this
patch removes those claims.

Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz
5694923ad1 iotests: Add mirror tests for orphaned source
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:06 +02:00
Max Reitz
7a25fcd056 block/mirror: Fix use-after-free
If @bs does not have any parents, the only reference to @mirror_top_bs
will be held by the BlockJob object after the bdrv_unref() following
block_job_create(). However, if block_job_create() fails, this reference
will not exist and @mirror_top_bs will have been deleted when we
goto fail.

The issue comes back at all later entries to the fail label: We delete
the BlockJob object before rolling back our changes to the node graph.
This means that we will delete @mirror_top_bs in the process.

All in all, whenever @bs does not have any parents and we go down the
fail path we will dereference @mirror_top_bs after it has been deleted.

Fix this by invoking bdrv_unref() only when block_job_create() was
successful and by bdrv_ref()'ing @mirror_top_bs in the fail path before
deleting the BlockJob object. Finally, bdrv_unref() it at the end of the
fail path after we actually no longer need it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
0d0676a104 commit: Set commit_top_bs->total_sectors
Like in the mirror filter driver, we also need to set the image size for
the commit filter driver. This is less likely to be a problem in
practice than for the mirror because we're not at the active layer here,
but attaching new parents to a node in the middle of the chain is
possible, so the size needs to be correct anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
02be4aeb93 commit: Set commit_top_bs->aio_context
The filter driver that is inserted by the commit job needs to use the
same AioContext as its parent and child nodes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2017-04-07 14:44:05 +02:00
Kevin Wolf
d35ff5e6b3 block: Ignore guest dev permissions during incoming migration
Usually guest devices don't like other writers to the same image, so
they use blk_set_perm() to prevent this from happening. In the migration
phase before the VM is actually running, though, they don't have a
problem with writes to the image. On the other hand, storage migration
needs to be able to write to the image in this phase, so the restrictive
blk_set_perm() call of qdev devices breaks it.

This patch flags all BlockBackends with a qdev device as
blk->disable_perm during incoming migration, which means that the
requested permissions are stored in the BlockBackend, but not actually
applied to its root node yet.

Once migration has finished and the VM should be resumed, the
permissions are applied. If they cannot be applied (e.g. because the NBD
server used for block migration hasn't been shut down), resuming the VM
fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
2017-04-07 14:44:05 +02:00
Marc-André Lureau
a703d3aef5 qxl: switch display on entering VGA
Since commit cd958edb1f, same size console resize is skipped. This
change broke QXL incoming migration in VGA mode,
qemu_spice_display_switch() is no longer called during qxl_post_load(),
because default message surface is of the same size, and during
displaychangelistener registration, PCIQXLDevice.mode is
QXL_MODE_UNDEFINED. This triggers a later crash on refresh:

==2634== Invalid read of size 4
==3516== at 0x65F3050: pixman_image_get_data (in /usr/lib64/libpixman-1.so.0.34.0)
==3516== by 0x6F0CEB: qemu_spice_create_update (spice-display.c:215)
==3516== by 0x6F1CC7: qemu_spice_display_refresh (spice-display.c:502)
==3516== by 0x58CF77: display_refresh (qxl.c:1948)
==3516== by 0x6E8084: do_safe_dpy_refresh (console.c:1591)
==3516== by 0x6E80D5: dpy_refresh (console.c:1604)
==3516== by 0x6E4508: gui_update (console.c:201)
==3516== by 0x81898E: timerlist_run_timers (qemu-timer.c:536)
==3516== by 0x8189D6: qemu_clock_run_timers (qemu-timer.c:547)
==3516== by 0x818D98: qemu_clock_run_all_timers (qemu-timer.c:662)
==3516== by 0x81952A: main_loop_wait (main-loop.c:514)
==3516== by 0x4ADD29: main_loop (vl.c:1898)

One way to solve this is to explicitely call qemu_spice_display_switch()
on entering VGA mode, which is called during qxl_post_load().

Fixes:
"null pointer access on migration resume of systemrescuecd boot menu with qxl-vga"
https://bugs.launchpad.net/qemu/+bug/1679126
https://bugzilla.redhat.com/show_bug.cgi?id=1438566

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170406120513.638-4-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-07 12:31:46 +02:00
Peter Maydell
5fe2339e6b Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170406.0' into staging
VFIO fixes 2017-04-06

 - Extra test for NVIDIA BAR5 quirk to avoid segfault (Alex Williamson)

# gpg: Signature made Thu 06 Apr 2017 23:05:53 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170406.0:
  vfio/pci-quirks: Exclude non-ioport BAR from NVIDIA quirk

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-07 10:29:56 +01:00
Alex Williamson
8f419c5b43 vfio/pci-quirks: Exclude non-ioport BAR from NVIDIA quirk
The NVIDIA BAR5 quirk is targeting an ioport BAR.  Some older devices
have a BAR5 which is not ioport and can induce a segfault here.  Test
the BAR type to skip these devices.

Link: https://bugs.launchpad.net/qemu/+bug/1678466
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-04-06 16:03:26 -06:00
Peter Maydell
54d689988c Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* TCO watchdog fix

# gpg: Signature made Wed 05 Apr 2017 16:24:52 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  tco: do not generate an NMI

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-06 09:27:49 +01:00
Paolo Bonzini
8c9f42f3cf tco: do not generate an NMI
This behavior is not indicated in the datasheet and can confuse the OS.
The TCO can trap NMIs from SERR# or IOCHK# and convert them to SMIs; but
any other TCO event is either delivered as an SMI or completely disabled.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-05 17:23:52 +02:00
Peter Maydell
1fde6ee885 Update version for v2.9.0-rc3 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 18:36:51 +01:00
Peter Maydell
1413c663c9 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Some 9pfs bugs fixes: potential hang at reset, migration blocker leak.

# gpg: Signature made Tue 04 Apr 2017 17:07:55 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: clear migration blocker at session reset
  9pfs: fix multiple flush for same request

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 18:00:23 +01:00
Peter Maydell
2c9938c5d7 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pci: fix

A single bugfix for a error handling issue in pci.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 04 Apr 2017 16:33:04 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  pci: Only unmap bus_master_enabled_region if was added previously

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 17:27:32 +01:00
Greg Kurz
6d54af0ea9 9pfs: clear migration blocker at session reset
The migration blocker survives a device reset: if the guest mounts a 9p
share and then gets rebooted with system_reset, it will be unmigratable
until it remounts and umounts the 9p share again.

This happens because the migration blocker is supposed to be cleared when
we put the last reference on the root fid, but virtfs_reset() wrongly calls
free_fid() instead of put_fid().

This patch fixes virtfs_reset() so that it honor the way fids are supposed
to be manipulated: first get a reference and later put it back when you're
done.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liqiang6-s@360.cn>
2017-04-04 18:06:01 +02:00
Greg Kurz
18adde86dd 9pfs: fix multiple flush for same request
If a client tries to flush the same outstanding request several times, only
the first flush completes. Subsequent ones keep waiting for the request
completion in v9fs_flush() and, therefore, leak a PDU. This will cause QEMU
to hang when draining active PDUs the next time the device is reset.

Let have each flush request wake up the next one if any. The last waiter
frees the cancelled PDU.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-04-04 18:06:01 +02:00
Alexey Kardashevskiy
193982c6f9 pci: Only unmap bus_master_enabled_region if was added previously
Normally pci_init_bus_master() would be called either via
bus->machine_done.notify or directly from do_pci_register_device().

However if a device's realize() failed, pci_init_bus_master() is not
called, and do_pci_unregister_device() fails on
memory_region_del_subregion() as it was not mapped.

This adds a check that subregion was mapped before unmapping it.

Fixes: c53598ed18 ("pci: Add missing drop of bus master AS reference")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: John Snow <jsnow@redhat.com>
2017-04-04 18:32:25 +03:00
Peter Maydell
fa902c8ca0 Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-04-04-1' into staging
Merge qio 2017/04/04 v1

# gpg: Signature made Tue 04 Apr 2017 16:17:56 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2017-04-04-1:
  io: fix FD socket handling in DNS lookup
  io: fix incoming client socket initialization

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 16:25:30 +01:00
Daniel P. Berrange
b8a68728b6 io: fix FD socket handling in DNS lookup
The qio_dns_resolver_lookup_sync() method is required to be a no-op
for socket kinds that don't require name resolution. Thus the KIND_FD
handling should not return an error.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-04-04 16:17:03 +01:00
Wang guang
0e5d6327f3 io: fix incoming client socket initialization
The channel socket was initialized manually, but forgot to set
QIO_CHANNEL_FEATURE_SHUTDOWN. Thus, the colo_process_incoming_thread
would hang at recvmsg. This patch just call qio_channel_socket_new to
get channel, Which set QIO_CHANNEL_FEATURE_SHUTDOWN already.

Signed-off-by: Wang Guang<wang.guang55@zte.com.cn>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-04-04 16:17:03 +01:00
Peter Maydell
87cc4c6102 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MemoryRegionCache revert
* glib optimization workaround
* fix "info lapic" segfault on isapc
* fix QIOChannel memory leak

# gpg: Signature made Mon 03 Apr 2017 18:17:00 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  main-loop: Acquire main_context lock around os_host_main_loop_wait.
  exec: revert MemoryRegionCache
  nbd: fix memory leak on socket_connect failed
  ipmi: Fix macro issues
  target-i386: fix "info lapic" segfault on isapc
  iscsi: drop unused IscsiAIOCB.qiov field

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-04 11:40:55 +01:00
Peter Maydell
d9123d09f7 tests/libqtest.c: Delete possible stale unix sockets
Occasionally if a test crashes or is interrupted by the user
at the wrong moment it could leave behind a stale UNIX
socket in /tmp/. This will then cause a subsequent test
run to fail spuriously with
 tests/libqtest.c:70:init_socket: assertion failed (ret != -1): (-1 != -1)
if it happens to reuse the same PID.

Defend against this by deleting any stray stale socket before
trying to open the new ones for this test.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1490963801-27870-1-git-send-email-peter.maydell@linaro.org
2017-04-03 19:05:38 +01:00
Richard W.M. Jones
ecbddbb106 main-loop: Acquire main_context lock around os_host_main_loop_wait.
When running virt-rescue the serial console hangs from time to time.
Virt-rescue runs an ordinary Linux kernel "appliance", but there is
only a single idle process running inside, so the qemu main loop is
largely idle.  With virt-rescue >= 1.37 you may be able to observe the
hang by doing:

  $ virt-rescue -e ^] --scratch
  ><rescue> while true; do ls -l /usr/bin; done

The hang in virt-rescue can be resolved by pressing a key on the
serial console.

Possibly with the same root cause, we also observed hangs during very
early boot of regular Linux VMs with a serial console.  Those hangs
are extremely rare, but you may be able to observe them by running
this command on baremetal for a sufficiently long time:

  $ while libguestfs-test-tool -t 60 >& /tmp/log ; do echo -n . ; done

(Check in /tmp/log that the failure was caused by a hang during early
boot, and not some other reason)

During investigation of this bug, Paolo Bonzini wrote:

> glib is expecting QEMU to use g_main_context_acquire around accesses to
> GMainContext.  However QEMU is not doing that, instead it is taking its
> own mutex.  So we should add g_main_context_acquire and
> g_main_context_release in the two implementations of
> os_host_main_loop_wait; these should undo the effect of Frediano's
> glib patch.

This patch exactly implements Paolo's suggestion in that paragraph.

This fixes the serial console hang in my testing, across 3 different
physical machines (AMD, Intel Core i7 and Intel Xeon), over many hours
of automated testing.  I wasn't able to reproduce the early boot hangs
(but as noted above, these are extremely rare in any case).

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1435432
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170331205133.23906-1-rjones@redhat.com>
[Paolo: this is actually a glib bug: recent glib versions are also
expecting g_main_context_acquire around g_poll---but that is not
documented and probably not even intended].
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 19:13:12 +02:00
Peter Maydell
b1a419ec79 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-04-03' into staging
Block patches for 2.9-rc3

# gpg: Signature made Mon 03 Apr 2017 16:29:49 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-04-03:
  block/parallels: Avoid overflows
  iotests: Improve image-clear tests on non-aligned image
  qcow2: Discard unaligned tail when wiping image
  iotests: fix 097 when run with qcow
  qemu-io-cmds: Assert that global and nofile commands don't use ct->perms
  sheepdog: Fix blockdev-add
  nbd: Tidy up blockdev-add interface
  sockets: New helper socket_address_crumple()
  qapi-schema: SocketAddressFlat variants 'vsock' and 'fd'
  gluster: Prepare for SocketAddressFlat extension
  block: Document -drive problematic code and bugs
  io vnc sockets: Clean up SocketAddressKind switches
  char: Fix socket with "type": "vsock" address
  nbd sockets vnc: Mark problematic address family tests TODO
  block: add missed aio_context_acquire into release_drive

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 16:43:39 +01:00
Max Reitz
86d1bd7098 block/parallels: Avoid overflows
Change the types of variables in allocate_clusters() to int64_t so we do
not have to worry about potential overflows.

Add an assertion that our accesses to s->bat[] do not result in a buffer
overflow and that the implicit conversion performed when invoking
bat_entry_off() does not result in an integer overflow.

Coverity-id: 1307776
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331170512.10381-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Eric Blake
f82c5b17ea iotests: Improve image-clear tests on non-aligned image
Tweak 097 and 176 to operate on an image that is not cluster-aligned,
to give further coverage of clearing out an entire image, including
the recent fix to eliminate the difference between fast path (97) and
slow (176) for qcow2.  Also tested on qcow (97 only, since qcow lacks
snapshots).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-4-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Eric Blake
0c1bd4692f qcow2: Discard unaligned tail when wiping image
There is a subtle difference between the fast (qcow2v3 with no
extra data) and slow path (qcow2v2 format [aka 0.10], or when a
snapshot is present) of qcow2_make_empty().  The slow path fails
to discard the final (partial) cluster of an unaligned image.

The problem stems from the fact that qcow2_discard_clusters() was
silently ignoring sub-cluster head and tail on unaligned requests.
A quick audit of all callers shows that qcow2_snapshot_create() has
always passed a cluster-aligned request since the call was added
in commit 1ebf561; qcow2_co_pdiscard() has passed a cluster-aligned
request since commit ecdbead taught the block layer about preferred
discard alignment; and qcow2_make_empty() was fixed to pass an
aligned start (but not necessarily end) in commit a3e1505.

Asserting that the start is always aligned also points out that we
now have a dead check: rounding the end offset down can never result
in a value less than the aligned start offset (the check was rendered
dead with commit ecdbead).  Meanwhile, we do not want to round the
end cluster down in the one case of the end offset matching the
(unaligned) file size - that final partial cluster should still be
discarded.

With those fixes in place, the fast and slow paths are back in sync
at discarding an entire image; the next patch will update
qemu-iotests to ensure we don't regress.

Note that bdrv_co_pdiscard ignores ALL partial cluster requests,
including the partial cluster at the end of an image; it can be
argued that the partial cluster at the end should be special-cased
so that a guest issuing discard requests at proper alignments
everywhere else can likewise empty the entire image.  But that
optimization is left for another day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-3-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Daniel P. Berrange
07ff948bd1 iotests: fix 097 when run with qcow
The previous commit:

  commit a3e1505dae
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Dec 5 09:49:34 2016 -0600

    qcow2: Don't strand clusters near 2G intervals during commit

extended the 097 test case so that it did two passes, once
with an internal snapshot, once without.

qcow (v1) does not support internal snapshots, so this change
broke test 097 when run against qcow.

This splits 097 in two, creating a new 176 that tests the
internal snapshot codepath, effectively putting 097 back
to its content before the above commit.

Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170221115512.21918-8-berrange@redhat.com>
[eblake: test collisions: s/173/176/g]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331185356.2479-2-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Peter Maydell
6aabeb5839 qemu-io-cmds: Assert that global and nofile commands don't use ct->perms
It would be a bug for a command with the CMD_NOFILE_OK or
CMD_FLAG_GLOBAL flags set to also set the ct->perms field,
because the former says "OK for a file not to be open"
but the latter is a check on a file.

Add an assertion in qemuio_add_command() so we can catch that
sort of buggy command definition immediately rather than it
being a bug that only manifests when a particular set of
command line options is used.

(Coverity gets confused about this (CID 1371723) and reports
that we might dereference a NULL blk pointer in this case,
because it can't tell that that code path never happens with
the cmdinfo_t that we have. This commit won't help unconfuse
it, but it does fix the underlying issue.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490967529-4767-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:40 +02:00
Markus Armbruster
d1c136885b sheepdog: Fix blockdev-add
Commit 831acdc "sheepdog: Implement bdrv_parse_filename()" and commit
d282f34 "sheepdog: Support blockdev-add" have different ideas on how
the QemuOpts parameters for the server address are named.  Fix that.
While there, rename BlockdevOptionsSheepdog member addr to server, for
consistency with BlockdevOptionsSsh, BlockdevOptionsGluster,
BlockdevOptionsNbd.

Commit 831acdc's example becomes

    --drive driver=sheepdog,server.type=inet,server.host=fido,server.port=7000,vdi=dolly

instead of

    --drive driver=sheepdog,host=fido,vdi=dolly

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-id: 1490895797-29094-10-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
9445673ea6 nbd: Tidy up blockdev-add interface
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C.  I intend to limit its use to
existing external interfaces, and convert all internal interfaces to
SocketAddressFlat.

BlockdevOptionsNbd is an external interface using SocketAddress.  We
already use SocketAddressFlat elsewhere in blockdev-add.  Replace it
by SocketAddressFlat while we can (it's new in 2.9) for simplicity and
consistency.  For example,

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "data": { "host": "localhost",
				           "port": "12345" } } } }

becomes

    { "execute": "blockdev-add",
      "arguments": { "node-name": "foo", "driver": "nbd",
                     "server": { "type": "inet",
		                 "host": "localhost", "port": "12345" } } }

Since the internal interfaces still take SocketAddress, this requires
conversion function socket_address_crumple().  It'll go away when I
update the interfaces.

Unfortunately, SocketAddress is also visible in -drive since 2.8:

    -drive if=none,driver=nbd,server.type=inet,server.data.host=127.0.0.1,server.data.port=12345

Nobody should be using it, as it's fairly new and has never been
documented, so adding still more compatibility gunk to keep it working
isn't worth the trouble.  You now have to use

    -drive if=none,driver=nbd,server.type=inet,server.host=127.0.0.1,server.port=12345

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-9-git-send-email-armbru@redhat.com

[mreitz: Change iotest 147 accordingly]

Because of this interface change, iotest 147 has to be adapted.
Unfortunately, we cannot just flatten all of the addresses because
nbd-server-start still takes a plain SocketAddress. Therefore, we need
both and this is most easily achieved by writing the SocketAddress into
the code and flattening it where necessary.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170330221243.17333-1-mreitz@redhat.com

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
216411b839 sockets: New helper socket_address_crumple()
SocketAddress is a simple union, and simple unions are awkward: they
have their variant members wrapped in a "data" object on the wire, and
require additional indirections in C.  I intend to limit its use to
existing external interfaces.  New ones should use SocketAddressFlat.
I further intend to convert all internal interfaces to
SocketAddressFlat.  This helper should go away then.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-8-git-send-email-armbru@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
8bc0673f6d qapi-schema: SocketAddressFlat variants 'vsock' and 'fd'
Note that the new variants are impossible in qemu_gluster_glfs_init(),
because the gconf->server can only come from qemu_gluster_parse_uri()
or qemu_gluster_parse_json(), and neither can create anything but
'inet' or 'unix'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-7-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
fce5d5386d gluster: Prepare for SocketAddressFlat extension
qemu_gluster_glfs_init() and qemu_gluster_parse_json() rely on the
fact that SocketAddressFlatType has only two members
SOCKET_ADDRESS_FLAT_TYPE_INET and SOCKET_ADDRESS_FLAT_TYPE_UNIX.
Correct, but won't stay correct.  Make them more robust.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490895797-29094-6-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
129c7d1c53 block: Document -drive problematic code and bugs
-blockdev and blockdev_add convert their arguments via QObject to
BlockdevOptions for qmp_blockdev_add(), which converts them back to
QObject, then to a flattened QDict.  The QDict's members are typed
according to the QAPI schema.

-drive converts its argument via QemuOpts to a (flat) QDict.  This
QDict's members are all QString.

Thus, the QType of a flat QDict member depends on whether it comes
from -drive or -blockdev/blockdev_add, except when the QAPI type maps
to QString, which is the case for 'str' and enumeration types.

The block layer core extracts generic configuration from the flat
QDict, and the block driver extracts driver-specific configuration.

Both commonly do so by converting (parts of) the flat QDict to
QemuOpts, which turns all values into strings.  Not exactly elegant,
but correct.

However, A few places access the flat QDict directly:

* Most of them access members that are always QString.  Correct.

* bdrv_open_inherit() accesses a boolean, carefully.  Correct.

* nfs_config() uses a QObject input visitor.  Correct only because the
  visited type contains nothing but QStrings.

* nbd_config() and ssh_config() use a QObject input visitor, and the
  visited types contain non-QStrings: InetSocketAddress members
  @numeric, @to, @ipv4, @ipv6.  -drive works as long as you don't try
  to use them (they're all optional).  @to is ignored anyway.

  Reproducer:
  -drive driver=ssh,server.host=h,server.port=22,server.ipv4,path=p
  -drive driver=nbd,server.type=inet,server.data.host=h,server.data.port=22,server.data.ipv4
  both fail with "Invalid parameter type for 'data.ipv4', expected: boolean"

Add suitable comments to all these places.  Mark the buggy ones FIXME.

"Fortunately", -drive's driver-specific options are entirely
undocumented.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1490895797-29094-5-git-send-email-armbru@redhat.com
[mreitz: Fixed two typos]
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
a6c76285f2 io vnc sockets: Clean up SocketAddressKind switches
We have quite a few switches over SocketAddressKind.  Some have case
labels for all enumeration values, others rely on a default label.
Some abort when the value isn't a valid SocketAddressKind, others
report an error then.

Unify as follows.  Always provide case labels for all enumeration
values, to clarify intent.  Abort when the value isn't a valid
SocketAddressKind, because the program state is messed up then.

Improve a few error messages while there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1490895797-29094-4-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
d2e49aad72 char: Fix socket with "type": "vsock" address
Watch this:

    $ qemu-system-x86_64 -nodefaults -S -display none -qmp stdio
    {"QMP": {"version": {"qemu": {"micro": 91, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    { "execute": "qmp_capabilities" }
    {"return": {}}
    { "execute": "chardev-add", "arguments": { "id": "chr0", "backend": { "type": "socket", "data": { "addr": { "type": "vsock", "data": { "cid": "CID", "port": "P" }}}}}}
    Aborted (core dumped)

Crashes because SocketAddress_to_str() is blissfully unaware of
SOCKET_ADDRESS_KIND_VSOCK.  Fix that.  Pick the output format to match
socket_parse(), just like the existing formats.

Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1490895797-29094-3-git-send-email-armbru@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Markus Armbruster
ca0b64e5ed nbd sockets vnc: Mark problematic address family tests TODO
Certain features make sense only with certain address families.  For
instance, passing file descriptors requires AF_UNIX.  Testing
SocketAddress's saddr->type == SOCKET_ADDRESS_KIND_UNIX is obvious,
but problematic: it can't recognize AF_UNIX when type ==
SOCKET_ADDRESS_KIND_FD.

Mark such tests of saddr->type TODO.  We may want to check the address
family with getsockname() there.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1490895797-29094-2-git-send-email-armbru@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Denis V. Lunev
9588c5897b block: add missed aio_context_acquire into release_drive
Recently we expirience hang with iothreads enabled with the following
call trace:
Thread 1 (Thread 0x7fa95efebc80 (LWP 177117)):
0  ppoll () from /lib64/libc.so.6
2  qemu_poll_ns () at qemu-timer.c:313
3  aio_poll () at aio-posix.c:457
4  bdrv_flush () at block/io.c:2641
5  bdrv_close () at block.c:2143
6  bdrv_delete () at block.c:2352
7  bdrv_unref () at block.c:3429
8  blk_remove_bs () at block/block-backend.c:427
9  blk_delete () at block/block-backend.c:178
10 blk_unref () at block/block-backend.c:226
11 object_property_del_all () at qom/object.c:399
12 object_finalize () at qom/object.c:461
13 object_unref () at qom/object.c:898
14 object_property_del_child () at qom/object.c:422
15 qmp_marshal_device_del () at qmp-marshal.c:1145
16 handle_qmp_command () at /usr/src/debug/qemu-2.6.0/monitor.c:3929

Technically bdrv_flush() stucks in
    while (rwco.ret == NOT_DONE) {
        aio_poll(aio_context, true);
    }
but rwco.ret is equal to 0 thus we have missed wakeup. Code investigation
reveals that we do not have performed aio_context_acquire() on this call
stack.

This patch adds missed lock.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Message-id: 1490717566-25516-1-git-send-email-den@openvz.org
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-04-03 17:11:39 +02:00
Gerd Hoffmann
102a3d8478 usb-host: switch to LIBUSB_API_VERSION
libusbx doesn't exist any more, the fork got merged back to libusb.  So
stop using LIBUSBX_API_VERSION and use LIBUSB_API_VERSION instead.  For
backward compatibility alias LIBUSB_API_VERSION to LIBUSBX_API_VERSION
in case we figure LIBUSB_API_VERSION isn't defined.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170403105238.23262-1-kraxel@redhat.com
2017-04-03 14:41:23 +01:00
Peter Maydell
230f4c6bc5 disas/cris.c: Avoid unintentional sign extension
Commit 001ebaca7b fixed some unintended sign extension issues
spotted by Coverity (CID 1005402, 1005403), but didn't catch
all of them. Fix the rest, so we behave consistently whether
'long' is 32 bit or 64 bit.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1490970671-20560-1-git-send-email-peter.maydell@linaro.org
2017-04-03 14:06:59 +01:00
Peter Maydell
6499fd151d configure: Mark SPARC as supported
Thanks to John Paul Adrian Glaubitz <glaubitz@physik.fu-berlin.de>
and the Debian Project, we now have access to a SPARC Linux
system we can use for build testing. Move SPARC back into
the "supported" list.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490698718-23762-1-git-send-email-peter.maydell@linaro.org
2017-04-03 12:59:47 +01:00
Peter Maydell
5c32be5baf tcg/sparc: Zero extend address argument to ld/st helpers
The C store helper functions take the address argument as a
target_ulong type; if this is 32 bit but the host is 64 bit
then the SPARC calling convention requires that the caller
must zero extend the value. We weren't doing this, which
meant we could pass values to the caller with high bits set
and QEMU would crash if it was compiled with optimizations.
In particular, the i386 BIOS would not start.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490871151-29029-3-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:37 +01:00
Peter Maydell
709a340d67 tcg/sparc: Zero extend data argument to store helpers
The C store helper functions take the data argument as a uint8_t,
uint16_t, etc depending on the store size. The SPARC calling
convention requires that data types smaller than the register
size must be extended by the caller. We weren't doing this,
which meant that if QEMU was compiled with optimizations enabled
we could end up storing incorrect values to guest memory.
(In particular the i386 guest BIOS would crash on startup.)

Add code to the trampolines that call the store helpers to
do the zero extension as required.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1490871151-29029-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-04-03 12:59:14 +01:00
Paolo Bonzini
90c4fe5fc5 exec: revert MemoryRegionCache
MemoryRegionCache did not know about virtio support for IOMMUs (because the
two features were developed at the same time).  Revert MemoryRegionCache
to "normal" address_space_* operations for 2.9, as it is simpler than
undoing the virtio patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-03 13:41:53 +02:00
Peter Maydell
f9e46d37bd Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170403-1' into staging
bugfixes: xhci, input-linux and vnc

# gpg: Signature made Mon 03 Apr 2017 11:25:29 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170403-1:
  vnc: allow to connect with add_client when -vnc none
  Fix input-linux reading from device
  xhci: flush dequeue pointer to endpoint context

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 12:24:25 +01:00
Peter Maydell
9eb5adae26 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170403' into staging
ppc patch queue 2017-04-03

A single bugfix in this pull request, for an ugly assert() failure, if
the user ignores the information in query-hotpluggable-cpus and tries
to hot add CPUs to pseries with bad parameters.

# gpg: Signature made Mon 03 Apr 2017 11:06:58 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170403:
  pseries: Enforce homogeneous threads-per-core

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 11:15:33 +01:00
Marc-André Lureau
fa03cb7fd2 vnc: allow to connect with add_client when -vnc none
Do not skip VNC initialization, in particular of auth method when vnc is
configured without sockets, since we should still allow connections
through QMP add_client.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1434551

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170328160646.21250-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-03 11:48:55 +02:00
Javier Celaya
1684907c92 Fix input-linux reading from device
The evdev devices in input-linux.c are read in blocks of one whole
event. If there are not enough bytes available, they are discarded,
instead of being kept for the next read operation. This results in
lost events, of even non-working devices.

This patch keeps track of the number of bytes to be read to fill up
a whole event, and then handle it.

Changes from v1 to v2:
- Fix: Calculate offset on each iteration

Changes from v2 to v3:
- Fix coding style
- Store offset instead of bytes to be read

Signed-off-by: Javier Celaya <jcelaya@gmail.com>
Message-id: 20170327182624.2914-1-jcelaya@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-04-03 11:44:58 +02:00
Gerd Hoffmann
243afe858b xhci: flush dequeue pointer to endpoint context
When done processing a endpoint ring we must update the dequeue pointer
in the endpoint context in guest memory.  This is needed to make sure
the guest has a correct view of things and also to make live migration
work properly, because xhci post_load restores alot of the state from
xhci data structures in guest memory.

Add xhci_set_ep_state() call to do that.

The recursive calls stopped by commit
ddb603ab6c had the (unintentional) side
effect to hiding this bug.  xhci_set_ep_state() was called before
processing, to set the state to running, which updated the dequeue
pointer too.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20170331102521.29253-1-kraxel@redhat.com
2017-04-03 11:40:57 +02:00
Peter Maydell
6954cdc070 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Sat 01 Apr 2017 02:23:29 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/curl: Check protocol prefix
  qapi/curl: Extend and fix blockdev-add schema
  rbd: Fix regression in legacy key/values containing escaped :

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-04-03 10:09:58 +01:00
David Gibson
8149e2992f pseries: Enforce homogeneous threads-per-core
For reasons that may be useful in future, CPU core objects, as used on the
pseries machine type have their own nr-threads property, potentially
allowing cores with different numbers of threads in the same system.

If the user/management uses the values specified in query-hotpluggable-cpus
as they're expected to do, this will never matter in pratice.  But that's
not actually enforced - it's possible to manually specify a core with
a different number of threads from that in -smp.  That will confuse the
platform - most immediately, this can be used to create a CPU thread with
index above max_cpus which leads to an assertion failure in
spapr_cpu_core_realize().

For now, enforce that all cores must have the same, standard, number of
threads.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
2017-04-03 13:46:18 +10:00
yaolujing
cc1e139139 nbd: fix memory leak on socket_connect failed
When TCP connection fails between nbd server and client,
the local var, sioc, memory leak.

This patch fixes the memory leak.

Signed-off-by: yaolujing <yaolujing@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1491005709-29989-1-git-send-email-yaolujing@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:18:21 +02:00
Corey Minyard
cb9a05a4f1 ipmi: Fix macro issues
Macro parameters should almost always have () around them when used.
llvm reported an error on this.

Remove redundant parenthesis and put parenthesis around the entire
macros with assignments in case they are used in an expression.

Remove some unused macros.

Reported in https://bugs.launchpad.net/bugs/1651167

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1490894892-8055-1-git-send-email-minyard@acm.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Tejaswini Poluri
c7f15bc936 target-i386: fix "info lapic" segfault on isapc
Start QEMU with
"qemu-system-x86_64 -nographic -M isapc -serial none-monitor stdio"
and enter "info lapic" at the monitor prompt ⇒
Segmentation fault

Signed-off-by: Tejaswini Poluri <tejaswinipoluri3@gmail.com>
Message-Id: <1490685583-16987-1-git-send-email-tejaswinipoluri3@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Stefan Hajnoczi
b9afaba891 iscsi: drop unused IscsiAIOCB.qiov field
The IscsiAIOCB.qiov field has been unused since commit
063c3378a9 ("block/iscsi: introduce
bdrv_co_{readv, writev, flush_to_disk}") back in 2013.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327165005.22038-1-stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-04-02 21:17:47 +02:00
Max Reitz
34634ca286 block/curl: Check protocol prefix
If the user has explicitly specified a block driver and thus a protocol,
we have to make sure the URL's protocol prefix matches. Otherwise the
latter will silently override the former which might catch some users by
surprise.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331120431.1767-3-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 15:53:22 -04:00
Max Reitz
6b9d62db89 qapi/curl: Extend and fix blockdev-add schema
The curl block driver accepts more options than just "filename"; also,
the URL is actually expected to be passed through the "url" option
instead of "filename".

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170331120431.1767-2-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 15:52:58 -04:00
Eric Blake
e98c6961c8 rbd: Fix regression in legacy key/values containing escaped :
Commit c7cacb3 accidentally broke legacy key-value parsing through
pseudo-filename parsing of -drive file=rbd://..., for any key that
contains an escaped ':'.  Such a key is surprisingly common, thanks
to mon_host specifying a 'host:port' string.  The break happens
because passing things from QDict through QemuOpts back to another
QDict requires that we pack our parsed key/value pairs into a string,
and then reparse that string, but the intermediate string that we
created ("key1=value1:key2=value2") lost the \: escaping that was
present in the original, so that we could no longer see which : were
used as separators vs. those used as part of the original input.

Fix it by collecting the key/value pairs through a QList, and
sending that list on a round trip through a JSON QString (as in
'["key1","value1","key2","value2"]') on its way through QemuOpts,
rather than hand-rolling our own string.  Since the string is only
handled internally, this was faster than creating a full-blown
struct of '[{"key1":"value1"},{"key2":"value2"}]', and safer at
guaranteeing order compared to '{"key1":"value1","key2":"value2"}'.

It would be nicer if we didn't have to round-trip through QemuOpts
in the first place, but that's a much bigger task for later.

Reproducer:
./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
-drive 'file=rbd:volumes/volume-ea141b5c-cdb3-4765-910d-e7008b209a70'\
':id=compute:key=AQAVkvxXAAAAABAA9ZxWFYdRmV+DSwKr7BKKXg=='\
':auth_supported=cephx\;none:mon_host=192.168.1.2\:6789'\
',format=raw,if=none,id=drive-virtio-disk0,'\
'serial=ea141b5c-cdb3-4765-910d-e7008b209a70,cache=writeback'

Even without an RBD setup, this serves a test of whether we get
the incorrect parser error of:
qemu-system-x86_64: -drive file=rbd:...cache=writeback: conf option 6789 has no value
or the correct behavior of hanging while trying to connect to
the requested mon_host of 192.168.1.2:6789.

Reported-by: Alexandru Avadanii <Alexandru.Avadanii@enea.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170331152730.12514-1-eblake@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-31 13:53:57 -04:00
Peter Maydell
95b31d709b Merge remote-tracking branch 'remotes/awilliam/tags/vfio-updates-20170331.0' into staging
VFIO fixes 2017-03-31

 - We can't disable stolen memory for UPT mode, it breaks Windows
   drivers on Gen9+ IGD (Xiong Zhang)

# gpg: Signature made Fri 31 Mar 2017 17:13:48 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-updates-20170331.0:
  Revert "vfio/pci-quirks.c: Disable stolen memory for igd VFIO"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 18:06:13 +01:00
Xiong Zhang
93587e3af3 Revert "vfio/pci-quirks.c: Disable stolen memory for igd VFIO"
This reverts commit c2b2e158cc.

The original patch intend to prevent linux i915 driver from using
stolen meory. But this patch breaks windows IGD driver loading on
Gen9+, as IGD HW will use stolen memory on Gen9+, once windows IGD
driver see zero size stolen memory, it will unload.
Meanwhile stolen memory will be disabled in 915 when i915 run as
a guest.

Signed-off-by: Xiong Zhang <xiong.y.zhang@intel.com>
[aw: Gen9+ is SkyLake and newer]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2017-03-31 10:04:41 -06:00
Peter Maydell
37c4a85cd2 Merge remote-tracking branch 'remotes/dgilbert/tags/pull-hmp-20170331' into staging
HMP pull (one bugfix)

# gpg: Signature made Fri 31 Mar 2017 11:57:17 BST
# gpg:                using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A  9FA9 0516 331E BC5B FDE7

* remotes/dgilbert/tags/pull-hmp-20170331:
  hmp: fix "dump-quest-memory" segfault

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 12:43:27 +01:00
Eric Auger
e7d54416cf hw/intc/arm_gicv3_kvm: Check KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS in reset
KVM_DEV_ARM_VGIC_GRP_CPU_SYSREGS needs to be checked before
attempting to read ICC_CTLR_EL1; otherwise kernel versions not
exposing this kvm device group will be incompatible with qemu 2.9.

Fixes: 07a5628  ("hw/intc/arm_gicv3_kvm: Reset GICv3 cpu interface registers")
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: Prakash B <bjsprakash.linux@gmail.com>
Tested-by: Alexander Graf <agraf@suse.de>
Message-id: 1490721640-13052-1-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 12:41:14 +01:00
Iwona Kotlarska
fd5d23babf hmp: fix "dump-quest-memory" segfault
Running QEMU with "qemu-system-x86_64 -M none -nographic -m 256" and executing
"dump-guest-memory /dev/null 0 8192" results in segfault.
Fix by checking if we have CPU.

Signed-off-by: Iwona Kotlarska <iwona260909@gmail.com>
Message-Id: <20170330050924.22134-1-iwona260909@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
   Fixed up title
2017-03-31 11:53:42 +01:00
Peter Maydell
05a6f451eb Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-30-tag' into staging
qemu-ga patch queue for 2.9

* fix make check failure of guest-get-fsinfo when nested virtual block
  device partitions are mounted in the test environment
* fix static compilation for mingw builds

# gpg: Signature made Fri 31 Mar 2017 04:52:40 BST
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* remotes/mdroth/tags/qga-pull-2017-03-30-tag:
  qga: Make qemu-ga compile statically for Windows
  qga: don't fail if mount doesn't have slave devices

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 11:09:51 +01:00
Peter Maydell
a0ee3797bf Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Fri 31 Mar 2017 01:50:55 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  e1000: disable debug by default
  virtio-net: avoid call tap_enable when there's only one queue

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-31 10:09:42 +01:00
Sameeh Jubran
4eaf720294 qga: Make qemu-ga compile statically for Windows
Attempting to compile qemu-ga statically as follows for Windows causes
the following error:

Compilation:
    ./configure --disable-docs --target-list=x86_64-softmmu \
    --cross-prefix=x86_64-w64-mingw32- --static \
    --enable-guest-agent-msi --with-vss-sdk=/path/to/VSSSDK72

    make -j8 qemu-ga

Error:
    path/to/qemu/stubs/error-printf.c:7: undefined reference to `__imp_g_test_config_vars'
    collect2: error: ld returned 1 exit status
    Makefile:444: recipe for target 'qemu-ga.exe' failed
    make: *** [qemu-ga.exe] Error 1

This is caused by a bug in the pkg-config file for glib as it doesn't define
GLIB_STATIC_COMPILATION for pkg-config --static.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-03-30 21:47:25 -05:00
Jason Wang
b4053c6483 e1000: disable debug by default
Disable debug output by default, the information were not needed for
release.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefan Hajnoczi <stefanha@gmail.com>
Cc: Leonid Bloch <leonid.bloch@ravellosystems.com>
Cc: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-31 08:48:13 +08:00
Jason Wang
1074b879d1 virtio-net: avoid call tap_enable when there's only one queue
We call tap_enable() even if for multiqueue is not enabled. This is
wrong since it should be used for multiqueue codes to enable a
disabled queue. Fixing this by only calling this when multiqueue is
used.

Fixes: 16dbaf905b ("tap: support enabling or disabling a queue")
Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Tested-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-31 08:48:13 +08:00
Michael Roth
8251a72f8b qga: don't fail if mount doesn't have slave devices
In some cases the slave devices of a virtual block device are tracked
by the parent in the corresponding sysfs node. For instance, if we
have a loop-back mount of the form:

  /dev/loop3p1 on /home/mdroth/mnt type ext4 (rw,relatime,data=ordered)

this will be reflected in sysfs as:

  /sys/devices/virtual/block/loop3/
  ...
  /sys/devices/virtual/block/loop3/slaves
  /sys/devices/virtual/block/loop3/loop3p1

The current code however assumes the mounted virtual block device,
loop3p1 in this case, contains the slaves directory, and reports an
error otherwise. This breaks 'make check' in certain environments.

Fix this by simply skipping attempts to generate disk topology
information in these cases. Since this information is documented
in QAPI as optionally-reported, this should be ok from an API
perspective.

In the future, this can possibly be improved upon by collecting
topology information from the parent in these cases.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 14:12:57 -05:00
Peter Maydell
ddc2c3a57e Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
vhost, pc: fixes

More fixes for 2.9. Region caching is still causing
issues around reset, but we seem to be getting there.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 30 Mar 2017 17:14:45 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  tests/acpi: don't pack a structure
  vhost: generalize iommu memory region

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 18:02:33 +01:00
Michael S. Tsirkin
0d876080a3 tests/acpi: don't pack a structure
There's no reason to pack structures where we don't care about size or
padding, this applies to AcpiStdTable in tests/acpi-utils.h.

OTOH bios-tables-test happens to be passing the address of a field in
this  struct to a function that expects a pointer to normally aligned
data which results in a SIGBUS on architectures like SPARC that have
strict alignment requirements.

Fixes: 9e8458c02 ("acpi unit-test: compare DSDT and SSDT tables against expected values")
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 19:12:44 +03:00
Jason Wang
375f74f473 vhost: generalize iommu memory region
We assumes the iommu_ops were attached to the root region of address
space. This may not be true for all kinds of IOMMU implementation and
especially after commit 3716d5902d ("pci: introduce a bus master
container"). So fix this by not assuming as->root has iommu_ops,
instead depending on the regions reported by memory listener through:

- register a memory listener to dma_as
- during region_add, if it's a region of IOMMU, register a specific
  IOMMU notifier, and store all notifiers in a list.
- during region_del, compare and delete the IOMMU notifier from the list

This is also a must for making vhost device IOTLB works for all types
of IOMMUs. Note, since we register one notifier during each
.region_add, the IOTLB may be flushed more than one times, this is
suboptimal and could be optimized in the future.

Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Fixes: 3716d5902d ("pci: introduce a bus master container")
Cc: Peter Xu <peterx@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Maxime Coquelin <maxime.coquelin@redhat.com>
2017-03-30 19:09:16 +03:00
Peter Maydell
e839001d5b Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Tue 28 Mar 2017 23:51:51 BST
# gpg:                using RSA key 0xB0A51BF58C9179C5
# gpg: Good signature from "Samuel Thibault <samuel.thibault@aquilenet.fr>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@u-bordeaux.fr>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: AEBF 7448 FAB9 453A 4552  390E B0A5 1BF5 8C91 79C5

* remotes/thibault/tags/samuel-thibault:
  slirp: Send RDNSS in RA only if host has an IPv6 DNS server
  slirp: Make RA build more flexible
  slirp: fix compilation errors with DEBUG set

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 15:28:19 +01:00
Peter Maydell
a67ec6ee2d Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170329' into staging
ppc patch queue for 2017-03-29

Two more bugfixes of sufficient severity to warrant going into 2.9.

# gpg: Signature made Wed 29 Mar 2017 04:33:19 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170329:
  spapr: fix memory hot-unplugging
  spapr: fix buffer-overflow

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 14:53:03 +01:00
Peter Maydell
e68dd68496 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci: fixes

More fixes for 2.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 29 Mar 2017 00:35:49 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio: fix vring_align() on 64-bit windows
  pci: Add missing drop of bus master AS reference
  event_notifier: prevent accidental use after close

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 13:55:40 +01:00
Peter Maydell
fb59dabd4f configure: Don't claim 'unsupported host OS' when better message available
The change in commit 898be3e041 which made completely
unrecognized OSes cause an error_exit "Unsupported host OS"
has some unfortunate unintended effects:
 * if you run 'configure --help' on an unsupported host OS
   (eg if intending to use it as a build machine for a
   cross compile to a supported host) then the message
   is printed instead of --help
 * if the C compiler doesn't work or is missing (eg if
   you passed an incorrect --cross-prefix by mistake)
   the message is printed instead of the more useful
   'compiler does not exist or does not work' message

Fix this by postponing the error_exit in this situation
until later, when we have already identified the more
useful cases for this.

The long term fix for this would be to move handling
of --help much further up in the configure script,
and make its output not dependent on checks that configure
runs. However for 2.9 this would be too invasive.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Tested-by: Stefan Weil <sw@weilnetz.de>
2017-03-30 12:47:03 +01:00
Peter Maydell
b529aec1ee Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
i386: Fix for "-cpu host,invtsc=on" bug

# gpg: Signature made Tue 28 Mar 2017 20:50:33 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  i386: Don't override -cpu options on -cpu host/max
  i386: Replace uint32_t* with FeatureWord on feature getter/setter

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-30 09:58:46 +01:00
Laurent Vivier
fe6824d126 spapr: fix memory hot-unplugging
If, once the kernel has booted, we try to remove a memory
hotplugged while the kernel was not started, QEMU crashes on
an assert:

    qemu-system-ppc64: hw/virtio/vhost.c:651:
                       vhost_commit: Assertion `r >= 0' failed.
    ...
    #4  in vhost_commit
    #5  in memory_region_transaction_commit
    #6  in pc_dimm_memory_unplug
    #7  in spapr_memory_unplug
    #8  spapr_machine_device_unplug
    #9  in hotplug_handler_unplug
    #10 in spapr_lmb_release
    #11 in detach
    #12 in set_allocation_state
    #13 in rtas_set_indicator
    ...

If we take a closer look to the guest kernel log, we can see when
we try to unplug the memory:

    pseries-hotplug-mem: Attempting to hot-add 4 LMB(s)

What happens:

    1- The kernel has ignored the memory hotplug event because
       it was not started when it was generated.

    2- When we hot-unplug the memory,
       QEMU starts to remove the memory,
            generates an hot-unplug event,
        and signals the kernel of the incoming new event

    3- as the kernel is started, on the QEMU signal, it reads
       the event list, decodes the hotplug event and tries to
       finish the hotplugging.

    4- QEMU receive the the hotplug notification while it
       is trying to hot-unplug the memory. This moves the memory
       DRC to an invalid state

This patch prevents this by not allowing to set the allocation
state to USABLE while the DRC is awaiting release.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1432382

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-29 11:35:16 +11:00
Marc-André Lureau
24ec2863b1 spapr: fix buffer-overflow
Running postcopy-test with ASAN produces the following error:

QTEST_QEMU_BINARY=ppc64-softmmu/qemu-system-ppc64  tests/postcopy-test
...
=================================================================
==23641==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x7f1556600000 at pc 0x55b8e9d28208 bp 0x7f1555f4d3c0 sp 0x7f1555f4d3b0
READ of size 8 at 0x7f1556600000 thread T6
    #0 0x55b8e9d28207 in htab_save_first_pass /home/elmarco/src/qq/hw/ppc/spapr.c:1528
    #1 0x55b8e9d2939c in htab_save_iterate /home/elmarco/src/qq/hw/ppc/spapr.c:1665
    #2 0x55b8e9beae3a in qemu_savevm_state_iterate /home/elmarco/src/qq/migration/savevm.c:1044
    #3 0x55b8ea677733 in migration_thread /home/elmarco/src/qq/migration/migration.c:1976
    #4 0x7f15845f46c9 in start_thread (/lib64/libpthread.so.0+0x76c9)
    #5 0x7f157d9d0f7e in clone (/lib64/libc.so.6+0x107f7e)

0x7f1556600000 is located 0 bytes to the right of 2097152-byte region [0x7f1556400000,0x7f1556600000)
allocated by thread T0 here:
    #0 0x7f159bb76980 in posix_memalign (/lib64/libasan.so.3+0xc7980)
    #1 0x55b8eab185b2 in qemu_try_memalign /home/elmarco/src/qq/util/oslib-posix.c:106
    #2 0x55b8eab186c8 in qemu_memalign /home/elmarco/src/qq/util/oslib-posix.c:122
    #3 0x55b8e9d268a8 in spapr_reallocate_hpt /home/elmarco/src/qq/hw/ppc/spapr.c:1214
    #4 0x55b8e9d26e04 in ppc_spapr_reset /home/elmarco/src/qq/hw/ppc/spapr.c:1261
    #5 0x55b8ea12e913 in qemu_system_reset /home/elmarco/src/qq/vl.c:1697
    #6 0x55b8ea13fa40 in main /home/elmarco/src/qq/vl.c:4679
    #7 0x7f157d8e9400 in __libc_start_main (/lib64/libc.so.6+0x20400)

Thread T6 created by T0 here:
    #0 0x7f159bae0488 in __interceptor_pthread_create (/lib64/libasan.so.3+0x31488)
    #1 0x55b8eab1d9cb in qemu_thread_create /home/elmarco/src/qq/util/qemu-thread-posix.c:465
    #2 0x55b8ea67874c in migrate_fd_connect /home/elmarco/src/qq/migration/migration.c:2096
    #3 0x55b8ea66cbb0 in migration_channel_connect /home/elmarco/src/qq/migration/migration.c:500
    #4 0x55b8ea678f38 in socket_outgoing_migration /home/elmarco/src/qq/migration/socket.c:87
    #5 0x55b8eaa5a03a in qio_task_complete /home/elmarco/src/qq/io/task.c:142
    #6 0x55b8eaa599cc in gio_task_thread_result /home/elmarco/src/qq/io/task.c:88
    #7 0x7f15823e38e6  (/lib64/libglib-2.0.so.0+0x468e6)
SUMMARY: AddressSanitizer: heap-buffer-overflow /home/elmarco/src/qq/hw/ppc/spapr.c:1528 in htab_save_first_pass

index seems to be wrongly incremented, unless I miss something that
would be worth a comment.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-29 11:35:02 +11:00
Andrew Baumann
b8adbc6578 virtio: fix vring_align() on 64-bit windows
long is 32-bits on 64-bit windows, which caused the top half of the
address to be truncated; this patch changes it to use the
QEMU_ALIGN_UP macro which does not suffer the same problem

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 02:35:24 +03:00
Alexey Kardashevskiy
c53598ed18 pci: Add missing drop of bus master AS reference
The recent introduction of a bus master container added
memory_region_add_subregion() into the PCI device registering path but
missed memory_region_del_subregion() in the unregistering path leaving
a reference to the root memory region of the new container.

This adds missing memory_region_del_subregion().

Fixes: 3716d5902d ("pci: introduce a bus master container")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-29 02:35:23 +03:00
Halil Pasic
aa26292859 event_notifier: prevent accidental use after close
Let's set the handles to the underlying facilities to their extremal
value so no accidental misuse can happen, and to make it obvious that the
notifier is dysfunctional. E.g. if we just close an fd but do not touch
the int holding the fd eventually a read/write could succeed again when
the fd gets reused, and corrupt the file addressed by the fd.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-29 02:35:23 +03:00
Samuel Thibault
a2f80fdfc6 slirp: Send RDNSS in RA only if host has an IPv6 DNS server
Previously we would always send an RDNSS option in the RA, making the guest
try to resolve DNS through IPv6, even if the host does not actually have
and IPv6 DNS server available.

This makes the RDNSS option enabled only when an IPv6 DNS server is
available.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 00:51:25 +02:00
Samuel Thibault
e42f869b51 slirp: Make RA build more flexible
Do not hardcode the RA size at all, use a pl_size variable which
accounts the accumulated size, and fill rip->ip_pl at the end.

This will allow to make some blocks optional.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-29 00:49:04 +02:00
Laurent Vivier
51149a2ac1 slirp: fix compilation errors with DEBUG set
slirp/slirp.c: In function 'get_dns_addr_resolv_conf':
slirp/slirp.c:202:29: error: initialization discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
                 char *res = inet_ntop(af, tmp_addr, s, sizeof(s));
                             ^~~~~~~~~
slirp/slirp.c:204:25: error: assignment discards 'const' qualifier from pointer target type [-Werror=discarded-qualifiers]
                     res = "(string conversion error)";

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2017-03-29 00:49:04 +02:00
Eduardo Habkost
d4a606b38b i386: Don't override -cpu options on -cpu host/max
The existing code for "host" and "max" CPU models overrides every
single feature in the CPU object at realize time, even the ones
that were explicitly enabled or disabled by the user using
"feat=on" or "feat=off", while features set using +feat/-feat are
kept.

This means "-cpu host,+invtsc" works as expected, while
"-cpu host,invtsc=on" doesn't.

This was a known bug, already documented in a comment inside
x86_cpu_expand_features(). What makes this bug worse now is that
libvirt 3.0.0 and newer now use "feat=on|off" instead of
+feat/-feat when it detects a QEMU version that supports it (see
libvirt commit d47db7b16dd5422c7e487c8c8ee5b181a2f9cd66).

Change the feature property getter/setter to set a
env->user_features field, to keep track of features that were
explicitly changed using QOM properties. Then make the
max_features code not override user features when handling "-cpu
host" and "-cpu max".

This will also allow us to remove the plus_features/minus_features
hack in the future, but I plan to do that after 2.9.0 is
released.

Reported-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170327144815.8043-3-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-28 16:41:10 -03:00
Eduardo Habkost
a7b0ffacc1 i386: Replace uint32_t* with FeatureWord on feature getter/setter
Instead of passing a pointer to the feature property getter and
setter functions, pass a FeatureWord enum so they can perform
other actions related to the feature flag.

This will be used to add a new "user_features" field to keep
track of features that were explicitly set by the user.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170327144815.8043-2-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-28 16:40:50 -03:00
Peter Maydell
df90463632 Update version for v2.9.0-rc2 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 19:11:16 +01:00
Peter Maydell
a634bbbafc Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2017-03-28' into staging
Miscellaneous patches for 2017-03-28

# gpg: Signature made Tue 28 Mar 2017 17:51:06 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2017-03-28:
  sockets: Fix socket_address_to_string() hostname truncation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 18:37:32 +01:00
Markus Armbruster
44fdc76455 sockets: Fix socket_address_to_string() hostname truncation
We first snprintf() to a fixed buffer, then g_strdup() the result
*boggle*.

Worse, the size of the fixed buffer INET6_ADDRSTRLEN + 5 + 4 is bogus:
the 4 correctly accounts for '[', ']', ':' and '\0', but
INET6_ADDRSTRLEN is not a suitable limit for inet->host, and 5 is not
one for inet->port!  They are for host and port in *numeric* form
(exploiting that INET6_ADDRSTRLEN > INET_ADDRSTRLEN), but inet->host
can also be a hostname, and inet->port can be a service name, to be
resolved with getaddrinfo().

Fortunately, the only user so far is the "socket" network backend's
net_socket_connected(), which uses it to initialize a NetSocketState's
info_str[].  info_str[] has considerable more space: 256 instead of
55.  So the bug's impact appears to be limited to truncated "info
networks" with the "socket" network backend.

The fix is obvious: use g_strdup_printf().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490268208-23368-1-git-send-email-armbru@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 18:50:38 +02:00
Peter Maydell
b8dc35b252 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 15:22:59 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: fix tcg tracing build breakage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 17:20:11 +01:00
Peter Maydell
aba0fb1e2e Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 15:02:40 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  rbd: Fix bugs around -drive parameter "server"
  rbd: Revert -blockdev parameter password-secret
  rbd: Revert -blockdev and -drive parameter auth-supported
  rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
  rbd: Clean up runtime_opts, fix -drive to reject filename
  rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
  rbd: Clean up after the previous commit
  rbd: Don't limit length of parameter values
  rbd: Fix to cleanly reject -drive without pool or image
  rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 15:56:05 +01:00
Markus Armbruster
2836284db6 rbd: Fix bugs around -drive parameter "server"
qemu_rbd_open() takes option parameters as a flattened QDict, with
keys of the form server.%d.host, server.%d.port, where %d counts up
from zero.

qemu_rbd_array_opts() extracts these values as follows.  First, it
calls qdict_array_entries() to find the list's length.  For each list
element, it formats the list's key prefix (e.g. "server.0."), then
creates a new QDict holding the options with that key prefix, then
converts that to a QemuOpts, so it can finally get the member values
from there.

If there's one surefire way to make code using QDict more awkward,
it's creating more of them and mixing in QemuOpts for good measure.

The extraction of keys starting with server.%d into another QDict
makes us ignore parameters like server.0.neither-host-nor-port
silently.

The conversion to QemuOpts abuses runtime_opts, as described a few
commits ago.

Rewrite to simply get the values straight from the options QDict.

Fixes -drive not to crash when server.*.* are present, but
server.*.host is absent.

Fixes -drive to reject invalid server.*.*.

Permits cleaning up runtime_opts.  Do that, and fix -drive to reject
bogus parameters host and port instead of silently ignoring them.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-11-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
577d8c9a81 rbd: Revert -blockdev parameter password-secret
This reverts a part of commit 8a47e8e.  We're having second thoughts
on the QAPI schema (and thus the external interface), and haven't
reached consensus, yet.  Issues include:

* BlockdevOptionsRbd member @password-secret isn't actually a
  password, it's a key generated by Ceph.

* We're not sure where member @password-secret belongs (see the
  previous commit).

* How @password-secret interacts with settings from a configuration
  file specified with @conf is undocumented.

Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.

Note that users can still configure an authentication key with a
configuration file.  They probably do that anyway if they use Ceph
outside QEMU as well.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-10-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
464444fcc1 rbd: Revert -blockdev and -drive parameter auth-supported
This reverts half of commit 0a55679.  We're having second thoughts on
the QAPI schema (and thus the external interface), and haven't reached
consensus, yet.  Issues include:

* The implementation uses deprecated rados_conf_set() key
  "auth_supported".  No biggie.

* The implementation makes -drive silently ignore invalid parameters
  "auth" and "auth-supported.*.X" where X isn't "auth".  Fixable (in
  fact I'm going to fix similar bugs around parameter server), so
  again no biggie.

* BlockdevOptionsRbd member @password-secret applies only to
  authentication method cephx.  Should it be a variant member of
  RbdAuthMethod?

* BlockdevOptionsRbd member @user could apply to both methods cephx
  and none, but I'm not sure it's actually used with none.  If it
  isn't, should it be a variant member of RbdAuthMethod?

* The client offers a *set* of authentication methods, not a list.
  Should the methods be optional members of BlockdevOptionsRbd instead
  of members of list @auth-supported?  The latter begs the question
  what multiple entries for the same method mean.  Trivial question
  now that RbdAuthMethod contains nothing but @type, but less so when
  RbdAuthMethod acquires other members, such the ones discussed above.

* How BlockdevOptionsRbd member @auth-supported interacts with
  settings from a configuration file specified with @conf is
  undocumented.  I suspect it's untested, too.

Let's avoid painting ourselves into a corner now, and revert the
feature for 2.9.

Note that users can still configure authentication methods with a
configuration file.  They probably do that anyway if they use Ceph
outside QEMU as well.

Further note that this doesn't affect use of key "auth-supported" in
-drive file=rbd:...:key=value.

qemu_rbd_array_opts()'s parameter @type now must be RBD_MON_HOST,
which is silly.  This will be cleaned up shortly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-9-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:01:21 -04:00
Markus Armbruster
078463977a rbd: Clean up qemu_rbd_create()'s detour through QemuOpts
The conversion from QDict to QemuOpts is pointless.  Simply get the
stuff straight from the QDict.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-8-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 10:00:57 -04:00
Markus Armbruster
cbf036b4f3 rbd: Clean up runtime_opts, fix -drive to reject filename
runtime_opts is used for three different purposes:

* qemu_rbd_open() uses it to accept options it recognizes, such as
  "pool" and "image".  Other .bdrv_open() methods do it similarly.

* qemu_rbd_open() accepts additional list-valued options
  auth-supported and server, with the help of qemu_rbd_array_opts().
  The list elements are again dictionaries.  qemu_rbd_array_opts()
  uses runtime_opts to accept their members.  Thus, runtime_opts
  contains recognized sub-sub-options "auth", "host", "port" in
  addition to recognized options.  No other block driver does that.

* qemu_rbd_create() uses it to convert the QDict produced by
  qemu_rbd_parse_filename() to QemuOpts.  No other block driver does
  that.  The keys produced by qemu_rbd_parse_filename() are "pool",
  "image", "snapshot", "conf", "user" and "keyvalue-pairs".
  qemu_rbd_open() accepts these, so no additional ones here.

This is a confusing mess.  Dates back to commit 0f9d252.  First step
to clean it up is documenting runtime_opts.desc[]:

* Reorder entries to match the QAPI schema, like we do in other block
  drivers.

* Document why the schema's "server" and "auth-supported" aren't in
  .desc[].

* Document why "keyvalue-pairs", "host", "port" and "auth" are in
  .desc[], but not the schema.

* Delete "filename", because none of the three users actually uses it.
  This fixes -drive to reject parameter filename instead of silently
  ignoring it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-7-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
82f20e8547 rbd: Don't accept -drive driver=rbd, keyvalue-pairs=...
The way we communicate extra key-value pairs from
qemu_rbd_parse_filename() to qemu_rbd_open() exposes option parameter
"keyvalue-pairs" on the command line.  It's not wanted there.  Hack:
rename the parameter to "=keyvalue-pairs" to make it inaccessible.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-6-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
8efb339dd4 rbd: Clean up after the previous commit
This code in qemu_rbd_parse_filename()

    found_str = qemu_rbd_next_tok(p, '\0', &p);
    p = found_str;

has no effect.  Drop it, and simplify qemu_rbd_next_tok().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-5-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
730b00bbfd rbd: Don't limit length of parameter values
We laboriously enforce that parameter values are between one and some
arbitrary limit in length.  Only RBD_MAX_IMAGE_NAME_SIZE comes from
librbd.h, and I'm not sure it applies.  Where the other limits come
from is unclear.

Drop the length checking.  The limits librbd actually imposes must be
checked by librbd anyway.

There's one minor complication: BDRVRBDState member name is a
fixed-size array.  Depends on the length limit.  Make it a pointer to
a dynamically allocated string.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-4-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
f51c363c2b rbd: Fix to cleanly reject -drive without pool or image
qemu_rbd_open() neglects to check pool and image are present.  Missing
image is caught by rbd_open(), but missing pool crashes.  Reproducer:

    $ qemu-system-x86_64 -nodefaults -drive driver=rbd,id=rbd,image=i,...
    terminate called after throwing an instance of 'std::logic_error'
      what():  basic_string::_M_construct null not valid
    Aborted (core dumped)

where ... is a working server.0.{host,port} configuration.

Doesn't affect -drive with file=..., because qemu_rbd_parse_filename()
always sets both pool and image.

Doesn't affect -blockdev, because pool and image are mandatory in the
QAPI schema.

Fix by adding the missing checks.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-3-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Markus Armbruster
eb87203b64 rbd: Reject -blockdev server.*.{numeric, to, ipv4, ipv6}
We use InetSocketAddress in the QAPI schema.  However, the code
doesn't use inet_connect_saddr(), but formats "host" and "port" into a
configuration string for rados_conf_set().  Thus, members "numeric",
"to", "ipv4" and "ipv6" are silently ignored.  Not nice.  Example:

    -blockdev rbd,node-name=nn,pool=p,image=i,server.0.host=h0,server.0.port=12345,server.0.ipv4=off

Factor a suitable InetSocketAddressBase out of InetSocketAddress, and
use that.  "numeric", "to", "ipv4" and "ipv6" are now rejected.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490691368-32099-2-git-send-email-armbru@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-28 09:53:16 -04:00
Peter Maydell
4d2bee82f4 Merge remote-tracking branch 'remotes/armbru/tags/pull-block-2017-03-28' into staging
Block patches for 2017-03-28

# gpg: Signature made Tue 28 Mar 2017 14:41:37 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-block-2017-03-28:
  block: Declare blockdev-add and blockdev-del supported

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 14:48:07 +01:00
Markus Armbruster
79b7a77eda block: Declare blockdev-add and blockdev-del supported
It's been a long journey, but here we are.

The supported blockdev-add is not compatible to its experimental
predecessors; bump all Since: tags to 2.9.

x-blockdev-remove-medium, x-blockdev-insert-medium and
x-blockdev-change need a bit more work, so leave them alone for now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-28 15:23:23 +02:00
Peter Maydell
0491c22154 Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1' into staging
MTTCG regression fixes for rc2

# gpg: Signature made Tue 28 Mar 2017 10:54:38 BST
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-fixups-for-rc2-280317-1:
  replay/replay.c: bump REPLAY_VERSION
  tcg: Add a new line after incompatibility warning
  ui/console: use exclusive mechanism directly
  ui/console: ensure do_safe_dpy_refresh holds BQL
  bsd-user: align use of mmap_lock to that of linux-user
  user-exec: handle synchronous signals from QEMU gracefully

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 12:34:23 +01:00
Peter Maydell
142b9ca51d Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 28 Mar 2017 11:07:02 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  parallels: wrong call to bdrv_truncate

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 11:10:36 +01:00
Stefan Hajnoczi
7609ffb919 trace: fix tcg tracing build breakage
Commit 0ab8ed18a6 ("trace: switch to
modular code generation for sub-directories") forgot to convert "tcg"
trace events to the modular code generation approach where each
sub-directory has its own trace-events file.

This patch fixes compilation for "tcg" trace events.  Currently they are
only used in the root ./trace-events file.

"tcg" trace events can only be used in the root ./trace-events file for
the time being.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170327131718.18268-1-stefanha@redhat.com
Suggested-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-28 11:07:46 +01:00
Denis V. Lunev
dc62da88b5 parallels: wrong call to bdrv_truncate
Parallels driver should not call bdrv_truncate if the image was opened
in the read-only mode. Without the patch
    qemu-img check harddisk.hds
asserts with
    bdrv_truncate: Assertion `child->perm & BLK_PERM_RESIZE' failed.

Parameters used on the write path are not needed if the image is opened
in the read-only mode.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reported-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Message-id: 1490625488-7980-1-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-28 11:06:00 +01:00
Alex Bennée
5b12c163c8 replay/replay.c: bump REPLAY_VERSION
A previous commit (3d4d16f4) added support for audio record/playback.
However this breaks the logfile ABI due to the re-ordering of the
ReplayEvents enum. The REPLAY_VERSION check is meant to prevent you
from using old log files in newer QEMUs but this is currently broken.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:52:50 +01:00
Pranith Kumar
8cfef89271 tcg: Add a new line after incompatibility warning
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:52:50 +01:00
Alex Bennée
0096109052 ui/console: use exclusive mechanism directly
The previous commit (8bb93c6f99) using async_safe_run_on_cpu() doesn't
work on graphics sub-system which restrict which threads can do GUI
updates. Rather the special casing MacOS we just directly call the
helper and move all the exclusive handling into do_dafe_dpy_refresh().

The unfortunate bouncing of the BQL is to ensure there is no deadlock
as vCPUs waiting on the BQL are kicked into their quiescent state.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-28 10:52:45 +01:00
Alex Bennée
8539093919 ui/console: ensure do_safe_dpy_refresh holds BQL
I missed the fact that when an exclusive work item runs it drops the
BQL to ensure all no vCPUs are stuck waiting for it, hence causing a
deadlock. However the actual helper needs to take the BQL especially
as we'll be messing with device emulation bits during the update which
all assume BQL is held.

We make a minor cpu_reloading_memory_map which must try and unlock the
RCU if we are actually outside the running context.

Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-28 10:52:24 +01:00
Alex Bennée
95992b674c bsd-user: align use of mmap_lock to that of linux-user
The introduction of stricter mmap_lock checking in translate-all broke
the BSD user build. The working mmap_lock functions were hidden behind
CONFIG_USE_NPTL which is never defined. This patch brings them inline
with linux-user.

Despite the disapearence of the comment "We aren't threadsafe to start
with..." this doesn't make bsd-user so. It will still need the rest of
the fixes that have been done in linux-user ported over.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:50:40 +01:00
Alex Bennée
02bed6bd5f user-exec: handle synchronous signals from QEMU gracefully
When "tcg: enable thread-per-vCPU" (commit 3725794) was merged the
lifetime of current_cpu was changed. Previously a broken linux-user
call might abort() which can eventually escalate into a SIGSEGV which
would then crash qemu as it attempted to deref a NULL current_cpu.
After commit 3725794 it would attempt to fixup state and re-start the
run-loop and much hilarity (i.e. a looping lockup) would ensue from
jumping into a stale jmp_env.

As we can actually tell if we are in the run-loop from looking at the
cpu->running flag we should catch this badness first and abort()
cleanly rather than try to soldier on. There is a theoretical race
between the flag being set and sigsetjmp refreshing the jump buffer
but we can try really hard to not introduce crashes into that code.

[LV: setgroups03 fails on powerpc LTP]
Reported-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-28 10:50:35 +01:00
Peter Maydell
8c9ee217f0 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This series fixes potential memory/fd leaks in 9pfs and a crash when
running tests/virtio-9p-test on SPARC hosts.

# gpg: Signature made Tue 28 Mar 2017 09:44:05 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
  9pfs: fix file descriptor leak

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-28 09:48:23 +01:00
Peter Maydell
34ef723ce3 tests/virtio-9p-test: Don't call le*_to_cpus on fields of packed struct
For a packed struct like 'P9Hdr' the fields within it may not be
aligned as much as the natural alignment for their types.  This means
it is not valid to pass the address of such a field to a function
like le32_to_cpus() which operate on uint32_t* and assume alignment.
Doing this results in a SIGBUS on hosts like SPARC which have strict
alignment requirements.

Use ldl_le_p() instead, which is specified to correctly handle
unaligned pointers.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-27 21:15:31 +02:00
Li Qiang
d63fb193e7 9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.

This patch ensures that the fid isn't already associated to anything
before using it.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
(reworded the changelog, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-27 21:13:19 +02:00
Peter Maydell
eb06c9e2d3 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* MTTCG fix for win32
* virtio-scsi assertion failure
* mem-prealloc coverity fix
* x86 migration revert which requires more thought
* x86 instruction limit (avoids >2 page translation blocks)
* nbd dead code cleanup
* small memory.c logic fix

# gpg: Signature made Mon 27 Mar 2017 17:03:04 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
  Revert "apic: save apic_delivered flag"
  nbd: drop unused NBDClientSession.is_unix field
  win32: replace custom mutex and condition variable with native primitives
  mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
  tcg/i386: Check the size of instruction being translated
  virtio-scsi: Fix acquire/release in dataplane handlers
  virtio-scsi: Make virtio_scsi_acquire/release public
  clear pending status before calling memory commit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 17:34:50 +01:00
Peter Maydell
9366f53d50 Merge remote-tracking branch 'remotes/maxreitz/tags/pull-block-2017-03-27' into staging
Block patches for 2.9-rc2.

# gpg: Signature made Mon 27 Mar 2017 16:47:54 BST
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* remotes/maxreitz/tags/pull-block-2017-03-27:
  block/file-posix.c: Fix unused variable warning on OpenBSD
  file-posix: Make bdrv_flush() failure permanent without O_DIRECT
  nbd-client: fix handling of hungup connections
  qemu-img: print short help on getopt failure
  qemu-img: fix switch indentation in img_amend()
  qemu-img: show help for invalid global options

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 16:56:31 +01:00
Peter Maydell
700f9ce0f9 block/file-posix.c: Fix unused variable warning on OpenBSD
On OpenBSD none of the ioctls probe_logical_blocksize() tries
exist, so the variable sector_size is unused. Refactor the
code to avoid this (and reduce the duplicated code).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490279788-12995-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 17:28:34 +02:00
Peter Maydell
a001631cb8 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170327-1' into staging
fixes for 2.9: vga, egl, cirrus, virtio-input.

# gpg: Signature made Mon 27 Mar 2017 14:19:45 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170327-1:
  vnc: fix reverse mode
  ui/egl-helpers: fix egl 1.5 display init
  cirrus: fix PUTPIXEL macro
  virtio-input: fix eventq batching
  virtio-input: free event queue when finalizing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-27 16:15:29 +01:00
Fam Zheng
bed58b4443 scsi-generic: Fill in opt_xfer_len in INQUIRY reply if it is zero
When opt_xfer_len is zero, Linux ignores max_xfer_len erroneously.

While that obviously should be fixed, we do older guests a favor to
always filling in a value.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170327142625.1249-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 17:02:07 +02:00
Kevin Wolf
e5bcf967fb file-posix: Make bdrv_flush() failure permanent without O_DIRECT
Success for bdrv_flush() means that all previously written data is safe
on disk. For fdatasync(), the best semantics we can hope for on Linux
(without O_DIRECT) is that all data that was written since the last call
was successfully written back. Therefore, and because we can't redo all
writes after a flush failure, we have to give up after a single
fdatasync() failure. After this failure, we would never be able to make
the promise that a successful bdrv_flush() makes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 20170322210005.16533-1-kwolf@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:53:42 +02:00
Paolo Bonzini
a12a712a7d nbd-client: fix handling of hungup connections
After the switch to reading replies in a coroutine, nothing is
reentering pending receive coroutines if the connection hangs.
Move nbd_recv_coroutines_enter_all to the reply read coroutine,
which is the place where hangups are detected.  nbd_teardown_connection
can simply wait for the reply read coroutine to detect the hangup
and clean up after itself.

This wouldn't be enough though because nbd_receive_reply returns 0
(rather than -EPIPE or similar) when reading from a hung connection.
Fix the return value check in nbd_read_reply_entry.

This fixes qemu-iotests 083.

Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170314111157.14464-1-pbonzini@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
c919297379 qemu-img: print short help on getopt failure
Printing the full help output obscures the error message for an invalid
command-line option or missing argument.

Before this patch:

  $ ./qemu-img --foo
  ...pages of output...

After this patch:

  $ ./qemu-img --foo
  qemu-img: unrecognized option '--foo'
  Try 'qemu-img --help' for more information

This patch adds the getopt ':' character so that it can distinguish
between missing arguments and unrecognized options.  This helps provide
more detailed error messages.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-4-stefanha@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
f7077624b0 qemu-img: fix switch indentation in img_amend()
QEMU coding style indents 'case' to the same level as the 'switch'
statement:

  switch (foo) {
  case 1:

Fix this coding style violation so checkpatch.pl doesn't complain about
the next patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-3-stefanha@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Stefan Hajnoczi
4581c16fce qemu-img: show help for invalid global options
The qemu-img sub-command executes regardless of invalid global options:

  $ qemu-img --foo info test.img
  qemu-img: unrecognized option '--foo'
  image: test.img
  ...

The unrecognized option warning may be missed by the user.  This can
hide incorrect command-lines in scripts and confuse users.

This patch prints the help information and terminates instead of
executing the sub-command.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170317104541.28979-2-stefanha@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-27 16:50:36 +02:00
Paolo Bonzini
5354edd286 Revert "apic: save apic_delivered flag"
This reverts commit 07bfa35477.
The global variable is only read as part of a

            apic_reset_irq_delivered();
            qemu_irq_raise(s->irq);
            if (!apic_get_irq_delivered()) {

sequence, so the value never matters at migration time.

Reported-by: Dr. David Alan Gilbert <dglibert@redhat.com>
Cc: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Stefan Hajnoczi
e4548bb640 nbd: drop unused NBDClientSession.is_unix field
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170327123223.1199-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Andrey Shedel
12f8def0e0 win32: replace custom mutex and condition variable with native primitives
The multithreaded TCG implementation exposed deadlocks in the win32
condition variables: as implemented, qemu_cond_broadcast waited on
receivers, whereas the pthreads API it was intended to emulate does
not. This was causing a deadlock because broadcast was called while
holding the IO lock, as well as all possible waiters blocked on the
same lock.

This patch replaces all the custom synchronisation code for mutexes
and condition variables with native Windows primitives (SRWlocks and
condition variables) with the same semantics as their POSIX
equivalents. To enable that, it requires a Windows Vista or newer host
OS.

Signed-off-by: Andrey Shedel <ashedel@microsoft.com>
[AB: edited commit message]
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20170324220141.10104-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-27 14:41:01 +02:00
Gerd Hoffmann
e5766eb404 vnc: fix reverse mode
vnc server in reverse mode (qemu -vnc localhost:$nr,reverse) interprets
$nr as display number (i.e. with 5900 offset) in recent qemu versions.
Historical and documented behavior is interpreting $nr as port number
though. So we should bring code and documentation in line.

Given that default listening port for viewers is 5500 the 5900 offset is
pretty inconvinient, because it is simply impossible to connect to port
5500.  So, lets fix the code not the docs.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1489480018-11443-1-git-send-email-kraxel@redhat.com
2017-03-27 12:16:02 +02:00
Gerd Hoffmann
8bce03e393 ui/egl-helpers: fix egl 1.5 display init
Unfortunaly switching to getPlatformDisplayEXT isn't as easy as
implemented by 0ea1523fb6.  See the
longish comment for the complete story.

Cc: Frediano Ziglio <fziglio@redhat.com>
Suggested-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489997042-1824-1-git-send-email-kraxel@redhat.com
2017-03-27 12:14:45 +02:00
Gerd Hoffmann
db6cd4c855 cirrus: fix PUTPIXEL macro
Should be "c" not "col".  The macro is used with "col" as third parameter
everywhere, so this tyops doesn't break something.

Fixes: 026aeffcb4
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1490168303-24588-1-git-send-email-kraxel@redhat.com
2017-03-27 12:14:45 +02:00
Ladi Prosek
57094547df virtio-input: fix eventq batching
virtio_input_send buffers input events until it sees a SYNC. Then it
either sends or drops the entire batch, depending on whether eventq
has enough space available. The case to avoid here is partial sends
where only part of the batch would get to the guest.

Using virtqueue_get_avail_bytes to check the state of eventq was not
correct. The queue may have a smaller number of larger buffers
available so bytes may be enough but the batch would still not be
possible to send, leading to the "Huh?  No vq elem available" error.

Instead of checking available bytes, this patch optimistically pops
buffers from the queue and puts them back in case it runs out of
space and the batch needs to be dropped.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1490365490-4854-3-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-27 12:14:45 +02:00
Ladi Prosek
0f5a15e40a virtio-input: free event queue when finalizing
VirtIOInput.queue was never freed. This commit adds an explicit
g_free to virtio_input_finalize and switches the allocation
function from realloc to g_realloc in virtio_input_send.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-id: 1490365490-4854-2-git-send-email-lprosek@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-27 12:14:44 +02:00
Peter Maydell
ea2afcf5b6 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Fri 24 Mar 2017 14:08:41 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: Avoid abuse of amdvi_mmio_read
  trace: Fix incorrect megasas trace parameters
  trace: Fix backwards mirror_yield parameters

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-24 14:14:18 +00:00
Christian Borntraeger
7150d34a1d boot-serial-test: use -no-shutdown
a qemu with an empty s390 guest will exit very quickly. This races
against the testsuite reading from the console pipe leading to
intermittent test suite failures. Using -no-shutdown will keep
the guest running.

Fixes: 864111f422 (vl: exit qemu on guest panic if -no-shutdown is not set)
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1490361570-288658-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-24 13:39:50 +00:00
Jitendra Kolhe
dfd0dcc717 mem-prealloc: fix sysconf(_SC_NPROCESSORS_ONLN) failure case.
This was spotted by Coverity, in case where sysconf(_SC_NPROCESSORS_ONLN)
fails and returns -1. This results in memset_num_threads getting set to -1.
Which we then pass to g_new0().
The patch replaces MAX_MEM_PREALLOC_THREAD_COUNT macro with a function call
get_memset_num_threads() to handle sysconf() failure gracefully. In case
sysconf() fails, we fall back to single threaded.

(Spotted by Coverity, CID 1372465.)

Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1490079006-32495-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:50:11 +01:00
Pranith Kumar
30663fd26c tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122

Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170323175851.14342-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:49:38 +01:00
Fam Zheng
7140778605 virtio-scsi: Fix acquire/release in dataplane handlers
After the AioContext lock push down, there is a race between
virtio_scsi_dataplane_start and those "assert(s->ctx &&
s->dataplane_started)", because the latter doesn't isn't wrapped in
aio_context_acquire.

Reproducer is simply booting a Fedora guest with an empty
virtio-scsi-dataplane controller:

    qemu-system-x86_64 \
      -drive if=none,id=root,format=raw,file=Fedora-Cloud-Base-25-1.3.x86_64.raw \
      -device virtio-scsi \
      -device scsi-disk,drive=root,bootindex=1 \
      -object iothread,id=io \
      -device virtio-scsi-pci,iothread=io \
      -net user,hostfwd=tcp::10022-:22 -net nic,model=virtio -m 2048 \
      --enable-kvm

Fix this by moving acquire/release pairs from virtio_scsi_handle_*_vq to
their callers - and wrap the broken assertions in.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170317061447.16243-3-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:49:03 +01:00
Fam Zheng
3d69f82161 virtio-scsi: Make virtio_scsi_acquire/release public
They will be used in virtio-scsi-dataplane.c as well, so move them to
header.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170317061447.16243-2-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:48:58 +01:00
Xu, Anthony
ade9c1aac5 clear pending status before calling memory commit
clear pending status before calling memory commit.
Otherwise when memory_region_finalize is called,
memory_region_transaction_depth is 0 and
memory_region_update_pending is true.
That's wrong.

Signed-off -by: Anthony Xu <anthony.xu@intel.com>

Message-Id: <4712D8F4B26E034E80552F30A67BE0B1A2E3D5@ORSMSX112.amr.corp.intel.com>

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-24 11:48:48 +01:00
Peter Maydell
bd517b4337 disas/microblaze: Remove unused REG_PC define
The REG_PC define in disas/microblaze.c clashes with a define in
the Linux SPARC system headers:

/home/pm215/qemu/disas/microblaze.c:162:0: error: "REG_PC" redefined [-Werror]
 #define REG_PC  32 /* PC */

In file included from /usr/include/signal.h:326:0,
                 from /home/pm215/qemu/include/qemu/osdep.h:86,
                 from /home/pm215/qemu/disas/microblaze.c:36:
/usr/include/sparc64-linux-gnu/sys/ucontext.h:96:0: note: this is the location of the previous definition
 #define REG_PC  (1)

Since the code doesn't actually use the REG_PC define
anywhere, the simplest fix is just to remove it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1490272961-1128-1-git-send-email-peter.maydell@linaro.org
2017-03-24 10:12:23 +00:00
Eric Blake
0d3ef78829 trace: Avoid abuse of amdvi_mmio_read
hw/i386/trace-events has an amdvi_mmio_read trace that is used for
both normal reads (listing the register name, address, size, and
offset) and for an error case (abusing the register name to show
an error message, the address to show the maximum value supported,
then shoehorning address and size into the size and offset
parameters).  The change from a wide address to a narrower size
parameter could truncate a (rather-large) bogus read attempt, so
it's better to create a separate dedicated trace with correct types,
rather than abusing the trace mechanism.  Broken since its
introduction in commit d29a09c.

[Change trace event argument type from hwaddr to uint64_t since
user-defined types should not be used for trace events.  This fixes a
build failure with LTTng UST.
--Stefan]

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
d17e744848 trace: Fix incorrect megasas trace parameters
hw/scsi/trace-events lists cmd as the first parameter for both
megasas_iovec_overflow and megasas_iovec_underflow, but the caller
was mistakenly passing cmd->iov_size twice instead of the command
index.  Also, trace_megasas_abort_invalid is called with parameters
in the wrong order.  Broken since its introduction in commit
e8f943c3.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
67adf4b398 trace: Fix backwards mirror_yield parameters
block/trace-events lists the parameters for mirror_yield
consistently with other mirror events (cnt just after s, like in
mirror_before_sleep; in_flight last, like in mirror_yield_in_flight).
But the callers were passing parameters in the wrong order, leading
to poor trace messages, including type truncation when there are
more than 4G dirty sectors involved.  Broken since its introduction
in commit bd48bde.

While touching this, ensure that all callers use the same type
(uint64_t) for cnt, as a later patch will enable the compiler to do
stricter type-checking.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-24 09:21:42 +00:00
Eric Blake
0832970119 qom: Fix regression with 'qom-type'
Commit 9a6d1ac assumed that 'qom-type' could be removed from QemuOpts
with no ill effects.  However, this command line proves otherwise:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio \
  -object rng-random,filename=/dev/urandom,id=rng0 \
  -device virtio-rng-pci,rng=rng0
qemu-system-x86_64: -object rng-random,filename=/dev/urandom,id=rng0: Parameter 'qom-type' is missing

Fix the regression by restoring qom-type in opts after its temporary
removal that was needed for the duration of user_creatable_add_opts().

Reported-by: Richard W. M. Jones <rjones@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 20170323160315.19696-1-eblake@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 17:59:40 +00:00
Peter Maydell
c50126aac1 configure: Fix cut-n-paste errors in OS deprecation warning
Fix some cut-and-paste errors in the OS deprecation warning
pointed out by Thomas Huth.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1490119729-26206-1-git-send-email-peter.maydell@linaro.org
2017-03-23 17:57:49 +00:00
Peter Maydell
032e95af53 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170323' into staging
ppc patch queue for 2017-03-23

Just a single bugfix in this batch.  It's not strictly in ppc code,
though it's for the pseries machine's benefit.  Eduardo suggested it
go through my tree however.

# gpg: Signature made Thu 23 Mar 2017 10:09:17 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170323:
  numa,spapr: align default numa node memory size to 256MB

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 15:21:28 +00:00
Peter Maydell
21c84c91f7 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170323' into staging
Fix linux-user vs. cpu models.

# gpg: Signature made Thu 23 Mar 2017 09:56:13 GMT
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170323:
  target/s390x: Fix broken user mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 14:51:10 +00:00
Peter Maydell
b79fbb2d70 Merge remote-tracking branch 'remotes/gonglei/tags/cryptodev-next-20170323' into staging
cryptodev fixes

# gpg: Signature made Thu 23 Mar 2017 09:22:44 GMT
# gpg:                using RSA key 0x2ED7FDE9063C864D
# gpg: Good signature from "Gonglei <arei.gonglei@huawei.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 3EF1 8E53 3459 E6D1 963A  3C05 2ED7 FDE9 063C 864D

* remotes/gonglei/tags/cryptodev-next-20170323:
  cryptodev: fix asserting single queue
  cryptodev: setiv only when really need

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 13:43:32 +00:00
Peter Maydell
d81d857f44 Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-03-22-v3' into staging
QAPI patches for 2017-03-22

# gpg: Signature made Wed 22 Mar 2017 18:25:15 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-03-22-v3:
  qapi: Fix QemuOpts visitor regression on unvisited input
  qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
  tests: Expose regression in QemuOpts visitor
  test-qobject-input-visitor: Cover visit_type_uint64()
  Revert "hostmem: fix QEMU crash by 'info memdev'"
  qapi: Fix string input visitor regression for empty lists
  qapi2texi: Fix translation of *strong* and _emphasized_
  tests/qapi-schema: Systematic positive doc comment tests
  tests/qapi-schema: Make test-qapi.py print docs again
  qapi: Drop unused QAPIDoc member optional
  qapi2texi: Fix to actually fail when 'doc-required' is false
  qapi: Drop excessive Make dependencies on qapi2texi.py
  MAINTAINERS: Add myself for files I touched recently
  keyval: Document issues with 'any' and alternate types
  test-keyval: Cover alternate and 'any' type
  keyval: Improve some comments
  test-keyval: Tweaks to improve list coverage

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 12:31:52 +00:00
Peter Maydell
e6ebebc204 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Wed 22 Mar 2017 17:28:56 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  blockjob: add devops to blockjob backends
  block-backend: add drained_begin / drained_end ops
  blockjob: add block_job_start_shim
  blockjob: avoid recursive AioContext locking

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 11:39:53 +00:00
Peter Maydell
2077cabcac Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes

virtio and misc fixes for 2.9.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 22 Mar 2017 16:29:50 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  hw/acpi/vmgenid: prevent more than one vmgenid device
  hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
  virtio: always use handle_aio_output if registered
  virtio: Fix error handling in virtio_bus_device_plugged

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 11:04:56 +00:00
Peter Maydell
ad3c6418c2 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Wed 22 Mar 2017 12:54:29 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  parallels: fix default options parsing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-23 09:56:54 +00:00
Stefan Weil
a352aa62a7 target/s390x: Fix broken user mode
Returning NULL from get_max_cpu_model results in a SIGSEGV runtime error.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170130131517.8092-1-sw@weilnetz.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-23 10:49:13 +01:00
Halil Pasic
b7bad50ae8 cryptodev: fix asserting single queue
We already check for queues == 1 in cryptodev_builtin_init and when that
is not true raise an error. But before that error is reported the
assertion in cryptodev_builtin_cleanup kicks in (because object is being
finalized and freed).

Let's remove assert(queues == 1) form cryptodev_builtin_cleanup as it
does only harm and no good.

Reported-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
2017-03-23 17:22:01 +08:00
Longpeng(Mike)
50d19cf368 cryptodev: setiv only when really need
ECB mode cipher doesn't need IV, if we setiv for it then qemu
crypto API would report "Expected IV size 0 not **", so we should
setiv only when the cipher algos really need.

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
2017-03-23 17:22:01 +08:00
Paul Durrant
8f25e75441 xen: create wrappers for all other uses of xc_hvm_XXX() functions
This patch creates inline wrapper functions in xen_common.h for all open
coded calls to xc_hvm_XXX() functions outside of xen_common.h so that use
of xen_xc can be made implicit. This again is in preparation for the move
to using libxendevicemodel.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Paul Durrant
5100afb5f5 xen: rename xen_modified_memory() to xen_hvm_modified_memory()
This patch is a purely cosmetic change that avoids a name collision in
a subsequent patch.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Paul Durrant
260cabed71 xen: make use of xen_xc implicit in xen_common.h inlines
Doing this will make the transition to using the new libxendevicemodel
interface less intrusive on the callers of these functions, since using
the new library will require a change of handle.

NOTE: The patch also moves the 'externs' for xen_xc and xen_fmem from
      xen_backend.h to xen_common.h, and the declarations from
      xen_backend.c to xen-common.c, which is where they belong.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Anthony Perard <anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
2017-03-22 11:47:39 -07:00
Eric Blake
21f88d021d qapi: Fix QemuOpts visitor regression on unvisited input
An off-by-one in commit 15c2f669e meant that we were failing to
check for unparsed input in all QemuOpts visitors.  Recent testsuite
additions show that fixing the obvious bug with bogus fields will
also fix the case of an incomplete list visit; update the tests to
match the new behavior.

Simple testcase:

./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -numa node,size=1g

failed to diagnose that 'size' is not a valid argument to -numa, and
now once again reports:

qemu-system-x86_64: -numa node,size=1g: Invalid parameter 'size'

See also https://bugzilla.redhat.com/show_bug.cgi?id=1434666

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170322144525.18964-4-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 19:24:34 +01:00
Eric Blake
9a6d1acb3e qom: Avoid unvisited 'id'/'qom-type' in user_creatable_add_opts
A regression in commit 15c2f669e caused us to silently ignore
excess input to the QemuOpts visitor.  Later, commit ea4641
accidentally abused that situation, by removing "qom-type" and
"id" from the corresponding QDict but leaving them defined in
the QemuOpts, when using the pair of containers to create a
user-defined object. Note that since we are already traversing
two separate items (a QDict and a QemuOpts), we are already
able to flag bogus arguments, as in:

$ ./x86_64-softmmu/qemu-system-x86_64 -nodefaults -nographic -qmp stdio -object memory-backend-ram,id=mem1,size=4k,bogus=huh
qemu-system-x86_64: -object memory-backend-ram,id=mem1,size=4k,bogus=huh: Property '.bogus' not found

So the only real concern is that when we re-enable strict checking
in the QemuOpts visitor, we do not want to start flagging the two
leftover keys as unvisited.  Rearrange the code to clean out the
QemuOpts listing in advance, rather than removing items from the
QDict.  Since "qom-type" is usually an automatic implicit default,
we don't have to restore it (this does mean that once instantiated,
QemuOpts is not necessarily an accurate representation of the
original command line - but this is not the first place to do that);
however "id" has to be put back (requiring us to cast away a const).

[As a side note, hmp_object_add() turns a QDict into a QemuOpts,
then calls user_creatable_add_opts() which converts QemuOpts into
a new QDict. There are probably a lot of wasteful conversions like
this, but cleaning them up is a much bigger task than the immediate
regression fix.]

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170322144525.18964-3-eblake@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 19:24:34 +01:00
John Snow
600ac6a0ef blockjob: add devops to blockjob backends
This lets us hook into drained_begin and drained_end requests from the
backend level, which is particularly useful for making sure that all
jobs associated with a particular node (whether the source or the target)
receive a drain request.

Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-4-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
John Snow
f4d9cc88ee block-backend: add drained_begin / drained_end ops
Allow block backends to forward drain requests to their devices/users.
The initial intended purpose for this patch is to allow BBs to forward
requests along to BlockJobs, which will want to pause if their associated
BB has entered a drained region.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-3-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
John Snow
e3796a245a blockjob: add block_job_start_shim
The purpose of this shim is to allow us to pause pre-started jobs.
The purpose of *that* is to allow us to buffer a pause request that
will be able to take effect before the job ever does any work, allowing
us to create jobs during a quiescent state (under which they will be
automatically paused), then resuming the jobs after the critical section
in any order, either:

(1) -block_job_start
    -block_job_resume (via e.g. drained_end)

(2) -block_job_resume (via e.g. drained_end)
    -block_job_start

The problem that requires a startup wrapper is the idea that a job must
start in the busy=true state only its first time-- all subsequent entries
require busy to be false, and the toggling of this state is otherwise
handled during existing pause and yield points.

The wrapper simply allows us to mandate that a job can "start," set busy
to true, then immediately pause only if necessary. We could avoid
requiring a wrapper, but all jobs would need to do it, so it's been
factored out here.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20170316212351.13797-2-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
Paolo Bonzini
d79df2a2ce blockjob: avoid recursive AioContext locking
Streaming or any other block job hangs when performed on a block device
that has a non-default iothread.  This happens because the AioContext
is acquired twice by block_job_defer_to_main_loop_bh and then released
only once by BDRV_POLL_WHILE.  (Insert rants on recursive mutexes, which
unfortunately are a temporary but necessary evil for iothreads at the
moment).

Luckily, the reason for the double acquisition is simple; the function
acquires the AioContext for both the job iothread and the BDS iothread,
in case the BDS iothread was changed while the job was running.  It
is therefore enough to skip the second acquisition when the two
AioContexts are one and the same.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1490118490-5597-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-22 13:26:27 -04:00
Laszlo Ersek
f92063028a hw/acpi/vmgenid: prevent more than one vmgenid device
A system with multiple VMGENID devices is undefined in the VMGENID spec by
omission.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-22 18:29:27 +02:00
Laszlo Ersek
f2a1ae45d8 hw/acpi/vmgenid: prevent device realization on pre-2.5 machine types
The WRITE_POINTER linker/loader command that underlies VMGENID depends on
commit baf2d5bfba ("fw-cfg: support writeable blobs", 2017-01-12), which
in turn depends on fw_cfg DMA.

DMA for fw_cfg is enabled in 2.5+ machine types only (see commit
e6915b5f3a, "fw_cfg: unbreak migration compatibility for 2.4 and earlier
machines", 2016-02-18).

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Ben Warren <ben@skyportsystems.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com <mailto:ben@skyportsystems.com>>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-22 18:27:35 +02:00
Paolo Bonzini
e49a661840 virtio: always use handle_aio_output if registered
Commit ad07cd6 ("virtio-scsi: always use dataplane path if ioeventfd is
active", 2016-10-30) and 9ffe337 ("virtio-blk: always use dataplane
path if ioeventfd is active", 2016-10-30) broke the virtio 1.0
indirect access registers.

The indirect access registers bypass the ioeventfd, so that virtio-blk
and virtio-scsi now repeatedly try to initialize dataplane instead of
triggering the guest->host EventNotifier.  Detect the situation by
checking vq->handle_aio_output; if it is not NULL, trigger the
EventNotifier, which is how the device expects to get notifications
and in fact the only thread-safe manner to deliver them.

Fixes: ad07cd6
Fixes: 9ffe337
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-22 17:56:00 +02:00
Eric Blake
76861f6bef tests: Expose regression in QemuOpts visitor
Commit 15c2f669e broke the ability of the QemuOpts visitor to
flag extra input parameters, but the regression went unnoticed
because of missing testsuite coverage.  Add a test to cover this;
take the approach already used in 9cb8ef3 of adding a test that
passes (to avoid breaking bisection) but marks with BUG the
behavior that we don't like, so that the actual impact of the
fix in a later patch is easier to see.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Message-Id: <20170322144525.18964-2-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-22 16:55:54 +01:00
Fam Zheng
a77690c41d virtio: Fix error handling in virtio_bus_device_plugged
For one thing we shouldn't continue if an error happened, for the other
two steps failing can cause an abort() in error_setg because we reuse
the same errp blindly.

Add error handling checks to fix both issues.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-22 17:54:32 +02:00
Laurent Vivier
55641213fc numa,spapr: align default numa node memory size to 256MB
Since commit 224245b ("spapr: Add LMB DR connectors"), NUMA node
memory size must be aligned to 256MB (SPAPR_MEMORY_BLOCK_SIZE).

But when "-numa" option is provided without "mem" parameter,
the memory is equally divided between nodes, but 8MB aligned.
This can be not valid for pseries.

In that case we can have:
$ ./ppc64-softmmu/qemu-system-ppc64 -m 4G -numa node -numa node -numa node
qemu-system-ppc64: Node 0 memory size 0x55000000 is not aligned to 256 MiB

With this patch, we have:
(qemu) info numa
3 nodes
node 0 cpus: 0
node 0 size: 1280 MB
node 1 cpus:
node 1 size: 1280 MB
node 2 cpus:
node 2 size: 1536 MB

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-22 11:32:42 +11:00
Markus Armbruster
4bc0c94da4 test-qobject-input-visitor: Cover visit_type_uint64()
The new test demonstrates known bugs: integers between INT64_MAX+1 and
UINT64_MAX rejected, and integers between INT64_MIN and -1 are
accepted modulo 2^64.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490118290-6133-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 20:01:39 +01:00
Peter Maydell
55a19ad8b2 Update version for v2.9.0-rc1 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-21 17:13:29 +00:00
Peter Maydell
898be3e041 configure: Warn about deprecated hosts
We plan to drop support in a future QEMU release for host OSes
and host architectures for which we have no test machine where
we can build and run tests. For the 2.9 release, make configure
print a warning if it is run on such a host, so that the user
has some warning of the plans and can volunteer to help us
maintain the port if they need it to continue to function.

This commit flags up as deprecated the CPU architectures:
 * ia64
 * sparc
 * anything which we don't have a TCG port for
   (and which was presumably using TCI)
and the OSes:
 * GNU/kFreeBSD
 * DragonFly BSD
 * NetBSD
 * OpenBSD
 * Solaris
 * AIX
 * Haiku

It also makes entirely unrecognized host OS strings be
rejected rather than treated as if they were Linux (which
likely never worked).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490106717-9542-1-git-send-email-peter.maydell@linaro.org
2017-03-21 15:46:22 +00:00
Peter Maydell
41a56822e3 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
This pull request fixes a potential QEMU hang in 9pfs and two issues
reported by Coverity.

# gpg: Signature made Tue 21 Mar 2017 09:57:58 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9pfs: proxy: assert if unmarshal fails
  9pfs: don't try to flush self and avoid QEMU hang on reset

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-21 14:32:51 +00:00
Gerd Hoffmann
cc720a5dc4 add opengl_cflags to QEMU_CFLAGS
... and drop OPENGL_CFLAGS from Makefiles.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1490079888-29029-1-git-send-email-kraxel@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-21 10:25:01 +00:00
Edgar Kaziahmedov
ff5bbe56c6 parallels: fix default options parsing
parallels block driver is completely broken since commit
    commit 75cdcd1553
    Author: Markus Armbruster <armbru@redhat.com>
    Date:   Tue Feb 21 21:14:08 2017 +0100
    option: Fix checking of sizes for overflow and trailing crap
Right now even simple
    qemu-io -c "read 512 64k" 1.hds
ends up with
    Unexpected error in parse_option_size() at util/qemu-option.c:188:
    Parameter 'prealloc-size' expects a non-negative number below 2^64
    Aborted (core dumped)
The cure is simple - we should use 'M' as a suffix in default option value
instead of 'MiB'.

Signed-off-by: Edgar Kaziahmedov <edos@virtuozzo.mipt.ru>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1490002022-22653-1-git-send-email-den@openvz.org
CC: Markus Armbruster <armbru@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-21 10:02:36 +00:00
Markus Armbruster
658ae5a7b9 Revert "hostmem: fix QEMU crash by 'info memdev'"
This reverts commit 1454d33f05.

The string input visitor regression fixed in the previous commit made
visit_type_uint16List() fail on empty input.  query_memdev() calls it
via object_property_get_uint16List().  Because it doesn't expect it to
fail, it passes &error_abort, and duly crashes.

Commit 1454d33 "fixes" this crash by making
host_memory_backend_get_host_nodes() return a list containing just
MAX_NODES instead of the empty list.  Papers over the regression, and
leads to bogus "info memdev" output, as shown below; revert.

I suspect that if we had bisected the crash back then, we would have
found and fixed the actual bug instead of papering over it.

To reproduce, run HMP command "info memdev" with

    $ qemu-system-x86_64 --nodefaults -S -display none -monitor stdio -object memory-backend-ram,id=mem1,size=4k

With this commit, "info memdev" prints

    memory backend: mem1
      size:  4096
      merge: true
      dump: true
      prealloc: false
      policy: default
      host nodes:

exactly like before commit 74f24cb.

Between commit 1454d33 and this commit, it prints

    memory backend: mem1
      size:  4096
      merge: true
      dump: true
      prealloc: false
      policy: default
      host nodes: 128

The last line is bogus.

Between commit 74f24cb and 1454d33, it crashes like this:

    Unexpected error in parse_str() at /work/armbru/tmp/qemu/qapi/string-input-visitor.c:126:
    Parameter 'null' expects an int64 value or range
    Aborted (core dumped)

Cc: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490026424-11330-3-git-send-email-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:43:28 +01:00
Markus Armbruster
d2788227c6 qapi: Fix string input visitor regression for empty lists
Visiting a list when input is the empty string should result in an
empty list, not an error.  Noticed when commit 3d089ce belatedly added
tests, but simply accepted as weird then.  It's actually a regression:
broken in commit 74f24cb, v2.7.0.  Fix it, and throw in another test
case for empty string.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490026424-11330-2-git-send-email-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:43:01 +01:00
Markus Armbruster
c32617a194 qapi2texi: Fix translation of *strong* and _emphasized_
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-7-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:58 +01:00
Markus Armbruster
80d1f2e4a5 tests/qapi-schema: Systematic positive doc comment tests
We have a number of negative tests, but we don't have systematic
positive coverage.  Fix that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-6-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:55 +01:00
Markus Armbruster
818c331833 tests/qapi-schema: Make test-qapi.py print docs again
test-qapi.py used to print the internal representation of doc comments
(commit 3313b61).  This went away when we dropped the doc comments in
positive tests (commit 87c16dc).  Bring it back, because I'm going to
add real positive doc comment tests.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-5-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:52 +01:00
Markus Armbruster
32b8a2ad61 qapi: Drop unused QAPIDoc member optional
Unused since commit aa964b7 "qapi2texi: Convert to QAPISchemaVisitor"

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-4-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:49 +01:00
Markus Armbruster
e8ba07ea9a qapi2texi: Fix to actually fail when 'doc-required' is false
Messed up in commit bc52d03.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-3-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:27 +01:00
Markus Armbruster
4afeeb57a1 qapi: Drop excessive Make dependencies on qapi2texi.py
When qapi2texi.py changes, we regenerate everything QAPI.  Screwed up
in commit 56e8bdd.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490015515-25851-2-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2017-03-21 10:42:15 +01:00
Markus Armbruster
e94630d3ad MAINTAINERS: Add myself for files I touched recently
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:12 +01:00
Markus Armbruster
0ee9ae7c8c keyval: Document issues with 'any' and alternate types
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-5-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:09 +01:00
Markus Armbruster
599c156bac test-keyval: Cover alternate and 'any' type
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:42:06 +01:00
Markus Armbruster
fae425d74f keyval: Improve some comments
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:41:54 +01:00
Markus Armbruster
b2cd5b925c test-keyval: Tweaks to improve list coverage
We have a negative test case for a list index with leading zero.  Add
positive ones.

Tweak the test case for list index greater or equal the number of
elements: test "equal" instead of "greater" to guard against
off-by-one mistakes.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1490014548-15083-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-21 10:41:43 +01:00
Greg Kurz
262169abe7 9pfs: proxy: assert if unmarshal fails
Replies from the virtfs proxy are made up of a fixed-size header (8 bytes)
and a payload of variable size (maximum 64kb). When receiving a reply,
the proxy backend first reads the whole header and then unmarshals it.
If the header is okay, it then does the same operation with the payload.

Since the proxy backend uses a pre-allocated buffer which has enough room
for a header and the maximum payload size, marshalling should never fail
with fixed size arguments. Any error here is likely to result from a more
serious corruption in QEMU and we'd better dump core right away.

This patch adds error checks where they are missing and converts the
associated error paths into assertions.

This should also address Coverity's complaints CID 1348519 and CID 1348520,
about not always checking the return value of proxy_unmarshal().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-21 09:12:47 +01:00
Greg Kurz
d5f2af7b95 9pfs: don't try to flush self and avoid QEMU hang on reset
According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.

QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.

If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.

This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).

The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.

[*] http://man.cat-v.org/plan_9/5/flush

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-21 09:12:47 +01:00
Peter Maydell
940a8ce075 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
fixes for 2.9-rc1, plus removal of -mno-cygwin references

# gpg: Signature made Mon 20 Mar 2017 11:25:07 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  hax: fix breakage in locking
  configure: remove Cygwin
  xen: do not build backends for targets that do not support xen
  qemu-ga: obey LISTEN_PID when using systemd socket activation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 16:34:26 +00:00
Gerd Hoffmann
373967b2ed audio: catch missing sdl support
sdl is probed before audio, so we can simply look at $sdl so see
whenever we have support or not.  Throw an error in case sdl audio
is requested without sdl being available.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1490000743-3615-1-git-send-email-kraxel@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 16:01:51 +00:00
Paolo Bonzini
c8645752ce configure: remove Cygwin
The Cygwin target is really compiling for native Win32 with -mno-cygwin.
Except, GCC 4.7.0 has finally removed the long deprecated -mno-cygwin
option, and that happened about five years ago.

Let it rest in peace.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20170317160811.28370-1-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 15:23:05 +00:00
Peter Maydell
e8b974f1ed Merge remote-tracking branch 'remotes/yongbok/tags/mips-20170320' into staging
MIPS patches 2017-03-20

Changes:
* Fix clang warnings
* Fix delay slot detection in gen_msa_branch()
* Fix rc4030 interval timer
* Fix rc4030 to tranlate memory accesses only when they occur
* Fix 4c4030 a mixed declarations and code warning
* Update MAINTAINERS file

# gpg: Signature made Mon 20 Mar 2017 12:46:01 GMT
# gpg:                using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA  2B5C 2238 EB86 D5F7 97C2

* remotes/yongbok/tags/mips-20170320:
  MAINTAINERS: update for MIPS devices
  dma/rc4030: fix a mixed declarations and code warning
  dma/rc4030: translate memory accesses only when they occur
  dma: rc4030: limit interval timer reload value
  target/mips: fix delay slot detection in gen_msa_branch()
  target-mips: replace few LOG_DISAS() with trace points
  target-mips: replace break by goto cp0_unimplemented
  target-mips: log bad coprocessor0 register accesses with LOG_UNIMP
  target-mips: remove old & unuseful comments
  target-mips: fix compiler warnings (clang 5)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 13:53:14 +00:00
Peter Maydell
32f70d7659 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170320' into staging
target-arm queue:
 * fix MSR/MRS decoding for M profile CPUs

# gpg: Signature made Mon 20 Mar 2017 12:53:26 GMT
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170320:
  arm: Fix APSR writes via M profile MSR
  arm: Enforce should-be-1 bits in MRS decoding
  arm: Don't decode MRS(banked) or MSR(banked) for M profile
  arm: HVC and SMC encodings don't exist for M profile

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 12:56:42 +00:00
Peter Maydell
b28b3377d7 arm: Fix APSR writes via M profile MSR
Our implementation of writes to the APSR for M-profile via the MSR
instruction was badly broken.

First and worst, we had the sense wrong on the test of bit 2 of the
SYSm field -- this is supposed to request an APSR write if bit 2 is 0
but we were doing it if bit 2 was 1.  This bug was introduced in
commit 58117c9bb4, so hasn't been in a QEMU release.

Secondly, the choice of exactly which parts of APSR should be written
is defined by bits in the 'mask' field.  We were not passing these
through from instruction decode, making it impossible to check them
in the helper.

Pass the mask bits through from the instruction decode to the helper
function and process them appropriately; fix the wrong sense of the
SYSm bit 2 check.

Invalid mask values and invalid combinations of mask and register
number are UNPREDICTABLE; we choose to treat them as if the mask
values were valid.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1487616072-9226-5-git-send-email-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-03-20 12:41:44 +00:00
Peter Maydell
3d54026fb0 arm: Enforce should-be-1 bits in MRS decoding
The MRS instruction requires that bits [19..16] are all 1s, and for
A/R profile also that bits [7..0] are all 0s.  At this point in the
decode tree we have checked all of the rest of the instruction but
were allowing these to be any value.  If these bits are not set then
the result is architecturally UNPREDICTABLE, but choosing to UNDEF is
more helpful to the user and avoids unexpected odd behaviour if the
encodings are used for some purpose in future architecture versions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487616072-9226-4-git-send-email-peter.maydell@linaro.org
2017-03-20 12:41:44 +00:00
Peter Maydell
43ac657423 arm: Don't decode MRS(banked) or MSR(banked) for M profile
M profile doesn't have the MSR(banked) and MRS(banked) instructions
and uses the encodings for different kinds of M-profile MRS/MSR.
Guard the relevant bits of the decode logic to make sure we don't
accidentally fall into them by accident on M-profile.

(The bit being checked for this (bit 5) is part of the SYSm field on
M-profile, but since no currently allocated system registers have
encodings with bit 5 of SYSm set, this hasn't been a problem in
practice.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487616072-9226-3-git-send-email-peter.maydell@linaro.org
2017-03-20 12:41:44 +00:00
Peter Maydell
001b3cab51 arm: HVC and SMC encodings don't exist for M profile
M profile doesn't have the HVC or SMC encodings, so make them always
UNDEF rather than generating calls to helper functions that assume
A/R profile.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1487616072-9226-2-git-send-email-peter.maydell@linaro.org
2017-03-20 12:41:44 +00:00
Vincent Palatin
b3d3a426da hax: fix breakage in locking
use qemu_mutex_lock_iothread consistently in qemu_hax_cpu_thread_fn() as
done in other _thread_fn functions, instead of grabbing directly the
BQL. This way we ensure that iothread_locked is properly set.

On v2.9.0-rc0, QEMU was dying in an assertion in the mutex code when
running with '--enable-hax' either on OSX or Windows. This bug was triggered
since the code modification for multithreading added new usages of
qemu_mutex_iothread_locked.
This fixes the breakage on both platforms, I can now run again a full
Chromium OS image with HAX kernel acceleration.

Signed-off-by: Vincent Palatin <vpalatin@chromium.org>
Message-Id: <20170320101549.150076-1-vpalatin@chromium.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-20 12:24:43 +01:00
Yongbok Kim
659f42d8c3 MAINTAINERS: update for MIPS devices
Add myself to MIPSSIM and new entry for Fulong 2E.
Add an entry for Boston machine (Paul Burton).

cc: Paul Burton <paul.burton@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2017-03-20 11:20:46 +00:00
Yongbok Kim
1b393b310f dma/rc4030: fix a mixed declarations and code warning
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
2017-03-20 11:20:35 +00:00
Hervé Poussineau
c627e7526a dma/rc4030: translate memory accesses only when they occur
This simplifies the code a lot, and this fixes big memory leaks
introduced in a3d586f704

Windows NT is now able to boot without using gigabytes of ram on the host.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:20:26 +00:00
Prasad J Pandit
c0a3172fa6 dma: rc4030: limit interval timer reload value
The JAZZ RC4030 chipset emulator has a periodic timer and
associated interval reload register. The reload value is used
as divider when computing timer's next tick value. If reload
value is large, it could lead to divide by zero error. Limit
the interval reload value to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:19:55 +00:00
Yongbok Kim
075a1fe788 target/mips: fix delay slot detection in gen_msa_branch()
It is unnecessary to test R6 from delay/forbidden slot check
in gen_msa_branch().

https://bugs.launchpad.net/qemu/+bug/1663287

Reported-by: Brian Campbell <bacam@z273.org.uk>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:19:14 +00:00
Philippe Mathieu-Daudé
b44a7fb14e target-mips: replace few LOG_DISAS() with trace points
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:06:32 +00:00
Philippe Mathieu-Daudé
3570d7f667 target-mips: replace break by goto cp0_unimplemented
this fixes many warnings like:

target/mips/translate.c:6253:13: warning: Value stored to 'rn' is never read
            rn = "invalid sel";
            ^    ~~~~~~~~~~~~~

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:06:32 +00:00
Philippe Mathieu-Daudé
965447eecb target-mips: log bad coprocessor0 register accesses with LOG_UNIMP
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:06:32 +00:00
Philippe Mathieu-Daudé
989f2aa9af target-mips: remove old & unuseful comments
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:06:32 +00:00
Philippe Mathieu-Daudé
def74c0cf0 target-mips: fix compiler warnings (clang 5)
static code analyzer complain:

target/mips/helper.c:453:5: warning: Function call argument is an uninitialized value
    qemu_log_mask(CPU_LOG_MMU,
    ^~~~~~~~~~~~~~~~~~~~~~~~~~

'physical' and 'prot' are uninitialized if 'ret' is not TLBRET_MATCH.

Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
2017-03-20 11:06:32 +00:00
Peter Maydell
00e7c07b06 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20170320' into staging
One bugfix for device plug/unplug and migration in the
channel subsystem code.

# gpg: Signature made Mon 20 Mar 2017 08:45:59 GMT
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20170320:
  s390x/css: reassign subchannel if schid is changed after migration

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 10:51:30 +00:00
Peter Maydell
bedf13ecab Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170320-1' into staging
fixes for 2.9: vnc, cirrus, tcg display updates.

# gpg: Signature made Mon 20 Mar 2017 08:52:34 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170320-1:
  vnc: fix a qio-channel leak
  cirrus: fix off-by-one in cirrus_bitblt_rop_bkwd_transp_*_16
  ui/console: ensure graphic updates don't race with TCG vCPUs

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-20 10:05:45 +00:00
Dong Jia Shi
3c788ebc6f s390x/css: reassign subchannel if schid is changed after migration
The subchannel is a means to access a device. While the device number is
assigned by the administrator, the subchannel number is assigned by
the channel subsystem in an ascending order on cold and hot plug.
When doing unplug and replug operations, the same device may end up on
a different subchannel; for example

- We start with a device fe.1.2222, which ends up at subchannel
  fe.1.0000.
- Now we detach the device, attach a device fe.1.3333 (which would get
  the now-free subchannel fe.1.0000), re-attach fe.1.2222 (which ends
  up at subchannel fe.1.0001) and detach fe.1.3333.
- We now have the same device (fe.1.2222) available to the guest; it
  just shows up on a different subchannel.

In such a case, the subchannel numbers are different from what a
QEMU would create during cold plug when parsing the command line.

As this would cause a guest visible change on migration, we do restore
the source system's value of the subchannel number on load.

So we are now fine from the guest perspective. From the host
perspective this will cause an inconsistent state in our internal data
structures, though.

For example, the subchannel 0 might not be at array position 0. This will
lead to problems when we continue doing hot (un/re) plug operations.

Let's fix this by cleaning up our internal data structures.

Reported-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-20 09:22:57 +01:00
Marc-André Lureau
7bc4f0846f vnc: fix a qio-channel leak
Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170317092802.17973-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-20 09:07:34 +01:00
Paolo Bonzini
732a802076 configure: remove Cygwin
The Cygwin target is really compiling for native Win32 with -mno-cygwin.
Except, GCC 4.7.0 has finally removed the long deprecated -mno-cygwin
option, and that happened about five years ago.

Let it rest in peace.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-19 11:12:12 +01:00
Stefano Stabellini
6b827cca9a xen: do not build backends for targets that do not support xen
Change Makefile.objs to use CONFIG_XEN instead of CONFIG_XEN_BACKEND, so
that the Xen backends are only built for targets that support Xen.

Set CONFIG_XEN in the toplevel Makefile to ensure that files that are
built only once pick up Xen support properly.

Signed-off-by: Stefano Stabellini <stefano@aporeto.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
CC: pbonzini@redhat.com
CC: peter.maydell@linaro.org
CC: rth@twiddle.net
CC: stefanha@redhat.com
Message-Id: <1489694518-16978-1-git-send-email-sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-19 11:12:12 +01:00
Paolo Bonzini
53fabd4b86 qemu-ga: obey LISTEN_PID when using systemd socket activation
qemu-ga's socket activation support was not obeying the LISTEN_PID
environment variable, which avoids that a process uses a socket-activation
file descriptor meant for its parent.

Mess can for example ensue if a process forks a children before consuming
the socket-activation file descriptor and therefore setting O_CLOEXEC
on it.

Luckily, qemu-nbd also got socket activation code, and its copy does
support LISTEN_PID.  Some extra fixups are needed to ensure that the
code can be used for both, but that's what this patch does.  The
main change is to replace get_listen_fds's "consume" argument with
the FIRST_SOCKET_ACTIVATION_FD macro from the qemu-nbd code.

Cc: "Richard W.M. Jones" <rjones@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-19 11:12:12 +01:00
Marek Vasut
ebedf0f9cd nios2: iic: Convert CPU prop to qom link
Add a const qom link between the CPU and the IIC instead
of passing the CPU link through a qom property.

Signed-off-by: Marek Vasut <marex@denx.de>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 20170317210627.23532-1-marex@denx.de
Cc: Alexander Graf <agraf@suse.de>
Cc: Chris Wulff <crwulff@gmail.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Jeff Da Silva <jdasilva@altera.com>
Cc: Ley Foon Tan <lftan@altera.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Sandra Loosemore <sandra@codesourcery.com>
Cc: Yves Vandervennet <yvanderv@altera.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-18 18:22:54 +00:00
Peter Maydell
96dd9c89c1 Merge remote-tracking branch 'remotes/xtensa/tags/20170317-xtensa' into staging
target/xtensa fixes for 2.9:

- fix build failure when FDT support is not enabled;
- correctly pass command line arguments to semihosting guests.

# gpg: Signature made Fri 17 Mar 2017 18:14:01 GMT
# gpg:                using RSA key 0x51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20170317-xtensa:
  target/xtensa: fix semihosting argc/argv implementation
  target/xtensa: xtfpga: load DTB only when FDT support is enabled

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-18 17:24:49 +00:00
Paolo Bonzini
6b3cca76dd oslib-posix: fix compilation on OpenBSD
si_band is not found in OpenBSD.   It is marked as obsolescent in
POSIX, so we can delete it without any remorse.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170317152214.6148-1-pbonzini@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-17 18:27:49 +00:00
Paolo Bonzini
eb048026aa curl: fix compilation on OpenBSD
EPROTO is not found in OpenBSD.   We usually use EIO when no better
errno is available, do that here too.

Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170317152412.8472-1-pbonzini@redhat.com
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-17 18:27:14 +00:00
Peter Maydell
31d8922836 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc1

# gpg: Signature made Fri 17 Mar 2017 12:06:04 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  block: quiesce AioContext when detaching from it
  thread-pool: add missing qemu_bh_cancel in completion function
  block: Propagate error in bdrv_open_backing_file
  blockdev: fix bitmap clear undo
  block: Always call bdrv_child_check_perm first
  file-posix: Don't leak fd in hdev_get_max_segments
  replication: clarify permissions
  file-posix: clean up max_segments buffer termination

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-17 13:26:10 +00:00
Kevin Wolf
11f0f5e553 Merge remote-tracking branch 'mreitz/tags/pull-block-2017-03-17' into queue-block
Block patches for 2.9-rc1

# gpg: Signature made Fri Mar 17 12:59:20 2017 CET
# gpg:                using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40

* mreitz/tags/pull-block-2017-03-17:
  block: quiesce AioContext when detaching from it

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 13:03:44 +01:00
Paolo Bonzini
c2b6428d38 block: quiesce AioContext when detaching from it
While it is true that bdrv_set_aio_context only works on a single
BlockDriverState subtree (see commit message for 53ec73e, "block: Use
bdrv_drain to replace uncessary bdrv_drain_all", 2015-07-07), it works
at the AioContext level rather than the BlockDriverState level.

Therefore, it is also necessary to trigger pending bottom halves too,
even if no requests are pending.

For NBD this ensures that the aio_co_schedule of a previous call to
nbd_attach_aio_context is completed before detaching from the old
AioContext; it fixes qemu-iotest 094.  Another similar bug happens
when the VM is stopped and the virtio-blk dataplane irqfd is torn down.
In this case it's possible that guest I/O gets stuck if notify_guest_bh
was scheduled but doesn't run.

Calling aio_poll from another AioContext is safe if non-blocking; races
such as the one mentioned in the commit message for c9d1a56 ("block:
only call aio_poll on the current thread's AioContext", 2016-10-28)
are a concern for blocking calls.

I considered other options, including:

- moving the bs->wakeup mechanism to AioContext, and letting the caller
check.  This might work for virtio which has a clear place to wakeup
(notify_place_bh) and check the condition (virtio_blk_data_plane_stop).
For aio_co_schedule I couldn't find a clear place to check the condition.

- adding a dummy oneshot bottom half and waiting for it to trigger.
This has the complication that bottom half list is LIFO for historical
reasons.  There were performance issues caused by bottom half ordering
in the past, so I decided against it for 2.9.

Fixes: 9972354856
Reported-by: Max Reitz <mreitz@redhat.com>
Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Tested-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20170314111157.14464-2-pbonzini@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2017-03-17 12:58:42 +01:00
Peter Lieven
b7a745dc33 thread-pool: add missing qemu_bh_cancel in completion function
commit 3c80ca15 fixed a deadlock scenarion with nested aio_poll invocations.

However, the rescheduling of the completion BH introcuded unnecessary spinning
in the main-loop. On very fast file backends this can even lead to the
"WARNING: I/O thread spun for 1000 iterations" message popping up.

Callgrind reports about 3-4% less instructions with this patch running
qemu-img bench on a ramdisk based VMDK file.

Fixes: 3c80ca158c
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:21 +01:00
Fam Zheng
8cd1a3e470 block: Propagate error in bdrv_open_backing_file
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
John Snow
184dd9c49b blockdev: fix bitmap clear undo
Only undo the action if we actually prepared the action.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Fam Zheng
c1cef67251 block: Always call bdrv_child_check_perm first
bdrv_child_set_perm alone is not very usable because the caller must
call bdrv_child_check_perm first. This is already encapsulated
conveniently in bdrv_child_try_set_perm, so remove the other prototypes
from the header and fix the one wrong caller, block/mirror.c.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Fam Zheng
fed414df9d file-posix: Don't leak fd in hdev_get_max_segments
This fixes a leaked fd introduced in commit 9103f1ce.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Changlong Xie
37a9051cc7 replication: clarify permissions
Even if hidden_disk, secondary_disk are backing files, they all need
write permissions in replication scenario. Otherwise we will encouter
below exceptions on secondary side during adding nbd server:

{'execute': 'nbd-server-add', 'arguments': {'device': 'colo-disk', 'writable': true } }
{"error": {"class": "GenericError", "desc": "Conflicts with use by hidden-qcow2-driver as 'backing', which does not allow 'write' on sec-qcow2-driver-for-nbd"}}

CC: Zhang Hailiang <zhang.zhanghailiang@huawei.com>
CC: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
CC: Wen Congyang <wencongyang2@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Stefan Hajnoczi
6958349085 file-posix: clean up max_segments buffer termination
The following pattern is unsafe:

  char buf[32];
  ret = read(fd, buf, sizeof(buf));
  ...
  buf[ret] = 0;

If read(2) returns 32 then a byte beyond the end of the buffer is
zeroed.

In practice this buffer overflow does not occur because the sysfs
max_segments file only contains an unsigned short + '\n'.  The string is
always shorter than 32 bytes.

Regardless, avoid this pattern because static analysis tools might
complain and it could lead to real buffer overflows if copy-pasted
elsewhere in the codebase.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-17 12:54:06 +01:00
Gerd Hoffmann
f019722cbb cirrus: fix off-by-one in cirrus_bitblt_rop_bkwd_transp_*_16
The switch from pointers to addresses (commit
026aeffcb4 and
ffaf857778) added
a off-by-one bug to 16bit backward blits.  Fix.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1489735296-19047-1-git-send-email-kraxel@redhat.com
2017-03-17 10:23:44 +01:00
Alex Bennée
8bb93c6f99 ui/console: ensure graphic updates don't race with TCG vCPUs
Commit 8d04fb55..

  tcg: drop global lock during TCG code execution

..broke the assumption that updates to the GUI couldn't happen at the
same time as TCG vCPUs where running. As a result the TCG vCPU could
still be updating a directly mapped frame-buffer while the display
side was updating. This would cause artefacts to appear when the
update code assumed that memory block hadn't changed.

The simplest solution is to ensure the two things can't happen at the
same time like the old BQL locking scheme. Here we use the solution
introduced for MTTCG and schedule the update as async_safe_work when
we know no vCPUs can be running.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20170315144825.3108-1-alex.bennee@linaro.org
Cc: BALATON Zoltan <balaton@eik.bme.hu>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>

[ kraxel: updated comment clarifying the display adapters are buggy
          and this is a temporary workaround ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-17 10:17:21 +01:00
Peter Maydell
272d7dee59 Merge remote-tracking branch 'remotes/kraxel/tags/pull-cirrus-20170316-1' into staging
cirrus: blitter fixes.

# gpg: Signature made Thu 16 Mar 2017 09:05:22 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-cirrus-20170316-1:
  cirrus: stop passing around src pointers in the blitter
  cirrus: stop passing around dst pointers in the blitter
  cirrus: fix cirrus_invalidate_region
  cirrus: add option to disable blitter
  cirrus: switch to 4 MB video memory by default
  cirrus/vnc: zap bitblit support from console code.
  fix :cirrus_vga fix OOB read case qemu Segmentation fault

# Conflicts:
#	include/hw/compat.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 16:40:44 +00:00
Peter Maydell
c5e737e5fb Merge remote-tracking branch 'remotes/juanquintela/tags/migration/20170316' into staging
migration/next for 20170316

# gpg: Signature made Thu 16 Mar 2017 08:21:51 GMT
# gpg:                using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg:                 aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03  4B82 F487 EF18 5872 D723

* remotes/juanquintela/tags/migration/20170316:
  postcopy: Check for shared memory
  RAMBlocks: qemu_ram_is_shared
  vmstate: fix failed iotests case 68 and 91
  migration/block: Avoid invoking blk_drain too frequently
  migration: use "" as the default for tls-creds/hostname
  Change the method to calculate dirty-pages-rate

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 15:32:08 +00:00
Peter Maydell
094a9a7cd6 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
Pull request

Tracing makefile fixes for QEMU 2.9.

# gpg: Signature made Thu 16 Mar 2017 06:56:10 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: ensure $(tracetool-y) is defined in top level makefile
  makefile: generate trace-events-all upfront
  makefile: merge GENERATED_HEADERS & GENERATED_SOURCES variables

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 14:23:10 +00:00
Peter Maydell
699f6c6fd4 dtc: Revert unintentional submodule downgrade from commit c2cabb3422
Commit c2cabb3422 inadvertently downgraded the 'dtc' submodule,
undoing the increments added in earlier commits. Revert this,
returning the submodule state to where we should be.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 14:11:15 +00:00
Peter Maydell
3c2758c286 Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-03-16' into staging
QAPI patches for 2017-03-16

# gpg: Signature made Thu 16 Mar 2017 06:18:38 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-03-16: (49 commits)
  qapi: Fix a misleading parser error message
  qapi: Make pylint a bit happier
  qapi: Drop unused .check_clash() parameter schema
  qapi: union_types is a list used like a dict, make it one
  qapi: struct_types is a list used like a dict, make it one
  qapi: enum_types is a list used like a dict, make it one
  qapi: Factor add_name() calls out of the meta conditional
  qapi: Simplify what gets stored in enum_types
  qapi: Drop unused variable events
  qapi: Eliminate check_docs() and drop QAPIDoc.expr
  qapi: Fix detection of bogus member documentation
  tests/qapi-schema: Improve coverage of bogus member docs
  tests/qapi-schema: Rename doc-bad-args to doc-bad-command-arg
  qapi: Move empty doc section checking to doc parser
  qapi: Improve error message on @NAME: in free-form doc
  qapi: Move detection of doc / expression name mismatch
  qapi: Fix detection of doc / expression mismatch
  tests/qapi-schema: Improve doc / expression mismatch coverage
  qapi2texi: Use category "Object" for all object types
  qapi2texi: Generate descriptions for simple union tags
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 12:05:03 +00:00
Peter Maydell
3716fba3f5 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pci: fixes

More fixes missed in the previous pull request.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 16 Mar 2017 02:29:49 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio-serial-bus: Delete timer from list before free it
  hw/virtio: fix Power Management Control Register for PCI Express virtio devices
  hw/virtio: fix Link Control Register for PCI Express virtio devices
  hw/virtio: fix error enabling flags in Device Control register
  hw/pcie: fix Extended Configuration Space for devices with no Extended Capabilities

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 11:05:47 +00:00
Peter Maydell
7c75638028 Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Thu 16 Mar 2017 00:52:41 GMT
# gpg:                using RSA key 0x7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  ide: ahci: call cleanup function in ahci unit
  ide: core: add cleanup function
  ide: qdev: register ide bus unrealize function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 10:04:41 +00:00
Dr. David Alan Gilbert
8679638b0e postcopy: Check for shared memory
Postcopy doesn't support migration of RAM shared with another process
yet (we've got a bunch of things to understand).
Check for the case and don't allow postcopy to be enabled.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 09:02:26 +01:00
Dr. David Alan Gilbert
463a4ac23b RAMBlocks: qemu_ram_is_shared
Provide a helper to say whether a RAMBlock was created as a
shared mapping.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 09:00:58 +01:00
QingFeng Hao
e1e686c1fa vmstate: fix failed iotests case 68 and 91
This problem affects s390x only if we are running without KVM.
Basically, S390CPU.irqstate is unused if we do not use KVM,
and thus no buffer is allocated.
This causes size=0, first_elem=NULL and n_elems=1 in
vmstate_load_state and vmstate_save_state. And the assert fails.
With this fix we can go back to the old behavior and support
VMS_VBUFFER with size 0 and nullptr.

Signed-off-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:59:52 +01:00
Lidong Chen
1cf6aa74b3 migration/block: Avoid invoking blk_drain too frequently
Increase bmds->cur_dirty after submit io, so reduce the frequency
involve into blk_drain, and improve the performance obviously
when block migration.

The performance test result of this patch:

During the block dirty save phase, this patch improve guest os IOPS
from 4.0K to 9.5K. and improve the migration speed from
505856 rsec/s to 855756 rsec/s.

Signed-off-by: Lidong Chen <jemmy858585@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:58:27 +01:00
Gerd Hoffmann
ffaf857778 cirrus: stop passing around src pointers in the blitter
Does basically the same as "cirrus: stop passing around dst pointers in
the blitter", just for the src pointer instead of the dst pointer.

For the src we have to care about cputovideo blits though and fetch the
data from s->cirrus_bltbuf instead of vga memory.  The cirrus_src*()
helper functions handle that.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489584487-3489-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:16 +01:00
Gerd Hoffmann
026aeffcb4 cirrus: stop passing around dst pointers in the blitter
Instead pass around the address (aka offset into vga memory).  Calculate
the pointer in the rop_* functions, after applying the mask to the
address, to make sure the address stays within the valid range.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489574872-8679-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:15 +01:00
Gerd Hoffmann
e048dac616 cirrus: fix cirrus_invalidate_region
off_cur_end is exclusive, so off_cur_end == cirrus_addr_mask is valid.
Fix calculation to make sure to allow that, otherwise the assert added
by commit f153b563f8 can trigger for valid
blits.

Test case: boot windows nt 4.0

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489579606-26020-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:15 +01:00
Gerd Hoffmann
827bd51726 cirrus: add option to disable blitter
Ok, we have this beast in the cirrus code which is not used at all by
modern guests, except when you try to find security holes in qemu.  So,
add an option to disable blitter altogether.  Guests released within
the last ten years should not show any rendering issues if you turn off
blitter support.

There are no known bugs in the cirrus blitter code.  But in the past we
hoped a few times already that we've finally nailed the last issue.  So
having some easy way to mitigate in case yet another blitter issue shows
up certainly makes me sleep a bit better at night.

For completeness:  The by far better way to mitigate is to switch away
from cirrus and use stdvga instead.  Or something more modern like
virtio-vga in case your guest has support for it.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494540-15745-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:15 +01:00
Gerd Hoffmann
73c148130b cirrus: switch to 4 MB video memory by default
Quoting cirrus source code:
   Follow real hardware, cirrus card emulated has 4 MB video memory.
   Also accept 8 MB/16 MB for backward compatibility.

So just use 4MB by default.  We decided to leave that at 8MB by default
a while ago, for live migration compatibility reasons.  But we have
compat properties to handle that, so that isn't a compeling reason.

This also removes some sanity check inconsistencies in the cirrus code.
Some places check against the allocated video memory, some places check
against the 4MB physical hardware has.  Guest code can trigger asserts
because of that.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494514-15606-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:15 +01:00
Gerd Hoffmann
50628d3479 cirrus/vnc: zap bitblit support from console code.
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests.  It is supported by cirrus and vnc server.  The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.

This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more.  Any linux guest using the cirrus drm
driver doesn't.  Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.

So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".

Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full).  This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display.  So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.

Lets kill it once for all.

Oh, and one more reason: Turns out (after writing the patch) we have a
security bug in that code path ...

Fixes: CVE-2016-9603
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
2017-03-16 08:58:15 +01:00
hangaohuai
215902d7b6 fix :cirrus_vga fix OOB read case qemu Segmentation fault
check the validity of parameters in cirrus_bitblt_rop_fwd_transp_xxx
and cirrus_bitblt_rop_fwd_xxx to avoid the OOB read which causes qemu Segmentation fault.

After the fix, we will touch the assert in
cirrus_invalidate_region:
assert(off_cur_end >= off_cur);

Signed-off-by: fangying <fangying1@huawei.com>
Signed-off-by: hangaohuai <hangaohuai@huawei.com>
Message-id: 20170314063919.16200-1-hangaohuai@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2017-03-16 08:58:15 +01:00
Daniel P. Berrange
4af245dc3e migration: use "" as the default for tls-creds/hostname
The tls-creds parameter has a default value of NULL indicating
that TLS should not be used. Setting it to non-NULL enables
use of TLS. Once tls-creds are set to a non-NULL value via the
monitor, it isn't possible to set them back to NULL again, due
to current implementation limitations. The empty string is not
a valid QObject identifier, so this switches to use "" as the
default, indicating that TLS will not be used

The tls-hostname parameter has a default value of NULL indicating
the the hostname from the migrate connection URI should be used.
Again, once tls-hostname is set non-NULL, to override the default
hostname for x509 cert validation, it isn't possible to reset it
back to NULL via the monitor. The empty string is not a valid
hostname, so this switches to use "" as the default, indicating
that the migrate URI hostname should be used.

Using "" as the default for both, also means that the monitor
commands "info migrate_parameters" / "query-migrate-parameters"
will report existance of tls-creds/tls-parameters even when set
to their default values.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>

Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:57:08 +01:00
Chao Fan
1ffb5dfd35 Change the method to calculate dirty-pages-rate
In function cpu_physical_memory_sync_dirty_bitmap, file
include/exec/ram_addr.h:

if (src[idx][offset]) {
    unsigned long bits = atomic_xchg(&src[idx][offset], 0);
    unsigned long new_dirty;
    new_dirty = ~dest[k];
    dest[k] |= bits;
    new_dirty &= bits;
    num_dirty += ctpopl(new_dirty);
}

After these codes executed, only the pages not dirtied in bitmap(dest),
but dirtied in dirty_memory[DIRTY_MEMORY_MIGRATION] will be calculated.
For example:
When ram_list.dirty_memory[DIRTY_MEMORY_MIGRATION] = 0b00001111,
and atomic_rcu_read(&migration_bitmap_rcu)->bmap = 0b00000011,
the new_dirty will be 0b00001100, and this function will return 2 but not
4 which is expected.
the dirty pages in dirty_memory[DIRTY_MEMORY_MIGRATION] are all new,
so these should be calculated also.

Signed-off-by: Chao Fan <fanc.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
2017-03-16 08:55:56 +01:00
Peter Maydell
5b467b9081 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-openbios-signed' into staging
Update OpenBIOS images

# gpg: Signature made Wed 15 Mar 2017 21:01:53 GMT
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-openbios-signed:
  Update OpenBIOS images to f233c3f built from submodule.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-16 07:47:12 +00:00
Markus Armbruster
012b126de2 qapi: Fix a misleading parser error message
When choking on a token where an expression is expected, we report
'Expected "{", "[" or string'.  Close, but no cigar.  Fix it to
Expected '"{", "[", string, boolean or "null"'.

Missed in commit e53188a.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-48-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
c261394978 qapi: Make pylint a bit happier
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-47-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
6bbfb12de6 qapi: Drop unused .check_clash() parameter schema
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-46-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
768562ded0 qapi: union_types is a list used like a dict, make it one
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-45-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
ed285bf821 qapi: struct_types is a list used like a dict, make it one
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-44-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
5f018446fe qapi: enum_types is a list used like a dict, make it one
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-43-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
6f05345f8f qapi: Factor add_name() calls out of the meta conditional
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-42-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
eda43c6844 qapi: Simplify what gets stored in enum_types
Don't invent a new dictionary structure just for enum_types, simply
store the defining expression, like we do for struct_types and
union_types.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-41-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
062e856b15 qapi: Drop unused variable events
Missed in commit e98859a

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-40-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
a9f396b028 qapi: Eliminate check_docs() and drop QAPIDoc.expr
Move what's left in check_docs() to check_expr().  Delegate the actual
checking to new QAPIDoc.check_expr().

QAPIDoc.expr is now unused; drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-39-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
816a57cd6e qapi: Fix detection of bogus member documentation
check_definition_doc() checks for member documentation without a
matching member.  It laboriously second-guesses what members
QAPISchema._def_exprs() will create.  That's a stupid game.

Move the check into QAPISchema.check(), where the members are known.
Delegate the actual checking to new QAPIDoc.check().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-38-git-send-email-armbru@redhat.com>
2017-03-16 07:13:04 +01:00
Markus Armbruster
f641d06ad6 tests/qapi-schema: Improve coverage of bogus member docs
New test doc-bad-union-member.json shows we can fail to reject
documentation for nonexistent members.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-37-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
bdc001caaa tests/qapi-schema: Rename doc-bad-args to doc-bad-command-arg
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-36-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
4ea7148e89 qapi: Move empty doc section checking to doc parser
Results in a more precise error location, but the real reason is
emptying out check_docs() step by step.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-35-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
2d433236df qapi: Improve error message on @NAME: in free-form doc
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-34-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
7947016d1c qapi: Move detection of doc / expression name mismatch
Move the check whether the doc matches the expression name from
check_definition_doc() to check_exprs().  This changes the error
location from the comment to the expression.  Makes sense as the
message talks about the expression: "Definition of '%s' follows
documentation for '%s'".  It's also a step towards getting rid of
check_docs().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-33-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
e7823a2adf qapi: Fix detection of doc / expression mismatch
This fixes the errors uncovered by the previous commit.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-32-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
2028be8eea tests/qapi-schema: Improve doc / expression mismatch coverage
New tests doc-before-include.json and doc-before-pragma.json show we
fail to reject a misplaced expression comment.

New test doc-no-symbol.json shows a bad error message.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1489582656-31133-31-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
75b50196d9 qapi2texi: Use category "Object" for all object types
At the protocol level, the distinction between struct, flat union and
simple union is meaningless, they are all JSON objects.  Document them
that way.

Example change (qemu-qmp-ref.txt):

- -- Simple Union: InputEvent
+ -- Object: InputEvent

      Input event union.

This also fixes the completely broken headings for flat and simple
unions in qemu-qmp-ref.7 and qemu-ga-ref.7, by sidestepping a bug in
texi2pod.pl.  For instance, it mistranslates "@deftp {Simple Union}
InputEvent" to "B<Union> (Simple)", but translates "@deftp Object
InputEvent" to "B<SocketAddress> (Object)".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-30-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
c19eaa64df qapi2texi: Generate descriptions for simple union tags
Simple union tags carry no type information, because their type is
implicit.  Their description should make up for it, but many have
none.  Generate one automatically then.

Example change (qemu-qmp-ref.txt):

  -- Simple Union: ImageInfoSpecific

      A discriminated record of image format specific information
      structures.

      Members:
      'type'
-          Not documented
+          One of "qcow2", "vmdk", "luks"
      'data: ImageInfoSpecificQCow2' when 'type' is "qcow2"
      'data: ImageInfoSpecificVmdk' when 'type' is "vmdk"
      'data: QCryptoBlockInfoLUKS' when 'type' is "luks"

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-29-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
5169cd8767 qapi2texi: Generate documentation for variant members
A flat union's branch brings in the members of another type.  Generate
a suitable reference to that type.

Example change (qemu-qmp-ref.txt):

  -- Flat Union: QCryptoBlockOpenOptions

      The options that are available for all encryption formats when
      opening an existing volume

      Members:
      The members of 'QCryptoBlockOptionsBase'
+     The members of 'QCryptoBlockOptionsQCow' when 'format' is "qcow"
+     The members of 'QCryptoBlockOptionsLUKS' when 'format' is "luks"

      Since: 2.6

A simple union's branch adds a member 'data' of some other type.
Generate documentation for that member.

Example change (qemu-qmp-ref.txt):

  -- Simple Union: SocketAddress

      Captures the address of a socket, which could also be a named file
      descriptor

      Members:
      'type'
	   Not documented
+     'data: InetSocketAddress' when 'type' is "inet"
+     'data: UnixSocketAddress' when 'type' is "unix"
+     'data: VsockSocketAddress' when 'type' is "vsock"
+     'data: String' when 'type' is "fd"

      Since: 1.3

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-28-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
88f63467c5 qapi2texi: Generate reference to base type members
The generated documentation doesn't mention object type members
inherited from a base type.  Fix that.

Example change (qemu-qmp-ref.txt):

  -- Struct: VncServerInfo

      The network connection information for server

      Members:
      'auth' (optional)
	   authentication method used for the plain (non-websocket) VNC
	   server
+     The members of 'VncBasicInfo'

      Since: 2.1

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-27-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
691e03133e qapi2texi: Include member type in generated documentation
The recent merge of docs/qmp-commands.txt and docs/qmp-events.txt into
the schema lost type information.  Fix this documentation regression.

Example change (qemu-qmp-ref.txt):

  -- Struct: InputKeyEvent

      Keyboard input event.

      Members:
-     'button'
+     'button: InputButton'
           Which button this event is for.
-     'down'
+     'down: boolean'
           True for key-down and false for key-up events.

      Since: 2.0

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-26-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
c2dd311cb7 qapi2texi: Implement boxed argument documentation
This replaces manual references like "For the arguments, see the
documentation of ..." by a generated reference "Arguments: the members
of ...".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-25-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
2c99f5fdc8 qapi2texi: Don't hide undocumented members and arguments
Show undocumented object, alternate type members and command, event
arguments exactly like undocumented enumeration type values.

Example change (qemu-qmp-ref.txt):

  -- Command: query-rocker

      Return rocker switch information.

+     Arguments:
+     'name'
+          Not documented
+
      Returns: 'Rocker' information

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-24-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
5da19f14ff qapi2texi: Explain enum value undocumentedness more clearly
Instead of not saying anything when we have no documentation, say "Not
documented".

Example change (qemu-qmp-ref.txt):

  -- Enum: GuestPanicAction

      An enumeration of the actions taken when guest OS panic is detected

      Values:
      'pause'
           system pauses
      'poweroff'
+          Not documented

      Since: 2.1 (poweroff since 2.8)

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-23-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
2a1183ce93 qapi2texi: Present the table of members more clearly
The table of members follows the main descriptive text immediately.
Makes it hard to see what it is about.  Start a new paragraph, and
lead with a line "Members:" for object and alternate types, "Values:"
for enumeration types, and "Arguments:" for commands and events.

Example change (qemu-qmp-ref.txt):

  -- Command: set_link

      Sets the link status of a virtual network adapter.
+
+     Arguments:
      'name'
           the device name of the virtual network adapter
      'up'
           true to set the link status to be up

      Returns: Nothing on success If 'name' is not a valid network
      device, DeviceNotFound

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-22-git-send-email-armbru@redhat.com>
2017-03-16 07:13:03 +01:00
Markus Armbruster
71d918a1b1 qapi2texi: Plainer enum value and member name formatting
Use @code{%s} instead of @code{'%s'}.  Impact, using @id as example:

* Texinfo
  -@item @code{'id'}
  +@item @code{id}

* HTML
  -<dt><code>'id'</code></dt>
  +<dt><code>id</code></dt>

* POD (for manual pages):
  -=item C<'id'>
  +=item C<id>

* Formatted manual pages:
  -'id'
  +"id"

* Plain text:
  -     ''id''
  +     'id'

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-21-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
ef801a9bb1 qapi: Prefer single-quoted strings more consistently
PEP 8 advises:

    In Python, single-quoted strings and double-quoted strings are the
    same.  This PEP does not make a recommendation for this.  Pick a
    rule and stick to it.  When a string contains single or double
    quote characters, however, use the other one to avoid backslashes
    in the string.  It improves readability.

The QAPI generators succeed at picking a rule, but fail at sticking to
it.  Convert a bunch of double-quoted strings to single-quoted ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-20-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
0fe675af77 qapi: Use raw strings for regular expressions consistently
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-19-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
1d8bda128d qapi: The #optional tag is redundant, drop
We traditionally mark optional members #optional in the doc comment.
Before commit 3313b61, this was entirely manual.

Commit 3313b61 added some automation because its qapi2texi.py relied
on #optional to determine whether a member is optional.  This is no
longer the case since the previous commit: the only thing qapi2texi.py
still does with #optional is stripping it out.  We still reject bogus
qapi-schema.json and six places for qga/qapi-schema.json.

Thus, you can't actually rely on #optional to see whether something is
optional.  Yet we still make people add it manually.  That's just
busy-work.

Drop the code to check, fix up and strip out #optional, along with all
instances of #optional.  To keep it out, add code to reject it, to be
dropped again once the dust settles.

No change to generated documentation.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-18-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
aa964b7fdc qapi2texi: Convert to QAPISchemaVisitor
qapi2texi works with schema expression trees.  Such a tight coupling
to schema language syntax is not a good idea.  Convert it to the visitor
interface the other generators use.

No change to generated documentation.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-17-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
860e877861 qapi: Conjure up QAPIDoc.ArgSection for undocumented members
qapi2texi.py already conjures up ArgSections for undocumented
enumeration values, in texi_enum.  Drop that, and conjure them up for
all kinds of "arguments" (enumeration values, object and alternate
type members) in qapi.py instead.

Take care to keep generated documentation exactly the same for now.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-16-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
069fb5b250 qapi: Prepare for requiring more complete documentation
We currently neglect to check all enumeration values, common members
of object types and members of alternate types are documented.
Unsurprisingly, many aren't.

Add the necessary plumbing to find undocumented ones, except for
variant members of object types.  Don't enforce anything just yet, but
connect each QAPIDoc.ArgSection to its QAPISchemaMember.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-15-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
4636211e4d qapi: Fix QAPISchemaEnumType.is_implicit() for 'QType'
Missed in commit 7264f5c.  Harmless, because nothing checks whether an
enumeration type is implicit so far.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-14-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
8c0aa61318 qapi/rocker: Fix up doc comment notes on optional members
Talking about #optional like this

    # Note: fields are marked #optional to indicate that they may or may
    # not appear ...

doesn't work so well in generated documentation, because the #optional
tag is not visible there.  Replace by

    # Note: optional members may or may not appear ...

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1489582656-31133-13-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
b116fd8e30 qapi: Avoid unwanted blank lines in QAPIDoc
We silently fix missing #optional tags for QAPIDoc by appending a line
"#optional" to the section's .content.  However, this interferes with
.__repr__ stripping trailing blank lines from .content.

Use new ArgSection instance variable .optional instead, and leave
.content alone.

To permit testing .optional in texi_body(), clean up texi_enum()'s
hack to add empty documentation for undocumented enum values: add an
ArgSection instead of ''.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1489582656-31133-12-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
42bebcc129 qapi2texi: Fix up output around #optional
We use tag #optional to mark optional members, like this:

    # @name: #optional The name of the guest

texi_body() strips #optional, but not whitespace around it.  For the
above, we get in qemu-qmp-qapi.texi

    @item @code{'name'} (optional)
     The name of the guest
    @end table

The extra space can lead to artifacts in output, e.g in
qemu-qmp-ref.7.pod

    =item C<'name'> (optional)

     The name of the guest

and then in qemu-qmp-ref.7

    .IX Item "name (optional)"
    .Vb 1
    \& The name of the guest
    .Ve

instead of intended plain

    .IX Item "name (optional)"
    The name of the guest

Get rid of these artifacts by removing whitespace around #optional
along with it.

This turns three minus signs in qapi-schema.json into markup, because
they're now at the beginning of the line.  Drop them, they're unwanted
there.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1489582656-31133-11-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
4815374513 qapi: Fix to reject empty union base gracefully
Common Python pitfall: 'assert base_members' fires on [] in addition
to None.  Correct to 'assert base_members is not None'.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-10-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
707fb2d381 tests/qapi-schema: Cover empty union base
The new test case shows off qapi.py choking on an empty union base.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-9-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
bd7f974796 qapi: Clean up build of generated documentation
Rename intermediate qemu-qapi.texi to qemu-qmp-qapi.texi to match its
user qemu-qmp-ref.texi, just like qemu-ga-qapi.texi matches
qemu-ga-ref.texi.

Build the intermediate .texi next to the sources and the final output
in docs/ instead of dumping them into the build root.

Fix version.texi dependencies so that only the targets that actually
need it depend on it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-8-git-send-email-armbru@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
2cfbae3c42 qapi: Have each QAPI schema declare its name rule violations
qapi.py has a hardcoded white-list of type names that may violate the
rule on use of upper and lower case.  Add a new pragma directive
'name-case-whitelist', and use it to replace the hard-coded
white-list.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1489582656-31133-7-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
1554a8fae9 qapi: Have each QAPI schema declare its returns white-list
qapi.py has a hardcoded white-list of command names that may violate
the rules on permitted return types.  Add a new pragma directive
'returns-whitelist', and use it to replace the hard-coded white-list.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1489582656-31133-6-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-16 07:13:02 +01:00
Markus Armbruster
700dc9f503 docs/qapi-code-gen.txt: Drop confusing reference to 'gen'
Section "Commands" qualifies its rules on permitted argument and
return types "with one exception noted below when 'gen' is used".  The
note went away in commit 2d21291.  Clean up the dangling references.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1489582656-31133-5-git-send-email-armbru@redhat.com>
2017-03-16 07:13:01 +01:00
Markus Armbruster
87c16dceca qapi: Back out doc comments added just to please qapi.py
This reverts commit 3313b61's changes to tests/qapi-schema/, except
for tests/qapi-schema/doc-*.

We could keep some of these doc comments to serve as positive test
cases.  However, they don't actually add to what we get from doc
comment use in actual schemas, as we we don't test output matches
expectations, and don't systematically cover doc comment features.
Proper positive test coverage would be nice.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1489582656-31133-4-git-send-email-armbru@redhat.com>
2017-03-16 07:13:01 +01:00
Markus Armbruster
bc52d03ff5 qapi: Make doc comments optional where we don't need them
Since we added the documentation generator in commit 3313b61, doc
comments are mandatory.  That's a very good idea for a schema that
needs to be documented, but has proven to be annoying for testing.

Make doc comments optional again, but add a new directive

    { 'pragma': { 'doc-required': true } }

to let a QAPI schema require them.

Add test cases for the new pragma directive.  While there, plug a
minor hole in includ directive test coverage.

Require documentation in the schemas we actually want documented:
qapi-schema.json and qga/qapi-schema.json.

We could probably make qapi2texi.py cope with incomplete
documentation, but for now, simply make it refuse to run unless the
schema has 'doc-required': true.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1489582656-31133-3-git-send-email-armbru@redhat.com>
[qapi-code-gen.txt wording tweaked]
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-16 07:13:01 +01:00
Markus Armbruster
e04dea8872 qapi: Factor QAPISchemaParser._include() out of .__init__()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1489582656-31133-2-git-send-email-armbru@redhat.com>
2017-03-16 07:13:01 +01:00
Daniel P. Berrange
f880cd6b6f qmp: allow setting properties to empty string in qmp-shell
The qmp-shell property parser currently rejects attempts to
set string properties to the empty string eg

  (QEMU) migrate-set-parameters  tls-hostname=
  Error while parsing command line: Expected a key=value pair, got 'tls-hostname='
command format: <command-name>  [arg-name1=arg1] ... [arg-nameN=argN]

This is caused by checking the wrong condition after splitting
the parameter on '='. The "partition" method will return "" for
the separator field, if the seperator was not present, so that
is the correct thing to check for malformed syntax.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170302122429.7737-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-16 07:13:01 +01:00
Marc-André Lureau
597494abde qapi2texi: change texi formatters
STRUCT_FMT is generic enough, rename it to TYPE_FMT, use it for unions.

Rename COMMAND_FMT to MSG_FMT, since it applies to both commands and
events.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170125130308.16104-2-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-16 07:12:54 +01:00
Daniel P. Berrange
8755b4afbd trace: ensure $(tracetool-y) is defined in top level makefile
The build rules for trace files have a dependancy on $(tracetool-y).
This variable populated in the trace/Makefile.objs file and thus its
definition gets pulled into the top level makefile. This happens too
late in the process though, so by the time $(tracetool-y) is defined,
make has already evaluated $(tracetool-y) in the dependancies and
found it to be empty. The result is that when the tracetool source
is changed, the generated files are not rebuilt. The solution is to
define the variable in the top level makefile too

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-id: 20170315123421.28815-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-16 11:51:26 +08:00
Daniel P. Berrange
4175304e59 makefile: generate trace-events-all upfront
Files should not be created in the build dir during the
'make install' phase. List 'trace-events-all' as a
generated file so that it gets created upfront during
build.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170228122901.24520-3-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-16 11:51:15 +08:00
Daniel P. Berrange
4f04f13c2a makefile: merge GENERATED_HEADERS & GENERATED_SOURCES variables
The only functional difference between the GENERATED_HEADERS
and GENERATED_SOURCES variables is that 'Makefile' has a
dependancy on GENERATED_HEADERS, causing generated header files
to be created immediatey at the start of the build process.
There is no reason why this early creation should be restricted
to the .h files, and not include .c files too. Merge both of
the variables into a single GENERATED_FILES variable to make
it clear it is for any type of generated file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 20170228122901.24520-2-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-16 11:51:15 +08:00
Li Qiang
d68f0f778e ide: ahci: call cleanup function in ahci unit
This can avoid memory leak when hotunplug the ahci device.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-4-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
2017-03-15 20:50:14 -04:00
Li Qiang
c9f086418a ide: core: add cleanup function
As the pci ahci can be hotplug and unplug, in the ahci unrealize
function it should free all the resource once allocated in the
realized function. This patch add ide_exit to free the resource.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
2017-03-15 20:50:14 -04:00
Li Qiang
44a109c1b3 ide: qdev: register ide bus unrealize function
we have an idebus unrealize function, but it was being
registered as the unrealize function for the IDE Device,
so it was not getting invoked on device teardown because
nothing is "unrealizing" the IDE devices themselves.

Suggested-by: John Snow <jsnow@redhat.com>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1488449293-80280-2-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
2017-03-15 20:50:14 -04:00
zhanghailiang
bdf4c4ec53 virtio-serial-bus: Delete timer from list before free it
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Amit Shah <amit@kernel.org>
2017-03-16 01:46:42 +02:00
Marcel Apfelbaum
27ce0f3afc hw/virtio: fix Power Management Control Register for PCI Express virtio devices
Make Power Management State flag writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum
d584f1b9ca hw/virtio: fix Link Control Register for PCI Express virtio devices
Make several Link Control Register flags writable to conform
with the PCI Express spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:41 +02:00
Marcel Apfelbaum
c2cabb3422 hw/virtio: fix error enabling flags in Device Control register
When the virtio devices are PCI Express, make error-enabling flags
writable to respect the PCIe spec.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:40 +02:00
Marcel Apfelbaum
f03d8ea330 hw/pcie: fix Extended Configuration Space for devices with no Extended Capabilities
Absence of any Extended Capabilities is required to be
indicated by an Extended Capability header with a Capability ID of
0000h, a Capability Version of 0h, and a Next Capability Offset of 000h.

Instead of inserting a 'NULL' capability is simpler to mark the start
of the Extended Configuration Space as read-only to achieve the same
behaviour.

Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-16 01:46:40 +02:00
Mark Cave-Ayland
111308e789 Update OpenBIOS images to f233c3f built from submodule.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-03-15 19:42:08 +00:00
Peter Maydell
1883ff34b5 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes

Some fixes to fallback from using virtio caching,
pls a minor vm gen id fix.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Wed 15 Mar 2017 17:59:25 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  virtio-pci: reset modern vq meta data
  Revert "virtio: unbreak virtio-pci with IOMMU after caching ring translations"
  pci: introduce a bus master container
  virtio: validate address space cache during init
  virtio: destroy region cache during reset
  virtio: guard against NULL pfn
  Bugfix: Handle error if VM Generation ID device not present

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-15 18:44:05 +00:00
Jason Wang
60a8d80234 virtio-pci: reset modern vq meta data
We don't reset proxy->vqs[].{num|desc[]|avail[]|used[]}. This means if
a driver enable the vq without setting vq address after reset. The old
addresses were leaked. Fixing this by resetting modern vq meta data
during device reset.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:59:18 +02:00
Jason Wang
f0edf23978 Revert "virtio: unbreak virtio-pci with IOMMU after caching ring translations"
This reverts commit
96a8821d21. Previous patch is a better
solution which does not require a strict order between virtio and IOMMU.

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-15 19:59:00 +02:00
Peter Maydell
de71fb96bf Merge remote-tracking branch 'remotes/armbru/tags/pull-misc-2017-03-15' into staging
Miscellaneous patches for 2017-03-15

# gpg: Signature made Wed 15 Mar 2017 13:12:35 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-misc-2017-03-15:
  coverity-model: model address_space_read/write
  tests: Use error_free_or_abort() where appropriate

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-15 17:54:41 +00:00
Jason Wang
3716d5902d pci: introduce a bus master container
96a8821d21 ("virtio: unbreak virtio-pci with IOMMU after caching ring
translations") tries to make IOMMU works with virtio memory region
cache, but it requires IOMMU to be created before any virtio
devices. This is sub optimal, fixing this by introduce a bus master
container to make sure address space can be initialized during device
registering, and then we can safely set alias and make
bus_master_enable_region as its subregion during bus master
initialization.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
e45da65322 virtio: validate address space cache during init
We don't check the return value of address_space_cache_init(), this
may lead buggy driver use incorrect region caches. Instead of
triggering an assert, catch and warn this early in
virtio_init_region_cache().

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
e0e2d64409 virtio: destroy region cache during reset
We don't destroy region cache during reset which can make the maps
of previous driver leaked to a buggy or malicious driver that don't
set vring address before starting to use the device. Fix this by
destroy the region cache during reset and validate it before trying to
see them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Jason Wang
168e4af3c1 virtio: guard against NULL pfn
To avoid access stale memory region cache after reset, this patch
check the existence of virtqueue pfn for all exported virtqueue access
helpers before trying to use them.

Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-15 19:37:19 +02:00
Ben Warren
72d9196f1e Bugfix: Handle error if VM Generation ID device not present
This was crashing due to NULL-pointer dereference

QMP Test case:
==============

(QEMU) query-vm-generation-id
{"error": {"class": "GenericError", "desc": "VM Generation ID device not
found"}}

HMP Test case:
==============
virsh # qemu-monitor-command --hmp 3 info vm-generation-id
VM Generation ID device not found

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-15 19:37:19 +02:00
Peter Maydell
7584bf5e6f Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Wed 15 Mar 2017 05:05:04 GMT
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  os: don't corrupt pre-existing memory-backend data with prealloc

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-15 14:19:59 +00:00
Peter Maydell
926e368388 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Fix global property and -cpu handling bug

This bug fix was supposed to be applied just after 2.8.0 was
released, but it slipped through the cracks. Sending it now for
the next -rc.

# gpg: Signature made Tue 14 Mar 2017 20:04:50 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-pull-request:
  machine: Convert abstract typename on compat_props to subclass names

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-15 13:07:08 +00:00
Paolo Bonzini
4d0e72396b coverity-model: model address_space_read/write
Commit eb7eeb8 ("memory: split address_space_read and
address_space_write", 2015-12-17) made address_space_rw
dispatch to one of address_space_read or address_space_write,
rather than vice versa.

For callers of address_space_read and address_space_write this
causes false positive defects when Coverity sees a length-8 write in
address_space_read and a length-4 (e.g. int*) buffer to read into.
As long as the size of the buffer is okay, this is a false positive.

Reflect the code change into the model.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20170315081641.20588-1-pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2017-03-15 13:59:16 +01:00
Markus Armbruster
157db293eb tests: Use error_free_or_abort() where appropriate
Done with this Coccinelle semantic patch:

    @@
    expression E;
    @@
    -    g_assert(E);
    -    error_free(E);
    +    error_free_or_abort(&E);

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1487362554-5688-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-15 08:52:09 +01:00
Daniel P. Berrange
9dc44aa582 os: don't corrupt pre-existing memory-backend data with prealloc
When using a memory-backend object with prealloc turned on, QEMU
will memset() the first byte in every memory page to zero. While
this might have been acceptable for memory backends associated
with RAM, this corrupts application data for NVDIMMs.

Instead of setting every page to zero, read the current byte
value and then just write that same value back, so we are not
corrupting the original data. Directly write the value instead
of memset()ing it, since there's no benefit to memset for a
single byte write.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Andrea Arcangeli <aarcange@redhat.com>
Message-id: 20170303113255.28262-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2017-03-15 11:55:41 +08:00
Eduardo Habkost
0bcba41fe3 machine: Convert abstract typename on compat_props to subclass names
Original problem description by Greg Kurz:

> Since commit "9a4c0e220d8a hw/virtio-pci: fix virtio
> behaviour", passing -device virtio-blk-pci.disable-modern=off
> has no effect on 2.6 machine types because the internal
> virtio-pci.disable-modern=on compat property always prevail.

The same bug also affects other abstract type names mentioned on
compat_props by machine-types: apic-common, i386-cpu, pci-device,
powerpc64-cpu, s390-skeys, spapr-pci-host-bridge, usb-device,
virtio-pci, x86_64-cpu.

The right fix for this problem is to make sure compat_props and
-global options are always applied in the order they are
registered, instead of reordering them based on the type
hierarchy. But changing the ordering rules of -global is risky
and might break existing configurations, so we shouldn't do that
on a stable branch.

This is a temporary hack that will work around the bug when
registering compat_props properties: if we find an abstract class
on compat_props, register properties for all its non-abstract
subtypes instead. This will make sure -global won't be overridden
by compat_props, while keeping the existing ordering rules on
-global options.

Note that there's one case that won't be fixed by this hack:
"-global spapr-pci-vfio-host-bridge.<option>=<value>" won't be
able to override compat_props, because spapr-pci-host-bridge is
not an abstract class.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1481575745-26120-1-git-send-email-ehabkost@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-14 16:53:44 -03:00
Peter Maydell
d84f714eaf Update version for v2.9.0-rc0 release
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 19:18:23 +00:00
Peter Maydell
64c358a33a Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* "x" monitor command fix for KVM (Christian)
* MemoryRegion name documentation (David)
* mem-prealloc optimization (Jitendra)
* -icount/MTTCG fixes (me)
* "info mtree" niceness (Peter)
* NBD drop_sync buffer overflow (Vladimir/Eric)
* small cleanups and bugfixes (Li, Lin, Suramya, Thomas)
* fix for "-device kvmclock" w/TCG (Eduardo)
* debug output before crashing on KVM_{GET,SET}_MSRS (Eduardo)

# gpg: Signature made Tue 14 Mar 2017 13:42:05 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream:
  nbd/client: fix drop_sync [CVE-2017-2630]
  memory: info mtree check mr range overflow
  icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
  main-loop: remove now unnecessary optimization
  cpus: define QEMUTimerListNotifyCB for QEMU system emulation
  qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
  qemu-timer: fix off-by-one
  target/nios2: take BQL around interrupt check
  scsi: mptsas: fix the wrong reading size in fetch request
  util: Removed unneeded header from path.c
  configure: add the missing help output for optional features
  scripts/dump-guest-memory.py: fix int128_get64 on recent gcc
  kvmclock: Don't crash QEMU if KVM is disabled
  kvm: Print MSR information if KVM_{GET,SET}_MSRS failed
  exec: add cpu_synchronize_state to cpu_memory_rw_debug
  mem-prealloc: reduce large guest start-up and migration time.
  docs: Add a note about mixing bootindex with "-boot order"
  memory_region: Fix name comments

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 16:52:17 +00:00
Peter Maydell
5e2fb7c598 hw/misc/imx6_src: Don't crash trying to reset missing CPUs
Commit 4881658a4b introduced a call to arm_get_cpu_by_id(),
and Coverity noticed that we weren't checking that it didn't
return NULL (CID 1371652).

Normally this won't happen (because all 4 CPUs are expected
to exist), but it's possible the user requested fewer CPUs
on the command line. Handle this possibility by silently
doing nothing, which is the same behaviour as before commit
4881658a4b and also how we handle the other CPU operations
(since we ignore the INVALID_PARAM returns from arm_set_cpu_on()
and friends).

There is a slight behavioural difference to the pre-4881658a4b
situation: the "reset this core" bit will remain set rather
than not being permitted to be set. The imx6 datasheet is
unclear about the behaviour in this odd corner case, so we
opt for the simpler code rather than complicated logic to
maintain identical behaviour.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488542374-1256-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
2017-03-14 16:13:22 +00:00
Programmingkid
5f26fcfb98 ui/cocoa.m: add toast file support
Add the ability for the user to use .toast files with QEMU. This format works
just like ISO files.

Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Message-id: 0C9DA454-E3DC-4291-806E-9A96557DE833@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 15:09:56 +00:00
Peter Maydell
486fc7a837 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20170314' into staging
target-arm queue:
 * arm-powerctl: Fix psci info return values
 * implement armv8 PMUSERENR (user-mode enable bits)

# gpg: Signature made Tue 14 Mar 2017 11:31:11 GMT
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20170314:
  target/arm/arm-powerctl: Fix psci info return values
  target/arm: implement armv8 PMUSERENR (user-mode enable bits)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 14:11:38 +00:00
Vladimir Sementsov-Ogievskiy
2563c9c6b8 nbd/client: fix drop_sync [CVE-2017-2630]
Comparison symbol is misused. It may lead to memory corruption.
Introduced in commit 7d3123e.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170203154757.36140-6-vsementsov@virtuozzo.com>
[eblake: add CVE details, update conditional]
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170307151627.27212-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 14:41:19 +01:00
Peter Xu
b31f841262 memory: info mtree check mr range overflow
The address of memory regions might overflow when something wrong
happened, like reported in:

https://lists.gnu.org/archive/html/qemu-devel/2017-03/msg02043.html

For easier debugging, let's try to detect it.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1489496187-624-1-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:57:52 +01:00
Paolo Bonzini
6b8f0187a4 icount: process QEMU_CLOCK_VIRTUAL timers in vCPU thread
icount has become much slower after tcg_cpu_exec has stopped
using the BQL.  There is also a latent bug that is masked by
the slowness.

The slowness happens because every occurrence of a QEMU_CLOCK_VIRTUAL
timer now has to wake up the I/O thread and wait for it.  The rendez-vous
is mediated by the BQL QemuMutex:

- handle_icount_deadline wakes up the I/O thread with BQL taken
- the I/O thread wakes up and waits on the BQL
- the VCPU thread releases the BQL a little later
- the I/O thread raises an interrupt, which calls qemu_cpu_kick
- the VCPU thread notices the interrupt, takes the BQL to
  process it and waits on it

All this back and forth is extremely expensive, causing a 6 to 8-fold
slowdown when icount is turned on.

One may think that the issue is that the VCPU thread is too dependent
on the BQL, but then the latent bug comes in.  I first tried removing
the BQL completely from the x86 cpu_exec, only to see everything break.
The only way to fix it (and make everything slow again) was to add a dummy
BQL lock/unlock pair.

This is because in -icount mode you really have to process the events
before the CPU restarts executing the next instruction.  Therefore, this
series moves the processing of QEMU_CLOCK_VIRTUAL timers straight in
the vCPU thread when running in icount mode.

The required changes include:

- make the timer notification callback wake up TCG's single vCPU thread
  when run from another thread.  By using async_run_on_cpu, the callback
  can override all_cpu_threads_idle() when the CPU is halted.

- move handle_icount_deadline after qemu_tcg_wait_io_event, so that
  the timer notification callback is invoked after the dummy work item
  wakes up the vCPU thread

- make handle_icount_deadline run the timers instead of just waking the
  I/O thread.

- stop processing the timers in the main loop

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:51:34 +01:00
Paolo Bonzini
e330c118f2 main-loop: remove now unnecessary optimization
This optimization is not necessary anymore, because the vCPU now drops
the I/O thread lock even with TCG.  Drop it to simplify the code and
avoid the "I/O thread spun for 1000 iterations" warning.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:29:21 +01:00
Paolo Bonzini
3f53bc61a4 cpus: define QEMUTimerListNotifyCB for QEMU system emulation
There is no change for now, because the callback just invokes
qemu_notify_event.

Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:28:29 +01:00
Paolo Bonzini
d2528bdc19 qemu-timer: do not include sysemu/cpus.h from util/qemu-timer.h
This dependency is the wrong way, and we will need util/qemu-timer.h from
sysemu/cpus.h in the next patch.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:28:18 +01:00
Paolo Bonzini
33bef0b994 qemu-timer: fix off-by-one
If the first timer is exactly at the current value of the clock, the
deadline is met and the timer should fire.  This fixes itself on the next
iteration of the loop without icount; with icount, however, execution
of instructions will stop exactly at the deadline and won't proceed.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:42 +01:00
Paolo Bonzini
c0d24e7f70 target/nios2: take BQL around interrupt check
The interrupt controller does not have its own locking.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:37 +01:00
Li Qiang
b01a2d07c9 scsi: mptsas: fix the wrong reading size in fetch request
When fetching request, it should read sizeof(*hdr), not the
pointer hdr.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <1489488980-130668-1-git-send-email-liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:37 +01:00
Suramya Shah
bd5d983fa8 util: Removed unneeded header from path.c
Signed-off-by: Suramya Shah <shah.suramya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170310163948.7567-1-shah.suramya@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:37 +01:00
c12d66aac1 configure: add the missing help output for optional features
Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <20170310101405.26974-1-lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Marc-André Lureau
9b4b157ef6 scripts/dump-guest-memory.py: fix int128_get64 on recent gcc
The Int128 is no longer a struct, reaching a python exception:
Python Exception <class 'gdb.error'> Attempt to extract a component of a value that is not a (null).:

Replace struct access with a cast to uint64[] instead.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1427466

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170310112819.16760-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Eduardo Habkost
ca2edcd35c kvmclock: Don't crash QEMU if KVM is disabled
Most machines don't allow sysbus devices like "kvmclock" to be
created from the command-line, but some of them do (the ones with
has_dynamic_sysbus=true). In those cases, it's possible to
manually create a kvmclock device without KVM being enabled,
making QEMU crash:

  $ qemu-system-x86_64 -machine q35,accel=tcg -device kvmclock
  Segmentation fault (core dumped)

This changes kvmclock's realize method to return an error if KVM
is disabled, to ensure it won't crash QEMU.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309185046.17555-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Eduardo Habkost
c70b11d160 kvm: Print MSR information if KVM_{GET,SET}_MSRS failed
When a KVM_{GET,SET}_MSRS ioctl() fails, it is difficult to find
out which MSR caused the problem. Print an error message for
debugging, before we trigger the (ret == cpu->kvm_msr_buf->nmsrs)
assert.

Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309194634.28457-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Christian Borntraeger
79ca7a1b89 exec: add cpu_synchronize_state to cpu_memory_rw_debug
I sometimes got "Cannot access memory" when using the x command
on the monitor. Turns out that the cpu env did contain stale data
(e.g. wrong control register content for page table origin).
We must synchronize the state of the CPU before walking the page
tables. A similar issues happens for a remote gdb, so lets
do the cpu_synchronize_state in cpu_memory_rw_debug.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1488896348-13560-1-git-send-email-borntraeger@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Jitendra Kolhe
1e356fc14b mem-prealloc: reduce large guest start-up and migration time.
Using "-mem-prealloc" option for a large guest leads to higher guest
start-up and migration time. This is because with "-mem-prealloc" option
qemu tries to map every guest page (create address translations), and
make sure the pages are available during runtime. virsh/libvirt by
default, seems to use "-mem-prealloc" option in case the guest is
configured to use huge pages. The patch tries to map all guest pages
simultaneously by spawning multiple threads. Currently limiting the
change to QEMU library functions on POSIX compliant host only, as we are
not sure if the problem exists on win32. Below are some stats with
"-mem-prealloc" option for guest configured to use huge pages.

------------------------------------------------------------------------
Idle Guest      | Start-up time | Migration time
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - single threaded (existing code)
------------------------------------------------------------------------
64 Core - 4TB   | 54m11.796s    | 75m43.843s
64 Core - 1TB   | 8m56.576s     | 14m29.049s
64 Core - 256GB | 2m11.245s     | 3m26.598s
------------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 8 threads
------------------------------------------------------------------------
64 Core - 4TB   | 5m1.027s      | 34m10.565s
64 Core - 1TB   | 1m10.366s     | 8m28.188s
64 Core - 256GB | 0m19.040s     | 2m10.148s
-----------------------------------------------------------------------
Guest stats with 2M HugePage usage - map guest pages using 16 threads
-----------------------------------------------------------------------
64 Core - 4TB   | 1m58.970s     | 31m43.400s
64 Core - 1TB   | 0m39.885s     | 7m55.289s
64 Core - 256GB | 0m11.960s     | 2m0.135s
-----------------------------------------------------------------------

Changed in v2:
 - modify number of memset threads spawned to min(smp_cpus, 16).
 - removed 64GB memory restriction for spawning memset threads.

Changed in v3:
 - limit number of threads spawned based on
   min(sysconf(_SC_NPROCESSORS_ONLN), 16, smp_cpus)
 - implement memset thread specific siglongjmp in SIGBUS signal_handler.

Changed in v4
 - remove sigsetjmp/siglongjmp and SIGBUS unblock/block for main thread
   as main thread no longer touches any pages.
 - simplify code my returning memset_thread_failed status from
   touch_all_pages.

Signed-off-by: Jitendra Kolhe <jitendra.kolhe@hpe.com>
Message-Id: <1487907103-32350-1-git-send-email-jitendra.kolhe@hpe.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Thomas Huth
c0d9f7d0bc docs: Add a note about mixing bootindex with "-boot order"
Occasionally the users try to mix the bootindex properties with the
"-boot order" parameter - and this likely does not give the expected
results. So let's add a proper statement that these two concepts
should not be used together.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1488303601-23741-1-git-send-email-thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Dr. David Alan Gilbert
e8f5fe2de1 memory_region: Fix name comments
The 'name' parameter to memory_region_init_* had been marked as debug
only, however vmstate_region_ram uses it as a parameter to
qemu_ram_set_idstr to set RAMBlock names and these form part of the
migration stream.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170309152708.30635-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-14 13:26:36 +01:00
Andrew Jones
d5affb0d86 target/arm/arm-powerctl: Fix psci info return values
The power state spec section 5.1.5 AFFINITY_INFO defines the
affinity info return values as

  0 ON
  1 OFF
  2 ON_PENDING

I grepped QEMU for power_state to ensure that no assumptions
of OFF=0 were being made.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 20170303123232.4967-1-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 11:28:54 +00:00
Andrew Baumann
6ecd0b6ba0 target/arm: implement armv8 PMUSERENR (user-mode enable bits)
In armv8, this register implements more than a single bit, with
fine-grained enables for read access to event counters, cycles
counters, and write access to the software increment. This change
implements those checks using custom access functions for the relevant
registers.

Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 20170228215801.10472-2-Andrew.Baumann@microsoft.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: move a couple of access functions to be only compiled
 ifndef CONFIG_USER_ONLY to avoid compiler warnings]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 11:28:54 +00:00
Peter Maydell
591bce29b1 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 14 Mar 2017 07:55:01 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  hw/net: implement MIB counters in mcf_fec driver
  COLO-compare: Fix trace_event print bug
  e1000e: correctly tear down MSI-X memory regions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 11:15:00 +00:00
Peter Maydell
94b5d57d2f Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170314' into staging
ppc patch queue for 2017-03-14

This set has a handful og bugfixes to go into qemu-2.9.  This includes
an update to the dtc/libfdt submodule which will fix the build errors
seen on some distributions.

# gpg: Signature made Tue 14 Mar 2017 04:00:41 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170314:
  dtc: Update submodule to avoid build errors
  pseries: Don't expose PCIe extended config space on older machine types
  target/ppc: fix cpu_ov setting for 32-bit
  target/ppc: Fix wrong number of UAMR register

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 10:13:19 +00:00
Christopher Covington
4d04351f4c build: include sys/sysmacros.h for major() and minor()
The definition of the major() and minor() macros are moving within glibc to
<sys/sysmacros.h>. Include this header when it is available to avoid the
following sorts of build-stopping messages:

qga/commands-posix.c: In function ‘dev_major_minor’:
qga/commands-posix.c:656:13: error: In the GNU C Library, "major" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "major", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "major", you should undefine it after including <sys/types.h>. [-Werror]
         *devmajor = major(st.st_rdev);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~

qga/commands-posix.c:657:13: error: In the GNU C Library, "minor" is defined
 by <sys/sysmacros.h>. For historical compatibility, it is
 currently defined by <sys/types.h> as well, but we plan to
 remove this soon. To use "minor", include <sys/sysmacros.h>
 directly. If you did not intend to use a system-defined macro
 "minor", you should undefine it after including <sys/types.h>. [-Werror]
         *devminor = minor(st.st_rdev);
             ^~~~~~~~~~~~~~~~~~~~~~~~~~

The additional include allows the build to complete on Fedora 26 (Rawhide)
with glibc version 2.24.90.

Signed-off-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-14 10:08:22 +00:00
Greg Ungerer
adb560f7fc hw/net: implement MIB counters in mcf_fec driver
The FEC ethernet hardware module used on ColdFire SoC parts contains a
block of RAM used to maintain hardware counters. This block is accessible
via the usual FEC register address space. There is currently no support
for this in the QEMU mcf_fec driver.

Add support for storing a MIB RAM block, and provide register level
access to it. Also implement a basic set of stats collection functions
to populate MIB data fields.

This support tested running a Linux target and using the net-tools
"ethtool -S" option. As of linux-4.9 the kernels FEC driver makes
accesses to the MIB counters during its initialization (which it never
did before), and so this version of Linux will now fail with the QEMU
error:

    qemu: hardware error: mcf_fec_read: Bad address 0x200

This MIB counter support fixes this problem.

Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-14 15:39:55 +08:00
Zhang Chen
e630b2bf7c COLO-compare: Fix trace_event print bug
Because of inet_ntoa() return a statically allocated buffer,
subsequent calls will overwrite, So we fix this bug.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-14 15:39:55 +08:00
Paolo Bonzini
7ec7ae4b97 e1000e: correctly tear down MSI-X memory regions
MSI-X has been disabled by the time the e1000e device is unrealized, hence
msix_uninit is never called.  This causes the object to be leaked, which
shows up as a RAMBlock with empty name when attempting migration.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-14 15:39:55 +08:00
David Gibson
28df75d8d1 dtc: Update submodule to avoid build errors
The currently included version of the dtc/libfdt submodule has some build
errors on certain distributions (including RHEL7).  This is due to some
poorly named macros in libfdt.h; they're designed for use with the sparse
static checker, but use reserved names which conflict with some symbols in
the standard headers.

That's been corrected in upstream dtc, this updates the qemu submodule to
bring the fix to qemu.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-14 12:24:29 +11:00
David Gibson
82516263ce pseries: Don't expose PCIe extended config space on older machine types
bb9986452 "spapr_pci: Advertise access to PCIe extended config space"
allowed guests to access the extended config space of PCI Express devices
via the PAPR interfaces, even though the paravirtualized bus mostly acts
like plain PCI.

However, that patch enabled access unconditionally, including for existing
machine types, which is an unwise change in behaviour.  This patch limits
the change to pseries-2.9 (and later) machine types.

Suggested-by: Andrea Bolognani <abologna@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-14 11:54:17 +11:00
Nikunj A Dadhania
38a61d3487 target/ppc: fix cpu_ov setting for 32-bit
A bug was introduced in following commit:

    dc0ad84 target/ppc: update overflow flags for add/sub

As for 32-bit ppc target extracting bit 63 for overflow is not correct.
Made it dependent on TARGET_LOG_BITS. This had broken booting MacOS
9.2.1 image

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2017-03-14 11:27:23 +11:00
Thomas Huth
f244115cbd target/ppc: Fix wrong number of UAMR register
The SPR UAMR has the number 13, and not 12. (Fortunately it seems like
Linux is not using this register yet - only the privileged version with
number 29 ... that's why nobody noticed this problem yet)

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-14 11:12:10 +11:00
Peter Maydell
5bac3c39c8 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc1

# gpg: Signature made Mon 13 Mar 2017 11:53:16 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream:
  commit: Implement .bdrv_refresh_filename
  mirror: Implement .bdrv_refresh_filename
  block: Refresh filename after changing backing file
  commit: Implement bdrv_commit_top.bdrv_co_get_block_status
  block: Request block status from *file for BDRV_BLOCK_RAW
  block: Remove check_new_perm from bdrv_replace_child()
  migration: Document handling of bdrv_is_allocated() errors
  vvfat: React to bdrv_is_allocated() errors
  backup: React to bdrv_is_allocated() errors
  block: Drop unmaintained 'archipelago' driver
  file-posix: Consider max_segments for BlockLimits.max_transfer
  backup: allow target without .bdrv_get_info

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-13 15:08:01 +00:00
Peter Maydell
f962709c69 Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
x86: Haswell TSX blacklist fix for 2.9

# gpg: Signature made Fri 10 Mar 2017 18:45:08 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  i386: Change stepping of Haswell to non-blacklisted value
  i386/kvm: Blacklist TSX on known broken hosts
  i386: host_vendor_fms() helper function

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-13 13:16:35 +00:00
Kevin Wolf
dcbf37ce41 commit: Implement .bdrv_refresh_filename
We want query-block to return the right filename, even if a commit job
put a bdrv_commit_top on top of the actual image format driver. Let
bdrv_commit_top.bdrv_refresh_filename get the filename from its backing
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf
fd4a6493bb mirror: Implement .bdrv_refresh_filename
We want query-block to return the right filename, even if a mirror job
put a bdrv_mirror_top on top of the actual image format driver. Let
bdrv_mirror_top.bdrv_refresh_filename get the filename from its backing
file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf
9e7e940c3d block: Refresh filename after changing backing file
In bdrv_open_inherit(), the filename is refreshed after opening the
backing file, but we neglected to do the same when the backing file
changes later.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf
9196565866 commit: Implement bdrv_commit_top.bdrv_co_get_block_status
In some cases, bdrv_co_get_block_status() is called recursively for the
whole backing chain. The automatically inserted bdrv_commit_top filter
driver must not stop the recursion, so implement a callback that simply
forwards the request to bs->backing.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf
b64aa44195 block: Request block status from *file for BDRV_BLOCK_RAW
This fixes bdrv_co_get_block_status() for the bdrv_mirror_top block
driver, which must fall through to bs->backing instead of bs->file.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Kevin Wolf
466787fbca block: Remove check_new_perm from bdrv_replace_child()
All callers pass false now, so the parameter can go away again.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake
7d66b1fbd2 migration: Document handling of bdrv_is_allocated() errors
Migration is the only code left in the tree that does not react
to bdrv_is_allocated() failures.  But as there is no useful way
to react to the failure, and we are merely skipping unallocated
sectors on success, just document that our choice of handling
is intended.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake
6f712ee080 vvfat: React to bdrv_is_allocated() errors
If bdrv_is_allocated() fails, we should react to that failure.
For 2 of the 3 callers, reporting the error was easy.  But in
cluster_was_modified() and its lone caller
get_cluster_count_for_direntry(), it's rather invasive to update
the logic to pass the error back; so there, I went with merely
documenting the issue by changing the return type to bool (in
all likelihood, treating the cluster as modified will then
trigger a read which will also fail, and eventually get to an
error - but given the appalling number of abort() calls in this
code, I'm not making it any worse).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake
666a9543fa backup: React to bdrv_is_allocated() errors
If bdrv_is_allocated() fails, we should immediately do the backup
error action, rather than attempting backup_do_cow() (although
that will likely fail too).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Eric Blake
e32ccbc6e9 block: Drop unmaintained 'archipelago' driver
The driver has failed to build since commit da34e65, in qemu 2.6,
due to a missing include of qapi/error.h for error_setg().
Since no one has complained in three releases, it is easier to
remove the dead code than to keep it around, especially since it
is not being built by default and therefore prone to bitrot.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Fam Zheng
9103f1ceb4 file-posix: Consider max_segments for BlockLimits.max_transfer
BlockLimits.max_transfer can be too high without this fix, guest will
encounter I/O error or even get paused with werror=stop or rerror=stop. The
cause is explained below.

Linux has a separate limit, /sys/block/.../queue/max_segments, which in
the worst case can be more restrictive than the BLKSECTGET which we
already consider (note that they are two different things). So, the
failure scenario before this patch is:

1) host device has max_sectors_kb = 4096 and max_segments = 64;
2) guest learns max_sectors_kb limit from QEMU, but doesn't know
   max_segments;
3) guest issues e.g. a 512KB request thinking it's okay, but actually
   it's not, because it will be passed through to host device as an
   SG_IO req that has niov > 64;
4) host kernel doesn't like the segmenting of the request, and returns
   -EINVAL;

This patch checks the max_segments sysfs entry for the host device and
calculates a "conservative" bytes limit using the page size, which is
then merged into the existing max_transfer limit. Guest will discover
this from the usual virtual block device interfaces. (In the case of
scsi-generic, it will be done in the INQUIRY reply interception in
device model.)

The other possibility is to actually propagate it as a separate limit,
but it's not better. On the one hand, there is a big complication: the
limit is per-LUN in QEMU PoV (because we can attach LUNs from different
host HBAs to the same virtio-scsi bus), but the channel to communicate
it in a per-LUN manner is missing down the stack; on the other hand,
two limits versus one doesn't change much about the valid size of I/O
(because guest has no control over host segmenting).

Also, the idea to fall back to bounce buffering in QEMU, upon -EINVAL,
was explored. Unfortunately there is no neat way to ensure the bounce
buffer is less segmented (in terms of DMA addr) than the guest buffer.

Practically, this bug is not very common. It is only reported on a
Emulex (lpfc), so it's okay to get it fixed in the easier way.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Vladimir Sementsov-Ogievskiy
a410a7f1af backup: allow target without .bdrv_get_info
Currently backup to nbd target is broken, as nbd doesn't have
.bdrv_get_info realization.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-13 12:49:33 +01:00
Peter Maydell
b1616fe0e2 Merge remote-tracking branch 'remotes/famz/tags/docker-pull-request' into staging
# gpg: Signature made Fri 10 Mar 2017 07:15:38 GMT
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/docker-pull-request:
  docker/dockerfiles/debian-s390-cross: include clang
  tests/docker: support proxy / corporate firewall

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-13 11:26:36 +00:00
Max Filippov
f289bb091e target/xtensa: fix semihosting argc/argv implementation
So far xtensa provides fixed dummy argc/argv for the corresponding
semihosting calls. Now that there are semihosting_get_argc and
semihosting_get_arg, use them to pass actual command line arguments
to guest.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-11 14:59:03 -08:00
Max Filippov
0e80359e62 target/xtensa: xtfpga: load DTB only when FDT support is enabled
xtensa linux can use DTB but does not require it, so FDT support is not
a requirement for target/xtensa. Don't try to load DTB when FDT support
is not configured.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-11 14:59:03 -08:00
Eduardo Habkost
ec56a4a7b0 i386: Change stepping of Haswell to non-blacklisted value
glibc blacklists TSX on Haswell CPUs with model==60 and
stepping < 4. To make the Haswell CPU model more useful, make
those guests actually use TSX by changing CPU stepping to 4.

References:
* glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359
  https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309181212.18864-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-10 15:01:09 -03:00
Eduardo Habkost
40e80ee411 i386/kvm: Blacklist TSX on known broken hosts
Some Intel CPUs are known to have a broken TSX implementation. A
microcode update from Intel disabled TSX on those CPUs, but
GET_SUPPORTED_CPUID might be reporting it as supported if the
hosts were not updated yet.

Manually fixup the GET_SUPPORTED_CPUID data to ensure we will
never enable TSX when running on those hosts.

Reference:
* glibc commit 2702856bf45c82cf8e69f2064f5aa15c0ceb6359:
  https://sourceware.org/git/?p=glibc.git;a=commit;h=2702856bf45c82cf8e69f2064f5aa15c0ceb6359

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309181212.18864-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-10 15:01:08 -03:00
Eduardo Habkost
20271d4840 i386: host_vendor_fms() helper function
Helper function for code that needs to check the host CPU
vendor/family/model/stepping values.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170309181212.18864-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-10 15:01:08 -03:00
Alex Bennée
8ba1e5f72b docker/dockerfiles/debian-s390-cross: include clang
It's a silly little limitation on Shippable that is looks for clang
in the container even though we won't use it. The arm/aarch64 cross
builds inherit this from debian.docker but as we needed to use
debian-testing for this we add it here. We also collapse the update
step into one RUN line to remove and intermediate layer of the docker
build.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20170306112848.659-1-alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-03-10 15:05:22 +08:00
Peter Maydell
95b0eca46e Merge remote-tracking branch 'remotes/stsquad/tags/pull-mttcg-fixups-090317-1' into staging
Fix-ups for MTTCG regressions for 2.9

This is the same as v3 posted a few days ago except with a few extra
Reviewed-by tags added.

# gpg: Signature made Thu 09 Mar 2017 10:45:18 GMT
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-mttcg-fixups-090317-1:
  hw/intc/arm_gic: modernise the DPRINTF
  target/arm/helper: make it clear the EC field is also in hex
  target-i386: defer VMEXIT to do_interrupt
  target/mips: hold BQL for timer interrupts
  translate-all: exit cpu_restore_state early if translating
  target/xtensa: hold BQL for interrupt processing
  s390x/misc_helper.c: wrap IO instructions in BQL
  sparc/sparc64: grab BQL before calling cpu_check_irqs
  cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG
  target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO
  vl/cpus: be smarter with icount and MTTCG

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-09 18:53:55 +00:00
Peter Maydell
dd4d257821 Merge remote-tracking branch 'remotes/kraxel/tags/pull-fixes-20170309-1' into staging
2.9 bugfixes for ohci and qxl

# gpg: Signature made Thu 09 Mar 2017 09:09:44 GMT
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-fixes-20170309-1:
  qxl: clear guest_cursor on QXL_CURSOR_HIDE
  ohci: relax link check

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-09 13:16:05 +00:00
Alex Bennée
68bf93ce9d hw/intc/arm_gic: modernise the DPRINTF
While I was debugging the icount issues I realised a bunch of the
messages look quite similar. I've fixed this by including __func__ in
the debug print. At the same time I move the a modern if (GATE) style
printf which ensures the compiler can check for format string errors
even if the code gets optimised away in the non-DEBUG_GIC case.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-09 10:41:49 +00:00
Alex Bennée
6568da459b target/arm/helper: make it clear the EC field is also in hex
..just like the rest of the displayed ESR register. Otherwise people
might scratch their heads if a not obviously hex number is displayed
for the EC field.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-09 10:41:48 +00:00
Paolo Bonzini
10cde894b6 target-i386: defer VMEXIT to do_interrupt
Paths through the softmmu code during code generation now need to be audited
to check for double locking of tb_lock.  In particular, VMEXIT can take tb_lock
through cpu_vmexit -> cpu_x86_update_cr4 -> tlb_flush.

To avoid this, split VMEXIT delivery in two parts, similar to what is done with
exceptions.  cpu_vmexit only records the VMEXIT exit code and information, and
cc->do_interrupt can then deliver it when it is safe to take the lock.

Reported-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Alexander Boettcher <alexander.boettcher@genode-labs.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-03-09 10:41:48 +00:00
Yongbok Kim
d394698d73 target/mips: hold BQL for timer interrupts
Hold BQL when accessing timer which can cause interrupts

Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-03-09 10:41:48 +00:00
Alex Bennée
d8b2239bcd translate-all: exit cpu_restore_state early if translating
The translation code uses cpu_ld*_code which can trigger a tlb_fill
which if it fails will erroneously attempts a fault resolution. This
never works during translation as the TB being generated hasn't been
added yet. The target should have checked retaddr before calling
cpu_restore_state but for those that have yet to be fixed we do it
here to avoid a recursive tb_lock() under MTTCG's new locking regime.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-03-09 10:41:43 +00:00
Alex Bennée
47e2088797 target/xtensa: hold BQL for interrupt processing
Make sure we have the BQL held when processing interrupts.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
2017-03-09 10:41:43 +00:00
Alex Bennée
278f5e98c6 s390x/misc_helper.c: wrap IO instructions in BQL
Helpers that can trigger IO events (including interrupts) need to be
protected by the BQL. I've updated all the helpers that call into an
ioinst_handle_* functions.

Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-09 10:41:43 +00:00
Alex Bennée
5ee5993001 sparc/sparc64: grab BQL before calling cpu_check_irqs
IRQ modification is part of device emulation and should be done while
the BQL is held to prevent races when MTTCG is enabled. This adds
assertions in the hw emulation layer and wraps the calls from helpers
in the BQL.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-03-09 10:41:38 +00:00
Alex Bennée
c34c762015 cpus.c: add additional error_report when !TARGET_SUPPORT_MTTCG
While we may fail the memory ordering check later that can be
confusing. So in cases where TARGET_SUPPORT_MTTCG has yet to be
defined we should say so specifically.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-09 10:38:16 +00:00
Alex Bennée
72c1701f62 target/i386/cpu.h: declare TCG_GUEST_DEFAULT_MO
This suppresses the incorrect warning when forcing MTTCG for x86
guests on x86 hosts. A future patch will still warn when
TARGET_SUPPORT_MTTCG hasn't been defined for the guest (which is still
pending for x86).

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-09 10:38:02 +00:00
Alex Bennée
83fd9629a3 vl/cpus: be smarter with icount and MTTCG
The sense of the test was inverted. Make it simple, if icount is
enabled then we disabled MTTCG by default. If the user tries to force
MTTCG upon us then we tell them "no".

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
2017-03-09 10:38:02 +00:00
Gerd Hoffmann
dbb5fb8d35 qxl: clear guest_cursor on QXL_CURSOR_HIDE
Make sure we don't leave guest_cursor pointing into nowhere.  This might
lead to (rare) live migration failures, due to target trying to restore
the cursor from the stale pointer.

Bugzilla: https://bugzilla.redhat.com/show_bug.cgi?id=1421788
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1488789111-27340-1-git-send-email-kraxel@redhat.com
2017-03-09 09:47:26 +01:00
Gerd Hoffmann
ab6b1105a2 ohci: relax link check
The strict td link limit added by commit "95ed569 usb: ohci: limit the
number of link eds" causes problems with macos guests.  Lets raise the
limit.

Reported-by: Programmingkid <programmingkidx@gmail.com>
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: John Arbuckle <programmingkidx@gmail.com>
Message-id: 1488876018-31576-1-git-send-email-kraxel@redhat.com
2017-03-09 09:46:13 +01:00
Peter Maydell
b64842dee4 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer fixes for 2.9.0-rc0

# gpg: Signature made Tue 07 Mar 2017 14:59:18 GMT
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (27 commits)
  commit: Don't use error_abort in commit_start
  block: Don't use error_abort in blk_new_open
  sheepdog: Support blockdev-add
  qapi-schema: Rename SocketAddressFlat's variant tcp to inet
  qapi-schema: Rename GlusterServer to SocketAddressFlat
  gluster: Plug memory leaks in qemu_gluster_parse_json()
  gluster: Don't duplicate qapi-util.c's qapi_enum_parse()
  gluster: Drop assumptions on SocketTransport names
  sheepdog: Implement bdrv_parse_filename()
  sheepdog: Use SocketAddress and socket_connect()
  sheepdog: Report errors in pseudo-filename more usefully
  sheepdog: Don't truncate long VDI name in _open(), _create()
  sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()
  sheepdog: Mark sd_snapshot_delete() lossage FIXME
  sheepdog: Fix error handling sd_create()
  sheepdog: Fix error handling in sd_snapshot_delete()
  sheepdog: Defuse time bomb in sd_open() error handling
  block: Fix error handling in bdrv_replace_in_backing_chain()
  block: Handle permission errors in change_parent_backing_link()
  block: Ignore multiple children in bdrv_check_update_perm()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-08 09:47:52 +00:00
Peter Maydell
87467097f8 Merge remote-tracking branch 'remotes/armbru/tags/pull-block-2017-02-28-v4' into staging
block: Command line option -blockdev

# gpg: Signature made Tue 07 Mar 2017 15:07:59 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-block-2017-02-28-v4: (24 commits)
  keyval: Support lists
  docs/qapi-code-gen.txt: Clarify naming rules
  qapi: Improve how keyval input visitor reports unexpected dicts
  block: Initial implementation of -blockdev
  qapi: New qobject_input_visitor_new_str() for convenience
  keyval: Restrict key components to valid QAPI names
  qapi: New parse_qapi_name()
  test-qapi-util: New, covering qapi/qapi-util.c
  monitor: Assert qmp_schema_json[] is sane
  test-visitor-serialization: Pass &error_abort to qobject_from_json()
  check-qjson: Test errors from qobject_from_json()
  block: More detailed syntax error reporting for JSON filenames
  qobject: Propagate parse errors through qobject_from_json()
  test-qobject-input-visitor: Abort earlier on bad test input
  qjson: Abort earlier on qobject_from_jsonf() misuse
  libqtest: Fix qmp() & friends to abort on JSON parse errors
  qobject: Propagate parse errors through qobject_from_jsonv()
  qapi: Factor out common qobject_input_get_keyval()
  qapi: Factor out common part of qobject input visitor creation
  test-keyval: Cover use with qobject input visitor
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-07 17:06:48 +00:00
Markus Armbruster
0b2c1beea4 keyval: Support lists
Additionally permit non-negative integers as key components.  A
dictionary's keys must either be all integers or none.  If all keys
are integers, convert the dictionary to a list.  The set of keys must
be [0,N].

Examples:

* list.1=goner,list.0=null,list.1=eins,list.2=zwei
  is equivalent to JSON [ "null", "eins", "zwei" ]

* a.b.c=1,a.b.0=2
  is inconsistent: a.b.c clashes with a.b.0

* list.0=null,list.2=eins,list.2=zwei
  has a hole: list.1 is missing

Similar design flaw as for objects: there is no way to denote an empty
list.  While interpreting "key absent" as empty list seems natural
(removing a list member from the input string works when there are
multiple ones, so why not when there's just one), it doesn't work:
"key absent" already means "optional list absent", which isn't the
same as "empty list present".

Update the keyval object visitor to use this a.0 syntax in error
messages rather than the usual a[0].

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488317230-26248-25-git-send-email-armbru@redhat.com>
[Off-by-one fix squashed in, as per Kevin's review]
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 16:07:48 +01:00
Markus Armbruster
79f7598164 docs/qapi-code-gen.txt: Clarify naming rules
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-24-git-send-email-armbru@redhat.com>
2017-03-07 16:07:48 +01:00
Markus Armbruster
31478f26ab qapi: Improve how keyval input visitor reports unexpected dicts
Incorrect option

    -blockdev node-name=foo,driver=file,filename=foo.img,aio.unmap=on

is rejected with "Invalid parameter type for 'aio', expected: string".
To make sense of this, you almost have to translate it into the
equivalent QMP command

    { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "file", "filename": "foo.img", "aio": { "unmap": true } } }

Improve the error message to "Parameters 'aio.*' are unexpected".
Take care not to confuse the case "unexpected nested parameters"
(i.e. the object is a QDict or QList) with the case "non-string scalar
parameter".  The latter is a misuse of the visitor, and should perhaps
be an assertion.  Note that test-qobject-input-visitor exercises this
misuse in test_visitor_in_int_keyval(), test_visitor_in_bool_keyval()
and test_visitor_in_number_keyval().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-23-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
42e5f39378 block: Initial implementation of -blockdev
The new command line option -blockdev works like QMP command
blockdev-add.

The option argument may be given in JSON syntax, exactly as in QMP.
Example usage:

    -blockdev '{"node-name": "foo", "driver": "raw", "file": {"driver": "file", "filename": "foo.img"} }'

The JSON argument doesn't exactly blend into the existing option
syntax, so the traditional KEY=VALUE,... syntax is also supported,
using dotted keys to do the nesting:

    -blockdev node-name=foo,driver=raw,file.driver=file,file.filename=foo.img

This does not yet support lists, but that will be addressed shortly.

Note that calling qmp_blockdev_add() (say via qmp_marshal_block_add())
right away would crash.  We need to stash the configuration for later
instead.  This is crudely done, and bypasses QemuOpts, even though
storing configuration is what QemuOpts is for.  Need to revamp option
infrastructure to support QAPI types like BlockdevOptions.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488317230-26248-22-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
9d1eab4b95 qapi: New qobject_input_visitor_new_str() for convenience
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488317230-26248-21-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
f740048323 keyval: Restrict key components to valid QAPI names
Until now, key components are separated by '.'.  This leaves little
room for evolving the syntax, and is incompatible with the __RFQDN_
prefix convention for downstream extensions.

Since key components will be commonly used as QAPI member names by the
QObject input visitor, we can just as well borrow the QAPI naming
rules here: letters, digits, hyphen and period starting with a letter,
with an optional __RFQDN_ prefix for downstream extensions.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-20-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
069b64e3fe qapi: New parse_qapi_name()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488317230-26248-19-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
6c873d1149 test-qapi-util: New, covering qapi/qapi-util.c
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-18-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
b0f36b23f1 monitor: Assert qmp_schema_json[] is sane
qmp_query_qmp_schema() parses qmp_schema_json[] with
qobject_from_json().  This must not fail, so pass &error_abort.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-17-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
02146d27c3 test-visitor-serialization: Pass &error_abort to qobject_from_json()
qmp_deserialize() calls qobject_from_json() ignoring errors.  It
passes the result to qobject_input_visitor_new(), which asserts it's
not null.  Therefore, we can just as well pass &error_abort to
qobject_from_json().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-16-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
aec4b054ea check-qjson: Test errors from qobject_from_json()
Pass &error_abort with known-good input.  Else pass &err and check
what comes back.  This demonstrates that the parser fails silently for
many errors.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-15-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
5577fff738 block: More detailed syntax error reporting for JSON filenames
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-14-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
57348c2f18 qobject: Propagate parse errors through qobject_from_json()
The next few commits will put the errors to use where appropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488317230-26248-13-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
bff17e84a9 test-qobject-input-visitor: Abort earlier on bad test input
visitor_input_test_init_internal() parses test input with
qobject_from_jsonv(), and asserts it succeeds.  Pass &error_abort for
good measure.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-12-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
ea5ef5c80b qjson: Abort earlier on qobject_from_jsonf() misuse
Ignoring errors first, then asserting success is suboptimal.  Pass
&error_abort instead, so we abort earlier, and hopefully get more
useful clues on what's wrong.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-11-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
53f991520e libqtest: Fix qmp() & friends to abort on JSON parse errors
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-10-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
99dbfd1db1 qobject: Propagate parse errors through qobject_from_jsonv()
The next few commits will put the errors to use where appropriate.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-9-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
e3934b4297 qapi: Factor out common qobject_input_get_keyval()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488317230-26248-8-git-send-email-armbru@redhat.com>
2017-03-07 16:07:47 +01:00
Markus Armbruster
abe81bc21a qapi: Factor out common part of qobject input visitor creation
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-7-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Markus Armbruster
9e3943f883 test-keyval: Cover use with qobject input visitor
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-6-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Daniel P. Berrange
cbd8acf38f qapi: qobject input visitor variant for use with keyval_parse()
Currently the QObjectInputVisitor assumes that all scalar values are
directly represented as the final types declared by the thing being
visited. i.e. it assumes an 'int' is using QInt, and a 'bool' is using
QBool, etc.  This is good when QObjectInputVisitor is fed a QObject
that came from a JSON document on the QMP monitor, as it will strictly
validate correctness.

To allow QObjectInputVisitor to be reused for visiting a QObject
originating from keyval_parse(), an alternative mode is needed where
all the scalars types are represented as QString and converted on the
fly to the final desired type.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-8-git-send-email-berrange@redhat.com>

Rebased, conflicts resolved, commit message updated to refer to
keyval_parse().  autocast replaced by keyval in identifiers,
noautocast replaced by fail in tests.

Fix qobject_input_type_uint64_keyval() not to reject '-', for QemuOpts
compatibility: replace parse_uint_full() by open-coded
parse_option_number().  The next commit will add suitable tests.
Leave out the fancy ERANGE error reporting for now, but add a TODO
comment.  Add it qobject_input_type_int64_keyval() and
qobject_input_type_number_keyval(), too.

Open code parse_option_bool() and parse_option_size() so we have to
call qobject_input_get_name() only when actually needed.  Again, leave
out ERANGE error reporting for now.

QAPI/QMP downstream extension prefixes __RFQDN_ don't work, because
keyval_parse() splits them at '.'.  This will be addressed later in
the series.

qobject_input_type_int64_keyval(), qobject_input_type_uint64_keyval(),
qobject_input_type_number_keyval() tweaked for style.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-5-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Markus Armbruster
d454dbe0ee keyval: New keyval_parse()
keyval_parse() parses KEY=VALUE,... into a QDict.  Works like
qemu_opts_parse(), except:

* Returns a QDict instead of a QemuOpts (d'oh).

* Supports nesting, unlike QemuOpts: a KEY is split into key
  fragments at '.' (dotted key convention; the block layer does
  something similar on top of QemuOpts).  The key fragments are QDict
  keys, and the last one's value is updated to VALUE.

* Each key fragment may be up to 127 bytes long.  qemu_opts_parse()
  limits the entire key to 127 bytes.

* Overlong key fragments are rejected.  qemu_opts_parse() silently
  truncates them.

* Empty key fragments are rejected.  qemu_opts_parse() happily
  accepts empty keys.

* It does not store the returned value.  qemu_opts_parse() stores it
  in the QemuOptsList.

* It does not treat parameter "id" specially.  qemu_opts_parse()
  ignores all but the first "id", and fails when its value isn't
  id_wellformed(), or duplicate (a QemuOpts with the same ID is
  already stored).  It also screws up when a value contains ",id=".

* Implied value is not supported.  qemu_opts_parse() desugars "foo" to
  "foo=on", and "nofoo" to "foo=off".

* An implied key's value can't be empty, and can't contain ','.

I intend to grow this into a saner replacement for QemuOpts.  It'll
take time, though.

Note: keyval_parse() provides no way to do lists, and its key syntax
is incompatible with the __RFQDN_ prefix convention for downstream
extensions, because it blindly splits at '.', even in __RFQDN_.  Both
issues will be addressed later in the series.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488317230-26248-4-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Markus Armbruster
112c944655 tests: Fix gcov-files-test-qemu-opts-y, gcov-files-test-logging-y
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-3-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Markus Armbruster
0e2052b260 test-qemu-opts: Cover qemu_opts_parse() of "no"
qemu_opts_parse() interprets "no" as negated empty key.  Consistent
with its acceptance of empty keys elsewhere, whatever that's worth.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1488317230-26248-2-git-send-email-armbru@redhat.com>
2017-03-07 16:07:46 +01:00
Peter Maydell
43c227f9dd disas/arm: Avoid unintended sign extension
When assembling 'given' from the instruction bytes, C's integer
promotion rules mean we may promote an unsigned char to a signed
integer before shifting it, and then sign extend to a 64-bit long,
which can set the high bits of the long.  The code doesn't in fact
care about the high bits if the long is 64 bits, but this is
surprising, so don't do it.

(Spotted by Coverity, CID 1005404.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488556233-31246-7-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Peter Maydell
001ebaca7b disas/cris: Avoid unintended sign extension
In the cris disassembler we were using 'unsigned long' to calculate
addresses which are supposed to be 32 bits.  This meant that we might
accidentally sign extend or calculate a value that was outside the 32
bit range of the guest CPU.  Use 'uint32_t' instead so we give the
right answers on 64-bit hosts.

(Spotted by Coverity, CID 1005402, 1005403.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1488556233-31246-6-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Peter Maydell
1d153a3388 disas/microblaze: Avoid unintended sign extension
In read_insn_microblaze() we assemble 4 bytes into an 'unsigned
long'.  If 'unsigned long' is 64 bits and the high byte has its top
bit set, then C's implicit conversion from 'unsigned char' to 'int'
for the shift will result in an unintended sign extension which sets
the top 32 bits in 'inst'.  Add casts to prevent this.  (Spotted by
Coverity, CID 1005401.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1488556233-31246-5-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Peter Maydell
2e3883d03d disas/m68k: Avoid unintended sign extension in get_field()
In get_field(), we take an 'unsigned char' value and shift it left,
which implicitly promotes it to 'signed int', before ORing it into an
'unsigned long' type.  If 'unsigned long' is 64 bits then this will
result in a sign extension and the top 32 bits of the result will be
1s.  Add explicit casts to unsigned long before shifting to prevent
this.

(Spotted by Coverity, CID 715697.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 1488556233-31246-4-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Peter Maydell
3f168b5d35 disas/i386: Avoid NULL pointer dereference in error case
In a code path where we hit an internal disassembler error, execution
would subsequently attempt to dereference a NULL pointer.  This
should never happen, but avoid the crash.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1488556233-31246-3-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Peter Maydell
6815a8a00a disas/hppa: Remove dead code
Coverity complains (CID 1302705) that the "fr0" part of the ?: in
fput_fp_reg_r() is dead.  This looks like cut-n-paste error from
fput_fp_reg(); delete the dead code.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1488556233-31246-2-git-send-email-peter.maydell@linaro.org
2017-03-07 14:33:51 +00:00
Fam Zheng
b69f00dde4 commit: Don't use error_abort in commit_start
bdrv_set_backing_hd failure needn't be abort. Since we already have
error parameter, use it.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Fam Zheng
50bfbe93b2 block: Don't use error_abort in blk_new_open
We have an errp and bdrv_root_attach_child can fail permission check,
error_abort is not the best choice here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
d282f34e82 sheepdog: Support blockdev-add
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
c5f1ae3ae7 qapi-schema: Rename SocketAddressFlat's variant tcp to inet
QAPI type SocketAddressFlat differs from SocketAddress pointlessly:
the discriminator value for variant InetSocketAddress is 'tcp' instead
of 'inet'.  Rename.

The type is so far only used by the Gluster block drivers.  Take care
to keep 'tcp' working in things like -drive's file.server.0.type=tcp.
The "gluster+tcp" URI scheme in pseudo-filenames stays the same.
blockdev-add changes, but it has changed incompatibly since 2.8
already.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
2b733709d7 qapi-schema: Rename GlusterServer to SocketAddressFlat
As its documentation says, it's not specific to Gluster.  Rename it,
as I'm going to use it for something else.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
85a82e852d gluster: Plug memory leaks in qemu_gluster_parse_json()
To reproduce, run

    $ valgrind qemu-system-x86_64 --nodefaults -S --drive driver=gluster,volume=testvol,path=/a/b/c,server.0.type=xxx

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
e152ef7a98 gluster: Don't duplicate qapi-util.c's qapi_enum_parse()
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:29 +01:00
Markus Armbruster
fc29458dee gluster: Drop assumptions on SocketTransport names
qemu_gluster_glfs_init() passes the names of QAPI enumeration type
SocketTransport to glfs_set_volfile_server().  Works, because they
were chosen to match.  But the coupling is artificial.  Use the
appropriate literal strings instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
831acdc95e sheepdog: Implement bdrv_parse_filename()
This permits configuration with driver-specific options in addition to
pseudo-filename parsed as URI.  For instance,

    --drive driver=sheepdog,host=fido,vdi=dolly

instead of

    --drive driver=sheepdog,file=sheepdog://fido/dolly

It's also a first step towards supporting blockdev-add.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
8ecc2f9eab sheepdog: Use SocketAddress and socket_connect()
sd_parse_uri() builds a string from host and port parts for
inet_connect().  inet_connect() parses it into host, port and options.
Whether this gets exactly the same host, port and no options for all
inputs is not obvious.

Cut out the string middleman and build a SocketAddress for
socket_connect() instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
36bcac16fd sheepdog: Report errors in pseudo-filename more usefully
Errors in the pseudo-filename are all reported with the same laconic
"Can't parse filename" message.

Add real error reporting, such as:

    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog:///
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog:///: missing file path in URI
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepgod:///vdi
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepgod:///vdi: URI scheme must be 'sheepdog', 'sheepdog+tcp', or 'sheepdog+unix'
    $ qemu-system-x86_64 --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock
    qemu-system-x86_64: --drive driver=sheepdog,filename=sheepdog+unix:///vdi?socke=sheepdog.sock: unexpected query parameters

The code to translate legacy syntax to URI fails to escape URI
meta-characters.  The new error messages are misleading then.  Replace
them by the old "Can't parse filename" message.  "Internal error"
would be more honest.  Anyway, no worse than before.  Also add a FIXME
comment.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
daa0b0d4b1 sheepdog: Don't truncate long VDI name in _open(), _create()
sd_parse_uri() truncates long VDI names silently.  Reject them
instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
89e2a31d33 sheepdog: Fix snapshot ID parsing in _open(), _create, _goto()
sd_parse_uri() and sd_snapshot_goto() screw up error checking after
strtoul(), and truncate long tag names silently.  Fix by replacing
those parts by new sd_parse_snapid_or_tag(), which checks more
carefully.

sd_snapshot_delete() also parses snapshot IDs, but is currently too
broken for me to touch.  Mark TODO.

Two calls of strtol() without error checking remain in
parse_redundancy().  Mark them FIXME.

More silent truncation of configuration strings remains elsewhere.
Not marked.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
a0dc0e2bfe sheepdog: Mark sd_snapshot_delete() lossage FIXME
sd_snapshot_delete() should delete the snapshot whose ID matches
@snapshot_id and whose name matches @name.  But that's not what it
does.  If @snapshot_id is a valid ID, it deletes the snapshot with
that ID, else it deletes the snapshot with that name.  It doesn't use
@name at all.  Add suitable FIXME comments, so someone who actually
knows Sheepdog can fix it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
48d7c4af06 sheepdog: Fix error handling sd_create()
As a bdrv_create() method, sd_create() must set an error and return
negative errno on failure.  It prints the error instead of setting it
when connect_to_sdog() fails.  Fix that.

While there, return the value of connect_to_sdog() like we do
elsewhere, instead of -EIO.  No functional change, as
connect_to_sdog() returns no other error code.

Many more suspicious uses of error_report() and error_report_err()
remain in other functions.  Left for another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
e25cad6921 sheepdog: Fix error handling in sd_snapshot_delete()
As a bdrv_snapshot_delete() method, sd_snapshot_delete() must set an
error and return negative errno on failure.  It sometimes returns -1,
and sometimes neglects to set an error.  It also prints error messages
with error_report().  Fix all that.

Moreover, its handling of an attempt to delete a nonexistent snapshot
is wrong: it error_report()s and succeeds.  Fix it to set an error and
return -ENOENT instead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Markus Armbruster
cbc488ee2a sheepdog: Defuse time bomb in sd_open() error handling
When qemu_opts_absorb_qdict() fails, sd_open() closes stdin, because
sd->fd is still zero.  Fortunately, qemu_opts_absorb_qdict() can't
fail, because:

1. it only fails when qemu_opt_parse() fails, and
2. the only member of runtime_opts.desc[] is a QEMU_OPT_STRING, and
3. qemu_opt_parse() can't fail for QEMU_OPT_STRING.

Defuse this ticking time bomb by jumping behind the file descriptor
cleanup on error.

Also do that for the error paths where sd->fd is still -1.  The file
descriptor cleanup happens to do nothing then, but let's not rely on
that here.

While there, rename label out to err, because it's on the error path,
not the normal path out of the function.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
5fe31c25cc block: Fix error handling in bdrv_replace_in_backing_chain()
When adding an Error parameter, bdrv_replace_in_backing_chain() would
become nothing more than a wrapper around change_parent_backing_link().
So make the latter public, renamed as bdrv_replace_node(), and remove
bdrv_replace_in_backing_chain().

Most of the callers just remove a node from the graph that they just
inserted, so they can use &error_abort, but completion of a mirror job
with 'replaces' set can actually fail.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
234ac1a902 block: Handle permission errors in change_parent_backing_link()
Instead of just trying to change parents by parent over to reference @to
instead of @from, and abort()ing whenever the permissions don't allow
this, do proper permission checking beforehand and pass any error to the
callers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
46181129ea block: Ignore multiple children in bdrv_check_update_perm()
change_parent_backing_link() will need to update multiple BdrvChild
objects at once. Checking permissions reference by reference doesn't
work because permissions need to be consistent only with all parents
moved to the new child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
8ee039951d block: Factor out bdrv_replace_child_noperm()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
d0ac038025 block: Factor out should_update_child()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-07 14:53:28 +01:00
Kevin Wolf
067acf28d1 block: Fix blockdev-snapshot error handling
For blockdev-snapshot, external_snapshot_prepare() accepts an arbitrary
node reference at first and only checks later whether it already has a
backing file. Between those places, other errors can occur.

Therefore checking in external_snapshot_abort() whether state->new_bs
has a backing file is not sufficient to tell whether bdrv_append() was
already completed or not. Trying to undo the bdrv_append() when it
wasn't even executed is wrong.

Introduce a new boolean flag in the state to fix this.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
88f9d1b3d2 mirror: Fix error path for dirty bitmap creation
mirror_top_bs must be removed from the graph again when creating the
dirty bitmap fails.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
0bf74767ff mirror: Fix permissions for removing mirror_top_bs
mirror_top_bs takes write permissions on its backing file, which can
make it impossible to attach that backing file node to another parent.
However, this is exactly what needs to be done in order to remove
mirror_top_bs from the backing chain. So give up the write permission
first.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
7d9fcb391c mirror: Fix permission problem with 'replaces'
The 'replaces' option of drive-mirror can be used to mirror a Quorum
node to a new image and then let the target image replace one of the
Quorum children. In order for this graph modification to succeed, the
mirror job needs to lift its restrictions on the target node first
before actually replacing the child.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-07 14:53:28 +01:00
Kevin Wolf
b247767aac commit: Fix error handling
Apparently some kind of mismerge happened in commit 8dfba279, which
broke the error handling without any real reason by removing the
assignment of the return value to ret in a blk_insert_bs() call.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-07 14:53:28 +01:00
Philippe Mathieu-Daudé
06cc355171 tests/docker: support proxy / corporate firewall
if ftp_proxy/http_proxy/https_proxy standard environment variables available,
pass them to the docker daemon to build images.
this is required when building behind corporate proxy/firewall, but also help
when using local cache server (ie: apt/yum).

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170306205520.32311-1-f4bug@amsat.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2017-03-07 18:20:40 +08:00
Peter Maydell
ff79d5e939 Merge remote-tracking branch 'remotes/xtensa/tags/20170306-xtensa' into staging
target/xtensa updates:

- instantiate local memories in xtensa sim machine;
- add two missing include files to xtensa core importing script.

# gpg: Signature made Mon 06 Mar 2017 22:32:45 GMT
# gpg:                using RSA key 0x51F9CC91F83FA044
# gpg: Good signature from "Max Filippov <filippov@cadence.com>"
# gpg:                 aka "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"
# Primary key fingerprint: 2B67 854B 98E5 327D CDEB  17D8 51F9 CC91 F83F A044

* remotes/xtensa/tags/20170306-xtensa:
  target/xtensa: add two missing headers to core import script
  target/xtensa: sim: instantiate local memories

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-07 09:57:14 +00:00
Peter Maydell
d6780c8221 Merge remote-tracking branch 'remotes/gkurz/tags/fixes-for-2.9' into staging
Fixes issues that got merged with the latest pull request:
- missing O_NOFOLLOW flag for CVE-2016-960
- build break with older glibc that don't have O_PATH and AT_EMPTY_PATH
- various bugs reported by Coverity

# gpg: Signature made Mon 06 Mar 2017 17:51:29 GMT
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/fixes-for-2.9:
  9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
  9pfs: fix O_PATH build break with older glibc versions
  9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
  9pfs: fail local_statfs() earlier
  9pfs: fix fd leak in local_opendir()
  9pfs: fix bogus fd check in local_remove()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-07 09:09:53 +00:00
Peter Maydell
7dc3bc7a04 Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2017-03-06-tag' into staging
qemu-ga patch queue for 2.9

* fix fsfreeze for filesystems mounted in multiple locations
* fix test failure when running in a chroot
* support for socket-based activation

# gpg: Signature made Mon 06 Mar 2017 07:54:17 GMT
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D  3FA0 3353 C9CE F108 B584

* remotes/mdroth/tags/qga-pull-2017-03-06-tag:
  tests: check path to avoid a failing qga/get-vcpus test
  qga: ignore EBUSY when freezing a filesystem
  qga: add systemd socket activation support

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-07 07:32:28 +00:00
Greg Kurz
b003fc0d8a 9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-06 17:34:01 +01:00
Greg Kurz
918112c02a 9pfs: fix O_PATH build break with older glibc versions
When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <groug@kaod.org>
2017-03-06 17:34:01 +01:00
Greg Kurz
b314f6a077 9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.

It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-06 17:34:01 +01:00
Greg Kurz
23da0145cc 9pfs: fail local_statfs() earlier
If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.

(Coverity issue CID1371729)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-06 17:34:01 +01:00
Greg Kurz
faab207f11 9pfs: fix fd leak in local_opendir()
Coverity issue CID1371731

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-06 17:34:01 +01:00
Greg Kurz
b7361d46e7 9pfs: fix bogus fd check in local_remove()
This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.

(Coverity issue CID1371732)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-06 17:34:01 +01:00
Peter Maydell
eba44e9339 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Mon 06 Mar 2017 04:15:17 GMT
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  net/filter-mirror: Follow CODING_STYLE
  COLO-compare: Fix icmp and udp compare different packet always dump bug
  COLO-compare: Optimize compare_common and compare_tcp
  COLO-compare: Rename compare function and remove duplicate codes
  filter-rewriter: skip net_checksum_calculate() while offset = 0
  net/colo: fix memory double free error
  vmxnet3: VMStatify rx/tx q_descr and int_state
  vmxnet3: Convert ring values to uint32_t's
  net/colo-compare: Fix memory free error
  colo-compare: Fix removing fds been watched incorrectly in finalization
  char: remove the right fd been watched in qemu_chr_fe_set_handlers()
  colo-compare: kick compare thread to exit after some cleanup in finalization
  colo-compare: use g_timeout_source_new() to process the stale packets
  NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()
  NetRxPkt: Account buffer with ETH header in IOV length
  NetRxPkt: Do not try to pull more data than present
  NetRxPkt: Fix memory corruption on VLAN header stripping
  eth: Extend vlan stripping functions
  net: Remove useless local var pkt

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-06 15:13:23 +00:00
Peter Maydell
56b51708e9 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170306' into staging
ppc patch queue for 2017-03-06

Looks like my previous batch wasn't quite the last before hard freeze.
This has a handful of bugfixes to go in.  They're all genuine
bugfixes, though not regressions in some cases.

# gpg: Signature made Mon 06 Mar 2017 04:07:48 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170306:
  target/ppc: use helper for excp handling
  target/ppc: fmadd: add macro for updating flags
  target/ppc: fmadd check for excp independently
  spapr: ensure that all threads within core are on the same NUMA node
  ppc/xics: register reset handlers for the ICP and ICS objects

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-06 13:06:30 +00:00
Peter Maydell
fbddc2e560 Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2017-02-28' into staging
QAPI patches for 2017-02-28

# gpg: Signature made Sun 05 Mar 2017 08:21:51 GMT
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2017-02-28: (27 commits)
  qapi: Improve qobject visitor documentation
  qapi: Fix object input visit beyond end of list
  tests: Cover input visit beyond end of list
  qapi: Make input visitors detect unvisited list tails
  test-qobject-input-visitor: Cover missing nested struct member
  tests: Cover partial input visit of list
  test-string-input-visitor: Improve list coverage
  test-string-input-visitor: Tear down existing test automatically
  tests-qobject-input-strict: Merge into test-qobject-input-visitor
  qapi: Drop unused non-strict qobject input visitor
  test-qobject-input-visitor: Use strict visitor
  qom: Make object_property_set_qobject()'s input visitor strict
  qapi: Make string input and opts visitor require non-null input
  qapi: Drop string input visitor method optional()
  qapi: Improve qobject input visitor error reporting
  qapi: Make QObject input visitor set *list reliably
  qapi: Clean up after commit 3d344c2
  qapi: Improve a QObject input visitor error message
  qmp: Eliminate silly QERR_QMP_* macros
  qmp: Drop duplicated QMP command object checks
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-06 10:18:33 +00:00
Bruce Rogers
ec72c0e271 tests: check path to avoid a failing qga/get-vcpus test
The qga/get-vcpus test fails in a simple chroot environment, as
used in an openSUSE Build Service local build, so first check
that the sysfs based path exists in order to avoid calling this
test in an environment where it won't work right.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-03-06 00:54:19 -06:00
Peter Lieven
ce2eb6c4a0 qga: ignore EBUSY when freezing a filesystem
the current implementation fails if we try to freeze an
already frozen filesystem. This can happen if a filesystem
is mounted more than once (e.g. with a bind mount).

Suggested-by: Christian Theune <ct@flyingcircus.io>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-03-06 00:54:18 -06:00
Stefan Hajnoczi
26de229657 qga: add systemd socket activation support
AF_UNIX and AF_VSOCK listen sockets can be passed in by systemd on
startup.  This allows systemd to manage the listen socket until the
first client connects and between restarts.  Advantages of socket
activation are that parallel startup of network services becomes
possible and that unused daemons do not consume memory.

The key to achieving this is the LISTEN_FDS environment variable, which
is a stable ABI as shown here:
https://www.freedesktop.org/wiki/Software/systemd/InterfacePortabilityAndStabilityChart/

We could link against libsystemd and use sd_listen_fds(3) but it's easy
to implement the tiny LISTEN_FDS ABI so that qemu-ga does not depend on
libsystemd.  Some systems may not have systemd installed and wish to
avoid the dependency.  Other init systems or socket activation servers
may implement the same ABI without systemd involvement.

Test as follows:

  $ cat ~/.config/systemd/user/qga.service
  [Unit]
  Description=qga

  [Service]
  WorkingDirectory=/tmp
  ExecStart=/path/to/qemu-ga --logfile=/tmp/qga.log --pidfile=/tmp/qga.pid --statedir=/tmp

  $ cat ~/.config/systemd/user/qga.socket
  [Socket]
  ListenStream=/tmp/qga.sock

  [Install]
  WantedBy=default.target

  $ systemctl --user daemon-reload
  $ systemctl --user start qga.socket
  $ nc -U /tmp/qga.sock

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2017-03-06 00:54:18 -06:00
Zhang Chen
f0aabd5c4a net/filter-mirror: Follow CODING_STYLE
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Zhang Chen
1723a7f7cf COLO-compare: Fix icmp and udp compare different packet always dump bug
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Zhang Chen
6efeb3286d COLO-compare: Optimize compare_common and compare_tcp
Add offset args for colo_packet_compare_common, optimize
colo_packet_compare_icmp() and colo_packet_compare_udp()
just compare the IP payload. Before compare all tcp packet,
we compare tcp checksum firstly, this function can get
better performance.

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Zhang Chen
2ad7ca4c81 COLO-compare: Rename compare function and remove duplicate codes
Rename colo_packet_compare() to colo_packet_compare_common() that
make tcp_compare udp_compare icmp_compare reuse this function.
Remove minimum packet size check in icmp_compare, because we have
check this in parse_packet_early().

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
db0a762e4b filter-rewriter: skip net_checksum_calculate() while offset = 0
While the offset of packets's sequence for primary side and
secondary side is zero, it is unnecessary to call net_checksum_calculate()
to recalculate the checksume value of packets.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
0e79668e1f net/colo: fix memory double free error
The 'primary_list' and 'secondary_list' members of struct Connection
is not allocated through dynamically g_queue_new(), but we free it by using
g_queue_free(), which will lead to a double-free bug.

Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dr. David Alan Gilbert
a11f5cb005 vmxnet3: VMStatify rx/tx q_descr and int_state
Fairly simple mechanical conversion of all fields.

TODO!!!!
The problem is vmxnet3-ring size/cell_size/next are declared as size_t
but written as 32bit.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dr. David Alan Gilbert
5504bba1fb vmxnet3: Convert ring values to uint32_t's
The index's in the Vmxnet3Ring were migrated as 32bit ints
yet are declared as size_t's.  They appear to be derived
from 32bit values loaded from guest memory, so actually
store them as that.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Zhang Chen
727c2d764f net/colo-compare: Fix memory free error
We use g_queue_init() to init s->conn_list, so we should use g_queue_clear()
to instead of g_queue_free().

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Reviewed-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
b43decb015 colo-compare: Fix removing fds been watched incorrectly in finalization
We will catch the bellow error report while try to delete compare object
by qmp command:
chardev/char-io.c:91: io_watch_poll_finalize: Assertion `iwp->src == ((void *)0)' failed.

This is caused by failing to remove the right fd been watched while
call qemu_chr_fe_set_handlers();

Fix it by pass the worker_context parameter to qemu_chr_fe_set_handlers().

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
8487ce45f8 char: remove the right fd been watched in qemu_chr_fe_set_handlers()
We can call qemu_chr_fe_set_handlers() to add/remove fd been watched
in 'context' which can be either default main context or other explicit
context. But the original logic is not correct, we didn't remove
the right fd because we call g_main_context_find_source_by_id(NULL, tag)
which always try to find the Gsource from default context.

Fix it by passing the right context to g_main_context_find_source_by_id().

Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
dfd917a9c2 colo-compare: kick compare thread to exit after some cleanup in finalization
We should call g_main_loop_quit() to notify colo compare thread to
exit, Or it will run in g_main_loop_run() forever.

Besides, the finalizing process can't happen in context of colo thread,
it is reasonable to remove the 'if (qemu_thread_is_self(&s->thread))'
branch.

Before compare thead exits, some cleanup works need to be
done,  All unhandled packets need to be released and connection_track_table
needs to be freed, or there will be memory leak.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
zhanghailiang
66d2a2423e colo-compare: use g_timeout_source_new() to process the stale packets
Instead of using qemu timer to process the stale packets,
We re-use the colo compare thread to process these packets
by creating a new timeout coroutine.

Besides, since we process all the same vNIC's net connection/packets
in one thread, it is safe to remove the timer_check_lock.

Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dmitry Fleytman
002d394fd4 NetRxPkt: Remove code duplication in net_rx_pkt_pull_data()
This is a refactoring commit that does not change behavior.

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dmitry Fleytman
c5d083c561 NetRxPkt: Account buffer with ETH header in IOV length
In case of VLAN stripping ETH header is stored in a
separate chunk and length of IOV should take this into
account.

This patch fixes checksum validation for RX packets
with VLAN header.

Devices affected by this problem: e1000e and vmxnet3.

Cc: qemu-stable@nongnu.org
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dmitry Fleytman
d5e772146d NetRxPkt: Do not try to pull more data than present
In case of VLAN stripping, ETH header put into a
separate buffer, therefore amont of data copied
from original IOV should be smaller.

Cc: qemu-stable@nongnu.org
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dmitry Fleytman
df8bf7a7fe NetRxPkt: Fix memory corruption on VLAN header stripping
This patch fixed a problem that was introduced in commit eb700029.

When net_rx_pkt_attach_iovec() calls eth_strip_vlan()
this can result in pkt->ehdr_buf being overflowed, because
ehdr_buf is only sizeof(struct eth_header) bytes large
but eth_strip_vlan() can write
sizeof(struct eth_header) + sizeof(struct vlan_header)
bytes into it.

Devices affected by this problem: vmxnet3.

Cc: qemu-stable@nongnu.org
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Dmitry Fleytman
566342c312 eth: Extend vlan stripping functions
Make VLAN stripping functions return number of bytes
copied to given Ethernet header buffer.

This information should be used to re-compose
packet IOV after VLAN stripping.

Cc: qemu-stable@nongnu.org
Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Fam Zheng
290e6e113b net: Remove useless local var pkt
This has been pointless since commit 605d52e62, which was a
search-and-replace, overlooked the redundancy.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2017-03-06 11:46:02 +08:00
Nikunj A Dadhania
182fe2cf19 target/ppc: use helper for excp handling
Use the helper routine float[32,64]_maddsub_update_excp() in VSX_MADD
macro.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06 13:17:28 +11:00
Nikunj A Dadhania
3e5b26cf57 target/ppc: fmadd: add macro for updating flags
Adds FPU_MADDSUB_UPDATE macro, this will be used for other routines
having float32/16

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06 13:17:28 +11:00
Nikunj A Dadhania
806c9d71ab target/ppc: fmadd check for excp independently
Current order of checking does not confirm with the spec
(ISA 3.0: MultiplyAddDP page-469). Change the order and make them
independent of each other.

For example: a = infinity, b = zero, c = SNaN, this should set both
VXIMZ and VXNAN

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06 13:17:28 +11:00
Igor Mammedov
17b7c39e27 spapr: ensure that all threads within core are on the same NUMA node
Threads within a core shouldn't be on different
NUMA nodes, so if user has misconfgured command
line, fail QEMU at start up to force user fix it.

For now use the first thread on the core as source
of core's node-id. Later when cpu-numa refactoring
lands  it will be switched to core's node-id from
possible_cpus[].

This prevents the same problems as commit 20bb648d
"spapr: Fix default NUMA node allocation for threads",
but for the case of manually configured NUMA node
mappings, instead of just the default case.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06 10:32:53 +11:00
Cédric Le Goater
7ea6e06717 ppc/xics: register reset handlers for the ICP and ICS objects
The recent changes on the XICS layer removed the XICSState object to
let the sPAPR machine handle the ICP and ICS directly. The reset of
these objects was previously handled by XICSState, which was a SysBus
device, and to keep the same behavior, the ICP and ICS were assigned
to SysbBus.

But that broke the 'info qtree' command in the monitor. 'qtree'
performs a loop on the children of a bus to print their properties and
SysBus devices are expected to be found under SysBus, which is not the
case anymore.

The fix for this problem is to register reset handlers for the ICP and
ICS objects and stop using SysBus for such devices.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-06 10:07:38 +11:00
Markus Armbruster
aa3a982e67 qapi: Improve qobject visitor documentation
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-29-git-send-email-armbru@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
1f41a645b6 qapi: Fix object input visit beyond end of list
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-28-git-send-email-armbru@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
a9416dc62c tests: Cover input visit beyond end of list
When you try to visit beyond the end of a list, the qobject input
visitor crashes, and the string visitor screws returns garbage.  The
generated list visits never go beyond the list end, but manual visits
could.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-27-git-send-email-armbru@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
a4a1c70dc7 qapi: Make input visitors detect unvisited list tails
Fix the design flaw demonstrated in the previous commit: new method
check_list() lets input visitors report that unvisited input remains
for a list, exactly like check_struct() lets them report that
unvisited input remains for a struct or union.

Implement the method for the qobject input visitor (straightforward),
and the string input visitor (less so, due to the magic list syntax
there).  The opts visitor's list magic is even more impenetrable, and
all I can do there today is a stub with a FIXME comment.  No worse
than before.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-26-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
86ca0dbe04 test-qobject-input-visitor: Cover missing nested struct member
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-25-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
9cb8ef3668 tests: Cover partial input visit of list
Demonstrates a design flaw: there is no way to for input visitors to
report that a list visit didn't visit the complete input list.  The
generated list visits always do, but manual visits needn't.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-24-git-send-email-armbru@redhat.com>
2017-03-05 09:14:20 +01:00
Markus Armbruster
3d089cea0d test-string-input-visitor: Improve list coverage
Lists with elements above INT64_MAX don't work (known bug).  Empty
lists don't work (weird).

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-23-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
0f721d168d test-string-input-visitor: Tear down existing test automatically
Call visitor_input_teardown() from visitor_input_test_init(), so you
don't have to call it from the actual tests.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-22-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
77c47de23f tests-qobject-input-strict: Merge into test-qobject-input-visitor
Much of test-qobject-input-strict.c duplicates
test-qobject-input-strict.c, but with less assertions on expected
output:

* test_validate_struct() duplicates test_visitor_in_struct()

* test_validate_struct_nested() duplicates
  test_visitor_in_struct_nested()

* test_validate_list() duplicates the first half of
  test_visitor_in_list()

* test_validate_union_native_list() duplicates
  test_visitor_in_native_list_int()

* test_validate_union_flat() duplicates test_visitor_in_union_flat()

* test_validate_alternate() duplicates the first part of
  test_visitor_in_alternate()

Merge the remaining test cases into test-qobject-input-visitor.c, and
drop the now redundant test-qobject-input-strict.c.

Test case "/visitor/input-strict/fail/list" isn't really about lists,
it's about a bad struct nested in a list.  Rename accordingly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-21-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
048abb7b20 qapi: Drop unused non-strict qobject input visitor
The split between tests/test-qobject-input-visitor.c and
tests/test-qobject-input-strict.c now makes less sense than ever.  The
next commit will take care of that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-20-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
ec95f6148c test-qobject-input-visitor: Use strict visitor
The qobject input visitor comes in a strict and a non-strict variant.
This test is the non-strict variant's last user.  Turns out it relies
on non-strict only in test_visitor_in_null(), and just out of
laziness.  We don't actually test the non-strict behavior.

Clean up test_visitor_in_null(), and switch to the strict variant.
The next commit will drop the non-strict variant.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-19-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
05601ed2de qom: Make object_property_set_qobject()'s input visitor strict
Commit 240f64b made all qobject input visitors created outside tests
strict, except for the one in object_property_set_qobject().  That one
was left behind only because Eric couldn't spare the time to figure
out whether making it strict would break anything, with a TODO
comment.  Time to resolve it.

Strict makes a difference only for otherwise successful visits of QAPI
structs or unions.  Let's examine what the callers of
object_property_set_qobject() visit:

* object_property_set_str(), object_property_set_bool(),
  object_property_set_int() visit a QString, QBool, QInt,
  respectively.  Strictness can't matter.

* qmp_qom_set visits its @value argument.  Comes straight from QMP and
  can be anything ('any' in the QAPI schema).  Strictness matters when
  the property's set() method visits a struct or union QAPI type.

  No such methods exist, thus switching to strict can't break
  anything.

  If we acquire such methods in the future, we'll *want* the visitor
  to be strict, so that unexpected members get rejected as they should
  be.

Switch to strict.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-18-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
f332e830e3 qapi: Make string input and opts visitor require non-null input
The string input visitor tries to cope with null input.  Null input
isn't used anywhere, and isn't covered by tests.  Unsurprisingly, it
doesn't fully work: start_list() crashes because it passes the input
via parse_str() to strtoll() unchecked.

Make string_input_visitor_new() assert its argument isn't null, and
drop the code trying to deal with null input.

The opts visitor crashes when you try to actually visit something with
null input.  Make opts_visitor_new() assert its argument isn't null,
mostly for clarity.

qobject_input_visitor_new() already asserts its argument isn't null.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-17-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
a8aec6de2a qapi: Drop string input visitor method optional()
visit_optional() is to be called only between visit_start_struct() and
visit_end_struct().  Visitors that don't support struct visits,
i.e. don't implement start_struct(), end_struct(), have no use for it.
Clarify documentation.

The string input visitor doesn't support struct visits.  Its
parse_optional() is therefore useless.  Drop it.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-16-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
a9fc37f6bc qapi: Improve qobject input visitor error reporting
Error messages refer to nodes of the QObject being visited by name.
Trouble is the names are sometimes less than helpful:

* The name of the root QObject is whatever @name argument got passed
  to the visitor, except NULL gets mapped to "null".  We commonly pass
  NULL.  Not good.

  Avoiding errors "at the root" mitigates.  For instance,
  visit_start_struct() can only fail when the visited object is not a
  dictionary, and we commonly ensure it is beforehand.

* The name of a QDict's member is the member key.  Good enough only
  when this happens to be unique.

* The name of a QList's member is "null".  Not good.

Improve error messages by referring to nodes by path instead, as
follows:

* The path of the root QObject is whatever @name argument got passed
  to the visitor, except NULL gets mapped to "<anonymous>".

* The path of a root QDict's member is the member key.

* The path of a root QList's member is "[%u]", where %u is the list
  index, starting at zero.

* The path of a non-root QDict's member is the path of the QDict
  concatenated with "." and the member key.

* The path of a non-root QList's member is the path of the QList
  concatenated with "[%u]", where %u is the list index.

For example, the incorrect QMP command

    { "execute": "blockdev-add", "arguments": { "node-name": "foo", "driver": "raw", "file": {"driver": "file" } } }

now fails with

    {"error": {"class": "GenericError", "desc": "Parameter 'file.filename' is missing"}}

instead of

    {"error": {"class": "GenericError", "desc": "Parameter 'filename' is missing"}}

and

    { "execute": "input-send-event", "arguments": { "device": "bar", "events": [ [] ] } }

now fails with

    {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'events[0]', expected: object"}}

instead of

    {"error": {"class": "GenericError", "desc": "Invalid parameter type for 'null', expected: QDict"}}

Aside: calling the thing "parameter" is suboptimal for QMP, because
the root object is "arguments" there.

The qobject output visitor doesn't have this problem because it should
not fail.  Same for dealloc and clone visitors.

The string visitors don't have this problem because they visit just
one value, whose name needs to be passed to the visitor as @name.  The
string output visitor shouldn't fail anyway.

The options visitor uses QemuOpts names.  Their name space is flat, so
the use of QDict member keys as names is fine.  NULL names used with
roots and lists could conceivably result in bad error messages.  Left
for another day.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-15-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
58561c2766 qapi: Make QObject input visitor set *list reliably
qobject_input_start_struct() sets *list, except when it fails because
qobject_input_get_object() fails, i.e. the input object doesn't exist.

All the other input visitor start_struct(), start_list(),
start_alternate() always set *obj / *list.

Change qobject_input_start_struct() to match.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-14-git-send-email-armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
2017-03-05 09:14:19 +01:00
Markus Armbruster
b8874fbfd3 qapi: Clean up after commit 3d344c2
Drop unused QIV_STACK_SIZE and unused qobject_input_start_struct()
parameter errp.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-13-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
910f738b85 qapi: Improve a QObject input visitor error message
The QObject input visitor has three error message formats:

* Parameter '%s' is missing
* "Invalid parameter type for '%s', expected: %s"
* "QMP input object member '%s' is unexpected"

The '%s' are member names (or "null", but I'll fix that later).

The last error message calls the thing "QMP input object member"
instead of "parameter".  Misleading when the visitor is used on
QObjects that don't come from QMP.  Change it to "Parameter '%s' is
unexpected".

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-12-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
99fb0c53c0 qmp: Eliminate silly QERR_QMP_* macros
The QERR_ macros are leftovers from the days of "rich" error objects.

QERR_QMP_BAD_INPUT_OBJECT, QERR_QMP_BAD_INPUT_OBJECT_MEMBER,
QERR_QMP_EXTRA_MEMBER are used in just one place now, except for one
use that has crept into qobject-input-visitor.c.

Drop these macros, to make the (bad) error messages more visible.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-10-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
104fc30279 qmp: Drop duplicated QMP command object checks
qmp_check_input_obj() duplicates qmp_dispatch_check_obj(), except the
latter screws up an error message.  handle_qmp_command() runs first
the former, then the latter via qmp_dispatch(), masking the screwup.

qemu-ga also masks the screwup, because it also duplicates checks,
just differently.

qmp_check_input_obj() exists because handle_qmp_command() needs to
examine the command before dispatching it.  The previous commit got
rid of this need, except for a tracepoint, and a bit of "id" code that
relies on qdict not being null.

Fix up the error message in qmp_dispatch_check_obj(), drop
qmp_check_input_obj() and the tracepoint.  Protect the "id" code with
a conditional.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-9-git-send-email-armbru@redhat.com>
2017-03-05 09:14:19 +01:00
Markus Armbruster
635db18f68 qmp: Clean up how we enforce capability negotiation
To enforce capability negotiation before normal operation,
handle_qmp_command() inspects every command before it's handed off to
qmp_dispatch().  This is a bit of a layering violation, and results in
duplicated code.

Before capability negotiation (!cur_mon->in_command_mode), we fail
commands other than "qmp_capabilities".  This is what enforces
capability negotiation.

Afterwards, we fail command "qmp_capabilities".

Clean this up as follows.

The obvious place to fail a command is the command itself, so move the
"afterwards" check to qmp_qmp_capabilities().

We do the "before" check in every other command, but that would be
bothersome.  Instead, start with an alternate list of commands that
contains only "qmp_capabilities".  Switch to the full list in
qmp_qmp_capabilities().

Additionally, replace the generic human-readable error message for
CommandNotFound by one that reminds the user to run qmp_capabilities.
Without that, we'd regress commit 2d5a834.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-8-git-send-email-armbru@redhat.com>
[Mirco-optimization squashed in, commit message typo fixed]
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:14:11 +01:00
Markus Armbruster
9b0c9a6349 qapi-introspect: Mangle --prefix argument properly for C
qapi-introspect.py --prefix hasn't been used so far, but fix it anyway.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-7-git-send-email-armbru@redhat.com>
[Commit message improved]
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:12:29 +01:00
Markus Armbruster
1527badb95 qapi: Support multiple command registries per program
The command registry encapsulates a single command list.  Give the
functions using it a parameter instead.  Define suitable command lists
in monitor, guest agent and test-qmp-commands.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1488544368-30622-6-git-send-email-armbru@redhat.com>
[Debugging turds buried]
Reviewed-by: Eric Blake <eblake@redhat.com>
2017-03-05 09:12:25 +01:00
Markus Armbruster
0587568780 qmp: Dumb down how we run QMP command registration
The way we get QMP commands registered is high tech:

* qapi-commands.py generates qmp_init_marshal() that does the actual work

* it also generates the magic to register it as a MODULE_INIT_QAPI
  function, so it runs when someone calls
  module_call_init(MODULE_INIT_QAPI)

* main() calls module_call_init()

QEMU needs to register a few non-qapified commands.  Same high tech
works: monitor.c has its own qmp_init_marshal() along with the magic
to make it run in module_call_init(MODULE_INIT_QAPI).

QEMU also needs to unregister commands that are not wanted in this
build's configuration (commit 5032a16).  Simple enough:
qmp_unregister_commands_hack().  The difficulty is to make it run
after the generated qmp_init_marshal().  We can't simply run it in
monitor.c's qmp_init_marshal(), because the order in which the
registered functions run is indeterminate.  So qmp_init_marshal()
registers qmp_unregister_commands_hack() separately.  Since
registering *appends* to the list of registered functions, this will
make it run after all the functions that have been registered already.

I suspect it takes a long and expensive computer science education to
not find this silly.

Dumb it down as follows:

* Drop MODULE_INIT_QAPI entirely

* Give the generated qmp_init_marshal() external linkage.

* Call it instead of module_call_init(MODULE_INIT_QAPI)

* Except in QEMU proper, call new monitor_init_qmp_commands() that in
  turn calls the generated qmp_init_marshal(), registers the
  additional commands and unregisters the unwanted ones.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-5-git-send-email-armbru@redhat.com>
2017-03-05 09:02:10 +01:00
Markus Armbruster
f66e7ac88c qmp-test: New, covering basic QMP protocol
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-4-git-send-email-armbru@redhat.com>
2017-03-05 09:02:10 +01:00
Markus Armbruster
13420ef837 libqtest: Work around a "QMP wants a newline" bug
The next commit is going to add a test that calls qmp("null").
Curiously, this hangs.  Here's why.

qmp_fd_sendv() doesn't send newlines.  Not even when @fmt contains
some.  At first glance, the QMP parser seems to be fine with that.
However, it turns out that it fails to react to input until it sees
either a newline, an object or an array.  To reproduce, feed to a QMP
monitor like this:

    $ echo -n 'null' | socat UNIX:/work/armbru/images/test-qmp STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}

No output after the greeting.

Add a newline:

    $ echo 'null' | socat UNIX:/work/armbru/images/test-qmp STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP input"}}

Correct output for input 'null'.

Add an object instead:

    $ echo -n 'null { "execute": "qmp_capabilities" }' | socat UNIX:qmp-socket STDIO
    {"QMP": {"version": {"qemu": {"micro": 50, "minor": 8, "major": 2}, "package": " (v2.8.0-1195-gf84141e-dirty)"}, "capabilities": []}}
    {"error": {"class": "GenericError", "desc": "Expected 'object' in QMP input"}}
    {"return": {}}

Also correct output.

Work around this QMP bug by having qmp_fd_sendv() append a newline.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-3-git-send-email-armbru@redhat.com>
2017-03-05 09:02:10 +01:00
Markus Armbruster
74d8c9d99d qga: Fix crash on non-dictionary QMP argument
The value of key 'arguments' must be a JSON object.  qemu-ga neglects
to check, and crashes.  To reproduce, send

    { 'execute': 'guest-sync', 'arguments': [] }

to qemu-ga.

do_qmp_dispatch() uses qdict_get_qdict() to get the arguments.  When
not a JSON object, this gets a null pointer, which flows through the
generated marshalling function to qobject_input_visitor_new(), where
it fails the assertion.  qmp_dispatch_check_obj() needs to catch this
error.

QEMU isn't affected, because it runs qmp_check_input_obj() first,
which basically duplicates qmp_dispatch_check_obj()'s checks, plus the
missing one.

Fix by copying the missing one from qmp_check_input_obj() to
qmp_dispatch_check_obj().

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1488544368-30622-2-git-send-email-armbru@redhat.com>
2017-03-05 09:02:10 +01:00
Peter Maydell
17783ac828 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.9-20170303' into staging
ppc patch queuye for 2017-03-03

This will probably be my last pull request before the hard freeze.  It
has some new work, but that has all been posted in draft before the
soft freeze, so I think it's reasonable to include in qemu-2.9.

This batch has:
    * A substantial amount of POWER9 work
        * Implements the legacy (hash) MMU for POWER9
	* Some more preliminaries for implementing the POWER9 radix
          MMU
	* POWER9 has_work
	* Basic POWER9 compatibility mode handling
	* Removal of some premature tests
    * Some cleanups and fixes to the existing MMU code to make the
      POWER9 work simpler
    * A bugfix for TCG multiply adds on power
    * Allow pseries guests to access PCIe extended config space

This also includes a code-motion not strictly in ppc code - moving
getrampagesize() from ppc code to exec.c.  This will make some future
VFIO improvements easier, Paolo said it was ok to merge via my tree.

# gpg: Signature made Fri 03 Mar 2017 03:20:36 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.9-20170303:
  target/ppc: rewrite f[n]m[add,sub] using float64_muladd
  spapr: Small cleanup of PPC MMU enums
  spapr_pci: Advertise access to PCIe extended config space
  target/ppc: Rework hash mmu page fault code and add defines for clarity
  target/ppc: Move no-execute and guarded page checking into new function
  target/ppc: Add execute permission checking to access authority check
  target/ppc: Add Instruction Authority Mask Register Check
  hw/ppc/spapr: Add POWER9 to pseries cpu models
  target/ppc/POWER9: Add cpu_has_work function for POWER9
  target/ppc/POWER9: Add POWER9 pa-features definition
  target/ppc/POWER9: Add POWER9 mmu fault handler
  target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
  target/ppc: Add patb_entry to sPAPRMachineState
  target/ppc/POWER9: Add POWERPC_MMU_V3 bit
  powernv: Don't test POWER9 CPU yet
  exec, kvm, target-ppc: Move getrampagesize() to common code
  target/ppc: Add POWER9/ISAv3.00 to compat_table

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 16:31:14 +00:00
Paolo Bonzini
eeb61d4f82 ppc: avoid typedef redefinitions
These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: David Gibson <dgibson@redhat.com>
Cc: qemu-ppc@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1488558530-21016-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 15:14:34 +00:00
Paolo Bonzini
4ae4b609ee nios2: avoid anonymous unions in designated initializers.
These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 14:05:48 +00:00
Paolo Bonzini
eff235eb2b hppa: avoid anonymous unions in designated initializers.
These cause compilation failures on CentOS 6 or other operating
systems with older GCCs.

Cc: Richard Henderson <rth@twiddle.net>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1488558530-21016-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-04 12:52:01 +00:00
Peter Maydell
5febe7671f Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* kernel header update (requested by David and Vijay)
* GuestPanicInformation fixups (Anton)
* record/replay icount fixes (Pavel)
* cpu-exec cleanup, unification of icount_decr with tcg_exit_req (me)
* KVM_CAP_IMMEDIATE_EXIT support (me)
* vmxcap update (me)
* iscsi locking fix (me)
* VFIO ram device fix (Yongji)
* scsi-hd vs. default CD-ROM (Hervé)
* SMI migration fix (Dave)
* spice-char segfault (Li Qiang)
* improved "info mtree -f" (me)

# gpg: Signature made Fri 03 Mar 2017 15:43:04 GMT
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (21 commits)
  iscsi: fix missing unlock
  memory: show region offset and ROM/RAM type in "info mtree -f"
  x86: Work around SMI migration breakages
  spice-char: fix segfault in char_spice_finalize
  vl: disable default cdrom when using explicitely scsi-hd
  memory: Introduce DEVICE_HOST_ENDIAN for ram device
  qmp-events: fix GUEST_PANICKED description formatting
  qapi: flatten GuestPanicInformation union
  vmxcap: update for September 2016 SDM
  vmxcap: port to Python 3
  KVM: use KVM_CAP_IMMEDIATE_EXIT
  kvm: use atomic_read/atomic_set to access cpu->exit_request
  KVM: move SIG_IPI handling to kvm-all.c
  KVM: do not use sigtimedwait to catch SIGBUS
  KVM: remove kvm_arch_on_sigbus
  cpus: reorganize signal handling code
  KVM: x86: cleanup SIGBUS handlers
  cpus: remove ugly cast on sigbus_handler
  cpu-exec: remove unnecessary check of cpu->exit_request
  replay: check icount in cpu exec loop
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 16:41:09 +00:00
Paolo Bonzini
f6eb0b319e iscsi: fix missing unlock
Reported by Coverity.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:41:20 +01:00
Paolo Bonzini
377a07aa0d memory: show region offset and ROM/RAM type in "info mtree -f"
"info mtree -f" output is currently hard to use for large RAM regions, because
there is no hint as to what part of the region is being mapped.  Add the offset
if it is nonzero.

Secondly, FlatView has a readonly field, that can override the MemoryRegion
in the presence of aliases.  Take it into account.

Together, with this patch this:

address-space (flat view): KVM-SMRAM
  0000000000000000-00000000000bffff (prio 0, ram): pc.ram
  00000000000c0000-00000000000c9fff (prio 0, ram): pc.ram
  00000000000ca000-00000000000ccfff (prio 0, ram): pc.ram
  00000000000cd000-00000000000ebfff (prio 0, ram): pc.ram
  00000000000ec000-00000000000effff (prio 0, ram): pc.ram
  00000000000f0000-00000000000fffff (prio 0, ram): pc.ram
  0000000000100000-00000000bfffffff (prio 0, ram): pc.ram
  00000000fd000000-00000000fdffffff (prio 1, ram): vga.vram
  00000000febc0000-00000000febdffff (prio 1, i/o): e1000-mmio
  00000000febf0400-00000000febf041f (prio 0, i/o): vga ioports remapped
  00000000febf0500-00000000febf0515 (prio 0, i/o): bochs dispi interface
  00000000febf0600-00000000febf0607 (prio 0, i/o): qemu extended regs
  00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, i/o): hpet
  00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi
  00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios
  0000000100000000-000000013fffffff (prio 0, ram): pc.ram

becomes this:

address-space (flat view): KVM-SMRAM
  0000000000000000-00000000000bffff (prio 0, ram): pc.ram
  00000000000c0000-00000000000c9fff (prio 0, rom): pc.ram @00000000000c0000
  00000000000ca000-00000000000ccfff (prio 0, ram): pc.ram @00000000000ca000
  00000000000cd000-00000000000ebfff (prio 0, rom): pc.ram @00000000000cd000
  00000000000ec000-00000000000effff (prio 0, ram): pc.ram @00000000000ec000
  00000000000f0000-00000000000fffff (prio 0, rom): pc.ram @00000000000f0000
  0000000000100000-00000000bfffffff (prio 0, ram): pc.ram @0000000000100000
  00000000fd000000-00000000fdffffff (prio 1, ram): vga.vram
  00000000febc0000-00000000febdffff (prio 1, i/o): e1000-mmio
  00000000febf0400-00000000febf041f (prio 0, i/o): vga ioports remapped
  00000000febf0500-00000000febf0515 (prio 0, i/o): bochs dispi interface
  00000000febf0600-00000000febf0607 (prio 0, i/o): qemu extended regs
  00000000fec00000-00000000fec00fff (prio 0, i/o): kvm-ioapic
  00000000fed00000-00000000fed003ff (prio 0, i/o): hpet
  00000000fee00000-00000000feefffff (prio 4096, i/o): kvm-apic-msi
  00000000fffc0000-00000000ffffffff (prio 0, rom): pc.bios
  0000000100000000-000000013fffffff (prio 0, ram): pc.ram @00000000c0000000

This should make it easier to understand what's going on.

Cc: Peter Xu <peterx@redhat.com>
Cc: "William Tambe" <tambewilliam@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Dr. David Alan Gilbert
fc3a1fd74f x86: Work around SMI migration breakages
Migration from a 2.3.0 qemu results in a reboot on the receiving QEMU
due to a disagreement about SM (System management) interrupts.

2.3.0 didn't have much SMI support, but it did set CPU_INTERRUPT_SMI
and this gets into the migration stream, but on 2.3.0 it
never got delivered.

~2.4.0 SMI interrupt support was added but was broken - so
that when a 2.3.0 stream was received it cleared the CPU_INTERRUPT_SMI
but never actually caused an interrupt.

The SMI delivery was recently fixed by 68c6efe07a, but the
effect now is that an incoming 2.3.0 stream takes the interrupt it
had flagged but it's bios can't actually handle it(I think
partly due to the original interrupt not being taken during boot?).
The consequence is a triple(?) fault and a reboot.

Tested from:
  2.3.1 -M 2.3.0
  2.7.0 -M 2.3.0
  2.8.0 -M 2.3.0
  2.8.0 -M 2.8.0

This corresponds to RH bugzilla entry 1420679.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170223133441.16010-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Li Qiang
f20e6f8cd4 spice-char: fix segfault in char_spice_finalize
In 'qemu_chr_open_spice_vmc' if the 'psubtype' is NULL, it will
call 'char_spice_finalize'. But as the SpiceChardev is not inserted
in the 'spice_chars' list, the 'QLIST_REMOVE' will cause a segfault.
Add a detect to avoid it.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <1487665107-88004-1-git-send-email-liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Li Qiang <liq3ea@gmail.com>
2017-03-03 16:40:03 +01:00
Hervé Poussineau
f6f99b4808 vl: disable default cdrom when using explicitely scsi-hd
In commit af6bf1328e (May 2011),
ide-hd, ide-cd and scsi-cd have been added to disable default cdrom,
"or else you can't put one on secondary master without -nodefaults".

Make it the same for scsi-hd, so you can put one on scsi-id 2 without
using -nodefaults.
scsi-hd has probably been forgotten, as it has been added in the
preceding commit (b443ae6713).

Affected users are the ones using a machine with SCSI devices and start QEMU
with -device scsi-hd but without -device scsi-cd or -cdrom
In that case, the default cdrom device will disappear instead of being empty.

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Message-Id: <1487623279-29930-1-git-send-email-hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Yongji Xie
c99a29e702 memory: Introduce DEVICE_HOST_ENDIAN for ram device
At the moment ram device's memory regions are DEVICE_NATIVE_ENDIAN. It's
incorrect. This memory region is backed by a MMIO area in host, so the
uint64_t data that MemoryRegionOps read from/write to this area should be
host-endian rather than target-endian. Hence, current code does not work
when target and host endianness are different which is the most common case
on PPC64. To fix it, this introduces DEVICE_HOST_ENDIAN for the ram device.

This has been tested on PPC64 BE/LE host/guest in all possible combinations
including TCG.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Message-Id: <1488171164-28319-1-git-send-email-xyjxie@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Anton Nefedov
11953be792 qmp-events: fix GUEST_PANICKED description formatting
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Message-Id: <1487614915-18710-4-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Anton Nefedov
e8ed97a647 qapi: flatten GuestPanicInformation union
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Eric Blake <eblake@redhat.com>
Message-Id: <1487614915-18710-3-git-send-email-den@openvz.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:03 +01:00
Paolo Bonzini
025533f6ee vmxcap: update for September 2016 SDM
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
c3e31eaa21 vmxcap: port to Python 3
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
cf0f7cf903 KVM: use KVM_CAP_IMMEDIATE_EXIT
The purpose of the KVM_SET_SIGNAL_MASK API is to let userspace "kick"
a VCPU out of KVM_RUN through a POSIX signal.  A signal is attached
to a dummy signal handler; by blocking the signal outside KVM_RUN and
unblocking it inside, this possible race is closed:

          VCPU thread                     service thread
   --------------------------------------------------------------
        check flag
                                          set flag
                                          raise signal
        (signal handler does nothing)
        KVM_RUN

However, one issue with KVM_SET_SIGNAL_MASK is that it has to take
tsk->sighand->siglock on every KVM_RUN.  This lock is often on a
remote NUMA node, because it is on the node of a thread's creator.
Taking this lock can be very expensive if there are many userspace
exits (as is the case for SMP Windows VMs without Hyper-V reference
time counter).

KVM_CAP_IMMEDIATE_EXIT provides an alternative, where the flag is
placed directly in kvm_run so that KVM can see it:

          VCPU thread                     service thread
   --------------------------------------------------------------
                                          raise signal
        signal handler
          set run->immediate_exit
        KVM_RUN
          check run->immediate_exit

The previous patches changed QEMU so that the only blocked signal is
SIG_IPI, so we can now stop using KVM_SET_SIGNAL_MASK and sigtimedwait
if KVM_CAP_IMMEDIATE_EXIT is available.

On a 14-VCPU guest, an "inl" operation goes down from 30k to 6k on
an unlocked (no BQL) MemoryRegion, or from 30k to 15k if the BQL
is involved.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
c5c6679d37 kvm: use atomic_read/atomic_set to access cpu->exit_request
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
18268b6016 KVM: move SIG_IPI handling to kvm-all.c
This lets us remove a bunch of CONFIG_LINUX defines.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
2ae41db262 KVM: do not use sigtimedwait to catch SIGBUS
Call kvm_on_sigbus_vcpu asynchronously from the VCPU thread.
Information for the SIGBUS can be stored in thread-local variables
and processed later in kvm_cpu_exec.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
4d39892cca KVM: remove kvm_arch_on_sigbus
Build it on kvm_arch_on_sigbus_vcpu instead.  They do the same
for "action optional" SIGBUSes, and the main thread should never get
"action required" SIGBUSes because it blocks the signal.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
a16fc07ebd cpus: reorganize signal handling code
Move the KVM "eat signals" code under CONFIG_LINUX, in preparation
for moving it to kvm-all.c; reraise non-MCE SIGBUS immediately,
without passing it to KVM.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
20e0ff59a9 KVM: x86: cleanup SIGBUS handlers
This patch should have no semantic change.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
d98d407234 cpus: remove ugly cast on sigbus_handler
The cast is there because sigbus_handler is invoked via sigfd_handler.
But it feels just wrong to use struct qemu_signalfd_siginfo in the
prototype of a function that is passed to sigaction.

Instead, do a simple-minded conversion of qemu_signalfd_siginfo to
siginfo_t.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:40:02 +01:00
Paolo Bonzini
30f3dda24b Merge branch 'icount-update' into HEAD
Merge the original development branch due to breakage caused by the
MTTCG merge.

Conflicts:
	cpu-exec.c
	translate-common.c

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-03-03 16:39:18 +01:00
Peter Maydell
5b10b94bd5 Merge remote-tracking branch 'remotes/ehabkost/tags/numa-pull-request' into staging
NUMA documentation update

# gpg: Signature made Fri 03 Mar 2017 13:11:25 GMT
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/numa-pull-request:
  qemu-options: Rewrite -numa documentation

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 14:59:45 +00:00
Peter Maydell
9a17d32721 Merge remote-tracking branch 'remotes/dgibson/tags/submodule-update-20170303' into staging
submodule updates (SLOF & dtc) 2017-03-03

This set of patches updates the SLOF and dtc submodules for qemu-2.9.

The SLOF update could have gone in my ppc pull request earlier today,
but I forgot it.  It should be safe to apply in either order with that
set though.

The dtc (and libfdt) update brings us up to dtc 1.4.3 which includes
some things that will be useful in future.

# gpg: Signature made Fri 03 Mar 2017 06:29:31 GMT
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg:                 aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/submodule-update-20170303:
  Update dtc submodule to v1.4.3
  pseries: Update SLOF firmware image

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 14:04:27 +00:00
Eduardo Habkost
4b9a5dd762 qemu-options: Rewrite -numa documentation
Rewrite the -numa documentation to clarify what exactly it does.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170123180632.28942-3-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2017-03-03 10:08:03 -03:00
Peter Maydell
1ec2dca691 Merge remote-tracking branch 'remotes/berrange/tags/pull-qio-2017-02-27-2' into staging
Merge qio 2017/02/27 v2

# gpg: Signature made Thu 02 Mar 2017 16:09:27 GMT
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/pull-qio-2017-02-27-2:
  io: fully parse & validate HTTP headers for websocket protocol handshake
  tests: fix leaks in test-io-channel-command
  io: fix decoding when multiple websockets frames arrive at once

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 12:53:33 +00:00
Peter Maydell
508e038a5d dtc: Revert unintentional submodule downgrade from commit 077dd74239
Commit 077dd74239 inadvertently downgraded the 'dtc' submodule,
undoing the increment added in commit 6e85fce022. Revert this,
returning the submodule state to where we should be.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 12:48:42 +00:00
Peter Maydell
9a81b792cc Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
virtio, pc: fixes, features

virtio support for region caches broke a bunch of stuff - fixing most of
it though it's not ideal.  Still pondering the right way to fix it.
New: VM gen ID and hotplug for PXB.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Thu 02 Mar 2017 06:19:17 GMT
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  hw/pxb-pcie: fix PCI Express hotplug support
  tests/acpi: update DSDT after last patch
  acpi: simplify _OSC
  virtio: unbreak virtio-pci with IOMMU after caching ring translations
  virtio: add missing region cache init in virtio_load()
  virtio: invalidate memory in vring_set_avail_event()
  virtio: guard vring access when setting notification
  virtio: check for vring setup in virtio_queue_empty
  MAINTAINERS: Add VM Generation ID entries
  tests: Move reusable ACPI code into a utility file
  qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands
  ACPI: Add Virtual Machine Generation ID support
  ACPI: Add vmgenid blob storage to the build tables
  docs: VM Generation ID device description
  linker-loader: Add new 'write pointer' command

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-03 10:09:03 +00:00
Nikunj A Dadhania
992d7e976c target/ppc: rewrite f[n]m[add,sub] using float64_muladd
Use the softfloat api for fused multiply-add.
Introduce routine to set the FPSCR flags VXNAN, VXIMZ nad VMISI.

Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:38:33 +11:00
Sam Bobroff
ec975e839c spapr: Small cleanup of PPC MMU enums
The PPC MMU types are sometimes treated as if they were a bit field
and sometime as if they were an enum which causes maintenance
problems: flipping bits in the MMU type (which is done on both the 1TB
segment and 64K segment bits) currently produces new MMU type
values that are not handled in every "switch" on it, sometimes causing
an abort().

This patch provides some macros that can be used to filter out the
"bit field-like" bits so that the remainder of the value can be
switched on, like an enum. This allows removal of all of the
"degraded" types from the list and should ease maintenance.

Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
David Gibson
bb99864528 spapr_pci: Advertise access to PCIe extended config space
The (paravirtual) PCI host bridge on the 'pseries' machine in most
regards acts like a regular PCI bus, rather than a PCIe bus.  Despite
this, though, it does allow access to the PCIe extended config space.

We already implemented the RTAS methods to allow this access.. but
forgot to put the markers into the device tree so that guest's know it
is there.  This adds them in.

With this, a pseries guest is able to view extended config space on
(for example an e1000e device.  This should be enough to allow guests
to use at least some PCIe devices.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
da82c73a95 target/ppc: Rework hash mmu page fault code and add defines for clarity
The hash mmu page fault handling code is responsible for generating ISIs
and DSIs when access permissions cause an access to fail. Part of this
involves setting the srr1 or dsisr registers to indicate what causes the
access to fail. Add defines for the bit fields of these registers and
rework the code to use these new defines in order to improve readability
and code clarity.

While we're here, update what is logged when an access fails to include
information as to what caused to access to fail for debug purposes.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Moved constants to cpu.h since they're not MMUv3 specific]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
07a68f9907 target/ppc: Move no-execute and guarded page checking into new function
A pte entry has bit fields which can be used to make a page no-execute or
guarded, if either of these bits are set then an instruction access to this
page will fail. Currently these bits are checked with the pp_prot function
however the ISA specifies that the access authority controlled by the
key-pp value pair should only be checked on an instruction access after
the no-execute and guard bits have already been verified to permit the
access.

Move the no-execute and guard bit checking into a new separate function.
Note that we can remove the check for the no-execute bit in the slb entry
since this check was already performed above when we obtained the slb
entry.

In the event that the no-execute or guard bits are set, an ISI should be
generated with the SRR1_NOEXEC_GUARD (0x10000000) bit set in srr1. Add a
define for this for clarity.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move constants to cpu.h since they're not MMUv3 specific]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
347a5c73ba target/ppc: Add execute permission checking to access authority check
Basic storage protection defines various access authority permissions
based on a slb storage key and pte pp value pair. This access authority
defines read, write and execute permissions however currently we only
use this to control read and write permissions and ignore the execute
control.

Fix the code to allow execute permissions based on the key-pp value pair.
Execute is allowed under the same conditions which enable reads.
(i.e. read permission -> execute permission)

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
a6152b52bc target/ppc: Add Instruction Authority Mask Register Check
The instruction authority mask register (IAMR) can be used to restrict
permissions for instruction fetch accesses on a per key basis for each
of 32 different key values. Access permissions are derived based on the
specific key value stored in the relevant page table entry.

The IAMR was introduced in, and is present in processors since, POWER8
(ISA v2.07). Thus introduce a function to check access permissions based
on the pte key value and the contents of the IAMR when handling a page
fault to ensure sufficient access permissions for an instruction fetch.

A hash pte contains a key value in bits 2:3|52:54 of the second double word
of the pte, this key value gives an index into the IAMR which contains 32
2-bit access masks. If the least significant bit of the 2-bit access mask
corresponding to the given key value is set (IAMR[key] & 0x1 == 0x1) then
the instruction fetch is not permitted and an ISI is generated accordingly.
While we're here, add defines for the srr1 bits to be set for the ISI for
clarity.

e.g.

pte:
dw0 [XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
dw1 [XX01XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX010XXXXXXXXX]
       ^^                                                ^^^
key = 01010 (0x0a)

IAMR: [XXXXXXXXXXXXXXXXXXXX01XXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXXX]
                           ^^
Access mask = 0b01

Test access mask: 0b01 & 0x1 == 0x1

Least significant bit of the access mask is set, thus the instruction fetch
is not permitted. We should generate an instruction storage interrupt (ISI)
with bit 42 of SRR1 set to indicate access precluded by virtual page class
key protection.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Move new constants to cpu.h, since they're not MMUv3 specific]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
24d8e5655f hw/ppc/spapr: Add POWER9 to pseries cpu models
Add POWER9 cpu to list of spapr core models which allows it to be specified
as the cpu model for a pseries guest (e.g. -machine pseries -cpu POWER9).

This now allows a POWER9 cpu to boot to userspace in tcg emulation for a
pseries machine with a legacy kernel.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
6f46dcb3e5 target/ppc/POWER9: Add cpu_has_work function for POWER9
The cpu has work function is used to mask interrupts used to determine
if there is work for the cpu based on the LPCR. Add a function to do this
for POWER9 and add it to the POWER9 cpu definition. This is similar to that
for POWER8 except using the LPCR bits as defined for POWER9.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
4975c098c9 target/ppc/POWER9: Add POWER9 pa-features definition
Add a pa-features definition which includes all of the new fields which
have been added, note we don't claim support for any of these new features
at this stage.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Balbir Singh <bsingharora@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
b2899495e3 target/ppc/POWER9: Add POWER9 mmu fault handler
Add a new mmu fault handler for the POWER9 cpu and add it as the handler
for the POWER9 cpu definition.

This handler checks if the guest is radix or hash based on the value in the
partition table entry and calls the correct fault handler accordingly.

The hash fault handling code has also been updated to check if the
partition is using segment tables.

Currently only legacy hash (no segment tables) is supported.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
4f4f28ffc1 target/ppc: Don't gen an SDR1 on POWER9 and rework register creation
POWER9 doesn't have a storage description register 1 (SDR1) which is used
to store the base and size of the hash table. Thus we don't need to
generate this register on the POWER9 cpu model. While we're here, the
register generation code for 970, POWER5+, POWER<7/8/9> in general is a
mess where we call a generic function from a model specific function which
then attempts to call model specific functions, so rework this for
readability.

We update ppc_cpu_dump_state so that "info registers" will only display
the value of sdr1 if the register has been generated.

As mentioned above the register generation for the pcc->init_proc
function for 970, POWER5+, POWER7, POWER8 and POWER9 has been reworked
for improved clarity. Instead of calling init_proc_book3s_64 which then
attempts to generate the correct registers through a mess of if statements,
we remove this function and instead call the appropriate register
generation functions directly. This follows the register generation model
used for earlier cpu models (pre-970) whereby cpu specific registers are
generated directly in the init_proc function and makes it easier to
add/remove specific registers for new cpu models.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
9861bb3efd target/ppc: Add patb_entry to sPAPRMachineState
ISA v3.00 adds the idea of a partition table which is used to store the
address translation details for all partitions on the system. The partition
table consists of double word entries indexed by partition id where the second
double word contains the location of the process table in guest memory. The
process table is registered by the guest via a h-call.

We need somewhere to store the address of the process table so we add an entry
to the sPAPRMachineState struct called patb_entry to represent the second
doubleword of a single partition table entry corresponding to the current
guest. We need to store this value so we know if the guest is using radix or
hash translation and the location of the corresponding process table in guest
memory. Since we only have a single guest per qemu instance, we only need one
entry.

Since the partition table is technically a hypervisor resource we require that
access to it is abstracted by the virtual hypervisor through the get_patbe()
call. Currently the value of the entry is never set (and thus
defaults to 0 indicating hash), but it will be required to both implement
POWER9 kvm support and tcg radix support.

We also add this field to be migrated as part of the sPAPRMachineState as we
will need it on the receiving side as the guest will never tell us this
information again and we need it to perform translation.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
David Gibson
0922f1e487 target/ppc/POWER9: Add POWERPC_MMU_V3 bit
For easier handling of future processors using the POWER9 or something
close to it, add a new bit in the MMU model.  This was originally from a
revised version of 86cf1e9 "target/ppc/POWER9: Add ISAv3.00 MMU definition"
but the older version of the patch was already merged.  This makes the
change on top of the original version.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
David Gibson
eaa477ca4e powernv: Don't test POWER9 CPU yet
A couple of tests for the work-in-progress 'powernv' machine type attempt
to test on POWER9 CPUs.  However the POWER9 CPU support is incomplete and
this doesn't really work.  In particular the firmware image we have
currently assumes the presence of the SDR1 register, which no longer exists
on POWER9.  We only got away with this so far, because of a different bug
which added SDR1 to POWER9 even though it shouldn't be there.

For now, remove POWER9 testing of powernv, POWER8 testing will do for now
until the POWER9 support is more complete.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Alexey Kardashevskiy
9c60766887 exec, kvm, target-ppc: Move getrampagesize() to common code
getrampagesize() returns the largest supported page size and mainly
used to know if huge pages are enabled.

However is implemented in target-ppc/kvm.c and not available
in TCG or other architectures.

This renames and moves gethugepagesize() to mmap-alloc.c where
fd-based analog of it is already implemented. This renames and moves
getrampagesize() to exec.c as it seems to be the common place for
helpers like this.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Suraj Jitindar Singh
9b44c836dc target/ppc: Add POWER9/ISAv3.00 to compat_table
compat_table contains the list of logical pvr compat modes which a cpu can
operate in. It is a list of struct CompatInfo which contains the given pvr
value for a compat mode, the pcr bits which should be set to operate in
that compat mode, the pcr level which must be present in pcr_supported for
a processor to support that compat mode and the max threads possible in
that compat mode.

Add an entry for the POWER9/ISAv3.00 logical pvr which represents a
processor running with support for logical pvr 0x0f000005. A processor
running in this mode should have PCR_COMPAT_3_00 set in the pcr (if
available in pcr_mask) and should have PCR_COMPAT_3_00 in pcr_supported
to indicate that it is capable of running in this compat mode.

Also add PCR_COMPAT_3_00 to the bits which must be set for all previous
compat modes. Since no processor models contain this bit yet in pcr_mask
it will never be set, but this ensures we don't forget to in the future.

Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2017-03-03 11:30:59 +11:00
Peter Maydell
54639aed97 Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Thu 02 Mar 2017 03:42:59 GMT
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
  block/rbd: add blockdev-add support
  block/rbd: parse all options via bdrv_parse_filename
  block/rbd: add all the currently supported runtime_opts
  block/rbd: don't copy strings in qemu_rbd_next_tok()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2017-03-02 23:20:37 +00:00
Marcel Apfelbaum
077dd74239 hw/pxb-pcie: fix PCI Express hotplug support
Add the missing osc method for pxb-pcie devices as APCI spec recommends,
see 6.2.9.1 OSC Implementation Example for PCI Host Bridge Devices, ACPI 3.0a:

    It is recommended that a machine with multiple host bridge devices
    should report the same capabilities for all host bridges, and also
    negotiate control of the features described in the Control Field in
    the same way for all host bridges.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:31:26 +02:00
Michael S. Tsirkin
5cb206b58d tests/acpi: update DSDT after last patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:31:26 +02:00
Michael S. Tsirkin
b3c782db20 acpi: simplify _OSC
Our _OSC method has a bunch of unused code loading data
into external CTRL and SUPP fields which are then never
used. Drop this in favor of a single local variable.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-02 07:29:10 +02:00
Jason Wang
96a8821d21 virtio: unbreak virtio-pci with IOMMU after caching ring translations
Commit c611c76417 ("virtio: add MemoryListener to cache ring
translations") registers a memory listener to dma_as. This may not
work when IOMMU is enabled: dma_as(bus_master_as) were initialized in
pcibus_machine_done() after virtio_realize(). This will cause a
segfault. Fixing this by using pci_device_iommu_address_space()
instead to make sure address space were initialized at this time.

With this fix, IOMMU device were required to be initialized before any
virtio-pci devices.

Fixes: c611c76417 ("virtio: add MemoryListener to cache ring translations")
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi
874adf45db virtio: add missing region cache init in virtio_load()
Commit 97cd965c07 ("virtio: use
VRingMemoryRegionCaches for avail and used rings") switched to a memory
region cache to avoid repeated map/unmap operations.

The virtio_load() process is a little tricky because vring addresses are
serialized in two separate places.  VIRTIO 1.0 devices serialize desc
and then a subsection with used and avail.  Legacy devices only
serialize desc.

Live migration of VIRTIO 1.0 devices fails on the destination host with:

  VQ 0 size 0x80 < last_avail_idx 0x12f8 - used_idx 0x0
  Failed to load virtio-blk:virtio
  error while loading state for instance 0x0 of device '0000:00:04.0/virtio-blk'

This happens because the memory region cache is only initialized after
desc is loaded and not after the used and avail subsection is loaded.
If the guest chose memory addresses that don't match the legacy ring
layout then the wrong guest memory location is accessed.

Wait until all ring addresses are known before trying to initialize the
region cache.  Also clarify the incomplete comment about VIRTIO-1 ring
address subsection.

Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:28 +02:00
Stefan Hajnoczi
3cdf847329 virtio: invalidate memory in vring_set_avail_event()
Remember to invalidate the avail event field so the memory pages are
marked dirty.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
2017-03-02 07:14:27 +02:00
Cornelia Huck
34c6bf22a8 virtio: guard vring access when setting notification
Switching to vring caches exposed an existing bug in
virtio_queue_set_notification(): We can't access vring structures
if they have not been set up yet. This may happen, for example,
for virtio-blk devices with multiple queues: The code will try to
switch notifiers for every queue, but the guest may have only set up
a subset of them.

Fix this by guarding access to the vring memory by checking for
vring.desc. The first aio poll will iron out any remaining
inconsistencies for later-configured queues (buggy legacy drivers).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:27 +02:00
Paolo Bonzini
dd3dd4ba7b virtio: check for vring setup in virtio_queue_empty
If the vring has not been set up, there is nothing in the virtqueue.
virtio_queue_host_notifier_aio_poll calls virtio_queue_empty even in
this case; we have to filter it out just like virtio_queue_notify_aio_vq.

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2017-03-02 07:14:27 +02:00
Ben Warren
42697d88de MAINTAINERS: Add VM Generation ID entries
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-02 07:14:27 +02:00
Ben Warren
3248f1b4e0 tests: Move reusable ACPI code into a utility file
Also usable by upcoming VM Generation ID tests

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
2017-03-02 07:14:27 +02:00
Igor Mammedov
39164c136c qmp/hmp: add query-vm-generation-id and 'info vm-generation-id' commands
Add commands to query Virtual Machine Generation ID counter.

QMP command example:
    { "execute": "query-vm-generation-id" }

HMP command example:
    info vm-generation-id

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:27 +02:00
Ben Warren
d03637bcfb ACPI: Add Virtual Machine Generation ID support
This implements the VM Generation ID feature by passing a 128-bit
GUID to the guest via a fw_cfg blob.
Any time the GUID changes, an ACPI notify event is sent to the guest

The user interface is a simple device with one parameter:
 - guid (string, must be "auto" or in UUID format
   xxxxxxxx-xxxx-xxxx-xxxx-xxxxxxxxxxxx)

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:27 +02:00
Ben Warren
c7809e6cd7 ACPI: Add vmgenid blob storage to the build tables
This allows them to be centrally initialized and destroyed

The "AcpiBuildTables.vmgenid" array will be used to construct the
"etc/vmgenid_guid" fw_cfg blob.

Its contents will be linked into fw_cfg after being built on the
pc_machine_done() -> acpi_setup() -> acpi_build() call path, and dropped
without use on the subsequent, guest triggered, acpi_build_update() ->
acpi_build() call path.

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:26 +02:00
Ben Warren
20f5d14dc9 docs: VM Generation ID device description
This patch is based off an earlier version by
Gal Hammer (ghammer@redhat.com)

Requirements section, ASCII diagrams and overall help
provided by Laszlo Ersek (lersek@redhat.com)

Signed-off-by: Gal Hammer <ghammer@redhat.com>
Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:26 +02:00
Ben Warren
489886d118 linker-loader: Add new 'write pointer' command
This is similar to the existing 'add pointer' functionality, but instead
of instructing the guest (BIOS or UEFI) to patch memory, it instructs
the guest to write the pointer back to QEMU via a writeable fw_cfg file.

Signed-off-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2017-03-02 07:14:26 +02:00
Jeff Cody
0a55679b4a block/rbd: add support for 'mon_host', 'auth_supported' via QAPI
This adds support for three additional options that may be specified
by QAPI in blockdev-add:

    server: host, port
    auth method: either 'cephx' or 'none'

The "server" and "auth-supported" QAPI parameters are arrays.  To conform
with the rados API, the array items are join as a single string with a ';'
character as a delimiter when setting the configuration values.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-03-01 22:39:25 -05:00
Jeff Cody
8a47e8eb59 block/rbd: add blockdev-add support
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:40 -05:00
Jeff Cody
c7cacb3e7a block/rbd: parse all options via bdrv_parse_filename
Get rid of qemu_rbd_parsename in favor of bdrv_parse_filename.
This simplifies a lot of the parsing as well, as we can treat everything
a bit simpler since nonexistent options are simply NULL pointers instead
of empty strings.

An important item to note:

Ceph has many extra option values that can be specified as key/value
pairs.  This was handled previously in the driver by extracting the
values that the QEMU driver cared about, and then blindly passing all
extra options to rbd after splitting them into key/value pairs, and
cleaning up any special character escaping.

The practice is continued in this patch; there is an option
"keyvalue-pairs" that is populated with all the key/value pairs that the
QEMU driver does not care about.  These key/value pairs will override
any settings in the 'conf' configuration file, just as they did before.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:31 -05:00
Jeff Cody
0f9d252de4 block/rbd: add all the currently supported runtime_opts
This adds all the currently supported runtime opts, which
are the options as parsed from the filename.  All of these
options are explicitly checked for during during runtime,
with an exception to the "keyvalue-pairs" option.

This option contains all the key/value pairs that the QEMU rbd
driver merely unescapes, and passes along blindly to rados.  This
option is a "legacy" option, and will not be exposed in the QAPI
or available for introspection.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:25 -05:00
Jeff Cody
7830f90998 block/rbd: don't copy strings in qemu_rbd_next_tok()
This patch is prep work for parsing options for .bdrv_parse_filename,
and using QDict options.

The function qemu_rbd_next_tok() searched for various key/value pairs,
and copied them into buffers.  This will soon be an unnecessary extra
step, so we will now return found strings by reference only, and
offload the responsibility for safely handling/coping these strings to
the caller.

This also cleans up error handling some, as the callers now rely on
the Error object to determine if there is a parse error.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2017-02-28 11:32:06 -05:00
Daniel P. Berrange
07e95cd529 io: fully parse & validate HTTP headers for websocket protocol handshake
The current websockets protocol handshake code is very relaxed, just
doing crude string searching across the HTTP header data. This causes
it to both reject valid connections and fail to reject invalid
connections. For example, according to the RFC 6455 it:

 - MUST reject any method other than "GET"
 - MUST reject any HTTP version less than "HTTP/1.1"
 - MUST reject Connection header without "Upgrade" listed
 - MUST reject Upgrade header which is not 'websocket'
 - MUST reject missing Host header
 - MUST treat HTTP header names as case insensitive

To do all this validation correctly requires that we fully parse the
HTTP headers, populating a data structure containing the header
fields.

After this change, we also reject any path other than '/'

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-02-28 11:51:16 +00:00
Marc-André Lureau
e3d90312b3 tests: fix leaks in test-io-channel-command
No need for strdup, fix leaks when socat is missing.

Spotted by ASAN.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-02-28 10:36:32 +00:00
Daniel P. Berrange
cd892a2efc io: fix decoding when multiple websockets frames arrive at once
The qio_channel_websock_read_wire() method will read upto 4096
bytes off the socket and then decode the websockets header and
payload. The code was only decoding a single websockets frame,
even if the buffered data contained multiple frames. This meant
that decoding of subsequent frames was delayed until further
input arrived on the socket. This backlog of delayed frames
gets worse & worse over time.

Symptom was that when connecting to the VNC server via the
built-in websockets server, mouse/keyboard interaction would
start out fine, but slowly get more & more delayed until it
was unusable.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2017-02-28 10:36:31 +00:00
Paolo Bonzini
55ac0a9bf4 cpu-exec: remove unnecessary check of cpu->exit_request
The cpu->exit_request check in cpu_loop_exec_tb is unnecessary,
because cpu->tcg_exit_req is always set after cpu->exit_request.
So let the TB exit and we will pick up the exit request later
in cpu_handle_interrupt.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-24 15:51:19 +01:00
Pavel Dovgalyuk
cfb2d02be9 replay: check icount in cpu exec loop
This patch adds check to break cpu loop when icount expires without
setting the TB_EXIT_ICOUNT_EXPIRED flag. It happens when there is no
available translated blocks and all instructions were executed.
In icount replay mode unnecessary tb_find will be called (which may
cause an exception) and execution will be non-deterministic.
Because cpu_loop_exec_tb cannot longjmp anymore, we can remove
the anticipated call to align_clocks in cpu_loop_exec_tb, as
well as the SyncClocks *sc argument.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <002801d2810f$18809c20$4981d460$@ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <dovgaluk@ispras.ru>
2017-02-24 15:51:19 +01:00
Max Filippov
cb3825b9af target/xtensa: add two missing headers to core import script
Include qemu/osdep.h and qemu-common.h at the beginning of imported
xtensa core source file.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-02-23 10:50:56 -08:00
Max Filippov
b68755c142 target/xtensa: sim: instantiate local memories
Xtensa core may have a number of RAM and ROM areas configured. Record
their size and location from the core configuration overlay and
instantiate them as RAM regions in the SIM machine.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2017-02-23 10:30:41 -08:00
Paolo Bonzini
1aab16c28a cpu-exec: unify icount_decr and tcg_exit_req
The icount interrupt flag and tcg_exit_req serve almost the same
purpose, let's make them completely the same.

The former TB_EXIT_REQUESTED and TB_EXIT_ICOUNT_EXPIRED cases are
unified, since we can distinguish them from the value of the
interrupt flag.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2017-02-22 14:56:34 +01:00
872 changed files with 24601 additions and 11935 deletions

10
.gitignore vendored
View File

@@ -99,15 +99,15 @@
/pc-bios/optionrom/kvmvapic.img
/pc-bios/s390-ccw/s390-ccw.elf
/pc-bios/s390-ccw/s390-ccw.img
/docs/qemu-ga-qapi.texi
/docs/qemu-ga-ref.html
/docs/qemu-ga-ref.info*
/docs/qemu-ga-ref.txt
/docs/qemu-qmp-qapi.texi
/docs/qemu-qmp-ref.html
/docs/qemu-qmp-ref.info*
/docs/qemu-qmp-ref.txt
docs/qemu-ga-ref.info*
docs/qemu-qmp-ref.info*
/qemu-ga-qapi.texi
/qemu-qapi.texi
/version.texi
/docs/version.texi
*.tps
.stgit-*
cscope.*

View File

@@ -327,6 +327,7 @@ L: xen-devel@lists.xenproject.org
S: Supported
F: xen-*
F: */xen*
F: hw/9pfs/xen-9p-backend.c
F: hw/char/xen_console.c
F: hw/display/xenfb.c
F: hw/net/xen_nic.c
@@ -600,15 +601,28 @@ S: Maintained
F: hw/mips/mips_malta.c
Mipssim
L: qemu-devel@nongnu.org
S: Orphan
M: Yongbok Kim <yongbok.kim@imgtec.com>
S: Odd Fixes
F: hw/mips/mips_mipssim.c
F: hw/net/mipsnet.c
R4000
M: Aurelien Jarno <aurelien@aurel32.net>
S: Maintained
F: hw/mips/mips_r4k.c
Fulong 2E
M: Yongbok Kim <yongbok.kim@imgtec.com>
S: Odd Fixes
F: hw/mips/mips_fulong2e.c
Boston
M: Paul Burton <paul.burton@imgtec.com>
S: Maintained
F: hw/core/loader-fit.c
F: hw/mips/boston.c
F: hw/pci-host/xilinx-pcie.c
OpenRISC Machines
-----------------
or1k-sim
@@ -632,7 +646,6 @@ F: hw/ppc/ppc440_bamboo.c
e500
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/e500.[hc]
@@ -643,7 +656,6 @@ F: pc-bios/u-boot.e500
mpc8544ds
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/mpc8544ds.c
@@ -908,6 +920,8 @@ F: hw/acpi/*
F: hw/smbios/*
F: hw/i386/acpi-build.[hc]
F: hw/arm/virt-acpi-build.c
F: tests/bios-tables-test.c
F: tests/acpi-utils.[hc]
ppc4xx
M: Alexander Graf <agraf@suse.de>
@@ -918,7 +932,6 @@ F: include/hw/ppc/ppc4xx.h
ppce500
M: Alexander Graf <agraf@suse.de>
M: Scott Wood <scottwood@freescale.com>
L: qemu-ppc@nongnu.org
S: Supported
F: hw/ppc/e500*
@@ -1122,6 +1135,15 @@ F: hw/nvram/chrp_nvram.c
F: include/hw/nvram/chrp_nvram.h
F: tests/prom-env-test.c
VM Generation ID
M: Ben Warren <ben@skyportsystems.com>
S: Maintained
F: hw/acpi/vmgenid.c
F: include/hw/acpi/vmgenid.h
F: docs/specs/vmgenid.txt
F: tests/vmgenid-test.c
F: stubs/vmgenid.c
Subsystems
----------
Audio
@@ -1207,6 +1229,15 @@ M: Samuel Thibault <samuel.thibault@ens-lyon.org>
S: Maintained
F: backends/baum.c
Command line option argument parsing
M: Markus Armbruster <armbru@redhat.com>
S: Supported
F: include/qemu/option.h
F: tests/test-keyval.c
F: tests/test-qemu-opts.c
F: util/keyval.c
F: util/qemu-option.c
Coverity model
M: Markus Armbruster <armbru@redhat.com>
S: Supported
@@ -1341,7 +1372,9 @@ X: include/qapi/qmp/
F: include/qapi/qmp/dispatch.h
F: tests/qapi-schema/
F: tests/test-*-visitor.c
F: tests/test-qapi-*.c
F: tests/test-qmp-*.c
F: tests/test-visitor-serialization.c
F: scripts/qapi*
F: docs/qapi*
T: git git://repo.or.cz/qemu/armbru.git qapi-next
@@ -1393,6 +1426,7 @@ F: qmp.c
F: monitor.c
F: docs/*qmp-*
F: scripts/qmp/
F: tests/qmp-test.c
T: git git://repo.or.cz/qemu/armbru.git qapi-next
Register API
@@ -1662,14 +1696,6 @@ S: Supported
F: block/ssh.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
ARCHIPELAGO
M: Chrysostomos Nanakos <chris@include.gr>
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
S: Maintained
F: block/archipelago.c
T: git git://github.com/codyprime/qemu-kvm-jtc.git block
CURL
M: Jeff Cody <jcody@redhat.com>
L: qemu-block@nongnu.org
@@ -1789,8 +1815,8 @@ S: Supported
F: tests/image-fuzzer/
Replication
M: Wen Congyang <wency@cn.fujitsu.com>
M: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
M: Wen Congyang <wencongyang2@huawei.com>
M: Xie Changlong <xiechanglong.d@gmail.com>
S: Supported
F: replication*
F: block/replication.c

View File

@@ -26,6 +26,7 @@ endif
CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y)
CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y)
CONFIG_XEN := $(CONFIG_XEN_BACKEND)
CONFIG_ALL=y
-include config-all-devices.mak
-include config-all-disas.mak
@@ -50,24 +51,24 @@ endif
include $(SRC_PATH)/rules.mak
GENERATED_HEADERS = qemu-version.h config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += qmp-introspect.h
GENERATED_SOURCES += qmp-introspect.c
GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
GENERATED_FILES += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_FILES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_FILES += qmp-introspect.h
GENERATED_FILES += qmp-introspect.c
GENERATED_HEADERS += trace/generated-tcg-tracers.h
GENERATED_FILES += trace/generated-tcg-tracers.h
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
GENERATED_FILES += trace/generated-helpers-wrappers.h
GENERATED_FILES += trace/generated-helpers.h
GENERATED_FILES += trace/generated-helpers.c
ifdef CONFIG_TRACE_UST
GENERATED_HEADERS += trace-ust-all.h
GENERATED_SOURCES += trace-ust-all.c
GENERATED_FILES += trace-ust-all.h
GENERATED_FILES += trace-ust-all.c
endif
GENERATED_HEADERS += module_block.h
GENERATED_FILES += module_block.h
TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h)
TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c)
@@ -80,11 +81,15 @@ ifdef CONFIG_TRACE_UST
TRACE_HEADERS += trace-ust-root.h $(trace-events-subdirs:%=%/trace-ust.h)
endif
GENERATED_HEADERS += $(TRACE_HEADERS)
GENERATED_SOURCES += $(TRACE_SOURCES)
GENERATED_FILES += $(TRACE_HEADERS)
GENERATED_FILES += $(TRACE_SOURCES)
GENERATED_FILES += $(BUILD_DIR)/trace-events-all
trace-group-name = $(shell dirname $1 | sed -e 's/[^a-zA-Z0-9]/_/g')
tracetool-y = $(SRC_PATH)/scripts/tracetool.py
tracetool-y += $(shell find $(SRC_PATH)/scripts/tracetool -name "*.py")
%/trace.h: %/trace.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
%/trace.h-timestamp: $(SRC_PATH)/%/trace-events $(tracetool-y)
@@ -341,7 +346,7 @@ dtc/%:
mkdir -p $@
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(chardev-obj-y) \
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY)) $(trace-obj-y)
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
# Only keep -O and -g cflags
@@ -361,11 +366,11 @@ Makefile: $(version-obj-y)
# Build libraries
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
libqemuutil.a: $(util-obj-y) $(trace-obj-y)
######################################################################
COMMON_LDADDS = $(trace-obj-y) libqemuutil.a libqemustub.a
COMMON_LDADDS = libqemuutil.a libqemustub.a
qemu-img.o: qemu-img-cmds.h
@@ -387,7 +392,6 @@ qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated
gen-out-type = $(subst .,-,$(suffix $@))
qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qapi-py += $(SRC_PATH)/scripts/qapi2texi.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
@@ -485,11 +489,10 @@ clean:
rm -f fsdev/*.pod
rm -f qemu-img-cmds.h
rm -f ui/shader/*-vert.h ui/shader/*-frag.h
@# May not be present in GENERATED_HEADERS
@# May not be present in GENERATED_FILES
rm -f trace/generated-tracers-dtrace.dtrace*
rm -f trace/generated-tracers-dtrace.h*
rm -f $(foreach f,$(GENERATED_HEADERS),$(f) $(f)-timestamp)
rm -f $(foreach f,$(GENERATED_SOURCES),$(f) $(f)-timestamp)
rm -f $(foreach f,$(GENERATED_FILES),$(f) $(f)-timestamp)
rm -rf qapi-generated
rm -rf qga/qapi-generated
for d in $(ALL_SUBDIRS); do \
@@ -516,7 +519,7 @@ distclean: clean
rm -f qemu-doc.vr qemu-doc.txt
rm -f config.log
rm -f linux-headers/asm
rm -f qemu-ga-qapi.texi qemu-qapi.texi version.texi
rm -f docs/qemu-ga-qapi.texi docs/qemu-qmp-qapi.texi docs/version.texi
rm -f docs/qemu-qmp-ref.7 docs/qemu-ga-ref.7
rm -f docs/qemu-qmp-ref.txt docs/qemu-ga-ref.txt
rm -f docs/qemu-qmp-ref.pdf docs/qemu-ga-ref.pdf
@@ -593,8 +596,7 @@ endif
endif
install: all $(if $(BUILD_DOCS),install-doc) $(BUILD_DIR)/trace-events-all \
install-datadir install-localstatedir
install: all $(if $(BUILD_DOCS),install-doc) install-datadir install-localstatedir
ifneq ($(TOOLS),)
$(call install-prog,$(subst qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir))
endif
@@ -663,25 +665,28 @@ ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
# documentation
MAKEINFO=makeinfo
MAKEINFOFLAGS=--no-split --number-sections
MAKEINFOFLAGS=--no-split --number-sections -I docs
TEXIFLAG=$(if $(V),,--quiet)
version.texi: $(SRC_PATH)/VERSION
docs/version.texi: $(SRC_PATH)/VERSION
$(call quiet-command,echo "@set VERSION $(VERSION)" > $@,"GEN","$@")
%.html: %.texi version.texi
%.html: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--html $< -o $@,"GEN","$@")
%.info: %.texi version.texi
%.info: %.texi
$(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
%.txt: %.texi version.texi
%.txt: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--plaintext $< -o $@,"GEN","$@")
%.pdf: %.texi version.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I $(SRC_PATH) -I . $< -o $@,"GEN","$@")
%.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I $(SRC_PATH) -I docs $< -o $@,"GEN","$@")
docs/qemu-ga-ref.html docs/qemu-ga-ref.info docs/qemu-ga-ref.txt docs/qemu-ga-ref.pdf docs/qemu-ga-ref.7.pod: docs/version.texi
docs/qemu-qmp-ref.html docs/qemu-qmp-ref.info docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.pdf docs/qemu-qmp-ref.pod: docs/version.texi
qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
@@ -695,10 +700,12 @@ qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxt
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
qemu-qapi.texi: $(qapi-modules) $(qapi-py)
docs/qemu-qmp-qapi.texi docs/qemu-ga-qapi.texi: $(SRC_PATH)/scripts/qapi2texi.py $(qapi-py)
docs/qemu-qmp-qapi.texi: $(qapi-modules)
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json $(qapi-py)
docs/qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
@@ -719,10 +726,10 @@ qemu-doc.html qemu-doc.info qemu-doc.pdf qemu-doc.txt: \
qemu-monitor-info.texi
docs/qemu-ga-ref.dvi docs/qemu-ga-ref.html docs/qemu-ga-ref.info docs/qemu-ga-ref.pdf docs/qemu-ga-ref.txt docs/qemu-ga-ref.7: \
docs/qemu-ga-ref.texi qemu-ga-qapi.texi
docs/qemu-ga-ref.texi docs/qemu-ga-qapi.texi
docs/qemu-qmp-ref.dvi docs/qemu-qmp-ref.html docs/qemu-qmp-ref.info docs/qemu-qmp-ref.pdf docs/qemu-qmp-ref.txt docs/qemu-qmp-ref.7: \
docs/qemu-qmp-ref.texi qemu-qapi.texi
docs/qemu-qmp-ref.texi docs/qemu-qmp-qapi.texi
ifdef CONFIG_WIN32
@@ -784,7 +791,7 @@ endif # CONFIG_WIN
# Add a dependency on the generated files, so that they are always
# rebuilt before other object files
ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
Makefile: $(GENERATED_HEADERS)
Makefile: $(GENERATED_FILES)
endif
.SECONDARY: $(TRACE_HEADERS) $(TRACE_HEADERS:%=%-timestamp) \

View File

@@ -157,6 +157,7 @@ trace-events-subdirs += audio
trace-events-subdirs += net
trace-events-subdirs += target/arm
trace-events-subdirs += target/i386
trace-events-subdirs += target/mips
trace-events-subdirs += target/sparc
trace-events-subdirs += target/s390x
trace-events-subdirs += target/ppc

View File

@@ -149,12 +149,6 @@ obj-y += dump.o
obj-y += migration/ram.o migration/savevm.o
LIBS := $(libs_softmmu) $(LIBS)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
obj-y += hw/sparc64/
@@ -162,7 +156,7 @@ else
obj-y += hw/$(TARGET_BASE_ARCH)/
endif
GENERATED_HEADERS += hmp-commands.h hmp-commands-info.h
GENERATED_FILES += hmp-commands.h hmp-commands-info.h
endif # CONFIG_SOFTMMU
@@ -188,8 +182,7 @@ dummy := $(call unnest-vars,.., \
qom-obj-y \
io-obj-y \
common-obj-y \
common-obj-m \
trace-obj-y)
common-obj-m)
target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
@@ -201,7 +194,7 @@ all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
COMMON_LDADDS = $(trace-obj-y) ../libqemuutil.a ../libqemustub.a
COMMON_LDADDS = ../libqemuutil.a ../libqemustub.a
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
@@ -238,5 +231,5 @@ ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_HEADERS += config-target.h
Makefile: $(GENERATED_HEADERS)
GENERATED_FILES += config-target.h
Makefile: $(GENERATED_FILES)

View File

@@ -1 +1 @@
2.8.50
2.9.50

View File

@@ -320,10 +320,12 @@ static int cryptodev_builtin_sym_operation(
sess = builtin->sessions[op_info->session_id];
ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
op_info->iv_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
if (op_info->iv_len > 0) {
ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
op_info->iv_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
}
}
if (sess->direction == VIRTIO_CRYPTO_OP_ENCRYPT) {
@@ -359,8 +361,6 @@ static void cryptodev_builtin_cleanup(
}
}
assert(queues == 1);
for (i = 0; i < queues; i++) {
cc = backend->conf.peers.ccs[i];
if (cc) {

View File

@@ -51,7 +51,7 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
gchar *path;
backend->force_prealloc = mem_prealloc;
path = object_get_canonical_path(OBJECT(backend));
@@ -76,7 +76,7 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}
@@ -96,7 +96,7 @@ static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}

View File

@@ -45,7 +45,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
Error *local_err = NULL;
uint64_t value;
if (memory_region_size(&backend->mr)) {
if (host_memory_backend_mr_inited(backend)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
@@ -64,14 +64,6 @@ out:
error_propagate(errp, local_err);
}
static uint16List **host_memory_append_node(uint16List **node,
unsigned long value)
{
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
return &(*node)->next;
}
static void
host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
@@ -82,23 +74,25 @@ host_memory_backend_get_host_nodes(Object *obj, Visitor *v, const char *name,
unsigned long value;
value = find_first_bit(backend->host_nodes, MAX_NODES);
node = host_memory_append_node(node, value);
if (value == MAX_NODES) {
goto out;
return;
}
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
do {
value = find_next_bit(backend->host_nodes, MAX_NODES, value + 1);
if (value == MAX_NODES) {
break;
}
node = host_memory_append_node(node, value);
*node = g_malloc0(sizeof(**node));
(*node)->value = value;
node = &(*node)->next;
} while (true);
out:
visit_type_uint16List(v, name, &host_nodes, errp);
}
@@ -152,7 +146,7 @@ static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->merge = value;
return;
}
@@ -178,7 +172,7 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->dump = value;
return;
}
@@ -214,7 +208,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
}
}
if (!memory_region_size(&backend->mr)) {
if (!host_memory_backend_mr_inited(backend)) {
backend->prealloc = value;
return;
}
@@ -224,7 +218,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz, &local_err);
os_mem_prealloc(fd, ptr, sz, smp_cpus, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -243,10 +237,19 @@ static void host_memory_backend_init(Object *obj)
backend->prealloc = mem_prealloc;
}
bool host_memory_backend_mr_inited(HostMemoryBackend *backend)
{
/*
* NOTE: We forbid zero-length memory backend, so here zero means
* "we haven't inited the backend memory region yet".
*/
return memory_region_size(&backend->mr) != 0;
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
return host_memory_backend_mr_inited(backend) ? &backend->mr : NULL;
}
void host_memory_backend_set_mapped(HostMemoryBackend *backend, bool mapped)
@@ -328,7 +331,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
*/
if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz,
&local_err);
smp_cpus, &local_err);
if (local_err) {
goto out;
}

351
block.c
View File

@@ -192,6 +192,43 @@ void path_combine(char *dest, int dest_size,
}
}
bool bdrv_is_read_only(BlockDriverState *bs)
{
return bs->read_only;
}
int bdrv_can_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
{
/* Do not set read_only if copy_on_read is enabled */
if (bs->copy_on_read && read_only) {
error_setg(errp, "Can't set node '%s' to r/o with copy-on-read enabled",
bdrv_get_device_or_node_name(bs));
return -EINVAL;
}
/* Do not clear read_only if it is prohibited */
if (!read_only && !(bs->open_flags & BDRV_O_ALLOW_RDWR)) {
error_setg(errp, "Node '%s' is read only",
bdrv_get_device_or_node_name(bs));
return -EPERM;
}
return 0;
}
int bdrv_set_read_only(BlockDriverState *bs, bool read_only, Error **errp)
{
int ret = 0;
ret = bdrv_can_set_read_only(bs, read_only, errp);
if (ret < 0) {
return ret;
}
bs->read_only = read_only;
return 0;
}
void bdrv_get_full_backing_filename_from_filename(const char *backed,
const char *backing,
char *dest, size_t sz,
@@ -1157,6 +1194,13 @@ static int bdrv_open_common(BlockDriverState *bs, BlockBackend *file,
if (file != NULL) {
filename = blk_bs(file)->filename;
} else {
/*
* Caution: while qdict_get_try_str() is fine, getting
* non-string types would require more care. When @options
* come from -blockdev or blockdev_add, its members are typed
* according to the QAPI schema, but when they come from
* -drive, they're all QString.
*/
filename = qdict_get_try_str(options, "filename");
}
@@ -1262,9 +1306,14 @@ static QDict *parse_json_filename(const char *filename, Error **errp)
ret = strstart(filename, "json:", &filename);
assert(ret);
options_obj = qobject_from_json(filename);
options_obj = qobject_from_json(filename, errp);
if (!options_obj) {
error_setg(errp, "Could not parse the JSON options");
/* Work around qobject_from_json() lossage TODO fix that */
if (errp && !*errp) {
error_setg(errp, "Could not parse the JSON options");
return NULL;
}
error_prepend(errp, "Could not parse the JSON options: ");
return NULL;
}
@@ -1319,6 +1368,13 @@ static int bdrv_fill_options(QDict **options, const char *filename,
BlockDriver *drv = NULL;
Error *local_err = NULL;
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
drvname = qdict_get_try_str(*options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
@@ -1353,6 +1409,7 @@ static int bdrv_fill_options(QDict **options, const char *filename,
}
/* Find the right block driver */
/* See cautionary note on accessing @options above */
filename = qdict_get_try_str(*options, "filename");
if (!drvname && protocol) {
@@ -1388,6 +1445,11 @@ static int bdrv_fill_options(QDict **options, const char *filename,
return 0;
}
static int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
GSList *ignore_children, Error **errp);
static void bdrv_child_abort_perm_update(BdrvChild *c);
static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared);
/*
* Check whether permissions on this node can be changed in a way that
* @cumulative_perms and @cumulative_shared_perms are the new cumulative
@@ -1398,7 +1460,8 @@ static int bdrv_fill_options(QDict **options, const char *filename,
* or bdrv_abort_perm_update().
*/
static int bdrv_check_perm(BlockDriverState *bs, uint64_t cumulative_perms,
uint64_t cumulative_shared_perms, Error **errp)
uint64_t cumulative_shared_perms,
GSList *ignore_children, Error **errp)
{
BlockDriver *drv = bs->drv;
BdrvChild *c;
@@ -1434,7 +1497,8 @@ static int bdrv_check_perm(BlockDriverState *bs, uint64_t cumulative_perms,
drv->bdrv_child_perm(bs, c, c->role,
cumulative_perms, cumulative_shared_perms,
&cur_perm, &cur_shared);
ret = bdrv_child_check_perm(c, cur_perm, cur_shared, errp);
ret = bdrv_child_check_perm(c, cur_perm, cur_shared, ignore_children,
errp);
if (ret < 0) {
return ret;
}
@@ -1554,15 +1618,15 @@ static char *bdrv_perm_names(uint64_t perm)
/*
* Checks whether a new reference to @bs can be added if the new user requires
* @new_used_perm/@new_shared_perm as its permissions. If @ignore_child is set,
* this old reference is ignored in the calculations; this allows checking
* permission updates for an existing reference.
* @new_used_perm/@new_shared_perm as its permissions. If @ignore_children is
* set, the BdrvChild objects in this list are ignored in the calculations;
* this allows checking permission updates for an existing reference.
*
* Needs to be followed by a call to either bdrv_set_perm() or
* bdrv_abort_perm_update(). */
static int bdrv_check_update_perm(BlockDriverState *bs, uint64_t new_used_perm,
uint64_t new_shared_perm,
BdrvChild *ignore_child, Error **errp)
GSList *ignore_children, Error **errp)
{
BdrvChild *c;
uint64_t cumulative_perms = new_used_perm;
@@ -1572,7 +1636,7 @@ static int bdrv_check_update_perm(BlockDriverState *bs, uint64_t new_used_perm,
assert(new_shared_perm & BLK_PERM_WRITE_UNCHANGED);
QLIST_FOREACH(c, &bs->parents, next_parent) {
if (c == ignore_child) {
if (g_slist_find(ignore_children, c)) {
continue;
}
@@ -1602,18 +1666,25 @@ static int bdrv_check_update_perm(BlockDriverState *bs, uint64_t new_used_perm,
cumulative_shared_perms &= c->shared_perm;
}
return bdrv_check_perm(bs, cumulative_perms, cumulative_shared_perms, errp);
return bdrv_check_perm(bs, cumulative_perms, cumulative_shared_perms,
ignore_children, errp);
}
/* Needs to be followed by a call to either bdrv_child_set_perm() or
* bdrv_child_abort_perm_update(). */
int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
Error **errp)
static int bdrv_child_check_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
GSList *ignore_children, Error **errp)
{
return bdrv_check_update_perm(c->bs, perm, shared, c, errp);
int ret;
ignore_children = g_slist_prepend(g_slist_copy(ignore_children), c);
ret = bdrv_check_update_perm(c->bs, perm, shared, ignore_children, errp);
g_slist_free(ignore_children);
return ret;
}
void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared)
static void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared)
{
uint64_t cumulative_perms, cumulative_shared_perms;
@@ -1625,7 +1696,7 @@ void bdrv_child_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared)
bdrv_set_perm(c->bs, cumulative_perms, cumulative_shared_perms);
}
void bdrv_child_abort_perm_update(BdrvChild *c)
static void bdrv_child_abort_perm_update(BdrvChild *c)
{
bdrv_abort_perm_update(c->bs);
}
@@ -1635,7 +1706,7 @@ int bdrv_child_try_set_perm(BdrvChild *c, uint64_t perm, uint64_t shared,
{
int ret;
ret = bdrv_child_check_perm(c, perm, shared, errp);
ret = bdrv_child_check_perm(c, perm, shared, NULL, errp);
if (ret < 0) {
bdrv_child_abort_perm_update(c);
return ret;
@@ -1713,12 +1784,14 @@ void bdrv_format_default_perms(BlockDriverState *bs, BdrvChild *c,
*nshared = shared;
}
static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
bool check_new_perm)
static void bdrv_replace_child_noperm(BdrvChild *child,
BlockDriverState *new_bs)
{
BlockDriverState *old_bs = child->bs;
uint64_t perm, shared_perm;
if (old_bs && new_bs) {
assert(bdrv_get_aio_context(old_bs) == bdrv_get_aio_context(new_bs));
}
if (old_bs) {
if (old_bs->quiesce_counter && child->role->drained_end) {
child->role->drained_end(child);
@@ -1727,13 +1800,6 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
child->role->detach(child);
}
QLIST_REMOVE(child, next_parent);
/* Update permissions for old node. This is guaranteed to succeed
* because we're just taking a parent away, so we're loosening
* restrictions. */
bdrv_get_cumulative_perm(old_bs, &perm, &shared_perm);
bdrv_check_perm(old_bs, perm, shared_perm, &error_abort);
bdrv_set_perm(old_bs, perm, shared_perm);
}
child->bs = new_bs;
@@ -1744,18 +1810,45 @@ static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs,
child->role->drained_begin(child);
}
bdrv_get_cumulative_perm(new_bs, &perm, &shared_perm);
if (check_new_perm) {
bdrv_check_perm(new_bs, perm, shared_perm, &error_abort);
}
bdrv_set_perm(new_bs, perm, shared_perm);
if (child->role->attach) {
child->role->attach(child);
}
}
}
/*
* Updates @child to change its reference to point to @new_bs, including
* checking and applying the necessary permisson updates both to the old node
* and to @new_bs.
*
* NULL is passed as @new_bs for removing the reference before freeing @child.
*
* If @new_bs is not NULL, bdrv_check_perm() must be called beforehand, as this
* function uses bdrv_set_perm() to update the permissions according to the new
* reference that @new_bs gets.
*/
static void bdrv_replace_child(BdrvChild *child, BlockDriverState *new_bs)
{
BlockDriverState *old_bs = child->bs;
uint64_t perm, shared_perm;
if (old_bs) {
/* Update permissions for old node. This is guaranteed to succeed
* because we're just taking a parent away, so we're loosening
* restrictions. */
bdrv_get_cumulative_perm(old_bs, &perm, &shared_perm);
bdrv_check_perm(old_bs, perm, shared_perm, NULL, &error_abort);
bdrv_set_perm(old_bs, perm, shared_perm);
}
bdrv_replace_child_noperm(child, new_bs);
if (new_bs) {
bdrv_get_cumulative_perm(new_bs, &perm, &shared_perm);
bdrv_set_perm(new_bs, perm, shared_perm);
}
}
BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
const char *child_name,
const BdrvChildRole *child_role,
@@ -1782,7 +1875,7 @@ BdrvChild *bdrv_root_attach_child(BlockDriverState *child_bs,
};
/* This performs the matching bdrv_set_perm() for the above check. */
bdrv_replace_child(child, child_bs, false);
bdrv_replace_child(child, child_bs);
return child;
}
@@ -1799,6 +1892,7 @@ BdrvChild *bdrv_attach_child(BlockDriverState *parent_bs,
bdrv_get_cumulative_perm(parent_bs, &perm, &shared_perm);
assert(parent_bs->drv);
assert(bdrv_get_aio_context(parent_bs) == bdrv_get_aio_context(child_bs));
parent_bs->drv->bdrv_child_perm(parent_bs, NULL, child_role,
perm, shared_perm, &perm, &shared_perm);
@@ -1819,7 +1913,7 @@ static void bdrv_detach_child(BdrvChild *child)
child->next.le_prev = NULL;
}
bdrv_replace_child(child, NULL, false);
bdrv_replace_child(child, NULL);
g_free(child->name);
g_free(child);
@@ -1905,6 +1999,8 @@ void bdrv_set_backing_hd(BlockDriverState *bs, BlockDriverState *backing_hd,
bdrv_unref(backing_hd);
}
bdrv_refresh_filename(bs);
out:
bdrv_refresh_limits(bs, NULL);
}
@@ -1947,6 +2043,13 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
qdict_extract_subqdict(parent_options, &options, bdref_key_dot);
g_free(bdref_key_dot);
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @parent_options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
reference = qdict_get_try_str(parent_options, bdref_key);
if (reference || qdict_haskey(options, "file.filename")) {
backing_filename[0] = '\0';
@@ -1990,6 +2093,7 @@ int bdrv_open_backing_file(BlockDriverState *bs, QDict *parent_options,
bdrv_set_backing_hd(bs, backing_hd, &local_err);
bdrv_unref(backing_hd);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto free_exit;
}
@@ -2018,6 +2122,13 @@ bdrv_open_child_bs(const char *filename, QDict *options, const char *bdref_key,
qdict_extract_subqdict(options, &image_options, bdref_key_dot);
g_free(bdref_key_dot);
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from
* -blockdev or blockdev_add, its members are typed according to
* the QAPI schema, but when they come from -drive, they're all
* QString.
*/
reference = qdict_get_try_str(options, bdref_key);
if (!filename && !reference && !qdict_size(image_options)) {
if (!allow_none) {
@@ -2233,9 +2344,13 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
goto fail;
}
/* Set the BDRV_O_RDWR and BDRV_O_ALLOW_RDWR flags.
* FIXME: we're parsing the QDict to avoid having to create a
* QemuOpts just for this, but neither option is optimal. */
/*
* Set the BDRV_O_RDWR and BDRV_O_ALLOW_RDWR flags.
* Caution: getting a boolean member of @options requires care.
* When @options come from -blockdev or blockdev_add, members are
* typed according to the QAPI schema, but when they come from
* -drive, they're all QString.
*/
if (g_strcmp0(qdict_get_try_str(options, BDRV_OPT_READ_ONLY), "on") &&
!qdict_get_try_bool(options, BDRV_OPT_READ_ONLY, false)) {
flags |= (BDRV_O_RDWR | BDRV_O_ALLOW_RDWR);
@@ -2257,6 +2372,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
options = qdict_clone_shallow(options);
/* Find the right image format driver */
/* See cautionary note on accessing @options above */
drvname = qdict_get_try_str(options, "driver");
if (drvname) {
drv = bdrv_find_format(drvname);
@@ -2268,6 +2384,7 @@ static BlockDriverState *bdrv_open_inherit(const char *filename,
assert(drvname || !(flags & BDRV_O_PROTOCOL));
/* See cautionary note on accessing @options above */
backing = qdict_get_try_str(options, "backing");
if (backing && *backing == '\0') {
flags |= BDRV_O_NO_BACKING;
@@ -2672,6 +2789,7 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
BlockDriver *drv;
QemuOpts *opts;
const char *value;
bool read_only;
assert(reopen_state != NULL);
assert(reopen_state->bs->drv != NULL);
@@ -2700,12 +2818,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
qdict_put(reopen_state->options, "driver", qstring_from_str(value));
}
/* if we are to stay read-only, do not allow permission change
* to r/w */
if (!(reopen_state->bs->open_flags & BDRV_O_ALLOW_RDWR) &&
reopen_state->flags & BDRV_O_RDWR) {
error_setg(errp, "Node '%s' is read only",
bdrv_get_device_or_node_name(reopen_state->bs));
/* If we are to stay read-only, do not allow permission change
* to r/w. Attempting to set to r/w may fail if either BDRV_O_ALLOW_RDWR is
* not set, or if the BDS still has copy_on_read enabled */
read_only = !(reopen_state->flags & BDRV_O_RDWR);
ret = bdrv_can_set_read_only(reopen_state->bs, read_only, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error;
}
@@ -2746,6 +2865,13 @@ int bdrv_reopen_prepare(BDRVReopenState *reopen_state, BlockReopenQueue *queue,
do {
QString *new_obj = qobject_to_qstring(entry->value);
const char *new = qstring_get_str(new_obj);
/*
* Caution: while qdict_get_try_str() is fine, getting
* non-string types would require more care. When
* bs->options come from -blockdev or blockdev_add, its
* members are typed according to the QAPI schema, but
* when they come from -drive, they're all QString.
*/
const char *old = qdict_get_try_str(reopen_state->bs->options,
entry->key);
@@ -2886,35 +3012,82 @@ void bdrv_close_all(void)
assert(QTAILQ_EMPTY(&all_bdrv_states));
}
static void change_parent_backing_link(BlockDriverState *from,
BlockDriverState *to)
static bool should_update_child(BdrvChild *c, BlockDriverState *to)
{
BdrvChild *c, *next, *to_c;
BdrvChild *to_c;
if (c->role->stay_at_node) {
return false;
}
if (c->role == &child_backing) {
/* If @from is a backing file of @to, ignore the child to avoid
* creating a loop. We only want to change the pointer of other
* parents. */
QLIST_FOREACH(to_c, &to->children, next) {
if (to_c == c) {
break;
}
}
if (to_c) {
return false;
}
}
return true;
}
void bdrv_replace_node(BlockDriverState *from, BlockDriverState *to,
Error **errp)
{
BdrvChild *c, *next;
GSList *list = NULL, *p;
uint64_t old_perm, old_shared;
uint64_t perm = 0, shared = BLK_PERM_ALL;
int ret;
assert(!atomic_read(&from->in_flight));
assert(!atomic_read(&to->in_flight));
/* Make sure that @from doesn't go away until we have successfully attached
* all of its parents to @to. */
bdrv_ref(from);
/* Put all parents into @list and calculate their cumulative permissions */
QLIST_FOREACH_SAFE(c, &from->parents, next_parent, next) {
if (c->role->stay_at_node) {
if (!should_update_child(c, to)) {
continue;
}
if (c->role == &child_backing) {
/* If @from is a backing file of @to, ignore the child to avoid
* creating a loop. We only want to change the pointer of other
* parents. */
QLIST_FOREACH(to_c, &to->children, next) {
if (to_c == c) {
break;
}
}
if (to_c) {
continue;
}
}
list = g_slist_prepend(list, c);
perm |= c->perm;
shared &= c->shared_perm;
}
/* Check whether the required permissions can be granted on @to, ignoring
* all BdrvChild in @list so that they can't block themselves. */
ret = bdrv_check_update_perm(to, perm, shared, list, errp);
if (ret < 0) {
bdrv_abort_perm_update(to);
goto out;
}
/* Now actually perform the change. We performed the permission check for
* all elements of @list at once, so set the permissions all at once at the
* very end. */
for (p = list; p != NULL; p = p->next) {
c = p->data;
bdrv_ref(to);
/* FIXME Are we sure that bdrv_replace_child() can't run into
* &error_abort because of permissions? */
bdrv_replace_child(c, to, true);
bdrv_replace_child_noperm(c, to);
bdrv_unref(from);
}
bdrv_get_cumulative_perm(to, &old_perm, &old_shared);
bdrv_set_perm(to, old_perm | perm, old_shared | shared);
out:
g_slist_free(list);
bdrv_unref(from);
}
/*
@@ -2938,16 +3111,18 @@ void bdrv_append(BlockDriverState *bs_new, BlockDriverState *bs_top,
{
Error *local_err = NULL;
assert(!atomic_read(&bs_top->in_flight));
assert(!atomic_read(&bs_new->in_flight));
bdrv_set_backing_hd(bs_new, bs_top, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto out;
}
change_parent_backing_link(bs_top, bs_new);
bdrv_replace_node(bs_top, bs_new, &local_err);
if (local_err) {
error_propagate(errp, local_err);
bdrv_set_backing_hd(bs_new, NULL, &error_abort);
goto out;
}
/* bs_new is now referenced by its new parents, we don't need the
* additional reference any more. */
@@ -2955,18 +3130,6 @@ out:
bdrv_unref(bs_new);
}
void bdrv_replace_in_backing_chain(BlockDriverState *old, BlockDriverState *new)
{
assert(!bdrv_requests_pending(old));
assert(!bdrv_requests_pending(new));
bdrv_ref(old);
change_parent_backing_link(old, new);
bdrv_unref(old);
}
static void bdrv_delete(BlockDriverState *bs)
{
assert(!bs->job);
@@ -3150,7 +3313,11 @@ int bdrv_truncate(BdrvChild *child, int64_t offset)
BlockDriver *drv = bs->drv;
int ret;
assert(child->perm & BLK_PERM_RESIZE);
/* FIXME: Some format block drivers use this function instead of implicitly
* growing their file by writing beyond its end.
* See bdrv_aligned_pwritev() for an explanation why we currently
* cannot assert this permission in that case. */
// assert(child->perm & BLK_PERM_RESIZE);
if (!drv)
return -ENOMEDIUM;
@@ -3227,11 +3394,6 @@ void bdrv_get_geometry(BlockDriverState *bs, uint64_t *nb_sectors_ptr)
*nb_sectors_ptr = nb_sectors < 0 ? 0 : nb_sectors;
}
bool bdrv_is_read_only(BlockDriverState *bs)
{
return bs->read_only;
}
bool bdrv_is_sg(BlockDriverState *bs)
{
return bs->sg;
@@ -4033,8 +4195,8 @@ bool bdrv_op_blocker_is_empty(BlockDriverState *bs)
void bdrv_img_create(const char *filename, const char *fmt,
const char *base_filename, const char *base_fmt,
char *options, uint64_t img_size, int flags,
Error **errp, bool quiet)
char *options, uint64_t img_size, int flags, bool quiet,
Error **errp)
{
QemuOptsList *create_opts = NULL;
QemuOpts *opts = NULL;
@@ -4200,6 +4362,11 @@ AioContext *bdrv_get_aio_context(BlockDriverState *bs)
return bs->aio_context;
}
void bdrv_coroutine_enter(BlockDriverState *bs, Coroutine *co)
{
aio_co_enter(bdrv_get_aio_context(bs), co);
}
static void bdrv_do_remove_aio_context_notifier(BdrvAioNotifier *ban)
{
QLIST_REMOVE(ban, list);
@@ -4272,8 +4439,16 @@ void bdrv_attach_aio_context(BlockDriverState *bs,
void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
{
AioContext *ctx = bdrv_get_aio_context(bs);
aio_disable_external(ctx);
bdrv_parent_drained_begin(bs);
bdrv_drain(bs); /* ensure there are no in-flight requests */
while (aio_poll(ctx, false)) {
/* wait for all bottom halves to execute */
}
bdrv_detach_aio_context(bs);
/* This function executes in the old AioContext so acquire the new one in
@@ -4281,6 +4456,8 @@ void bdrv_set_aio_context(BlockDriverState *bs, AioContext *new_context)
*/
aio_context_acquire(new_context);
bdrv_attach_aio_context(bs, new_context);
bdrv_parent_drained_end(bs);
aio_enable_external(ctx);
aio_context_release(new_context);
}

View File

@@ -19,7 +19,7 @@ block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_VXHS) += vxhs.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o dirty-bitmap.o
block-obj-y += write-threshold.o
@@ -39,9 +39,9 @@ rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
vxhs.o-libs := $(VXHS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o
dmg-bz2.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz

File diff suppressed because it is too large Load Diff

View File

@@ -24,6 +24,7 @@
#include "qemu/cutils.h"
#include "sysemu/block-backend.h"
#include "qemu/bitmap.h"
#include "qemu/error-report.h"
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
#define SLICE_TIME 100000000ULL /* ns */
@@ -467,13 +468,14 @@ static void coroutine_fn backup_run(void *opaque)
/* Both FULL and TOP SYNC_MODE's require copying.. */
for (; start < end; start++) {
bool error_is_read;
int alloced = 0;
if (yield_and_check(job)) {
break;
}
if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
int i, n;
int alloced = 0;
/* Check to see if these blocks are already in the
* backing file. */
@@ -491,7 +493,7 @@ static void coroutine_fn backup_run(void *opaque)
sectors_per_cluster - i, &n);
i += n;
if (alloced == 1 || n == 0) {
if (alloced || n == 0) {
break;
}
}
@@ -503,8 +505,13 @@ static void coroutine_fn backup_run(void *opaque)
}
}
/* FULL sync mode we copy the whole drive. */
ret = backup_do_cow(job, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read, false);
if (alloced < 0) {
ret = alloced;
} else {
ret = backup_do_cow(job, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
}
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
@@ -648,7 +655,16 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
* backup cluster size is smaller than the target cluster size. Even for
* targets with a backing file, try to avoid COW if possible. */
ret = bdrv_get_info(target, &bdi);
if (ret < 0 && !target->backing) {
if (ret == -ENOTSUP && !target->backing) {
/* Cluster size is not defined */
error_report("WARNING: The target block device doesn't provide "
"information about the block size and it doesn't have a "
"backing file. The default block size of %u bytes is "
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BACKUP_CLUSTER_SIZE_DEFAULT);
job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
} else if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
"which has no backing file");

View File

@@ -61,10 +61,13 @@ struct BlockBackend {
uint64_t perm;
uint64_t shared_perm;
bool disable_perm;
bool allow_write_beyond_eof;
NotifierList remove_bs_notifiers, insert_bs_notifiers;
int quiesce_counter;
};
typedef struct BlockBackendAIOCB {
@@ -213,7 +216,12 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
}
blk->root = bdrv_root_attach_child(bs, "root", &child_root,
perm, BLK_PERM_ALL, blk, &error_abort);
perm, BLK_PERM_ALL, blk, errp);
if (!blk->root) {
bdrv_unref(bs);
blk_unref(blk);
return NULL;
}
return blk;
}
@@ -223,6 +231,9 @@ static void blk_delete(BlockBackend *blk)
assert(!blk->refcnt);
assert(!blk->name);
assert(!blk->dev);
if (blk->public.throttle_state) {
blk_io_limits_disable(blk);
}
if (blk->root) {
blk_remove_bs(blk);
}
@@ -571,7 +582,7 @@ int blk_set_perm(BlockBackend *blk, uint64_t perm, uint64_t shared_perm,
{
int ret;
if (blk->root) {
if (blk->root && !blk->disable_perm) {
ret = bdrv_child_try_set_perm(blk->root, perm, shared_perm, errp);
if (ret < 0) {
return ret;
@@ -590,15 +601,52 @@ void blk_get_perm(BlockBackend *blk, uint64_t *perm, uint64_t *shared_perm)
*shared_perm = blk->shared_perm;
}
/*
* Notifies the user of all BlockBackends that migration has completed. qdev
* devices can tighten their permissions in response (specifically revoke
* shared write permissions that we needed for storage migration).
*
* If an error is returned, the VM cannot be allowed to be resumed.
*/
void blk_resume_after_migration(Error **errp)
{
BlockBackend *blk;
Error *local_err = NULL;
for (blk = blk_all_next(NULL); blk; blk = blk_all_next(blk)) {
if (!blk->disable_perm) {
continue;
}
blk->disable_perm = false;
blk_set_perm(blk, blk->perm, blk->shared_perm, &local_err);
if (local_err) {
error_propagate(errp, local_err);
blk->disable_perm = true;
return;
}
}
}
static int blk_do_attach_dev(BlockBackend *blk, void *dev)
{
if (blk->dev) {
return -EBUSY;
}
/* While migration is still incoming, we don't need to apply the
* permissions of guest device BlockBackends. We might still have a block
* job or NBD server writing to the image for storage migration. */
if (runstate_check(RUN_STATE_INMIGRATE)) {
blk->disable_perm = true;
}
blk_ref(blk);
blk->dev = dev;
blk->legacy_dev = false;
blk_iostatus_reset(blk);
return 0;
}
@@ -694,12 +742,17 @@ void blk_set_dev_ops(BlockBackend *blk, const BlockDevOps *ops,
void *opaque)
{
/* All drivers that use blk_set_dev_ops() are qdevified and we want to keep
* it that way, so we can assume blk->dev is a DeviceState if blk->dev_ops
* is set. */
* it that way, so we can assume blk->dev, if present, is a DeviceState if
* blk->dev_ops is set. Non-device users may use dev_ops without device. */
assert(!blk->legacy_dev);
blk->dev_ops = ops;
blk->dev_opaque = opaque;
/* Are we currently quiesced? Should we enforce this right now? */
if (blk->quiesce_counter && ops->drained_begin) {
ops->drained_begin(opaque);
}
}
/*
@@ -995,7 +1048,7 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
co_entry(&rwco);
} else {
Coroutine *co = qemu_coroutine_create(co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(blk_bs(blk), co);
BDRV_POLL_WHILE(blk_bs(blk), rwco.ret == NOT_DONE);
}
@@ -1102,7 +1155,7 @@ static BlockAIOCB *blk_aio_prwv(BlockBackend *blk, int64_t offset, int bytes,
acb->has_returned = false;
co = qemu_coroutine_create(co_entry, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(blk_bs(blk), co);
acb->has_returned = true;
if (acb->rwco.ret != NOT_DONE) {
@@ -1865,6 +1918,12 @@ static void blk_root_drained_begin(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
if (++blk->quiesce_counter == 1) {
if (blk->dev_ops && blk->dev_ops->drained_begin) {
blk->dev_ops->drained_begin(blk->dev_opaque);
}
}
/* Note that blk->root may not be accessible here yet if we are just
* attaching to a BlockDriverState that is drained. Use child instead. */
@@ -1876,7 +1935,14 @@ static void blk_root_drained_begin(BdrvChild *child)
static void blk_root_drained_end(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
assert(blk->quiesce_counter);
assert(blk->public.io_limits_disabled);
--blk->public.io_limits_disabled;
if (--blk->quiesce_counter == 0) {
if (blk->dev_ops && blk->dev_ops->drained_end) {
blk->dev_ops->drained_end(blk->dev_opaque);
}
}
}

View File

@@ -110,7 +110,10 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
bs->read_only = true; /* no write support yet */
ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
if (ret < 0) {
return ret;
}
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
if (ret < 0) {

View File

@@ -72,7 +72,10 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
bs->read_only = true;
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
/* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4);

View File

@@ -13,6 +13,7 @@
*/
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob_int.h"
@@ -232,6 +233,23 @@ static int coroutine_fn bdrv_commit_top_preadv(BlockDriverState *bs,
return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
}
static int64_t coroutine_fn bdrv_commit_top_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
*pnum = nb_sectors;
*file = bs->backing->bs;
return BDRV_BLOCK_RAW | BDRV_BLOCK_OFFSET_VALID | BDRV_BLOCK_DATA |
(sector_num << BDRV_SECTOR_BITS);
}
static void bdrv_commit_top_refresh_filename(BlockDriverState *bs, QDict *opts)
{
bdrv_refresh_filename(bs->backing->bs);
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
bs->backing->bs->filename);
}
static void bdrv_commit_top_close(BlockDriverState *bs)
{
}
@@ -248,10 +266,12 @@ static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
/* Dummy node that provides consistent read to its users without requiring it
* from its backing file and that allows writes on the backing file chain. */
static BlockDriver bdrv_commit_top = {
.format_name = "commit_top",
.bdrv_co_preadv = bdrv_commit_top_preadv,
.bdrv_close = bdrv_commit_top_close,
.bdrv_child_perm = bdrv_commit_top_child_perm,
.format_name = "commit_top",
.bdrv_co_preadv = bdrv_commit_top_preadv,
.bdrv_co_get_block_status = bdrv_commit_top_get_block_status,
.bdrv_refresh_filename = bdrv_commit_top_refresh_filename,
.bdrv_close = bdrv_commit_top_close,
.bdrv_child_perm = bdrv_commit_top_child_perm,
};
void commit_start(const char *job_id, BlockDriverState *bs,
@@ -315,9 +335,23 @@ void commit_start(const char *job_id, BlockDriverState *bs,
if (commit_top_bs == NULL) {
goto fail;
}
commit_top_bs->total_sectors = top->total_sectors;
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(top));
bdrv_set_backing_hd(commit_top_bs, top, &error_abort);
bdrv_set_backing_hd(overlay_bs, commit_top_bs, &error_abort);
bdrv_set_backing_hd(commit_top_bs, top, &local_err);
if (local_err) {
bdrv_unref(commit_top_bs);
commit_top_bs = NULL;
error_propagate(errp, local_err);
goto fail;
}
bdrv_set_backing_hd(overlay_bs, commit_top_bs, &local_err);
if (local_err) {
bdrv_unref(commit_top_bs);
commit_top_bs = NULL;
error_propagate(errp, local_err);
goto fail;
}
s->commit_top_bs = commit_top_bs;
bdrv_unref(commit_top_bs);
@@ -364,7 +398,7 @@ void commit_start(const char *job_id, BlockDriverState *bs,
/* Required permissions are already taken with block_job_add_bdrv() */
s->top = blk_new(0, BLK_PERM_ALL);
blk_insert_bs(s->top, top, errp);
ret = blk_insert_bs(s->top, top, errp);
if (ret < 0) {
goto fail;
}
@@ -450,6 +484,7 @@ int bdrv_commit(BlockDriverState *bs)
error_report_err(local_err);
goto ro_cleanup;
}
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(backing_file_bs));
bdrv_set_backing_hd(commit_top_bs, backing_file_bs, &error_abort);
bdrv_set_backing_hd(bs, commit_top_bs, &error_abort);

View File

@@ -56,11 +56,11 @@ static int block_crypto_probe_generic(QCryptoBlockFormat format,
static ssize_t block_crypto_read_func(QCryptoBlock *block,
void *opaque,
size_t offset,
uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
Error **errp)
{
BlockDriverState *bs = opaque;
ssize_t ret;
@@ -83,11 +83,11 @@ struct BlockCryptoCreateData {
static ssize_t block_crypto_write_func(QCryptoBlock *block,
void *opaque,
size_t offset,
const uint8_t *buf,
size_t buflen,
Error **errp,
void *opaque)
Error **errp)
{
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
@@ -102,9 +102,9 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
static ssize_t block_crypto_init_func(QCryptoBlock *block,
void *opaque,
size_t headerlen,
Error **errp,
void *opaque)
Error **errp)
{
struct BlockCryptoCreateData *data = opaque;
int ret;

View File

@@ -377,7 +377,7 @@ static void curl_multi_check_completion(BDRVCURLState *s)
}
qemu_mutex_unlock(&s->mutex);
acb->common.cb(acb->common.opaque, -EPROTO);
acb->common.cb(acb->common.opaque, -EIO);
qemu_mutex_lock(&s->mutex);
qemu_aio_unref(acb);
state->acb[i] = NULL;
@@ -659,6 +659,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
const char *cookie;
double d;
const char *secretid;
const char *protocol_delimiter;
static int inited = 0;
@@ -700,6 +701,15 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
if (!strstart(file, bs->drv->protocol_name, &protocol_delimiter) ||
!strstart(protocol_delimiter, "://", NULL))
{
error_setg(errp, "%s curl driver cannot handle the URL '%s' (does not "
"start with '%s://')", bs->drv->protocol_name, file,
bs->drv->protocol_name);
goto out_noclean;
}
s->username = g_strdup(qemu_opt_get(opts, CURL_BLOCK_OPT_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PASSWORD_SECRET);

View File

@@ -419,8 +419,12 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
block_module_load_one("dmg-bz2");
bs->read_only = true;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;

View File

@@ -144,6 +144,7 @@ typedef struct BDRVRawState {
bool has_write_zeroes:1;
bool discard_zeroes:1;
bool use_linux_aio:1;
bool page_cache_inconsistent:1;
bool has_fallocate;
bool needs_alignment;
} BDRVRawState;
@@ -219,28 +220,28 @@ static int probe_logical_blocksize(int fd, unsigned int *sector_size_p)
{
unsigned int sector_size;
bool success = false;
int i;
errno = ENOTSUP;
/* Try a few ioctls to get the right size */
static const unsigned long ioctl_list[] = {
#ifdef BLKSSZGET
if (ioctl(fd, BLKSSZGET, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
BLKSSZGET,
#endif
#ifdef DKIOCGETBLOCKSIZE
if (ioctl(fd, DKIOCGETBLOCKSIZE, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
DKIOCGETBLOCKSIZE,
#endif
#ifdef DIOCGSECTORSIZE
if (ioctl(fd, DIOCGSECTORSIZE, &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
DIOCGSECTORSIZE,
#endif
};
/* Try a few ioctls to get the right size */
for (i = 0; i < (int)ARRAY_SIZE(ioctl_list); i++) {
if (ioctl(fd, ioctl_list[i], &sector_size) >= 0) {
*sector_size_p = sector_size;
success = true;
}
}
return success ? 0 : -errno;
}
@@ -668,6 +669,51 @@ static int hdev_get_max_transfer_length(BlockDriverState *bs, int fd)
#endif
}
static int hdev_get_max_segments(const struct stat *st)
{
#ifdef CONFIG_LINUX
char buf[32];
const char *end;
char *sysfspath;
int ret;
int fd = -1;
long max_segments;
sysfspath = g_strdup_printf("/sys/dev/block/%u:%u/queue/max_segments",
major(st->st_rdev), minor(st->st_rdev));
fd = open(sysfspath, O_RDONLY);
if (fd == -1) {
ret = -errno;
goto out;
}
do {
ret = read(fd, buf, sizeof(buf) - 1);
} while (ret == -1 && errno == EINTR);
if (ret < 0) {
ret = -errno;
goto out;
} else if (ret == 0) {
ret = -EIO;
goto out;
}
buf[ret] = 0;
/* The file is ended with '\n', pass 'end' to accept that. */
ret = qemu_strtol(buf, &end, 10, &max_segments);
if (ret == 0 && end && *end == '\n') {
ret = max_segments;
}
out:
if (fd != -1) {
close(fd);
}
g_free(sysfspath);
return ret;
#else
return -ENOTSUP;
#endif
}
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVRawState *s = bs->opaque;
@@ -679,6 +725,11 @@ static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
if (ret > 0 && ret <= BDRV_REQUEST_MAX_BYTES) {
bs->bl.max_transfer = pow2floor(ret);
}
ret = hdev_get_max_segments(&st);
if (ret > 0) {
bs->bl.max_transfer = MIN(bs->bl.max_transfer,
ret * getpagesize());
}
}
}
@@ -774,10 +825,31 @@ static ssize_t handle_aiocb_ioctl(RawPosixAIOData *aiocb)
static ssize_t handle_aiocb_flush(RawPosixAIOData *aiocb)
{
BDRVRawState *s = aiocb->bs->opaque;
int ret;
if (s->page_cache_inconsistent) {
return -EIO;
}
ret = qemu_fdatasync(aiocb->aio_fildes);
if (ret == -1) {
/* There is no clear definition of the semantics of a failing fsync(),
* so we may have to assume the worst. The sad truth is that this
* assumption is correct for Linux. Some pages are now probably marked
* clean in the page cache even though they are inconsistent with the
* on-disk contents. The next fdatasync() call would succeed, but no
* further writeback attempt will be made. We can't get back to a state
* in which we know what is on disk (we would have to rewrite
* everything that was touched since the last fdatasync() at least), so
* make bdrv_flush() fail permanently. Given that the behaviour isn't
* really defined, I have little hope that other OSes are doing better.
*
* Obviously, this doesn't affect O_DIRECT, which bypasses the page
* cache. */
if ((s->open_flags & O_DIRECT) == 0) {
s->page_cache_inconsistent = true;
}
return -errno;
}
return 0;
@@ -2121,6 +2193,12 @@ static int hdev_open(BlockDriverState *bs, QDict *options, int flags,
int ret;
#if defined(__APPLE__) && defined(__MACH__)
/*
* Caution: while qdict_get_str() is fine, getting non-string types
* would require more care. When @options come from -blockdev or
* blockdev_add, its members are typed according to the QAPI
* schema, but when they come from -drive, they're all QString.
*/
const char *filename = qdict_get_str(options, "filename");
char bsd_path[MAXPATHLEN] = "";
bool error_occurred = false;

View File

@@ -12,6 +12,7 @@
#include "block/block_int.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qapi/util.h"
#include "qemu/uri.h"
#include "qemu/error-report.h"
#include "qemu/cutils.h"
@@ -151,7 +152,7 @@ static QemuOptsList runtime_type_opts = {
{
.name = GLUSTER_OPT_TYPE,
.type = QEMU_OPT_STRING,
.help = "tcp|unix",
.help = "inet|unix",
},
{ /* end of list */ }
},
@@ -170,14 +171,14 @@ static QemuOptsList runtime_unix_opts = {
},
};
static QemuOptsList runtime_tcp_opts = {
.name = "gluster_tcp",
.head = QTAILQ_HEAD_INITIALIZER(runtime_tcp_opts.head),
static QemuOptsList runtime_inet_opts = {
.name = "gluster_inet",
.head = QTAILQ_HEAD_INITIALIZER(runtime_inet_opts.head),
.desc = {
{
.name = GLUSTER_OPT_TYPE,
.type = QEMU_OPT_STRING,
.help = "tcp|unix",
.help = "inet|unix",
},
{
.name = GLUSTER_OPT_HOST,
@@ -320,7 +321,7 @@ static int parse_volume_options(BlockdevOptionsGluster *gconf, char *path)
static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
const char *filename)
{
GlusterServer *gsconf;
SocketAddressFlat *gsconf;
URI *uri;
QueryParams *qp = NULL;
bool is_unix = false;
@@ -331,19 +332,19 @@ static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
return -EINVAL;
}
gconf->server = g_new0(GlusterServerList, 1);
gconf->server->value = gsconf = g_new0(GlusterServer, 1);
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server->value = gsconf = g_new0(SocketAddressFlat, 1);
/* transport */
if (!uri->scheme || !strcmp(uri->scheme, "gluster")) {
gsconf->type = GLUSTER_TRANSPORT_TCP;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+tcp")) {
gsconf->type = GLUSTER_TRANSPORT_TCP;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
} else if (!strcmp(uri->scheme, "gluster+unix")) {
gsconf->type = GLUSTER_TRANSPORT_UNIX;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_UNIX;
is_unix = true;
} else if (!strcmp(uri->scheme, "gluster+rdma")) {
gsconf->type = GLUSTER_TRANSPORT_TCP;
gsconf->type = SOCKET_ADDRESS_FLAT_TYPE_INET;
error_report("Warning: rdma feature is not supported, falling "
"back to tcp");
} else {
@@ -373,11 +374,11 @@ static int qemu_gluster_parse_uri(BlockdevOptionsGluster *gconf,
}
gsconf->u.q_unix.path = g_strdup(qp->p[0].value);
} else {
gsconf->u.tcp.host = g_strdup(uri->server ? uri->server : "localhost");
gsconf->u.inet.host = g_strdup(uri->server ? uri->server : "localhost");
if (uri->port) {
gsconf->u.tcp.port = g_strdup_printf("%d", uri->port);
gsconf->u.inet.port = g_strdup_printf("%d", uri->port);
} else {
gsconf->u.tcp.port = g_strdup_printf("%d", GLUSTER_DEFAULT_PORT);
gsconf->u.inet.port = g_strdup_printf("%d", GLUSTER_DEFAULT_PORT);
}
}
@@ -395,7 +396,7 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
struct glfs *glfs;
int ret;
int old_errno;
GlusterServerList *server;
SocketAddressFlatList *server;
unsigned long long port;
glfs = glfs_find_preopened(gconf->volume);
@@ -411,22 +412,27 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
glfs_set_preopened(gconf->volume, glfs);
for (server = gconf->server; server; server = server->next) {
if (server->value->type == GLUSTER_TRANSPORT_UNIX) {
ret = glfs_set_volfile_server(glfs,
GlusterTransport_lookup[server->value->type],
switch (server->value->type) {
case SOCKET_ADDRESS_FLAT_TYPE_UNIX:
ret = glfs_set_volfile_server(glfs, "unix",
server->value->u.q_unix.path, 0);
} else {
if (parse_uint_full(server->value->u.tcp.port, &port, 10) < 0 ||
break;
case SOCKET_ADDRESS_FLAT_TYPE_INET:
if (parse_uint_full(server->value->u.inet.port, &port, 10) < 0 ||
port > 65535) {
error_setg(errp, "'%s' is not a valid port number",
server->value->u.tcp.port);
server->value->u.inet.port);
errno = EINVAL;
goto out;
}
ret = glfs_set_volfile_server(glfs,
GlusterTransport_lookup[server->value->type],
server->value->u.tcp.host,
ret = glfs_set_volfile_server(glfs, "tcp",
server->value->u.inet.host,
(int)port);
break;
case SOCKET_ADDRESS_FLAT_TYPE_VSOCK:
case SOCKET_ADDRESS_FLAT_TYPE_FD:
default:
abort();
}
if (ret < 0) {
@@ -444,13 +450,13 @@ static struct glfs *qemu_gluster_glfs_init(BlockdevOptionsGluster *gconf,
error_setg(errp, "Gluster connection for volume %s, path %s failed"
" to connect", gconf->volume, gconf->path);
for (server = gconf->server; server; server = server->next) {
if (server->value->type == GLUSTER_TRANSPORT_UNIX) {
if (server->value->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
error_append_hint(errp, "hint: failed on socket %s ",
server->value->u.q_unix.path);
} else {
error_append_hint(errp, "hint: failed on host %s and port %s ",
server->value->u.tcp.host,
server->value->u.tcp.port);
server->value->u.inet.host,
server->value->u.inet.port);
}
}
@@ -474,23 +480,6 @@ out:
return NULL;
}
static int qapi_enum_parse(const char *opt)
{
int i;
if (!opt) {
return GLUSTER_TRANSPORT__MAX;
}
for (i = 0; i < GLUSTER_TRANSPORT__MAX; i++) {
if (!strcmp(opt, GlusterTransport_lookup[i])) {
return i;
}
}
return i;
}
/*
* Convert the json formatted command line into qapi.
*/
@@ -498,14 +487,14 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
QDict *options, Error **errp)
{
QemuOpts *opts;
GlusterServer *gsconf;
GlusterServerList *curr = NULL;
SocketAddressFlat *gsconf = NULL;
SocketAddressFlatList *curr = NULL;
QDict *backing_options = NULL;
Error *local_err = NULL;
char *str = NULL;
const char *ptr;
size_t num_servers;
int i;
int i, type;
/* create opts info from runtime_json_opts list */
opts = qemu_opts_create(&runtime_json_opts, NULL, 0, &error_abort);
@@ -547,25 +536,32 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
}
ptr = qemu_opt_get(opts, GLUSTER_OPT_TYPE);
gsconf = g_new0(GlusterServer, 1);
gsconf->type = qapi_enum_parse(ptr);
if (!ptr) {
error_setg(&local_err, QERR_MISSING_PARAMETER, GLUSTER_OPT_TYPE);
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
if (gsconf->type == GLUSTER_TRANSPORT__MAX) {
error_setg(&local_err, QERR_INVALID_PARAMETER_VALUE,
GLUSTER_OPT_TYPE, "tcp or unix");
gsconf = g_new0(SocketAddressFlat, 1);
if (!strcmp(ptr, "tcp")) {
ptr = "inet"; /* accept legacy "tcp" */
}
type = qapi_enum_parse(SocketAddressFlatType_lookup, ptr,
SOCKET_ADDRESS_FLAT_TYPE__MAX, -1, NULL);
if (type != SOCKET_ADDRESS_FLAT_TYPE_INET
&& type != SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
error_setg(&local_err,
"Parameter '%s' may be 'inet' or 'unix'",
GLUSTER_OPT_TYPE);
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
gsconf->type = type;
qemu_opts_del(opts);
if (gsconf->type == GLUSTER_TRANSPORT_TCP) {
/* create opts info from runtime_tcp_opts list */
opts = qemu_opts_create(&runtime_tcp_opts, NULL, 0, &error_abort);
if (gsconf->type == SOCKET_ADDRESS_FLAT_TYPE_INET) {
/* create opts info from runtime_inet_opts list */
opts = qemu_opts_create(&runtime_inet_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, backing_options, &local_err);
if (local_err) {
goto out;
@@ -578,7 +574,7 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
gsconf->u.tcp.host = g_strdup(ptr);
gsconf->u.inet.host = g_strdup(ptr);
ptr = qemu_opt_get(opts, GLUSTER_OPT_PORT);
if (!ptr) {
error_setg(&local_err, QERR_MISSING_PARAMETER,
@@ -586,28 +582,28 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
error_append_hint(&local_err, GERR_INDEX_HINT, i);
goto out;
}
gsconf->u.tcp.port = g_strdup(ptr);
gsconf->u.inet.port = g_strdup(ptr);
/* defend for unsupported fields in InetSocketAddress,
* i.e. @ipv4, @ipv6 and @to
*/
ptr = qemu_opt_get(opts, GLUSTER_OPT_TO);
if (ptr) {
gsconf->u.tcp.has_to = true;
gsconf->u.inet.has_to = true;
}
ptr = qemu_opt_get(opts, GLUSTER_OPT_IPV4);
if (ptr) {
gsconf->u.tcp.has_ipv4 = true;
gsconf->u.inet.has_ipv4 = true;
}
ptr = qemu_opt_get(opts, GLUSTER_OPT_IPV6);
if (ptr) {
gsconf->u.tcp.has_ipv6 = true;
gsconf->u.inet.has_ipv6 = true;
}
if (gsconf->u.tcp.has_to) {
if (gsconf->u.inet.has_to) {
error_setg(&local_err, "Parameter 'to' not supported");
goto out;
}
if (gsconf->u.tcp.has_ipv4 || gsconf->u.tcp.has_ipv6) {
if (gsconf->u.inet.has_ipv4 || gsconf->u.inet.has_ipv6) {
error_setg(&local_err, "Parameters 'ipv4/ipv6' not supported");
goto out;
}
@@ -632,16 +628,18 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
}
if (gconf->server == NULL) {
gconf->server = g_new0(GlusterServerList, 1);
gconf->server = g_new0(SocketAddressFlatList, 1);
gconf->server->value = gsconf;
curr = gconf->server;
} else {
curr->next = g_new0(GlusterServerList, 1);
curr->next = g_new0(SocketAddressFlatList, 1);
curr->next->value = gsconf;
curr = curr->next;
}
gsconf = NULL;
qdict_del(backing_options, str);
QDECREF(backing_options);
backing_options = NULL;
g_free(str);
str = NULL;
}
@@ -650,11 +648,10 @@ static int qemu_gluster_parse_json(BlockdevOptionsGluster *gconf,
out:
error_propagate(errp, local_err);
qapi_free_SocketAddressFlat(gsconf);
qemu_opts_del(opts);
if (str) {
qdict_del(backing_options, str);
g_free(str);
}
g_free(str);
QDECREF(backing_options);
errno = EINVAL;
return -errno;
}
@@ -683,7 +680,7 @@ static struct glfs *qemu_gluster_init(BlockdevOptionsGluster *gconf,
"file.volume=testvol,file.path=/path/a.qcow2"
"[,file.debug=9]"
"[,file.logfile=/path/filename.log],"
"file.server.0.type=tcp,"
"file.server.0.type=inet,"
"file.server.0.host=1.2.3.4,"
"file.server.0.port=24007,"
"file.server.1.transport=unix,"

View File

@@ -44,7 +44,7 @@ static void coroutine_fn bdrv_co_do_rw(void *opaque);
static int coroutine_fn bdrv_co_do_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags);
static void bdrv_parent_drained_begin(BlockDriverState *bs)
void bdrv_parent_drained_begin(BlockDriverState *bs)
{
BdrvChild *c;
@@ -55,7 +55,7 @@ static void bdrv_parent_drained_begin(BlockDriverState *bs)
}
}
static void bdrv_parent_drained_end(BlockDriverState *bs)
void bdrv_parent_drained_end(BlockDriverState *bs)
{
BdrvChild *c;
@@ -158,7 +158,7 @@ bool bdrv_requests_pending(BlockDriverState *bs)
static bool bdrv_drain_recurse(BlockDriverState *bs)
{
BdrvChild *child;
BdrvChild *child, *tmp;
bool waited;
waited = BDRV_POLL_WHILE(bs, atomic_read(&bs->in_flight) > 0);
@@ -167,8 +167,25 @@ static bool bdrv_drain_recurse(BlockDriverState *bs)
bs->drv->bdrv_drain(bs);
}
QLIST_FOREACH(child, &bs->children, next) {
waited |= bdrv_drain_recurse(child->bs);
QLIST_FOREACH_SAFE(child, &bs->children, next, tmp) {
BlockDriverState *bs = child->bs;
bool in_main_loop =
qemu_get_current_aio_context() == qemu_get_aio_context();
assert(bs->refcnt > 0);
if (in_main_loop) {
/* In case the recursive bdrv_drain_recurse processes a
* block_job_defer_to_main_loop BH and modifies the graph,
* let's hold a reference to bs until we are done.
*
* IOThread doesn't have such a BH, and it is not safe to call
* bdrv_unref without BQL, so skip doing it there.
*/
bdrv_ref(bs);
}
waited |= bdrv_drain_recurse(bs);
if (in_main_loop) {
bdrv_unref(bs);
}
}
return waited;
@@ -616,7 +633,7 @@ static int bdrv_prwv_co(BdrvChild *child, int64_t offset,
bdrv_rw_co_entry(&rwco);
} else {
co = qemu_coroutine_create(bdrv_rw_co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(child->bs, co);
BDRV_POLL_WHILE(child->bs, rwco.ret == NOT_DONE);
}
return rwco.ret;
@@ -945,7 +962,14 @@ static int coroutine_fn bdrv_co_do_copy_on_readv(BdrvChild *child,
size_t skip_bytes;
int ret;
assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
/* FIXME We cannot require callers to have write permissions when all they
* are doing is a read request. If we did things right, write permissions
* would be obtained anyway, but internally by the copy-on-read code. As
* long as it is implemented here rather than in a separat filter driver,
* the copy-on-read code doesn't have its own BdrvChild, however, for which
* it could request permissions. Therefore we have to bypass the permission
* system for the moment. */
// assert(child->perm & (BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE));
/* Cover entire cluster so no additional backing file I/O is required when
* allocating cluster in the image file.
@@ -1338,8 +1362,16 @@ static int coroutine_fn bdrv_aligned_pwritev(BdrvChild *child,
assert(!waited || !req->serialising);
assert(req->overlap_offset <= offset);
assert(offset + bytes <= req->overlap_offset + req->overlap_bytes);
assert(child->perm & BLK_PERM_WRITE);
assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
/* FIXME: Block migration uses the BlockBackend of the guest device at a
* point when it has not yet taken write permissions. This will be
* fixed by a future patch, but for now we have to bypass this
* assertion for block migration to work. */
// assert(child->perm & BLK_PERM_WRITE);
/* FIXME: Because of the above, we also cannot guarantee that all format
* BDS take the BLK_PERM_RESIZE permission on their file BDS, since
* they are not obligated to do so if they do not have any parent
* that has taken the permission to write to them. */
// assert(end_sector <= bs->total_sectors || child->perm & BLK_PERM_RESIZE);
ret = notifier_with_return_list_notify(&bs->before_write_notifiers, req);
@@ -1760,7 +1792,7 @@ static int64_t coroutine_fn bdrv_co_get_block_status(BlockDriverState *bs,
if (ret & BDRV_BLOCK_RAW) {
assert(ret & BDRV_BLOCK_OFFSET_VALID);
ret = bdrv_get_block_status(bs->file->bs, ret >> BDRV_SECTOR_BITS,
ret = bdrv_get_block_status(*file, ret >> BDRV_SECTOR_BITS,
*pnum, pnum, file);
goto out;
}
@@ -1873,7 +1905,7 @@ int64_t bdrv_get_block_status_above(BlockDriverState *bs,
} else {
co = qemu_coroutine_create(bdrv_get_block_status_above_co_entry,
&data);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, !data.done);
}
return data.ret;
@@ -1999,7 +2031,7 @@ bdrv_rw_vmstate(BlockDriverState *bs, QEMUIOVector *qiov, int64_t pos,
};
Coroutine *co = qemu_coroutine_create(bdrv_co_rw_vmstate_entry, &data);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
while (data.ret == -EINPROGRESS) {
aio_poll(bdrv_get_aio_context(bs), true);
}
@@ -2216,7 +2248,7 @@ static BlockAIOCB *bdrv_co_aio_prw_vector(BdrvChild *child,
acb->is_write = is_write;
co = qemu_coroutine_create(bdrv_co_do_rw, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(child->bs, co);
bdrv_co_maybe_schedule_bh(acb);
return &acb->common;
@@ -2247,7 +2279,7 @@ BlockAIOCB *bdrv_aio_flush(BlockDriverState *bs,
acb->req.error = -EINPROGRESS;
co = qemu_coroutine_create(bdrv_aio_flush_co_entry, acb);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
bdrv_co_maybe_schedule_bh(acb);
return &acb->common;
@@ -2271,16 +2303,17 @@ static void coroutine_fn bdrv_flush_co_entry(void *opaque)
int coroutine_fn bdrv_co_flush(BlockDriverState *bs)
{
int ret;
if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
bdrv_is_sg(bs)) {
return 0;
}
int current_gen;
int ret = 0;
bdrv_inc_in_flight(bs);
int current_gen = bs->write_gen;
if (!bs || !bdrv_is_inserted(bs) || bdrv_is_read_only(bs) ||
bdrv_is_sg(bs)) {
goto early_exit;
}
current_gen = bs->write_gen;
/* Wait until any previous flushes are completed */
while (bs->active_flush_req) {
@@ -2363,6 +2396,7 @@ out:
/* Return value is ignored - it's ok if wait queue is empty */
qemu_co_queue_next(&bs->flush_queue);
early_exit:
bdrv_dec_in_flight(bs);
return ret;
}
@@ -2380,7 +2414,7 @@ int bdrv_flush(BlockDriverState *bs)
bdrv_flush_co_entry(&flush_co);
} else {
co = qemu_coroutine_create(bdrv_flush_co_entry, &flush_co);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, flush_co.ret == NOT_DONE);
}
@@ -2527,7 +2561,7 @@ int bdrv_pdiscard(BlockDriverState *bs, int64_t offset, int count)
bdrv_pdiscard_co_entry(&rwco);
} else {
co = qemu_coroutine_create(bdrv_pdiscard_co_entry, &rwco);
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, rwco.ret == NOT_DONE);
}

View File

@@ -103,7 +103,6 @@ typedef struct IscsiTask {
typedef struct IscsiAIOCB {
BlockAIOCB common;
QEMUIOVector *qiov;
QEMUBH *bh;
IscsiLun *iscsilun;
struct scsi_task *task;
@@ -637,6 +636,7 @@ retry:
}
#endif
if (iTask.task == NULL) {
qemu_mutex_unlock(&iscsilun->mutex);
return -ENOMEM;
}
#if LIBISCSI_API_VERSION < (20160603)
@@ -864,6 +864,7 @@ retry:
}
#endif
if (iTask.task == NULL) {
qemu_mutex_unlock(&iscsilun->mutex);
return -ENOMEM;
}
#if LIBISCSI_API_VERSION < (20160603)
@@ -904,6 +905,7 @@ static int coroutine_fn iscsi_co_flush(BlockDriverState *bs)
retry:
if (iscsi_synchronizecache10_task(iscsilun->iscsi, iscsilun->lun, 0, 0, 0,
0, iscsi_co_generic_cb, &iTask) == NULL) {
qemu_mutex_unlock(&iscsilun->mutex);
return -ENOMEM;
}
@@ -1237,6 +1239,7 @@ retry:
0, 0, iscsi_co_generic_cb, &iTask);
}
if (iTask.task == NULL) {
qemu_mutex_unlock(&iscsilun->mutex);
return -ENOMEM;
}
@@ -2089,6 +2092,7 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
BlockDriverState *bs;
IscsiLun *iscsilun = NULL;
QDict *bs_options;
Error *local_err = NULL;
bs = bdrv_new();
@@ -2099,8 +2103,13 @@ static int iscsi_create(const char *filename, QemuOpts *opts, Error **errp)
iscsilun = bs->opaque;
bs_options = qdict_new();
qdict_put(bs_options, "filename", qstring_from_str(filename));
ret = iscsi_open(bs, bs_options, 0, NULL);
iscsi_parse_filename(filename, bs_options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
} else {
ret = iscsi_open(bs, bs_options, 0, NULL);
}
QDECREF(bs_options);
if (ret != 0) {

View File

@@ -12,6 +12,7 @@
*/
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "trace.h"
#include "block/blockjob_int.h"
#include "block/block_int.h"
@@ -509,6 +510,13 @@ static void mirror_exit(BlockJob *job, void *opaque)
* block_job_completed(). */
bdrv_ref(src);
bdrv_ref(mirror_top_bs);
bdrv_ref(target_bs);
/* Remove target parent that still uses BLK_PERM_WRITE/RESIZE before
* inserting target_bs at s->to_replace, where we might not be able to get
* these permissions. */
blk_unref(s->target);
s->target = NULL;
/* We don't access the source any more. Dropping any WRITE/RESIZE is
* required before it could become a backing file of target_bs. */
@@ -543,8 +551,12 @@ static void mirror_exit(BlockJob *job, void *opaque)
/* The mirror job has no requests in flight any more, but we need to
* drain potential other users of the BDS before changing the graph. */
bdrv_drained_begin(target_bs);
bdrv_replace_in_backing_chain(to_replace, target_bs);
bdrv_replace_node(to_replace, target_bs, &local_err);
bdrv_drained_end(target_bs);
if (local_err) {
error_report_err(local_err);
data->ret = -EPERM;
}
}
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
@@ -555,19 +567,20 @@ static void mirror_exit(BlockJob *job, void *opaque)
aio_context_release(replace_aio_context);
}
g_free(s->replaces);
blk_unref(s->target);
s->target = NULL;
bdrv_unref(target_bs);
/* Remove the mirror filter driver from the graph. Before this, get rid of
* the blockers on the intermediate nodes so that the resulting state is
* valid. */
* valid. Also give up permissions on mirror_top_bs->backing, which might
* block the removal. */
block_job_remove_all_bdrv(job);
bdrv_replace_in_backing_chain(mirror_top_bs, backing_bs(mirror_top_bs));
bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
&error_abort);
bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
/* We just changed the BDS the job BB refers to (with either or both of the
* bdrv_replace_in_backing_chain() calls), so switch the BB back so the
* cleanup does the right thing. We don't need any permissions any more
* now. */
* bdrv_replace_node() calls), so switch the BB back so the cleanup does
* the right thing. We don't need any permissions any more now. */
blk_remove_bs(job->blk);
blk_set_perm(job->blk, 0, BLK_PERM_ALL, &error_abort);
blk_insert_bs(job->blk, mirror_top_bs, &error_abort);
@@ -621,7 +634,8 @@ static int coroutine_fn mirror_dirty_init(MirrorBlockJob *s)
}
if (s->in_flight >= MAX_IN_FLIGHT) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, -1);
trace_mirror_yield(s, UINT64_MAX, s->buf_free_count,
s->in_flight);
mirror_wait_for_io(s);
continue;
}
@@ -796,7 +810,7 @@ static void coroutine_fn mirror_run(void *opaque)
s->common.iostatus == BLOCK_DEVICE_IO_STATUS_OK) {
if (s->in_flight >= MAX_IN_FLIGHT || s->buf_free_count == 0 ||
(cnt == 0 && s->in_flight > 0)) {
trace_mirror_yield(s, s->in_flight, s->buf_free_count, cnt);
trace_mirror_yield(s, cnt, s->buf_free_count, s->in_flight);
mirror_wait_for_io(s);
continue;
} else if (cnt != 0) {
@@ -1049,6 +1063,13 @@ static int coroutine_fn bdrv_mirror_top_pdiscard(BlockDriverState *bs,
return bdrv_co_pdiscard(bs->backing->bs, offset, count);
}
static void bdrv_mirror_top_refresh_filename(BlockDriverState *bs, QDict *opts)
{
bdrv_refresh_filename(bs->backing->bs);
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
bs->backing->bs->filename);
}
static void bdrv_mirror_top_close(BlockDriverState *bs)
{
}
@@ -1077,6 +1098,7 @@ static BlockDriver bdrv_mirror_top = {
.bdrv_co_pdiscard = bdrv_mirror_top_pdiscard,
.bdrv_co_flush = bdrv_mirror_top_flush,
.bdrv_co_get_block_status = bdrv_mirror_top_get_block_status,
.bdrv_refresh_filename = bdrv_mirror_top_refresh_filename,
.bdrv_close = bdrv_mirror_top_close,
.bdrv_child_perm = bdrv_mirror_top_child_perm,
};
@@ -1090,10 +1112,11 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
void *opaque, Error **errp,
void *opaque,
const BlockJobDriver *driver,
bool is_none_mode, BlockDriverState *base,
bool auto_complete, const char *filter_node_name)
bool auto_complete, const char *filter_node_name,
Error **errp)
{
MirrorBlockJob *s;
BlockDriverState *mirror_top_bs;
@@ -1126,9 +1149,10 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
return;
}
mirror_top_bs->total_sectors = bs->total_sectors;
bdrv_set_aio_context(mirror_top_bs, bdrv_get_aio_context(bs));
/* bdrv_append takes ownership of the mirror_top_bs reference, need to keep
* it alive until block_job_create() even if bs has no parent. */
* it alive until block_job_create() succeeds even if bs has no parent. */
bdrv_ref(mirror_top_bs);
bdrv_drained_begin(bs);
bdrv_append(mirror_top_bs, bs, &local_err);
@@ -1146,10 +1170,12 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE_UNCHANGED |
BLK_PERM_WRITE | BLK_PERM_GRAPH_MOD, speed,
creation_flags, cb, opaque, errp);
bdrv_unref(mirror_top_bs);
if (!s) {
goto fail;
}
/* The block job now has a reference to this node */
bdrv_unref(mirror_top_bs);
s->source = bs;
s->mirror_top_bs = mirror_top_bs;
@@ -1189,10 +1215,7 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!s->dirty_bitmap) {
g_free(s->replaces);
blk_unref(s->target);
block_job_unref(&s->common);
return;
goto fail;
}
/* Required permissions are already taken with blk_new() */
@@ -1223,12 +1246,20 @@ static void mirror_start_job(const char *job_id, BlockDriverState *bs,
fail:
if (s) {
/* Make sure this BDS does not go away until we have completed the graph
* changes below */
bdrv_ref(mirror_top_bs);
g_free(s->replaces);
blk_unref(s->target);
block_job_unref(&s->common);
}
bdrv_replace_in_backing_chain(mirror_top_bs, backing_bs(mirror_top_bs));
bdrv_child_try_set_perm(mirror_top_bs->backing, 0, BLK_PERM_ALL,
&error_abort);
bdrv_replace_node(mirror_top_bs, backing_bs(mirror_top_bs), &error_abort);
bdrv_unref(mirror_top_bs);
}
void mirror_start(const char *job_id, BlockDriverState *bs,
@@ -1250,17 +1281,17 @@ void mirror_start(const char *job_id, BlockDriverState *bs,
base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
mirror_start_job(job_id, bs, BLOCK_JOB_DEFAULT, target, replaces,
speed, granularity, buf_size, backing_mode,
on_source_error, on_target_error, unmap, NULL, NULL, errp,
on_source_error, on_target_error, unmap, NULL, NULL,
&mirror_job_driver, is_none_mode, base, false,
filter_node_name);
filter_node_name, errp);
}
void commit_active_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *base, int creation_flags,
int64_t speed, BlockdevOnError on_error,
const char *filter_node_name,
BlockCompletionFunc *cb, void *opaque, Error **errp,
bool auto_complete)
BlockCompletionFunc *cb, void *opaque,
bool auto_complete, Error **errp)
{
int orig_base_flags;
Error *local_err = NULL;
@@ -1273,9 +1304,9 @@ void commit_active_start(const char *job_id, BlockDriverState *bs,
mirror_start_job(job_id, bs, creation_flags, base, NULL, speed, 0, 0,
MIRROR_LEAVE_BACKING_CHAIN,
on_error, on_error, true, cb, opaque, &local_err,
on_error, on_error, true, cb, opaque,
&commit_active_job_driver, false, base, auto_complete,
filter_node_name);
filter_node_name, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error_restore_flags;

View File

@@ -33,17 +33,15 @@
#define HANDLE_TO_INDEX(bs, handle) ((handle) ^ ((uint64_t)(intptr_t)bs))
#define INDEX_TO_HANDLE(bs, index) ((index) ^ ((uint64_t)(intptr_t)bs))
static void nbd_recv_coroutines_enter_all(BlockDriverState *bs)
static void nbd_recv_coroutines_enter_all(NBDClientSession *s)
{
NBDClientSession *s = nbd_get_client_session(bs);
int i;
for (i = 0; i < MAX_NBD_REQUESTS; i++) {
if (s->recv_coroutine[i]) {
qemu_coroutine_enter(s->recv_coroutine[i]);
aio_co_wake(s->recv_coroutine[i]);
}
}
BDRV_POLL_WHILE(bs, s->read_reply_co);
}
static void nbd_teardown_connection(BlockDriverState *bs)
@@ -58,7 +56,7 @@ static void nbd_teardown_connection(BlockDriverState *bs)
qio_channel_shutdown(client->ioc,
QIO_CHANNEL_SHUTDOWN_BOTH,
NULL);
nbd_recv_coroutines_enter_all(bs);
BDRV_POLL_WHILE(bs, client->read_reply_co);
nbd_client_detach_aio_context(bs);
object_unref(OBJECT(client->sioc));
@@ -76,7 +74,7 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
for (;;) {
assert(s->reply.handle == 0);
ret = nbd_receive_reply(s->ioc, &s->reply);
if (ret < 0) {
if (ret <= 0) {
break;
}
@@ -103,6 +101,8 @@ static coroutine_fn void nbd_read_reply_entry(void *opaque)
aio_co_wake(s->recv_coroutine[i]);
qemu_coroutine_yield();
}
nbd_recv_coroutines_enter_all(s);
s->read_reply_co = NULL;
}

View File

@@ -30,8 +30,6 @@ typedef struct NBDClientSession {
Coroutine *recv_coroutine[MAX_NBD_REQUESTS];
NBDReply reply;
bool is_unix;
} NBDClientSession;
NBDClientSession *nbd_get_client_session(BlockDriverState *bs);

View File

@@ -47,7 +47,7 @@ typedef struct BDRVNBDState {
NBDClientSession client;
/* For nbd_refresh_filename() */
SocketAddress *saddr;
SocketAddressFlat *saddr;
char *export, *tlscredsid;
} BDRVNBDState;
@@ -95,7 +95,7 @@ static int nbd_parse_uri(const char *filename, QDict *options)
goto out;
}
qdict_put(options, "server.type", qstring_from_str("unix"));
qdict_put(options, "server.data.path",
qdict_put(options, "server.path",
qstring_from_str(qp->p[0].value));
} else {
QString *host;
@@ -116,10 +116,10 @@ static int nbd_parse_uri(const char *filename, QDict *options)
}
qdict_put(options, "server.type", qstring_from_str("inet"));
qdict_put(options, "server.data.host", host);
qdict_put(options, "server.host", host);
port_str = g_strdup_printf("%d", uri->port ?: NBD_DEFAULT_PORT);
qdict_put(options, "server.data.port", qstring_from_str(port_str));
qdict_put(options, "server.port", qstring_from_str(port_str));
g_free(port_str);
}
@@ -197,7 +197,7 @@ static void nbd_parse_filename(const char *filename, QDict *options,
/* are we a UNIX or TCP socket? */
if (strstart(host_spec, "unix:", &unixpath)) {
qdict_put(options, "server.type", qstring_from_str("unix"));
qdict_put(options, "server.data.path", qstring_from_str(unixpath));
qdict_put(options, "server.path", qstring_from_str(unixpath));
} else {
InetSocketAddress *addr = NULL;
@@ -207,8 +207,8 @@ static void nbd_parse_filename(const char *filename, QDict *options,
}
qdict_put(options, "server.type", qstring_from_str("inet"));
qdict_put(options, "server.data.host", qstring_from_str(addr->host));
qdict_put(options, "server.data.port", qstring_from_str(addr->port));
qdict_put(options, "server.host", qstring_from_str(addr->host));
qdict_put(options, "server.port", qstring_from_str(addr->port));
qapi_free_InetSocketAddress(addr);
}
@@ -248,20 +248,21 @@ static bool nbd_process_legacy_socket_options(QDict *output_options,
}
qdict_put(output_options, "server.type", qstring_from_str("unix"));
qdict_put(output_options, "server.data.path", qstring_from_str(path));
qdict_put(output_options, "server.path", qstring_from_str(path));
} else if (host) {
qdict_put(output_options, "server.type", qstring_from_str("inet"));
qdict_put(output_options, "server.data.host", qstring_from_str(host));
qdict_put(output_options, "server.data.port",
qdict_put(output_options, "server.host", qstring_from_str(host));
qdict_put(output_options, "server.port",
qstring_from_str(port ?: stringify(NBD_DEFAULT_PORT)));
}
return true;
}
static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
static SocketAddressFlat *nbd_config(BDRVNBDState *s, QDict *options,
Error **errp)
{
SocketAddress *saddr = NULL;
SocketAddressFlat *saddr = NULL;
QDict *addr = NULL;
QObject *crumpled_addr = NULL;
Visitor *iv = NULL;
@@ -278,15 +279,21 @@ static SocketAddress *nbd_config(BDRVNBDState *s, QDict *options, Error **errp)
goto done;
}
iv = qobject_input_visitor_new(crumpled_addr, true);
visit_type_SocketAddress(iv, NULL, &saddr, &local_err);
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive
* server.type=inet. .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_SocketAddressFlat(iv, NULL, &saddr, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto done;
}
s->client.is_unix = saddr->type == SOCKET_ADDRESS_KIND_UNIX;
done:
QDECREF(addr);
qobject_decref(crumpled_addr);
@@ -300,9 +307,10 @@ NBDClientSession *nbd_get_client_session(BlockDriverState *bs)
return &s->client;
}
static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
static QIOChannelSocket *nbd_establish_connection(SocketAddressFlat *saddr_flat,
Error **errp)
{
SocketAddress *saddr = socket_address_crumple(saddr_flat);
QIOChannelSocket *sioc;
Error *local_err = NULL;
@@ -312,7 +320,9 @@ static QIOChannelSocket *nbd_establish_connection(SocketAddress *saddr,
qio_channel_socket_connect_sync(sioc,
saddr,
&local_err);
qapi_free_SocketAddress(saddr);
if (local_err) {
object_unref(OBJECT(sioc));
error_propagate(errp, local_err);
return NULL;
}
@@ -403,7 +413,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
goto error;
}
/* Translate @host, @port, and @path to a SocketAddress */
/* Translate @host, @port, and @path to a SocketAddressFlat */
if (!nbd_process_legacy_socket_options(options, opts, errp)) {
goto error;
}
@@ -423,11 +433,12 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
goto error;
}
if (s->saddr->type != SOCKET_ADDRESS_KIND_INET) {
/* TODO SOCKET_ADDRESS_KIND_FD where fd has AF_INET or AF_INET6 */
if (s->saddr->type != SOCKET_ADDRESS_FLAT_TYPE_INET) {
error_setg(errp, "TLS only supported over IP sockets");
goto error;
}
hostname = s->saddr->u.inet.data->host;
hostname = s->saddr->u.inet.host;
}
/* establish TCP connection, return error if it fails
@@ -450,7 +461,7 @@ static int nbd_open(BlockDriverState *bs, QDict *options, int flags,
object_unref(OBJECT(tlscreds));
}
if (ret < 0) {
qapi_free_SocketAddress(s->saddr);
qapi_free_SocketAddressFlat(s->saddr);
g_free(s->export);
g_free(s->tlscredsid);
}
@@ -476,7 +487,7 @@ static void nbd_close(BlockDriverState *bs)
nbd_client_close(bs);
qapi_free_SocketAddress(s->saddr);
qapi_free_SocketAddressFlat(s->saddr);
g_free(s->export);
g_free(s->tlscredsid);
}
@@ -507,15 +518,15 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
Visitor *ov;
const char *host = NULL, *port = NULL, *path = NULL;
if (s->saddr->type == SOCKET_ADDRESS_KIND_INET) {
const InetSocketAddress *inet = s->saddr->u.inet.data;
if (s->saddr->type == SOCKET_ADDRESS_FLAT_TYPE_INET) {
const InetSocketAddress *inet = &s->saddr->u.inet;
if (!inet->has_ipv4 && !inet->has_ipv6 && !inet->has_to) {
host = inet->host;
port = inet->port;
}
} else if (s->saddr->type == SOCKET_ADDRESS_KIND_UNIX) {
path = s->saddr->u.q_unix.data->path;
}
} else if (s->saddr->type == SOCKET_ADDRESS_FLAT_TYPE_UNIX) {
path = s->saddr->u.q_unix.path;
} /* else can't represent as pseudo-filename */
qdict_put(opts, "driver", qstring_from_str("nbd"));
@@ -534,7 +545,7 @@ static void nbd_refresh_filename(BlockDriverState *bs, QDict *options)
}
ov = qobject_output_visitor_new(&saddr_qdict);
visit_type_SocketAddress(ov, NULL, &s->saddr, &error_abort);
visit_type_SocketAddressFlat(ov, NULL, &s->saddr, &error_abort);
visit_complete(ov, &saddr_qdict);
visit_free(ov);
qdict_put_obj(opts, "server", saddr_qdict);

View File

@@ -474,7 +474,14 @@ static NFSServer *nfs_config(QDict *options, Error **errp)
goto out;
}
iv = qobject_input_visitor_new(crumpled_addr, true);
/*
* Caution: this works only because all scalar members of
* NFSServer are QString in @crumpled_addr. The visitor expects
* @crumpled_addr to be typed according to the QAPI schema. It
* is when @options come from -blockdev or blockdev_add. But when
* they come from -drive, they're all QString.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_NFSServer(iv, NULL, &server, &local_error);
if (local_error) {
error_propagate(errp, local_error);
@@ -490,7 +497,7 @@ out:
static int64_t nfs_client_open(NFSClient *client, QDict *options,
int flags, Error **errp, int open_flags)
int flags, int open_flags, Error **errp)
{
int ret = -EINVAL;
QemuOpts *opts = NULL;
@@ -656,7 +663,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
ret = nfs_client_open(client, options,
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp, bs->open_flags);
bs->open_flags, errp);
if (ret < 0) {
return ret;
}
@@ -698,7 +705,7 @@ static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
goto out;
}
ret = nfs_client_open(client, options, O_CREAT, errp, 0);
ret = nfs_client_open(client, options, O_CREAT, 0, errp);
if (ret < 0) {
goto out;
}

View File

@@ -114,7 +114,7 @@ static QemuOptsList parallels_runtime_opts = {
.name = PARALLELS_OPT_PREALLOC_SIZE,
.type = QEMU_OPT_SIZE,
.help = "Preallocation size on image expansion",
.def_value_str = "128MiB",
.def_value_str = "128M",
},
{
.name = PARALLELS_OPT_PREALLOC_MODE,
@@ -192,8 +192,7 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, int *pnum)
{
BDRVParallelsState *s = bs->opaque;
uint32_t idx, to_allocate, i;
int64_t pos, space;
int64_t pos, space, idx, to_allocate, i;
pos = block_status(s, sector_num, nb_sectors, pnum);
if (pos > 0) {
@@ -201,11 +200,19 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
}
idx = sector_num / s->tracks;
if (idx >= s->bat_size) {
return -EINVAL;
}
to_allocate = DIV_ROUND_UP(sector_num + *pnum, s->tracks) - idx;
/* This function is called only by parallels_co_writev(), which will never
* pass a sector_num at or beyond the end of the image (because the block
* layer never passes such a sector_num to that function). Therefore, idx
* is always below s->bat_size.
* block_status() will limit *pnum so that sector_num + *pnum will not
* exceed the image end. Therefore, idx + to_allocate cannot exceed
* s->bat_size.
* Note that s->bat_size is an unsigned int, therefore idx + to_allocate
* will always fit into a uint32_t. */
assert(idx < s->bat_size && idx + to_allocate <= s->bat_size);
space = to_allocate * s->tracks;
if (s->data_end + space > bdrv_getlength(bs->file->bs) >> BDRV_SECTOR_BITS) {
int ret;
@@ -687,7 +694,8 @@ static int parallels_open(BlockDriverState *bs, QDict *options, int flags,
if (local_err != NULL) {
goto fail_options;
}
if (!bdrv_has_zero_init(bs->file->bs) ||
if (!(flags & BDRV_O_RESIZE) || !bdrv_has_zero_init(bs->file->bs) ||
bdrv_truncate(bs->file, bdrv_getlength(bs->file->bs)) != 0) {
s->prealloc_mode = PRL_PREALLOC_MODE_FALLOCATE;
}

View File

@@ -1519,12 +1519,10 @@ int qcow2_discard_clusters(BlockDriverState *bs, uint64_t offset,
end_offset = offset + (nb_sectors << BDRV_SECTOR_BITS);
/* Round start up and end down */
offset = align_offset(offset, s->cluster_size);
end_offset = start_of_cluster(s, end_offset);
if (offset > end_offset) {
return 0;
/* The caller must cluster-align start; round end down except at EOF */
assert(QEMU_IS_ALIGNED(offset, s->cluster_size));
if (end_offset != bs->total_sectors * BDRV_SECTOR_SIZE) {
end_offset = start_of_cluster(s, end_offset);
}
nb_clusters = size_to_clusters(s, end_offset - offset);

View File

@@ -13,13 +13,14 @@
#include "qemu/osdep.h"
#include <rbd/librbd.h>
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "block/block_int.h"
#include "crypto/secret.h"
#include "qemu/cutils.h"
#include <rbd/librbd.h>
#include "qapi/qmp/qstring.h"
#include "qapi/qmp/qjson.h"
/*
* When specifying the image filename use:
@@ -55,11 +56,6 @@
#define OBJ_MAX_SIZE (1UL << OBJ_DEFAULT_OBJ_ORDER)
#define RBD_MAX_CONF_NAME_SIZE 128
#define RBD_MAX_CONF_VAL_SIZE 512
#define RBD_MAX_CONF_SIZE 1024
#define RBD_MAX_POOL_NAME_SIZE 128
#define RBD_MAX_SNAP_NAME_SIZE 128
#define RBD_MAX_SNAPS 100
/* The LIBRBD_SUPPORTS_IOVEC is defined in librbd.h */
@@ -98,46 +94,29 @@ typedef struct BDRVRBDState {
rados_t cluster;
rados_ioctx_t io_ctx;
rbd_image_t image;
char name[RBD_MAX_IMAGE_NAME_SIZE];
char *image_name;
char *snap;
} BDRVRBDState;
static int qemu_rbd_next_tok(char *dst, int dst_len,
char *src, char delim,
const char *name,
char **p, Error **errp)
static char *qemu_rbd_next_tok(char *src, char delim, char **p)
{
int l;
char *end;
*p = NULL;
if (delim != '\0') {
for (end = src; *end; ++end) {
if (*end == delim) {
break;
}
if (*end == '\\' && end[1] != '\0') {
end++;
}
}
for (end = src; *end; ++end) {
if (*end == delim) {
*p = end + 1;
*end = '\0';
break;
}
if (*end == '\\' && end[1] != '\0') {
end++;
}
}
l = strlen(src);
if (l >= dst_len) {
error_setg(errp, "%s too long", name);
return -EINVAL;
} else if (l == 0) {
error_setg(errp, "%s too short", name);
return -EINVAL;
if (*end == delim) {
*p = end + 1;
*end = '\0';
}
pstrcpy(dst, dst_len, src);
return 0;
return src;
}
static void qemu_rbd_unescape(char *src)
@@ -153,87 +132,92 @@ static void qemu_rbd_unescape(char *src)
*p = '\0';
}
static int qemu_rbd_parsename(const char *filename,
char *pool, int pool_len,
char *snap, int snap_len,
char *name, int name_len,
char *conf, int conf_len,
Error **errp)
static void qemu_rbd_parse_filename(const char *filename, QDict *options,
Error **errp)
{
const char *start;
char *p, *buf;
int ret;
QList *keypairs = NULL;
char *found_str;
if (!strstart(filename, "rbd:", &start)) {
error_setg(errp, "File name must start with 'rbd:'");
return -EINVAL;
return;
}
buf = g_strdup(start);
p = buf;
*snap = '\0';
*conf = '\0';
ret = qemu_rbd_next_tok(pool, pool_len, p,
'/', "pool name", &p, errp);
if (ret < 0 || !p) {
ret = -EINVAL;
found_str = qemu_rbd_next_tok(p, '/', &p);
if (!p) {
error_setg(errp, "Pool name is required");
goto done;
}
qemu_rbd_unescape(pool);
qemu_rbd_unescape(found_str);
qdict_put(options, "pool", qstring_from_str(found_str));
if (strchr(p, '@')) {
ret = qemu_rbd_next_tok(name, name_len, p,
'@', "object name", &p, errp);
if (ret < 0) {
goto done;
}
ret = qemu_rbd_next_tok(snap, snap_len, p,
':', "snap name", &p, errp);
qemu_rbd_unescape(snap);
found_str = qemu_rbd_next_tok(p, '@', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "image", qstring_from_str(found_str));
found_str = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "snapshot", qstring_from_str(found_str));
} else {
ret = qemu_rbd_next_tok(name, name_len, p,
':', "object name", &p, errp);
found_str = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(found_str);
qdict_put(options, "image", qstring_from_str(found_str));
}
qemu_rbd_unescape(name);
if (ret < 0 || !p) {
if (!p) {
goto done;
}
ret = qemu_rbd_next_tok(conf, conf_len, p,
'\0', "configuration", &p, errp);
/* The following are essentially all key/value pairs, and we treat
* 'id' and 'conf' a bit special. Key/value pairs may be in any order. */
while (p) {
char *name, *value;
name = qemu_rbd_next_tok(p, '=', &p);
if (!p) {
error_setg(errp, "conf option %s has no value", name);
break;
}
qemu_rbd_unescape(name);
value = qemu_rbd_next_tok(p, ':', &p);
qemu_rbd_unescape(value);
if (!strcmp(name, "conf")) {
qdict_put(options, "conf", qstring_from_str(value));
} else if (!strcmp(name, "id")) {
qdict_put(options, "user" , qstring_from_str(value));
} else {
/*
* We pass these internally to qemu_rbd_set_keypairs(), so
* we can get away with the simpler list of [ "key1",
* "value1", "key2", "value2" ] rather than a raw dict
* { "key1": "value1", "key2": "value2" } where we can't
* guarantee order, or even a more correct but complex
* [ { "key1": "value1" }, { "key2": "value2" } ]
*/
if (!keypairs) {
keypairs = qlist_new();
}
qlist_append(keypairs, qstring_from_str(name));
qlist_append(keypairs, qstring_from_str(value));
}
}
if (keypairs) {
qdict_put(options, "=keyvalue-pairs",
qobject_to_json(QOBJECT(keypairs)));
}
done:
g_free(buf);
return ret;
}
static char *qemu_rbd_parse_clientname(const char *conf, char *clientname)
{
const char *p = conf;
while (*p) {
int len;
const char *end = strchr(p, ':');
if (end) {
len = end - p;
} else {
len = strlen(p);
}
if (strncmp(p, "id=", 3) == 0) {
len -= 3;
strncpy(clientname, p + 3, len);
clientname[len] = '\0';
return clientname;
}
if (end == NULL) {
break;
}
p = end + 1;
}
return NULL;
QDECREF(keypairs);
return;
}
@@ -256,64 +240,41 @@ static int qemu_rbd_set_auth(rados_t cluster, const char *secretid,
return 0;
}
static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
bool only_read_conf_file,
Error **errp)
static int qemu_rbd_set_keypairs(rados_t cluster, const char *keypairs_json,
Error **errp)
{
char *p, *buf;
char name[RBD_MAX_CONF_NAME_SIZE];
char value[RBD_MAX_CONF_VAL_SIZE];
QList *keypairs;
QString *name;
QString *value;
const char *key;
size_t remaining;
int ret = 0;
buf = g_strdup(conf);
p = buf;
if (!keypairs_json) {
return ret;
}
keypairs = qobject_to_qlist(qobject_from_json(keypairs_json,
&error_abort));
remaining = qlist_size(keypairs) / 2;
assert(remaining);
while (p) {
ret = qemu_rbd_next_tok(name, sizeof(name), p,
'=', "conf option name", &p, errp);
while (remaining--) {
name = qobject_to_qstring(qlist_pop(keypairs));
value = qobject_to_qstring(qlist_pop(keypairs));
assert(name && value);
key = qstring_get_str(name);
ret = rados_conf_set(cluster, key, qstring_get_str(value));
QDECREF(name);
QDECREF(value);
if (ret < 0) {
break;
}
qemu_rbd_unescape(name);
if (!p) {
error_setg(errp, "conf option %s has no value", name);
error_setg_errno(errp, -ret, "invalid conf option %s", key);
ret = -EINVAL;
break;
}
ret = qemu_rbd_next_tok(value, sizeof(value), p,
':', "conf option value", &p, errp);
if (ret < 0) {
break;
}
qemu_rbd_unescape(value);
if (strcmp(name, "conf") == 0) {
/* read the conf file alone, so it doesn't override more
specific settings for a particular device */
if (only_read_conf_file) {
ret = rados_conf_read_file(cluster, value);
if (ret < 0) {
error_setg_errno(errp, -ret, "error reading conf file %s",
value);
break;
}
}
} else if (strcmp(name, "id") == 0) {
/* ignore, this is parsed by qemu_rbd_parse_clientname() */
} else if (!only_read_conf_file) {
ret = rados_conf_set(cluster, name, value);
if (ret < 0) {
error_setg_errno(errp, -ret, "invalid conf option %s", name);
ret = -EINVAL;
break;
}
}
}
g_free(buf);
QDECREF(keypairs);
return ret;
}
@@ -328,33 +289,76 @@ static void qemu_rbd_memset(RADOSCB *rcb, int64_t offs)
}
}
static QemuOptsList runtime_opts = {
.name = "rbd",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "pool",
.type = QEMU_OPT_STRING,
.help = "Rados pool name",
},
{
.name = "image",
.type = QEMU_OPT_STRING,
.help = "Image name in the pool",
},
{
.name = "conf",
.type = QEMU_OPT_STRING,
.help = "Rados config file location",
},
{
.name = "snapshot",
.type = QEMU_OPT_STRING,
.help = "Ceph snapshot name",
},
{
/* maps to 'id' in rados_create() */
.name = "user",
.type = QEMU_OPT_STRING,
.help = "Rados id name",
},
/*
* server.* extracted manually, see qemu_rbd_mon_host()
*/
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
/*
* Keys for qemu_rbd_parse_filename(), not in the QAPI schema
*/
{
/*
* HACK: name starts with '=' so that qemu_opts_parse()
* can't set it
*/
.name = "=keyvalue-pairs",
.type = QEMU_OPT_STRING,
.help = "Legacy rados key/value option parameters",
},
{ /* end of list */ }
},
};
static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
{
Error *local_err = NULL;
int64_t bytes = 0;
int64_t objsize;
int obj_order = 0;
char pool[RBD_MAX_POOL_NAME_SIZE];
char name[RBD_MAX_IMAGE_NAME_SIZE];
char snap_buf[RBD_MAX_SNAP_NAME_SIZE];
char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname;
const char *pool, *image_name, *conf, *user, *keypairs;
const char *secretid;
rados_t cluster;
rados_ioctx_t io_ctx;
int ret;
QDict *options = NULL;
int ret = 0;
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
name, sizeof(name),
conf, sizeof(conf), &local_err) < 0) {
error_propagate(errp, local_err);
return -EINVAL;
}
/* Read out options */
bytes = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
@@ -362,35 +366,53 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
if (objsize) {
if ((objsize - 1) & objsize) { /* not a power of 2? */
error_setg(errp, "obj size needs to be power of 2");
return -EINVAL;
ret = -EINVAL;
goto exit;
}
if (objsize < 4096) {
error_setg(errp, "obj size too small");
return -EINVAL;
ret = -EINVAL;
goto exit;
}
obj_order = ctz32(objsize);
}
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
ret = rados_create(&cluster, clientname);
if (ret < 0) {
error_setg_errno(errp, -ret, "error initializing");
return ret;
options = qdict_new();
qemu_rbd_parse_filename(filename, options, &local_err);
if (local_err) {
ret = -EINVAL;
error_propagate(errp, local_err);
goto exit;
}
if (strstr(conf, "conf=") == NULL) {
/* try default location, but ignore failure */
rados_conf_read_file(cluster, NULL);
} else if (conf[0] != '\0' &&
qemu_rbd_set_conf(cluster, conf, true, &local_err) < 0) {
error_propagate(errp, local_err);
/*
* Caution: while qdict_get_try_str() is fine, getting non-string
* types would require more care. When @options come from -blockdev
* or blockdev_add, its members are typed according to the QAPI
* schema, but when they come from -drive, they're all QString.
*/
pool = qdict_get_try_str(options, "pool");
conf = qdict_get_try_str(options, "conf");
user = qdict_get_try_str(options, "user");
image_name = qdict_get_try_str(options, "image");
keypairs = qdict_get_try_str(options, "=keyvalue-pairs");
ret = rados_create(&cluster, user);
if (ret < 0) {
error_setg_errno(errp, -ret, "error initializing");
goto exit;
}
/* try default location when conf=NULL, but ignore failure */
ret = rados_conf_read_file(cluster, conf);
if (conf && ret < 0) {
error_setg_errno(errp, -ret, "error reading conf file %s", conf);
ret = -EIO;
goto shutdown;
}
if (conf[0] != '\0' &&
qemu_rbd_set_conf(cluster, conf, false, &local_err) < 0) {
error_propagate(errp, local_err);
ret = qemu_rbd_set_keypairs(cluster, keypairs, errp);
if (ret < 0) {
ret = -EIO;
goto shutdown;
}
@@ -412,7 +434,7 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
goto shutdown;
}
ret = rbd_create(io_ctx, name, bytes, &obj_order);
ret = rbd_create(io_ctx, image_name, bytes, &obj_order);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
}
@@ -421,6 +443,9 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
shutdown:
rados_shutdown(cluster);
exit:
QDECREF(options);
return ret;
}
@@ -471,83 +496,110 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
qemu_aio_unref(acb);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "rbd",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "Specification of the rbd image",
},
{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of secret providing the password",
},
{ /* end of list */ }
},
};
static char *qemu_rbd_mon_host(QDict *options, Error **errp)
{
const char **vals = g_new(const char *, qdict_size(options) + 1);
char keybuf[32];
const char *host, *port;
char *rados_str;
int i;
for (i = 0;; i++) {
sprintf(keybuf, "server.%d.host", i);
host = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
sprintf(keybuf, "server.%d.port", i);
port = qdict_get_try_str(options, keybuf);
qdict_del(options, keybuf);
if (!host && !port) {
break;
}
if (!host) {
error_setg(errp, "Parameter server.%d.host is missing", i);
rados_str = NULL;
goto out;
}
if (strchr(host, ':')) {
vals[i] = port ? g_strdup_printf("[%s]:%s", host, port)
: g_strdup_printf("[%s]", host);
} else {
vals[i] = port ? g_strdup_printf("%s:%s", host, port)
: g_strdup(host);
}
}
vals[i] = NULL;
rados_str = i ? g_strjoinv(";", (char **)vals) : NULL;
out:
g_strfreev((char **)vals);
return rados_str;
}
static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRBDState *s = bs->opaque;
char pool[RBD_MAX_POOL_NAME_SIZE];
char snap_buf[RBD_MAX_SNAP_NAME_SIZE];
char conf[RBD_MAX_CONF_SIZE];
char clientname_buf[RBD_MAX_CONF_SIZE];
char *clientname;
const char *pool, *snap, *conf, *user, *image_name, *keypairs;
const char *secretid;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
char *mon_host = NULL;
int r;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
qemu_opts_del(opts);
return -EINVAL;
}
filename = qemu_opt_get(opts, "filename");
secretid = qemu_opt_get(opts, "password-secret");
if (qemu_rbd_parsename(filename, pool, sizeof(pool),
snap_buf, sizeof(snap_buf),
s->name, sizeof(s->name),
conf, sizeof(conf), errp) < 0) {
r = -EINVAL;
goto failed_opts;
}
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
r = rados_create(&s->cluster, clientname);
mon_host = qemu_rbd_mon_host(options, &local_err);
if (local_err) {
error_propagate(errp, local_err);
r = -EINVAL;
goto failed_opts;
}
secretid = qemu_opt_get(opts, "password-secret");
pool = qemu_opt_get(opts, "pool");
conf = qemu_opt_get(opts, "conf");
snap = qemu_opt_get(opts, "snapshot");
user = qemu_opt_get(opts, "user");
image_name = qemu_opt_get(opts, "image");
keypairs = qemu_opt_get(opts, "=keyvalue-pairs");
if (!pool || !image_name) {
error_setg(errp, "Parameters 'pool' and 'image' are required");
r = -EINVAL;
goto failed_opts;
}
r = rados_create(&s->cluster, user);
if (r < 0) {
error_setg_errno(errp, -r, "error initializing");
goto failed_opts;
}
s->snap = NULL;
if (snap_buf[0] != '\0') {
s->snap = g_strdup(snap_buf);
s->snap = g_strdup(snap);
s->image_name = g_strdup(image_name);
/* try default location when conf=NULL, but ignore failure */
r = rados_conf_read_file(s->cluster, conf);
if (conf && r < 0) {
error_setg_errno(errp, -r, "error reading conf file %s", conf);
goto failed_shutdown;
}
if (strstr(conf, "conf=") == NULL) {
/* try default location, but ignore failure */
rados_conf_read_file(s->cluster, NULL);
} else if (conf[0] != '\0') {
r = qemu_rbd_set_conf(s->cluster, conf, true, errp);
if (r < 0) {
goto failed_shutdown;
}
r = qemu_rbd_set_keypairs(s->cluster, keypairs, errp);
if (r < 0) {
goto failed_shutdown;
}
if (conf[0] != '\0') {
r = qemu_rbd_set_conf(s->cluster, conf, false, errp);
if (mon_host) {
r = rados_conf_set(s->cluster, "mon_host", mon_host);
if (r < 0) {
goto failed_shutdown;
}
@@ -583,13 +635,23 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
goto failed_shutdown;
}
r = rbd_open(s->io_ctx, s->name, &s->image, s->snap);
/* rbd_open is always r/w */
r = rbd_open(s->io_ctx, s->image_name, &s->image, s->snap);
if (r < 0) {
error_setg_errno(errp, -r, "error reading header from %s", s->name);
error_setg_errno(errp, -r, "error reading header from %s",
s->image_name);
goto failed_open;
}
bs->read_only = (s->snap != NULL);
/* If we are using an rbd snapshot, we must be r/o, otherwise
* leave as-is */
if (s->snap != NULL) {
r = bdrv_set_read_only(bs, true, &local_err);
if (r < 0) {
error_propagate(errp, local_err);
goto failed_open;
}
}
qemu_opts_del(opts);
return 0;
@@ -599,11 +661,33 @@ failed_open:
failed_shutdown:
rados_shutdown(s->cluster);
g_free(s->snap);
g_free(s->image_name);
failed_opts:
qemu_opts_del(opts);
g_free(mon_host);
return r;
}
/* Since RBD is currently always opened R/W via the API,
* we just need to check if we are using a snapshot or not, in
* order to determine if we will allow it to be R/W */
static int qemu_rbd_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
BDRVRBDState *s = state->bs->opaque;
int ret = 0;
if (s->snap && state->flags & BDRV_O_RDWR) {
error_setg(errp,
"Cannot change node '%s' to r/w when using RBD snapshot",
bdrv_get_device_or_node_name(state->bs));
ret = -EINVAL;
}
return ret;
}
static void qemu_rbd_close(BlockDriverState *bs)
{
BDRVRBDState *s = bs->opaque;
@@ -611,6 +695,7 @@ static void qemu_rbd_close(BlockDriverState *bs)
rbd_close(s->image);
rados_ioctx_destroy(s->io_ctx);
g_free(s->snap);
g_free(s->image_name);
rados_shutdown(s->cluster);
}
@@ -1004,18 +1089,19 @@ static QemuOptsList qemu_rbd_create_opts = {
};
static BlockDriver bdrv_rbd = {
.format_name = "rbd",
.instance_size = sizeof(BDRVRBDState),
.bdrv_needs_filename = true,
.bdrv_file_open = qemu_rbd_open,
.bdrv_close = qemu_rbd_close,
.bdrv_create = qemu_rbd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,
.create_opts = &qemu_rbd_create_opts,
.bdrv_getlength = qemu_rbd_getlength,
.bdrv_truncate = qemu_rbd_truncate,
.protocol_name = "rbd",
.format_name = "rbd",
.instance_size = sizeof(BDRVRBDState),
.bdrv_parse_filename = qemu_rbd_parse_filename,
.bdrv_file_open = qemu_rbd_open,
.bdrv_close = qemu_rbd_close,
.bdrv_reopen_prepare = qemu_rbd_reopen_prepare,
.bdrv_create = qemu_rbd_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_get_info = qemu_rbd_getinfo,
.create_opts = &qemu_rbd_create_opts,
.bdrv_getlength = qemu_rbd_getlength,
.bdrv_truncate = qemu_rbd_truncate,
.protocol_name = "rbd",
.bdrv_aio_readv = qemu_rbd_aio_readv,
.bdrv_aio_writev = qemu_rbd_aio_writev,

View File

@@ -155,6 +155,18 @@ static void replication_close(BlockDriverState *bs)
replication_remove(s->rs);
}
static void replication_child_perm(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
*nperm = *nshared = BLK_PERM_CONSISTENT_READ \
| BLK_PERM_WRITE \
| BLK_PERM_WRITE_UNCHANGED;
return;
}
static int64_t replication_getlength(BlockDriverState *bs)
{
return bdrv_getlength(bs->file->bs);
@@ -644,7 +656,7 @@ static void replication_stop(ReplicationState *rs, bool failover, Error **errp)
s->replication_state = BLOCK_REPLICATION_FAILOVER;
commit_active_start(NULL, s->active_disk->bs, s->secondary_disk->bs,
BLOCK_JOB_INTERNAL, 0, BLOCKDEV_ON_ERROR_REPORT,
NULL, replication_done, bs, errp, true);
NULL, replication_done, bs, true, errp);
break;
default:
aio_context_release(aio_context);
@@ -660,7 +672,7 @@ BlockDriver bdrv_replication = {
.bdrv_open = replication_open,
.bdrv_close = replication_close,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_child_perm = replication_child_perm,
.bdrv_getlength = replication_getlength,
.bdrv_co_readv = replication_co_readv,

View File

@@ -13,7 +13,11 @@
*/
#include "qemu/osdep.h"
#include "qapi-visit.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qobject-input-visitor.h"
#include "qemu/uri.h"
#include "qemu/error-report.h"
#include "qemu/sockets.h"
@@ -374,8 +378,7 @@ struct BDRVSheepdogState {
uint32_t cache_flags;
bool discard_supported;
char *host_spec;
bool is_unix;
SocketAddress *addr;
int fd;
CoMutex lock;
@@ -527,21 +530,77 @@ static void sd_aio_setup(SheepdogAIOCB *acb, BDRVSheepdogState *s,
QLIST_INSERT_HEAD(&s->inflight_aiocb_head, acb, aiocb_siblings);
}
static SocketAddress *sd_socket_address(const char *path,
const char *host, const char *port)
{
SocketAddress *addr = g_new0(SocketAddress, 1);
if (path) {
addr->type = SOCKET_ADDRESS_KIND_UNIX;
addr->u.q_unix.data = g_new0(UnixSocketAddress, 1);
addr->u.q_unix.data->path = g_strdup(path);
} else {
addr->type = SOCKET_ADDRESS_KIND_INET;
addr->u.inet.data = g_new0(InetSocketAddress, 1);
addr->u.inet.data->host = g_strdup(host ?: SD_DEFAULT_ADDR);
addr->u.inet.data->port = g_strdup(port ?: stringify(SD_DEFAULT_PORT));
}
return addr;
}
static SocketAddress *sd_server_config(QDict *options, Error **errp)
{
QDict *server = NULL;
QObject *crumpled_server = NULL;
Visitor *iv = NULL;
SocketAddressFlat *saddr_flat = NULL;
SocketAddress *saddr = NULL;
Error *local_err = NULL;
qdict_extract_subqdict(options, &server, "server.");
crumpled_server = qdict_crumple(server, errp);
if (!crumpled_server) {
goto done;
}
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive
* server.type=inet. .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_server);
visit_type_SocketAddressFlat(iv, NULL, &saddr_flat, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto done;
}
saddr = socket_address_crumple(saddr_flat);
done:
qapi_free_SocketAddressFlat(saddr_flat);
visit_free(iv);
qobject_decref(crumpled_server);
QDECREF(server);
return saddr;
}
/* Return -EIO in case of error, file descriptor on success */
static int connect_to_sdog(BDRVSheepdogState *s, Error **errp)
{
int fd;
if (s->is_unix) {
fd = unix_connect(s->host_spec, errp);
} else {
fd = inet_connect(s->host_spec, errp);
fd = socket_connect(s->addr, NULL, NULL, errp);
if (fd >= 0) {
int ret = socket_set_nodelay(fd);
if (ret < 0) {
error_report("%s", strerror(errno));
}
if (s->addr->type == SOCKET_ADDRESS_KIND_INET && fd >= 0) {
int ret = socket_set_nodelay(fd);
if (ret < 0) {
error_report("%s", strerror(errno));
}
}
@@ -677,7 +736,7 @@ static int do_req(int sockfd, BlockDriverState *bs, SheepdogReq *hdr,
} else {
co = qemu_coroutine_create(do_co_req, &srco);
if (bs) {
qemu_coroutine_enter(co);
bdrv_coroutine_enter(bs, co);
BDRV_POLL_WHILE(bs, !srco.finished);
} else {
qemu_coroutine_enter(co);
@@ -820,8 +879,7 @@ static void coroutine_fn aio_read_response(void *opaque)
case AIOCB_DISCARD_OBJ:
switch (rsp.result) {
case SD_RES_INVALID_PARMS:
error_report("sheep(%s) doesn't support discard command",
s->host_spec);
error_report("server doesn't support discard command");
rsp.result = SD_RES_SUCCESS;
s->discard_supported = false;
break;
@@ -884,7 +942,7 @@ static void co_read_response(void *opaque)
s->co_recv = qemu_coroutine_create(aio_read_response, opaque);
}
aio_co_wake(s->co_recv);
aio_co_enter(s->aio_context, s->co_recv);
}
static void co_write_request(void *opaque)
@@ -914,71 +972,155 @@ static int get_sheep_fd(BDRVSheepdogState *s, Error **errp)
return fd;
}
static int sd_parse_uri(BDRVSheepdogState *s, const char *filename,
char *vdi, uint32_t *snapid, char *tag)
/*
* Parse numeric snapshot ID in @str
* If @str can't be parsed as number, return false.
* Else, if the number is zero or too large, set *@snapid to zero and
* return true.
* Else, set *@snapid to the number and return true.
*/
static bool sd_parse_snapid(const char *str, uint32_t *snapid)
{
URI *uri;
QueryParams *qp = NULL;
int ret = 0;
unsigned long ul;
int ret;
uri = uri_parse(filename);
ret = qemu_strtoul(str, NULL, 10, &ul);
if (ret == -ERANGE) {
ul = ret = 0;
}
if (ret) {
return false;
}
if (ul > UINT32_MAX) {
ul = 0;
}
*snapid = ul;
return true;
}
static bool sd_parse_snapid_or_tag(const char *str,
uint32_t *snapid, char tag[])
{
if (!sd_parse_snapid(str, snapid)) {
*snapid = 0;
if (g_strlcpy(tag, str, SD_MAX_VDI_TAG_LEN) >= SD_MAX_VDI_TAG_LEN) {
return false;
}
} else if (!*snapid) {
return false;
} else {
tag[0] = 0;
}
return true;
}
typedef struct {
const char *path; /* non-null iff transport is tcp */
const char *host; /* valid when transport is tcp */
int port; /* valid when transport is tcp */
char vdi[SD_MAX_VDI_LEN];
char tag[SD_MAX_VDI_TAG_LEN];
uint32_t snap_id;
/* Remainder is only for sd_config_done() */
URI *uri;
QueryParams *qp;
} SheepdogConfig;
static void sd_config_done(SheepdogConfig *cfg)
{
if (cfg->qp) {
query_params_free(cfg->qp);
}
uri_free(cfg->uri);
}
static void sd_parse_uri(SheepdogConfig *cfg, const char *filename,
Error **errp)
{
Error *err = NULL;
QueryParams *qp = NULL;
bool is_unix;
URI *uri;
memset(cfg, 0, sizeof(*cfg));
cfg->uri = uri = uri_parse(filename);
if (!uri) {
return -EINVAL;
error_setg(&err, "invalid URI");
goto out;
}
/* transport */
if (!strcmp(uri->scheme, "sheepdog")) {
s->is_unix = false;
is_unix = false;
} else if (!strcmp(uri->scheme, "sheepdog+tcp")) {
s->is_unix = false;
is_unix = false;
} else if (!strcmp(uri->scheme, "sheepdog+unix")) {
s->is_unix = true;
is_unix = true;
} else {
ret = -EINVAL;
error_setg(&err, "URI scheme must be 'sheepdog', 'sheepdog+tcp',"
" or 'sheepdog+unix'");
goto out;
}
if (uri->path == NULL || !strcmp(uri->path, "/")) {
ret = -EINVAL;
error_setg(&err, "missing file path in URI");
goto out;
}
pstrcpy(vdi, SD_MAX_VDI_LEN, uri->path + 1);
qp = query_params_parse(uri->query);
if (qp->n > 1 || (s->is_unix && !qp->n) || (!s->is_unix && qp->n)) {
ret = -EINVAL;
if (g_strlcpy(cfg->vdi, uri->path + 1, SD_MAX_VDI_LEN)
>= SD_MAX_VDI_LEN) {
error_setg(&err, "VDI name is too long");
goto out;
}
if (s->is_unix) {
cfg->qp = qp = query_params_parse(uri->query);
if (is_unix) {
/* sheepdog+unix:///vdiname?socket=path */
if (uri->server || uri->port || strcmp(qp->p[0].name, "socket")) {
ret = -EINVAL;
if (uri->server || uri->port) {
error_setg(&err, "URI scheme %s doesn't accept a server address",
uri->scheme);
goto out;
}
s->host_spec = g_strdup(qp->p[0].value);
if (!qp->n) {
error_setg(&err,
"URI scheme %s requires query parameter 'socket'",
uri->scheme);
goto out;
}
if (qp->n != 1 || strcmp(qp->p[0].name, "socket")) {
error_setg(&err, "unexpected query parameters");
goto out;
}
cfg->path = qp->p[0].value;
} else {
/* sheepdog[+tcp]://[host:port]/vdiname */
s->host_spec = g_strdup_printf("%s:%d", uri->server ?: SD_DEFAULT_ADDR,
uri->port ?: SD_DEFAULT_PORT);
if (qp->n) {
error_setg(&err, "unexpected query parameters");
goto out;
}
cfg->host = uri->server;
cfg->port = uri->port;
}
/* snapshot tag */
if (uri->fragment) {
*snapid = strtoul(uri->fragment, NULL, 10);
if (*snapid == 0) {
pstrcpy(tag, SD_MAX_VDI_TAG_LEN, uri->fragment);
if (!sd_parse_snapid_or_tag(uri->fragment,
&cfg->snap_id, cfg->tag)) {
error_setg(&err, "'%s' is not a valid snapshot ID",
uri->fragment);
goto out;
}
} else {
*snapid = CURRENT_VDI_ID; /* search current vdi */
cfg->snap_id = CURRENT_VDI_ID; /* search current vdi */
}
out:
if (qp) {
query_params_free(qp);
if (err) {
error_propagate(errp, err);
sd_config_done(cfg);
}
uri_free(uri);
return ret;
}
/*
@@ -998,12 +1140,13 @@ out:
* You can run VMs outside the Sheepdog cluster by specifying
* `hostname' and `port' (experimental).
*/
static int parse_vdiname(BDRVSheepdogState *s, const char *filename,
char *vdi, uint32_t *snapid, char *tag)
static void parse_vdiname(SheepdogConfig *cfg, const char *filename,
Error **errp)
{
Error *err = NULL;
char *p, *q, *uri;
const char *host_spec, *vdi_spec;
int nr_sep, ret;
int nr_sep;
strstart(filename, "sheepdog:", &filename);
p = q = g_strdup(filename);
@@ -1038,12 +1181,60 @@ static int parse_vdiname(BDRVSheepdogState *s, const char *filename,
uri = g_strdup_printf("sheepdog://%s/%s", host_spec, vdi_spec);
ret = sd_parse_uri(s, uri, vdi, snapid, tag);
/*
* FIXME We to escape URI meta-characters, e.g. "x?y=z"
* produces "sheepdog://x?y=z". Because of that ...
*/
sd_parse_uri(cfg, uri, &err);
if (err) {
/*
* ... this can fail, but the error message is misleading.
* Replace it by the traditional useless one until the
* escaping is fixed.
*/
error_free(err);
error_setg(errp, "Can't parse filename");
}
g_free(q);
g_free(uri);
}
return ret;
static void sd_parse_filename(const char *filename, QDict *options,
Error **errp)
{
Error *err = NULL;
SheepdogConfig cfg;
char buf[32];
if (strstr(filename, "://")) {
sd_parse_uri(&cfg, filename, &err);
} else {
parse_vdiname(&cfg, filename, &err);
}
if (err) {
error_propagate(errp, err);
return;
}
if (cfg.path) {
qdict_set_default_str(options, "server.path", cfg.path);
qdict_set_default_str(options, "server.type", "unix");
} else {
qdict_set_default_str(options, "server.type", "inet");
qdict_set_default_str(options, "server.host",
cfg.host ?: SD_DEFAULT_ADDR);
snprintf(buf, sizeof(buf), "%d", cfg.port ?: SD_DEFAULT_PORT);
qdict_set_default_str(options, "server.port", buf);
}
qdict_set_default_str(options, "vdi", cfg.vdi);
qdict_set_default_str(options, "tag", cfg.tag);
if (cfg.snap_id) {
snprintf(buf, sizeof(buf), "%d", cfg.snap_id);
qdict_set_default_str(options, "snap-id", buf);
}
sd_config_done(&cfg);
}
static int find_vdi_name(BDRVSheepdogState *s, const char *filename,
@@ -1357,15 +1548,21 @@ static void sd_attach_aio_context(BlockDriverState *bs,
co_read_response, NULL, NULL, s);
}
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "sheepdog",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.name = "vdi",
.type = QEMU_OPT_STRING,
},
{
.name = "snap-id",
.type = QEMU_OPT_NUMBER,
},
{
.name = "tag",
.type = QEMU_OPT_STRING,
.help = "URL to the sheepdog image",
},
{ /* end of list */ }
},
@@ -1377,12 +1574,11 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
int ret, fd;
uint32_t vid = 0;
BDRVSheepdogState *s = bs->opaque;
char vdi[SD_MAX_VDI_LEN], tag[SD_MAX_VDI_TAG_LEN];
uint32_t snapid;
const char *vdi, *snap_id_str, *tag;
uint64_t snap_id;
char *buf = NULL;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
s->bs = bs;
s->aio_context = bdrv_get_aio_context(bs);
@@ -1392,37 +1588,63 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
if (local_err) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto out;
goto err_no_fd;
}
filename = qemu_opt_get(opts, "filename");
s->addr = sd_server_config(options, errp);
if (!s->addr) {
ret = -EINVAL;
goto err_no_fd;
}
vdi = qemu_opt_get(opts, "vdi");
snap_id_str = qemu_opt_get(opts, "snap-id");
snap_id = qemu_opt_get_number(opts, "snap-id", CURRENT_VDI_ID);
tag = qemu_opt_get(opts, "tag");
if (!vdi) {
error_setg(errp, "parameter 'vdi' is missing");
ret = -EINVAL;
goto err_no_fd;
}
if (strlen(vdi) >= SD_MAX_VDI_LEN) {
error_setg(errp, "value of parameter 'vdi' is too long");
ret = -EINVAL;
goto err_no_fd;
}
if (snap_id > UINT32_MAX) {
snap_id = 0;
}
if (snap_id_str && !snap_id) {
error_setg(errp, "'snap-id=%s' is not a valid snapshot ID",
snap_id_str);
ret = -EINVAL;
goto err_no_fd;
}
if (!tag) {
tag = "";
}
if (tag && strlen(tag) >= SD_MAX_VDI_TAG_LEN) {
error_setg(errp, "value of parameter 'tag' is too long");
ret = -EINVAL;
goto err_no_fd;
}
QLIST_INIT(&s->inflight_aio_head);
QLIST_INIT(&s->failed_aio_head);
QLIST_INIT(&s->inflight_aiocb_head);
s->fd = -1;
memset(vdi, 0, sizeof(vdi));
memset(tag, 0, sizeof(tag));
if (strstr(filename, "://")) {
ret = sd_parse_uri(s, filename, vdi, &snapid, tag);
} else {
ret = parse_vdiname(s, filename, vdi, &snapid, tag);
}
if (ret < 0) {
error_setg(errp, "Can't parse filename");
goto out;
}
s->fd = get_sheep_fd(s, errp);
if (s->fd < 0) {
ret = s->fd;
goto out;
goto err_no_fd;
}
ret = find_vdi_name(s, vdi, snapid, tag, &vid, true, errp);
ret = find_vdi_name(s, vdi, (uint32_t)snap_id, tag, &vid, true, errp);
if (ret) {
goto out;
goto err;
}
/*
@@ -1435,7 +1657,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
}
s->discard_supported = true;
if (snapid || tag[0] != '\0') {
if (snap_id || tag[0]) {
DPRINTF("%" PRIx32 " snapshot inode was open.\n", vid);
s->is_snapshot = true;
}
@@ -1443,7 +1665,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
fd = connect_to_sdog(s, errp);
if (fd < 0) {
ret = fd;
goto out;
goto err;
}
buf = g_malloc(SD_INODE_SIZE);
@@ -1454,7 +1676,7 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
if (ret) {
error_setg(errp, "Can't read snapshot inode");
goto out;
goto err;
}
memcpy(&s->inode, buf, sizeof(s->inode));
@@ -1466,12 +1688,12 @@ static int sd_open(BlockDriverState *bs, QDict *options, int flags,
qemu_opts_del(opts);
g_free(buf);
return 0;
out:
err:
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
false, NULL, NULL, NULL, NULL);
if (s->fd >= 0) {
closesocket(s->fd);
}
closesocket(s->fd);
err_no_fd:
qemu_opts_del(opts);
g_free(buf);
return ret;
@@ -1685,6 +1907,7 @@ static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
}
copy = strtol(n1, NULL, 10);
/* FIXME fix error checking by switching to qemu_strtol() */
if (copy > SD_MAX_COPIES || copy < 1) {
return -EINVAL;
}
@@ -1699,6 +1922,7 @@ static int parse_redundancy(BDRVSheepdogState *s, const char *opt)
}
parity = strtol(n2, NULL, 10);
/* FIXME fix error checking by switching to qemu_strtol() */
if (parity >= SD_EC_MAX_STRIP || parity < 1) {
return -EINVAL;
}
@@ -1737,29 +1961,34 @@ static int parse_block_size_shift(BDRVSheepdogState *s, QemuOpts *opt)
static int sd_create(const char *filename, QemuOpts *opts,
Error **errp)
{
Error *err = NULL;
int ret = 0;
uint32_t vid = 0;
char *backing_file = NULL;
char *buf = NULL;
BDRVSheepdogState *s;
char tag[SD_MAX_VDI_TAG_LEN];
uint32_t snapid;
SheepdogConfig cfg;
uint64_t max_vdi_size;
bool prealloc = false;
s = g_new0(BDRVSheepdogState, 1);
memset(tag, 0, sizeof(tag));
if (strstr(filename, "://")) {
ret = sd_parse_uri(s, filename, s->name, &snapid, tag);
sd_parse_uri(&cfg, filename, &err);
} else {
ret = parse_vdiname(s, filename, s->name, &snapid, tag);
parse_vdiname(&cfg, filename, &err);
}
if (ret < 0) {
error_setg(errp, "Can't parse filename");
if (err) {
error_propagate(errp, err);
goto out;
}
buf = cfg.port ? g_strdup_printf("%d", cfg.port) : NULL;
s->addr = sd_socket_address(cfg.path, cfg.host, buf);
g_free(buf);
strcpy(s->name, cfg.vdi);
sd_config_done(&cfg);
s->inode.vdi_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
@@ -1829,14 +2058,12 @@ static int sd_create(const char *filename, QemuOpts *opts,
if (s->inode.block_size_shift == 0) {
SheepdogVdiReq hdr;
SheepdogClusterRsp *rsp = (SheepdogClusterRsp *)&hdr;
Error *local_err = NULL;
int fd;
unsigned int wlen = 0, rlen = 0;
fd = connect_to_sdog(s, &local_err);
fd = connect_to_sdog(s, errp);
if (fd < 0) {
error_report_err(local_err);
ret = -EIO;
ret = fd;
goto out;
}
@@ -1922,7 +2149,7 @@ static void sd_close(BlockDriverState *bs)
aio_set_fd_handler(bdrv_get_aio_context(bs), s->fd,
false, NULL, NULL, NULL, NULL);
closesocket(s->fd);
g_free(s->host_spec);
qapi_free_SocketAddress(s->addr);
}
static int64_t sd_getlength(BlockDriverState *bs)
@@ -2367,19 +2594,16 @@ static int sd_snapshot_goto(BlockDriverState *bs, const char *snapshot_id)
BDRVSheepdogState *old_s;
char tag[SD_MAX_VDI_TAG_LEN];
uint32_t snapid = 0;
int ret = 0;
int ret;
if (!sd_parse_snapid_or_tag(snapshot_id, &snapid, tag)) {
return -EINVAL;
}
old_s = g_new(BDRVSheepdogState, 1);
memcpy(old_s, s, sizeof(BDRVSheepdogState));
snapid = strtoul(snapshot_id, NULL, 10);
if (snapid) {
tag[0] = 0;
} else {
pstrcpy(tag, sizeof(tag), snapshot_id);
}
ret = reload_inode(s, snapid, tag);
if (ret) {
goto out;
@@ -2405,18 +2629,15 @@ out:
#define NR_BATCHED_DISCARD 128
static bool remove_objects(BDRVSheepdogState *s)
static int remove_objects(BDRVSheepdogState *s, Error **errp)
{
int fd, i = 0, nr_objs = 0;
Error *local_err = NULL;
int ret = 0;
bool result = true;
int ret;
SheepdogInode *inode = &s->inode;
fd = connect_to_sdog(s, &local_err);
fd = connect_to_sdog(s, errp);
if (fd < 0) {
error_report_err(local_err);
return false;
return fd;
}
nr_objs = count_data_objs(inode);
@@ -2446,15 +2667,15 @@ static bool remove_objects(BDRVSheepdogState *s)
data_vdi_id[start_idx]),
false, s->cache_flags);
if (ret < 0) {
error_report("failed to discard snapshot inode.");
result = false;
error_setg(errp, "Failed to discard snapshot inode");
goto out;
}
}
ret = 0;
out:
closesocket(fd);
return result;
return ret;
}
static int sd_snapshot_delete(BlockDriverState *bs,
@@ -2462,9 +2683,12 @@ static int sd_snapshot_delete(BlockDriverState *bs,
const char *name,
Error **errp)
{
/*
* FIXME should delete the snapshot matching both @snapshot_id and
* @name, but @name not used here
*/
unsigned long snap_id = 0;
char snap_tag[SD_MAX_VDI_TAG_LEN];
Error *local_err = NULL;
int fd, ret;
char buf[SD_MAX_VDI_LEN + SD_MAX_VDI_TAG_LEN];
BDRVSheepdogState *s = bs->opaque;
@@ -2477,15 +2701,22 @@ static int sd_snapshot_delete(BlockDriverState *bs,
};
SheepdogVdiRsp *rsp = (SheepdogVdiRsp *)&hdr;
if (!remove_objects(s)) {
return -1;
ret = remove_objects(s, errp);
if (ret) {
return ret;
}
memset(buf, 0, sizeof(buf));
memset(snap_tag, 0, sizeof(snap_tag));
pstrcpy(buf, SD_MAX_VDI_LEN, s->name);
/* TODO Use sd_parse_snapid() once this mess is cleaned up */
ret = qemu_strtoul(snapshot_id, NULL, 10, &snap_id);
if (ret || snap_id > UINT32_MAX) {
/*
* FIXME Since qemu_strtoul() returns -EINVAL when
* @snapshot_id is null, @snapshot_id is mandatory. Correct
* would be to require at least one of @snapshot_id and @name.
*/
error_setg(errp, "Invalid snapshot ID: %s",
snapshot_id ? snapshot_id : "<null>");
return -EINVAL;
@@ -2494,40 +2725,42 @@ static int sd_snapshot_delete(BlockDriverState *bs,
if (snap_id) {
hdr.snapid = (uint32_t) snap_id;
} else {
/* FIXME I suspect we should use @name here */
/* FIXME don't truncate silently */
pstrcpy(snap_tag, sizeof(snap_tag), snapshot_id);
pstrcpy(buf + SD_MAX_VDI_LEN, SD_MAX_VDI_TAG_LEN, snap_tag);
}
ret = find_vdi_name(s, s->name, snap_id, snap_tag, &vid, true,
&local_err);
ret = find_vdi_name(s, s->name, snap_id, snap_tag, &vid, true, errp);
if (ret) {
return ret;
}
fd = connect_to_sdog(s, &local_err);
fd = connect_to_sdog(s, errp);
if (fd < 0) {
error_report_err(local_err);
return -1;
return fd;
}
ret = do_req(fd, s->bs, (SheepdogReq *)&hdr,
buf, &wlen, &rlen);
closesocket(fd);
if (ret) {
error_setg_errno(errp, -ret, "Couldn't send request to server");
return ret;
}
switch (rsp->result) {
case SD_RES_NO_VDI:
error_report("%s was already deleted", s->name);
error_setg(errp, "Can't find the snapshot");
return -ENOENT;
case SD_RES_SUCCESS:
break;
default:
error_report("%s, %s", sd_strerror(rsp->result), s->name);
return -1;
error_setg(errp, "%s", sd_strerror(rsp->result));
return -EIO;
}
return ret;
return 0;
}
static int sd_snapshot_list(BlockDriverState *bs, QEMUSnapshotInfo **psn_tab)
@@ -2832,7 +3065,7 @@ static BlockDriver bdrv_sheepdog = {
.format_name = "sheepdog",
.protocol_name = "sheepdog",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = sd_parse_filename,
.bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,
@@ -2868,7 +3101,7 @@ static BlockDriver bdrv_sheepdog_tcp = {
.format_name = "sheepdog",
.protocol_name = "sheepdog+tcp",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = sd_parse_filename,
.bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,
@@ -2904,7 +3137,7 @@ static BlockDriver bdrv_sheepdog_unix = {
.format_name = "sheepdog",
.protocol_name = "sheepdog+unix",
.instance_size = sizeof(BDRVSheepdogState),
.bdrv_needs_filename = true,
.bdrv_parse_filename = sd_parse_filename,
.bdrv_file_open = sd_open,
.bdrv_reopen_prepare = sd_reopen_prepare,
.bdrv_reopen_commit = sd_reopen_commit,

View File

@@ -27,6 +27,7 @@
#include "block/block_int.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qstring.h"
QemuOptsList internal_snapshot_opts = {
.name = "snapshot",
@@ -189,14 +190,33 @@ int bdrv_snapshot_goto(BlockDriverState *bs,
}
if (bs->file) {
BlockDriverState *file;
QDict *options = qdict_clone_shallow(bs->options);
QDict *file_options;
file = bs->file->bs;
/* Prevent it from getting deleted when detached from bs */
bdrv_ref(file);
qdict_extract_subqdict(options, &file_options, "file.");
QDECREF(file_options);
qdict_put(options, "file", qstring_from_str(bdrv_get_node_name(file)));
drv->bdrv_close(bs);
ret = bdrv_snapshot_goto(bs->file->bs, snapshot_id);
open_ret = drv->bdrv_open(bs, NULL, bs->open_flags, NULL);
bdrv_unref_child(bs, bs->file);
bs->file = NULL;
ret = bdrv_snapshot_goto(file, snapshot_id);
open_ret = drv->bdrv_open(bs, options, bs->open_flags, NULL);
QDECREF(options);
if (open_ret < 0) {
bdrv_unref(bs->file->bs);
bdrv_unref(file);
bs->drv = NULL;
return open_ret;
}
assert(bs->file->bs == file);
bdrv_unref(file);
return ret;
}

View File

@@ -601,7 +601,15 @@ static InetSocketAddress *ssh_config(QDict *options, Error **errp)
goto out;
}
iv = qobject_input_visitor_new(crumpled_addr, true);
/*
* FIXME .numeric, .to, .ipv4 or .ipv6 don't work with -drive.
* .to doesn't matter, it's ignored anyway.
* That's because when @options come from -blockdev or
* blockdev_add, members are typed according to the QAPI schema,
* but when they come from -drive, they're all QString. The
* visitor expects the former.
*/
iv = qobject_input_visitor_new(crumpled_addr);
visit_type_InetSocketAddress(iv, NULL, &inet, &local_error);
if (local_error) {
error_propagate(errp, local_error);
@@ -673,7 +681,7 @@ static int connect_to_ssh(BDRVSSHState *s, QDict *options,
}
/* Open the socket and connect. */
s->sock = inet_connect_saddr(s->inet, errp, NULL, NULL);
s->sock = inet_connect_saddr(s->inet, NULL, NULL, errp);
if (s->sock < 0) {
ret = -EIO;
goto err;

View File

@@ -110,3 +110,20 @@ qed_aio_write_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s
qed_aio_write_prefill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"
# block/vxhs.c
vxhs_iio_callback(int error) "ctx is NULL: error %d"
vxhs_iio_callback_chnfail(int err, int error) "QNIO channel failed, no i/o %d, %d"
vxhs_iio_callback_unknwn(int opcode, int err) "unexpected opcode %d, errno %d"
vxhs_aio_rw_invalid(int req) "Invalid I/O request iodir %d"
vxhs_aio_rw_ioerr(char *guid, int iodir, uint64_t size, uint64_t off, void *acb, int ret, int err) "IO ERROR (vDisk %s) FOR : Read/Write = %d size = %"PRIu64" offset = %"PRIu64" ACB = %p. Error = %d, errno = %d"
vxhs_get_vdisk_stat_err(char *guid, int ret, int err) "vDisk (%s) stat ioctl failed, ret = %d, errno = %d"
vxhs_get_vdisk_stat(char *vdisk_guid, uint64_t vdisk_size) "vDisk %s stat ioctl returned size %"PRIu64
vxhs_complete_aio(void *acb, uint64_t ret) "aio failed acb %p ret %"PRIu64
vxhs_parse_uri_filename(const char *filename) "URI passed via bdrv_parse_filename %s"
vxhs_open_vdiskid(const char *vdisk_id) "Opening vdisk-id %s"
vxhs_open_hostinfo(char *of_vsa_addr, int port) "Adding host %s:%d to BDRVVXHSState"
vxhs_open_iio_open(const char *host) "Failed to connect to storage agent on host %s"
vxhs_parse_uri_hostinfo(char *host, int port) "Host: IP %s, Port %d"
vxhs_close(char *vdisk_guid) "Closing vdisk %s"
vxhs_get_creds(const char *cacert, const char *client_key, const char *client_cert) "cacert %s, client_key %s, client_cert %s"

View File

@@ -1156,8 +1156,6 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
s->current_cluster=0xffffffff;
/* read only is the default for safety */
bs->read_only = true;
s->qcow = NULL;
s->qcow_filename = NULL;
s->fat2 = NULL;
@@ -1169,11 +1167,24 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
s->sector_count = cyls * heads * secs - (s->first_sectors_number - 1);
if (qemu_opt_get_bool(opts, "rw", false)) {
ret = enable_write_target(bs, errp);
if (ret < 0) {
if (!bdrv_is_read_only(bs)) {
ret = enable_write_target(bs, errp);
if (ret < 0) {
goto fail;
}
} else {
ret = -EPERM;
error_setg(errp,
"Unable to set VVFAT to 'rw' when drive is read-only");
goto fail;
}
} else {
/* read only is the default for safety */
ret = bdrv_set_read_only(bs, true, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
goto fail;
}
bs->read_only = false;
}
bs->total_sectors = cyls * heads * secs;
@@ -1394,7 +1405,13 @@ static int vvfat_read(BlockDriverState *bs, int64_t sector_num,
return -1;
if (s->qcow) {
int n;
if (bdrv_is_allocated(s->qcow->bs, sector_num, nb_sectors-i, &n)) {
int ret;
ret = bdrv_is_allocated(s->qcow->bs, sector_num,
nb_sectors - i, &n);
if (ret < 0) {
return ret;
}
if (ret) {
DLOG(fprintf(stderr, "sectors %d+%d allocated\n",
(int)sector_num, n));
if (bdrv_read(s->qcow, sector_num, buf + i * 0x200, n)) {
@@ -1668,7 +1685,8 @@ static inline uint32_t modified_fat_get(BDRVVVFATState* s,
}
}
static inline int cluster_was_modified(BDRVVVFATState* s, uint32_t cluster_num)
static inline bool cluster_was_modified(BDRVVVFATState *s,
uint32_t cluster_num)
{
int was_modified = 0;
int i, dummy;
@@ -1683,7 +1701,13 @@ static inline int cluster_was_modified(BDRVVVFATState* s, uint32_t cluster_num)
1, &dummy);
}
return was_modified;
/*
* Note that this treats failures to learn allocation status the
* same as if an allocation has occurred. It's as safe as
* anything else, given that a failure to learn allocation status
* will probably result in more failures.
*/
return !!was_modified;
}
static const char* get_basename(const char* path)
@@ -1833,6 +1857,9 @@ static uint32_t get_cluster_count_for_direntry(BDRVVVFATState* s,
int res;
res = bdrv_is_allocated(s->qcow->bs, offset + i, 1, &dummy);
if (res < 0) {
return -1;
}
if (!res) {
res = vvfat_read(s->bs, offset, s->cluster_buffer, 1);
if (res) {

575
block/vxhs.c Normal file
View File

@@ -0,0 +1,575 @@
/*
* QEMU Block driver for Veritas HyperScale (VxHS)
*
* Copyright (c) 2017 Veritas Technologies LLC.
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include <qnio/qnio_api.h>
#include <sys/param.h>
#include "block/block_int.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "trace.h"
#include "qemu/uri.h"
#include "qapi/error.h"
#include "qemu/uuid.h"
#include "crypto/tlscredsx509.h"
#define VXHS_OPT_FILENAME "filename"
#define VXHS_OPT_VDISK_ID "vdisk-id"
#define VXHS_OPT_SERVER "server"
#define VXHS_OPT_HOST "host"
#define VXHS_OPT_PORT "port"
/* Only accessed under QEMU global mutex */
static uint32_t vxhs_ref;
typedef enum {
VDISK_AIO_READ,
VDISK_AIO_WRITE,
} VDISKAIOCmd;
/*
* HyperScale AIO callbacks structure
*/
typedef struct VXHSAIOCB {
BlockAIOCB common;
int err;
} VXHSAIOCB;
typedef struct VXHSvDiskHostsInfo {
void *dev_handle; /* Device handle */
char *host; /* Host name or IP */
int port; /* Host's port number */
} VXHSvDiskHostsInfo;
/*
* Structure per vDisk maintained for state
*/
typedef struct BDRVVXHSState {
VXHSvDiskHostsInfo vdisk_hostinfo; /* Per host info */
char *vdisk_guid;
char *tlscredsid; /* tlscredsid */
} BDRVVXHSState;
static void vxhs_complete_aio_bh(void *opaque)
{
VXHSAIOCB *acb = opaque;
BlockCompletionFunc *cb = acb->common.cb;
void *cb_opaque = acb->common.opaque;
int ret = 0;
if (acb->err != 0) {
trace_vxhs_complete_aio(acb, acb->err);
ret = (-EIO);
}
qemu_aio_unref(acb);
cb(cb_opaque, ret);
}
/*
* Called from a libqnio thread
*/
static void vxhs_iio_callback(void *ctx, uint32_t opcode, uint32_t error)
{
VXHSAIOCB *acb = NULL;
switch (opcode) {
case IRP_READ_REQUEST:
case IRP_WRITE_REQUEST:
/*
* ctx is VXHSAIOCB*
* ctx is NULL if error is QNIOERROR_CHANNEL_HUP
*/
if (ctx) {
acb = ctx;
} else {
trace_vxhs_iio_callback(error);
goto out;
}
if (error) {
if (!acb->err) {
acb->err = error;
}
trace_vxhs_iio_callback(error);
}
aio_bh_schedule_oneshot(bdrv_get_aio_context(acb->common.bs),
vxhs_complete_aio_bh, acb);
break;
default:
if (error == QNIOERROR_HUP) {
/*
* Channel failed, spontaneous notification,
* not in response to I/O
*/
trace_vxhs_iio_callback_chnfail(error, errno);
} else {
trace_vxhs_iio_callback_unknwn(opcode, error);
}
break;
}
out:
return;
}
static QemuOptsList runtime_opts = {
.name = "vxhs",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = VXHS_OPT_FILENAME,
.type = QEMU_OPT_STRING,
.help = "URI to the Veritas HyperScale image",
},
{
.name = VXHS_OPT_VDISK_ID,
.type = QEMU_OPT_STRING,
.help = "UUID of the VxHS vdisk",
},
{
.name = "tls-creds",
.type = QEMU_OPT_STRING,
.help = "ID of the TLS/SSL credentials to use",
},
{ /* end of list */ }
},
};
static QemuOptsList runtime_tcp_opts = {
.name = "vxhs_tcp",
.head = QTAILQ_HEAD_INITIALIZER(runtime_tcp_opts.head),
.desc = {
{
.name = VXHS_OPT_HOST,
.type = QEMU_OPT_STRING,
.help = "host address (ipv4 addresses)",
},
{
.name = VXHS_OPT_PORT,
.type = QEMU_OPT_NUMBER,
.help = "port number on which VxHSD is listening (default 9999)",
.def_value_str = "9999"
},
{ /* end of list */ }
},
};
/*
* Parse incoming URI and populate *options with the host
* and device information
*/
static int vxhs_parse_uri(const char *filename, QDict *options)
{
URI *uri = NULL;
char *port;
int ret = 0;
trace_vxhs_parse_uri_filename(filename);
uri = uri_parse(filename);
if (!uri || !uri->server || !uri->path) {
uri_free(uri);
return -EINVAL;
}
qdict_put(options, VXHS_OPT_SERVER".host", qstring_from_str(uri->server));
if (uri->port) {
port = g_strdup_printf("%d", uri->port);
qdict_put(options, VXHS_OPT_SERVER".port", qstring_from_str(port));
g_free(port);
}
qdict_put(options, "vdisk-id", qstring_from_str(uri->path));
trace_vxhs_parse_uri_hostinfo(uri->server, uri->port);
uri_free(uri);
return ret;
}
static void vxhs_parse_filename(const char *filename, QDict *options,
Error **errp)
{
if (qdict_haskey(options, "vdisk-id") || qdict_haskey(options, "server")) {
error_setg(errp, "vdisk-id/server and a file name may not be specified "
"at the same time");
return;
}
if (strstr(filename, "://")) {
int ret = vxhs_parse_uri(filename, options);
if (ret < 0) {
error_setg(errp, "Invalid URI. URI should be of the form "
" vxhs://<host_ip>:<port>/<vdisk-id>");
}
}
}
static int vxhs_init_and_ref(void)
{
if (vxhs_ref++ == 0) {
if (iio_init(QNIO_VERSION, vxhs_iio_callback)) {
return -ENODEV;
}
}
return 0;
}
static void vxhs_unref(void)
{
if (--vxhs_ref == 0) {
iio_fini();
}
}
static void vxhs_get_tls_creds(const char *id, char **cacert,
char **key, char **cert, Error **errp)
{
Object *obj;
QCryptoTLSCreds *creds;
QCryptoTLSCredsX509 *creds_x509;
obj = object_resolve_path_component(
object_get_objects_root(), id);
if (!obj) {
error_setg(errp, "No TLS credentials with id '%s'",
id);
return;
}
creds_x509 = (QCryptoTLSCredsX509 *)
object_dynamic_cast(obj, TYPE_QCRYPTO_TLS_CREDS_X509);
if (!creds_x509) {
error_setg(errp, "Object with id '%s' is not TLS credentials",
id);
return;
}
creds = &creds_x509->parent_obj;
if (creds->endpoint != QCRYPTO_TLS_CREDS_ENDPOINT_CLIENT) {
error_setg(errp,
"Expecting TLS credentials with a client endpoint");
return;
}
/*
* Get the cacert, client_cert and client_key file names.
*/
if (!creds->dir) {
error_setg(errp, "TLS object missing 'dir' property value");
return;
}
*cacert = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CA_CERT);
*cert = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CLIENT_CERT);
*key = g_strdup_printf("%s/%s", creds->dir,
QCRYPTO_TLS_CREDS_X509_CLIENT_KEY);
}
static int vxhs_open(BlockDriverState *bs, QDict *options,
int bdrv_flags, Error **errp)
{
BDRVVXHSState *s = bs->opaque;
void *dev_handlep;
QDict *backing_options = NULL;
QemuOpts *opts = NULL;
QemuOpts *tcp_opts = NULL;
char *of_vsa_addr = NULL;
Error *local_err = NULL;
const char *vdisk_id_opt;
const char *server_host_opt;
int ret = 0;
char *cacert = NULL;
char *client_key = NULL;
char *client_cert = NULL;
ret = vxhs_init_and_ref();
if (ret < 0) {
ret = -EINVAL;
goto out;
}
/* Create opts info from runtime_opts and runtime_tcp_opts list */
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
tcp_opts = qemu_opts_create(&runtime_tcp_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
ret = -EINVAL;
goto out;
}
/* vdisk-id is the disk UUID */
vdisk_id_opt = qemu_opt_get(opts, VXHS_OPT_VDISK_ID);
if (!vdisk_id_opt) {
error_setg(&local_err, QERR_MISSING_PARAMETER, VXHS_OPT_VDISK_ID);
ret = -EINVAL;
goto out;
}
/* vdisk-id may contain a leading '/' */
if (strlen(vdisk_id_opt) > UUID_FMT_LEN + 1) {
error_setg(&local_err, "vdisk-id cannot be more than %d characters",
UUID_FMT_LEN);
ret = -EINVAL;
goto out;
}
s->vdisk_guid = g_strdup(vdisk_id_opt);
trace_vxhs_open_vdiskid(vdisk_id_opt);
/* get the 'server.' arguments */
qdict_extract_subqdict(options, &backing_options, VXHS_OPT_SERVER".");
qemu_opts_absorb_qdict(tcp_opts, backing_options, &local_err);
if (local_err != NULL) {
ret = -EINVAL;
goto out;
}
server_host_opt = qemu_opt_get(tcp_opts, VXHS_OPT_HOST);
if (!server_host_opt) {
error_setg(&local_err, QERR_MISSING_PARAMETER,
VXHS_OPT_SERVER"."VXHS_OPT_HOST);
ret = -EINVAL;
goto out;
}
if (strlen(server_host_opt) > MAXHOSTNAMELEN) {
error_setg(&local_err, "server.host cannot be more than %d characters",
MAXHOSTNAMELEN);
ret = -EINVAL;
goto out;
}
/* check if we got tls-creds via the --object argument */
s->tlscredsid = g_strdup(qemu_opt_get(opts, "tls-creds"));
if (s->tlscredsid) {
vxhs_get_tls_creds(s->tlscredsid, &cacert, &client_key,
&client_cert, &local_err);
if (local_err != NULL) {
ret = -EINVAL;
goto out;
}
trace_vxhs_get_creds(cacert, client_key, client_cert);
}
s->vdisk_hostinfo.host = g_strdup(server_host_opt);
s->vdisk_hostinfo.port = g_ascii_strtoll(qemu_opt_get(tcp_opts,
VXHS_OPT_PORT),
NULL, 0);
trace_vxhs_open_hostinfo(s->vdisk_hostinfo.host,
s->vdisk_hostinfo.port);
of_vsa_addr = g_strdup_printf("of://%s:%d",
s->vdisk_hostinfo.host,
s->vdisk_hostinfo.port);
/*
* Open qnio channel to storage agent if not opened before
*/
dev_handlep = iio_open(of_vsa_addr, s->vdisk_guid, 0,
cacert, client_key, client_cert);
if (dev_handlep == NULL) {
trace_vxhs_open_iio_open(of_vsa_addr);
ret = -ENODEV;
goto out;
}
s->vdisk_hostinfo.dev_handle = dev_handlep;
out:
g_free(of_vsa_addr);
QDECREF(backing_options);
qemu_opts_del(tcp_opts);
qemu_opts_del(opts);
g_free(cacert);
g_free(client_key);
g_free(client_cert);
if (ret < 0) {
vxhs_unref();
error_propagate(errp, local_err);
g_free(s->vdisk_hostinfo.host);
g_free(s->vdisk_guid);
g_free(s->tlscredsid);
s->vdisk_guid = NULL;
}
return ret;
}
static const AIOCBInfo vxhs_aiocb_info = {
.aiocb_size = sizeof(VXHSAIOCB)
};
/*
* This allocates QEMU-VXHS callback for each IO
* and is passed to QNIO. When QNIO completes the work,
* it will be passed back through the callback.
*/
static BlockAIOCB *vxhs_aio_rw(BlockDriverState *bs, int64_t sector_num,
QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque,
VDISKAIOCmd iodir)
{
VXHSAIOCB *acb = NULL;
BDRVVXHSState *s = bs->opaque;
size_t size;
uint64_t offset;
int iio_flags = 0;
int ret = 0;
void *dev_handle = s->vdisk_hostinfo.dev_handle;
offset = sector_num * BDRV_SECTOR_SIZE;
size = nb_sectors * BDRV_SECTOR_SIZE;
acb = qemu_aio_get(&vxhs_aiocb_info, bs, cb, opaque);
/*
* Initialize VXHSAIOCB.
*/
acb->err = 0;
iio_flags = IIO_FLAG_ASYNC;
switch (iodir) {
case VDISK_AIO_WRITE:
ret = iio_writev(dev_handle, acb, qiov->iov, qiov->niov,
offset, (uint64_t)size, iio_flags);
break;
case VDISK_AIO_READ:
ret = iio_readv(dev_handle, acb, qiov->iov, qiov->niov,
offset, (uint64_t)size, iio_flags);
break;
default:
trace_vxhs_aio_rw_invalid(iodir);
goto errout;
}
if (ret != 0) {
trace_vxhs_aio_rw_ioerr(s->vdisk_guid, iodir, size, offset,
acb, ret, errno);
goto errout;
}
return &acb->common;
errout:
qemu_aio_unref(acb);
return NULL;
}
static BlockAIOCB *vxhs_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors, cb,
opaque, VDISK_AIO_READ);
}
static BlockAIOCB *vxhs_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return vxhs_aio_rw(bs, sector_num, qiov, nb_sectors,
cb, opaque, VDISK_AIO_WRITE);
}
static void vxhs_close(BlockDriverState *bs)
{
BDRVVXHSState *s = bs->opaque;
trace_vxhs_close(s->vdisk_guid);
g_free(s->vdisk_guid);
s->vdisk_guid = NULL;
/*
* Close vDisk device
*/
if (s->vdisk_hostinfo.dev_handle) {
iio_close(s->vdisk_hostinfo.dev_handle);
s->vdisk_hostinfo.dev_handle = NULL;
}
vxhs_unref();
/*
* Free the dynamically allocated host string etc
*/
g_free(s->vdisk_hostinfo.host);
g_free(s->tlscredsid);
s->tlscredsid = NULL;
s->vdisk_hostinfo.host = NULL;
s->vdisk_hostinfo.port = 0;
}
static int64_t vxhs_get_vdisk_stat(BDRVVXHSState *s)
{
int64_t vdisk_size = -1;
int ret = 0;
void *dev_handle = s->vdisk_hostinfo.dev_handle;
ret = iio_ioctl(dev_handle, IOR_VDISK_STAT, &vdisk_size, 0);
if (ret < 0) {
trace_vxhs_get_vdisk_stat_err(s->vdisk_guid, ret, errno);
return -EIO;
}
trace_vxhs_get_vdisk_stat(s->vdisk_guid, vdisk_size);
return vdisk_size;
}
/*
* Returns the size of vDisk in bytes. This is required
* by QEMU block upper block layer so that it is visible
* to guest.
*/
static int64_t vxhs_getlength(BlockDriverState *bs)
{
BDRVVXHSState *s = bs->opaque;
int64_t vdisk_size;
vdisk_size = vxhs_get_vdisk_stat(s);
if (vdisk_size < 0) {
return -EIO;
}
return vdisk_size;
}
static BlockDriver bdrv_vxhs = {
.format_name = "vxhs",
.protocol_name = "vxhs",
.instance_size = sizeof(BDRVVXHSState),
.bdrv_file_open = vxhs_open,
.bdrv_parse_filename = vxhs_parse_filename,
.bdrv_close = vxhs_close,
.bdrv_getlength = vxhs_getlength,
.bdrv_aio_readv = vxhs_aio_readv,
.bdrv_aio_writev = vxhs_aio_writev,
};
static void bdrv_vxhs_init(void)
{
bdrv_register(&bdrv_vxhs);
}
block_init(bdrv_vxhs_init);

View File

@@ -124,6 +124,7 @@ void qmp_nbd_server_start(SocketAddress *addr,
goto error;
}
/* TODO SOCKET_ADDRESS_KIND_FD where fd has AF_INET or AF_INET6 */
if (addr->type != SOCKET_ADDRESS_KIND_INET) {
error_setg(errp, "TLS is only supported with IPv4/IPv6");
goto error;

View File

@@ -1614,6 +1614,7 @@ typedef struct ExternalSnapshotState {
BlockDriverState *old_bs;
BlockDriverState *new_bs;
AioContext *aio_context;
bool overlay_appended;
} ExternalSnapshotState;
static void external_snapshot_prepare(BlkActionState *common,
@@ -1727,7 +1728,7 @@ static void external_snapshot_prepare(BlkActionState *common,
bdrv_img_create(new_image_file, format,
state->old_bs->filename,
state->old_bs->drv->format_name,
NULL, size, flags, &local_err, false);
NULL, size, flags, false, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
@@ -1771,6 +1772,8 @@ static void external_snapshot_prepare(BlkActionState *common,
return;
}
bdrv_set_aio_context(state->new_bs, state->aio_context);
/* This removes our old bs and adds the new bs. This is an operation that
* can fail, so we need to do it in .prepare; undoing it for abort is
* always possible. */
@@ -1780,6 +1783,7 @@ static void external_snapshot_prepare(BlkActionState *common,
error_propagate(errp, local_err);
return;
}
state->overlay_appended = true;
}
static void external_snapshot_commit(BlkActionState *common)
@@ -1787,8 +1791,6 @@ static void external_snapshot_commit(BlkActionState *common)
ExternalSnapshotState *state =
DO_UPCAST(ExternalSnapshotState, common, common);
bdrv_set_aio_context(state->new_bs, state->aio_context);
/* We don't need (or want) to use the transactional
* bdrv_reopen_multiple() across all the entries at once, because we
* don't want to abort all of them if one of them fails the reopen */
@@ -1803,8 +1805,8 @@ static void external_snapshot_abort(BlkActionState *common)
ExternalSnapshotState *state =
DO_UPCAST(ExternalSnapshotState, common, common);
if (state->new_bs) {
if (state->new_bs->backing) {
bdrv_replace_in_backing_chain(state->new_bs, state->old_bs);
if (state->overlay_appended) {
bdrv_replace_node(state->new_bs, state->old_bs, &error_abort);
}
}
}
@@ -2045,7 +2047,9 @@ static void block_dirty_bitmap_clear_abort(BlkActionState *common)
BlockDirtyBitmapState *state = DO_UPCAST(BlockDirtyBitmapState,
common, common);
bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
if (state->backup) {
bdrv_undo_clear_dirty_bitmap(state->bitmap, state->backup);
}
}
static void block_dirty_bitmap_clear_commit(BlkActionState *common)
@@ -2831,7 +2835,7 @@ void hmp_drive_del(Monitor *mon, const QDict *qdict)
bs = bdrv_find_node(id);
if (bs) {
qmp_x_blockdev_del(id, &local_err);
qmp_blockdev_del(id, &local_err);
if (local_err) {
error_report_err(local_err);
}
@@ -3138,7 +3142,7 @@ void qmp_block_commit(bool has_job_id, const char *job_id, const char *device,
}
commit_active_start(has_job_id ? job_id : NULL, bs, base_bs,
BLOCK_JOB_DEFAULT, speed, on_error,
filter_node_name, NULL, NULL, &local_err, false);
filter_node_name, NULL, NULL, false, &local_err);
} else {
BlockDriverState *overlay_bs = bdrv_find_overlay(bs, top_bs);
if (bdrv_op_is_blocked(overlay_bs, BLOCK_OP_TYPE_COMMIT_TARGET, errp)) {
@@ -3233,10 +3237,10 @@ static BlockJob *do_drive_backup(DriveBackup *backup, BlockJobTxn *txn,
if (source) {
bdrv_img_create(backup->target, backup->format, source->filename,
source->drv->format_name, NULL,
size, flags, &local_err, false);
size, flags, false, &local_err);
} else {
bdrv_img_create(backup->target, backup->format, NULL, NULL, NULL,
size, flags, &local_err, false);
size, flags, false, &local_err);
}
}
@@ -3527,7 +3531,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
/* create new image w/o backing file */
assert(format);
bdrv_img_create(arg->target, format,
NULL, NULL, NULL, size, flags, &local_err, false);
NULL, NULL, NULL, size, flags, false, &local_err);
} else {
switch (arg->mode) {
case NEW_IMAGE_MODE_EXISTING:
@@ -3537,7 +3541,7 @@ void qmp_drive_mirror(DriveMirror *arg, Error **errp)
bdrv_img_create(arg->target, format,
source->filename,
source->drv->format_name,
NULL, size, flags, &local_err, false);
NULL, size, flags, false, &local_err);
break;
default:
abort();
@@ -3896,7 +3900,7 @@ fail:
visit_free(v);
}
void qmp_x_blockdev_del(const char *node_name, Error **errp)
void qmp_blockdev_del(const char *node_name, Error **errp)
{
AioContext *aio_context;
BlockDriverState *bs;

View File

@@ -68,6 +68,23 @@ static const BdrvChildRole child_job = {
.stay_at_node = true,
};
static void block_job_drained_begin(void *opaque)
{
BlockJob *job = opaque;
block_job_pause(job);
}
static void block_job_drained_end(void *opaque)
{
BlockJob *job = opaque;
block_job_resume(job);
}
static const BlockDevOps block_job_dev_ops = {
.drained_begin = block_job_drained_begin,
.drained_end = block_job_drained_end,
};
BlockJob *block_job_next(BlockJob *job)
{
if (!job) {
@@ -205,11 +222,6 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
}
job = g_malloc0(driver->instance_size);
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
job->driver = driver;
job->id = g_strdup(job_id);
job->blk = blk;
@@ -219,8 +231,15 @@ void *block_job_create(const char *job_id, const BlockJobDriver *driver,
job->paused = true;
job->pause_count = 1;
job->refcnt = 1;
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
block_job_add_bdrv(job, "main node", bs, 0, BLK_PERM_ALL, &error_abort);
bs->job = job;
blk_set_dev_ops(blk, &block_job_dev_ops, job);
bdrv_op_unblock(bs, BLOCK_OP_TYPE_DATAPLANE, job->blocker);
QLIST_INSERT_HEAD(&block_jobs, job, job_list);
blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
@@ -250,16 +269,28 @@ static bool block_job_started(BlockJob *job)
return job->co;
}
/**
* All jobs must allow a pause point before entering their job proper. This
* ensures that jobs can be paused prior to being started, then resumed later.
*/
static void coroutine_fn block_job_co_entry(void *opaque)
{
BlockJob *job = opaque;
assert(job && job->driver && job->driver->start);
block_job_pause_point(job);
job->driver->start(job);
}
void block_job_start(BlockJob *job)
{
assert(job && !block_job_started(job) && job->paused &&
!job->busy && job->driver->start);
job->co = qemu_coroutine_create(job->driver->start, job);
if (--job->pause_count == 0) {
job->paused = false;
job->busy = true;
qemu_coroutine_enter(job->co);
}
job->driver && job->driver->start);
job->co = qemu_coroutine_create(block_job_co_entry, job);
job->pause_count--;
job->busy = true;
job->paused = false;
bdrv_coroutine_enter(blk_bs(job->blk), job->co);
}
void block_job_ref(BlockJob *job)
@@ -501,7 +532,7 @@ void block_job_user_resume(BlockJob *job)
void block_job_enter(BlockJob *job)
{
if (job->co && !job->busy) {
qemu_coroutine_enter(job->co);
bdrv_coroutine_enter(blk_bs(job->blk), job->co);
}
}
@@ -755,12 +786,16 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
/* Fetch BDS AioContext again, in case it has changed */
aio_context = blk_get_aio_context(data->job->blk);
aio_context_acquire(aio_context);
if (aio_context != data->aio_context) {
aio_context_acquire(aio_context);
}
data->job->deferred_to_main_loop = false;
data->fn(data->job, data->opaque);
aio_context_release(aio_context);
if (aio_context != data->aio_context) {
aio_context_release(aio_context);
}
aio_context_release(data->aio_context);

View File

@@ -24,8 +24,7 @@
//#define DEBUG_MMAP
#if defined(CONFIG_USE_NPTL)
pthread_mutex_t mmap_mutex;
static pthread_mutex_t mmap_mutex = PTHREAD_MUTEX_INITIALIZER;
static int __thread mmap_lock_count;
void mmap_lock(void)
@@ -62,16 +61,6 @@ void mmap_fork_end(int child)
else
pthread_mutex_unlock(&mmap_mutex);
}
#else
/* We aren't threadsafe to start with, so no need to worry about locking. */
void mmap_lock(void)
{
}
void mmap_unlock(void)
{
}
#endif
/* NOTE: all the constants are the HOST ones, but addresses are target. */
int target_mprotect(abi_ulong start, abi_ulong len, int prot)
@@ -199,12 +188,7 @@ static int mmap_frag(abi_ulong real_start,
return 0;
}
#if defined(__CYGWIN__)
/* Cygwin doesn't have a whole lot of address space. */
static abi_ulong mmap_next_start = 0x18000000;
#else
static abi_ulong mmap_next_start = 0x40000000;
#endif
unsigned long last_brk;

View File

@@ -209,10 +209,8 @@ abi_long target_mremap(abi_ulong old_addr, abi_ulong old_size,
abi_ulong new_addr);
int target_msync(abi_ulong start, abi_ulong len, int flags);
extern unsigned long last_brk;
#if defined(CONFIG_USE_NPTL)
void mmap_fork_start(void);
void mmap_fork_end(int child);
#endif
/* main.c */
extern unsigned long x86_stack_size;

View File

@@ -58,7 +58,7 @@ static gboolean fd_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
ret = qio_channel_read(
chan, (gchar *)buf, len, NULL);
if (ret == 0) {
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
qemu_chr_be_event(chr, CHR_EVENT_CLOSED);
return FALSE;
}
@@ -89,7 +89,7 @@ static void fd_chr_update_read_handler(Chardev *chr,
{
FDChardev *s = FD_CHARDEV(chr);
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
if (s->ioc_in) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc_in,
fd_chr_read_poll,
@@ -103,7 +103,7 @@ static void char_fd_finalize(Object *obj)
Chardev *chr = CHARDEV(obj);
FDChardev *s = FD_CHARDEV(obj);
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
if (s->ioc_in) {
object_unref(OBJECT(s->ioc_in));
}

View File

@@ -127,14 +127,14 @@ guint io_add_watch_poll(Chardev *chr,
return tag;
}
static void io_remove_watch_poll(guint tag)
static void io_remove_watch_poll(guint tag, GMainContext *context)
{
GSource *source;
IOWatchPoll *iwp;
g_return_if_fail(tag > 0);
source = g_main_context_find_source_by_id(NULL, tag);
source = g_main_context_find_source_by_id(context, tag);
g_return_if_fail(source != NULL);
iwp = io_watch_poll_from_source(source);
@@ -146,10 +146,10 @@ static void io_remove_watch_poll(guint tag)
g_source_destroy(&iwp->parent);
}
void remove_fd_in_watch(Chardev *chr)
void remove_fd_in_watch(Chardev *chr, GMainContext *context)
{
if (chr->fd_in_tag) {
io_remove_watch_poll(chr->fd_in_tag);
io_remove_watch_poll(chr->fd_in_tag, context);
chr->fd_in_tag = 0;
}
}

View File

@@ -36,7 +36,7 @@ guint io_add_watch_poll(Chardev *chr,
gpointer user_data,
GMainContext *context);
void remove_fd_in_watch(Chardev *chr);
void remove_fd_in_watch(Chardev *chr, GMainContext *context);
int io_channel_send(QIOChannel *ioc, const void *buf, size_t len);

View File

@@ -199,7 +199,7 @@ static void pty_chr_state(Chardev *chr, int connected)
g_source_remove(s->open_tag);
s->open_tag = 0;
}
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
s->connected = 0;
/* (re-)connect poll interval for idle guests: once per second.
* We check more frequently in case the guests sends data to

View File

@@ -47,7 +47,6 @@ typedef struct {
int max_size;
int do_telnetopt;
int do_nodelay;
int is_unix;
int *read_msgfds;
size_t read_msgfds_num;
int *write_msgfds;
@@ -328,7 +327,7 @@ static void tcp_chr_free_connection(Chardev *chr)
}
tcp_set_msgfds(chr, NULL, 0);
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
object_unref(OBJECT(s->sioc));
s->sioc = NULL;
object_unref(OBJECT(s->ioc));
@@ -358,6 +357,10 @@ static char *SocketAddress_to_str(const char *prefix, SocketAddress *addr,
return g_strdup_printf("%sfd:%s%s", prefix, addr->u.fd.data->str,
is_listen ? ",server" : "");
break;
case SOCKET_ADDRESS_KIND_VSOCK:
return g_strdup_printf("%svsock:%s:%s", prefix,
addr->u.vsock.data->cid,
addr->u.vsock.data->port);
default:
abort();
}
@@ -498,7 +501,7 @@ static void tcp_chr_update_read_handler(Chardev *chr,
return;
}
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
if (s->ioc) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
tcp_chr_read_poll,
@@ -825,7 +828,6 @@ static void qmp_chardev_open_socket(Chardev *chr,
int64_t reconnect = sock->has_reconnect ? sock->reconnect : 0;
QIOChannelSocket *sioc = NULL;
s->is_unix = addr->type == SOCKET_ADDRESS_KIND_UNIX;
s->is_listen = is_listen;
s->is_telnet = is_telnet;
s->do_nodelay = do_nodelay;
@@ -865,7 +867,8 @@ static void qmp_chardev_open_socket(Chardev *chr,
s->addr = QAPI_CLONE(SocketAddress, sock->addr);
qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_RECONNECTABLE);
if (s->is_unix) {
/* TODO SOCKET_ADDRESS_FD where fd has AF_UNIX */
if (addr->type == SOCKET_ADDRESS_KIND_UNIX) {
qemu_chr_set_feature(chr, QEMU_CHAR_FEATURE_FD_PASS);
}

View File

@@ -81,7 +81,7 @@ static gboolean udp_chr_read(QIOChannel *chan, GIOCondition cond, void *opaque)
ret = qio_channel_read(
s->ioc, (char *)s->buf, sizeof(s->buf), NULL);
if (ret <= 0) {
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
return FALSE;
}
s->bufcnt = ret;
@@ -101,7 +101,7 @@ static void udp_chr_update_read_handler(Chardev *chr,
{
UdpChardev *s = UDP_CHARDEV(chr);
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
if (s->ioc) {
chr->fd_in_tag = io_add_watch_poll(chr, s->ioc,
udp_chr_read_poll,
@@ -115,7 +115,7 @@ static void char_udp_finalize(Object *obj)
Chardev *chr = CHARDEV(obj);
UdpChardev *s = UDP_CHARDEV(obj);
remove_fd_in_watch(chr);
remove_fd_in_watch(chr, NULL);
if (s->ioc) {
object_unref(OBJECT(s->ioc));
}

View File

@@ -560,7 +560,7 @@ void qemu_chr_fe_set_handlers(CharBackend *b,
cc = CHARDEV_GET_CLASS(s);
if (!opaque && !fd_can_read && !fd_read && !fd_event) {
fe_open = 0;
remove_fd_in_watch(s);
remove_fd_in_watch(s, context);
} else {
fe_open = 1;
}

402
configure vendored
View File

@@ -301,7 +301,6 @@ glusterfs=""
glusterfs_xlator_opt="no"
glusterfs_discard="no"
glusterfs_zerofill="no"
archipelago="no"
gtk=""
gtkabi=""
gtk_gl="no"
@@ -321,6 +320,11 @@ numa=""
tcmalloc="no"
jemalloc="no"
replication="yes"
vxhs=""
supported_cpu="no"
supported_os="no"
bogus_os="no"
# parse CC options first
for opt do
@@ -518,26 +522,36 @@ ARCH=
# Normalise host CPU name and set ARCH.
# Note that this case should only have supported host CPUs, not guests.
case "$cpu" in
ia64|ppc|ppc64|s390|s390x|sparc64|x32)
ppc|ppc64|s390|s390x|sparc64|x32)
cpu="$cpu"
supported_cpu="yes"
;;
ia64)
cpu="$cpu"
;;
i386|i486|i586|i686|i86pc|BePC)
cpu="i386"
supported_cpu="yes"
;;
x86_64|amd64)
cpu="x86_64"
supported_cpu="yes"
;;
armv*b|armv*l|arm)
cpu="arm"
supported_cpu="yes"
;;
aarch64)
cpu="aarch64"
supported_cpu="yes"
;;
mips*)
cpu="mips"
supported_cpu="yes"
;;
sparc|sun4[cdmuv])
cpu="sparc"
supported_cpu="yes"
;;
*)
# This will result in either an error or falling back to TCI later
@@ -554,12 +568,6 @@ fi
HOST_VARIANT_DIR=""
case $targetos in
CYGWIN*)
mingw32="yes"
QEMU_CFLAGS="-mno-cygwin $QEMU_CFLAGS"
audio_possible_drivers="sdl"
audio_drv_list="sdl"
;;
MINGW32*)
mingw32="yes"
hax="yes"
@@ -569,6 +577,7 @@ MINGW32*)
else
audio_drv_list=""
fi
supported_os="yes"
;;
GNU/kFreeBSD)
bsd="yes"
@@ -586,6 +595,7 @@ FreeBSD)
libs_qga="-lutil $libs_qga"
netmap="" # enable netmap autodetect
HOST_VARIANT_DIR="freebsd"
supported_os="yes"
;;
DragonFly)
bsd="yes"
@@ -627,6 +637,7 @@ Darwin)
# won't work when we're compiling with gcc as a C compiler.
QEMU_CFLAGS="-DOS_OBJECT_USE_OBJC=0 $QEMU_CFLAGS"
HOST_VARIANT_DIR="darwin"
supported_os="yes"
;;
SunOS)
solaris="yes"
@@ -673,7 +684,7 @@ Haiku)
QEMU_CFLAGS="-DB_USE_POSITIVE_POSIX_ERRORS $QEMU_CFLAGS"
LIBS="-lposix_error_mapper -lnetwork $LIBS"
;;
*)
Linux)
audio_drv_list="oss"
audio_possible_drivers="oss alsa sdl pa"
linux="yes"
@@ -683,6 +694,13 @@ Haiku)
vhost_scsi="yes"
vhost_vsock="yes"
QEMU_INCLUDES="-I\$(SRC_PATH)/linux-headers -I$(pwd)/linux-headers $QEMU_INCLUDES"
supported_os="yes"
;;
*)
# This is a fatal error, but don't report it yet, because we
# might be going to just print the --help text, or it might
# be the result of a missing compiler.
bogus_os="yes"
;;
esac
@@ -725,7 +743,7 @@ if test "$mingw32" = "yes" ; then
sysconfdir="\${prefix}"
local_statedir=
confsuffix=""
libs_qga="-lws2_32 -lwinmm -lpowrprof -liphlpapi -lnetapi32 $libs_qga"
libs_qga="-lws2_32 -lwinmm -lpowrprof -lwtsapi32 -liphlpapi -lnetapi32 $libs_qga"
fi
werror=""
@@ -1101,10 +1119,6 @@ for opt do
;;
--enable-glusterfs) glusterfs="yes"
;;
--disable-archipelago) archipelago="no"
;;
--enable-archipelago) archipelago="yes"
;;
--disable-virtio-blk-data-plane|--enable-virtio-blk-data-plane)
echo "$0: $opt is obsolete, virtio-blk data-plane is always on" >&2
;;
@@ -1170,6 +1184,10 @@ for opt do
;;
--enable-replication) replication="yes"
;;
--disable-vxhs) vxhs="no"
;;
--enable-vxhs) vxhs="yes"
;;
*)
echo "ERROR: unknown option $opt"
echo "Try '$0 --help' for more information"
@@ -1178,21 +1196,6 @@ for opt do
esac
done
if ! has $python; then
error_exit "Python not found. Use --python=/path/to/python"
fi
# Note that if the Python conditional here evaluates True we will exit
# with status 1 which is a shell 'false' value.
if ! $python -c 'import sys; sys.exit(sys.version_info < (2,6) or sys.version_info >= (3,))'; then
error_exit "Cannot use '$python', Python 2.6 or later is required." \
"Note that Python 3 or later is not yet supported." \
"Use --python=/path/to/python to specify a supported Python."
fi
# Suppress writing compiled files
python="$python -B"
case "$cpu" in
ppc)
CPU_CFLAGS="-m32"
@@ -1267,6 +1270,9 @@ for config in $mak_wilds; do
default_target_list="${default_target_list} $(basename "$config" .mak)"
done
# Enumerate public trace backends for --help output
trace_backend_list=$(echo $(grep -le '^PUBLIC = True$' "$source_path"/scripts/tracetool/backend/*.py | sed -e 's/^.*\/\(.*\)\.py$/\1/'))
if test x"$show_help" = x"yes" ; then
cat << EOF
@@ -1320,7 +1326,7 @@ Advanced options (experts only):
set block driver read-only whitelist
(affects only QEMU, not qemu-img)
--enable-trace-backends=B Set trace backend
Available backends: $($python $source_path/scripts/tracetool.py --list-backends)
Available backends: $trace_backend_list
--with-trace-file=NAME Full PATH,NAME of file to store traces
Default:trace-<pid>
--disable-slirp disable SLIRP userspace network connectivity
@@ -1335,6 +1341,12 @@ Advanced options (experts only):
--with-vss-sdk=SDK-path enable Windows VSS support in QEMU Guest Agent
--with-win-sdk=SDK-path path to Windows Platform SDK (to build VSS .tlb)
--tls-priority default TLS protocol/cipher priority string
--enable-gprof QEMU profiling with gprof
--enable-profiler profiler support
--enable-xen-pv-domain-build
xen pv domain builder
--enable-debug-stack-usage
track the maximum stack usage of stacks created by qemu_alloc_stack
Optional features, enabled with --enable-FEATURE and
disabled with --disable-FEATURE, default is enabled if available:
@@ -1396,19 +1408,40 @@ disabled with --disable-FEATURE, default is enabled if available:
seccomp seccomp support
coroutine-pool coroutine freelist (better performance)
glusterfs GlusterFS backend
archipelago Archipelago backend
tpm TPM support
libssh2 ssh block device support
numa libnuma support
tcmalloc tcmalloc support
jemalloc jemalloc support
replication replication support
vhost-vsock virtio sockets device support
opengl opengl support
virglrenderer virgl rendering support
xfsctl xfsctl support
qom-cast-debug cast debugging support
tools build qemu-io, qemu-nbd and qemu-image tools
vxhs Veritas HyperScale vDisk backend support
NOTE: The object files are built at the place where configure is launched
EOF
exit 0
fi
if ! has $python; then
error_exit "Python not found. Use --python=/path/to/python"
fi
# Note that if the Python conditional here evaluates True we will exit
# with status 1 which is a shell 'false' value.
if ! $python -c 'import sys; sys.exit(sys.version_info < (2,6) or sys.version_info >= (3,))'; then
error_exit "Cannot use '$python', Python 2.6 or later is required." \
"Note that Python 3 or later is not yet supported." \
"Use --python=/path/to/python to specify a supported Python."
fi
# Suppress writing compiled files
python="$python -B"
# Now we have handled --enable-tcg-interpreter and know we're not just
# printing the help message, bail out if the host CPU isn't supported.
if test "$ARCH" = "unknown"; then
@@ -1441,6 +1474,14 @@ if ! compile_prog ; then
error_exit "\"$cc\" cannot build an executable (is your linker broken?)"
fi
if test "$bogus_os" = "yes"; then
# Now that we know that we're not printing the help and that
# the compiler works (so the results of the check_defines we used
# to identify the OS are reliable), if we didn't recognize the
# host OS we should stop now.
error_exit "Unrecognized host OS $targetos"
fi
# Check that the C++ compiler exists and works with the C compiler
if has $cxx; then
cat > $TMPC <<EOF
@@ -1956,30 +1997,65 @@ fi
# xen probe
if test "$xen" != "no" ; then
xen_libs="-lxenstore -lxenctrl -lxenguest"
xen_stable_libs="-lxenforeignmemory -lxengnttab -lxenevtchn"
# Check whether Xen library path is specified via --extra-ldflags to avoid
# overriding this setting with pkg-config output. If not, try pkg-config
# to obtain all needed flags.
# First we test whether Xen headers and libraries are available.
# If no, we are done and there is no Xen support.
# If yes, more tests are run to detect the Xen version.
if ! echo $EXTRA_LDFLAGS | grep tools/libxc > /dev/null && \
$pkg_config --exists xencontrol ; then
xen_ctrl_version="$(printf '%d%02d%02d' \
$($pkg_config --modversion xencontrol | sed 's/\./ /g') )"
xen=yes
xen_pc="xencontrol xenstore xenguest xenforeignmemory xengnttab"
xen_pc="$xen_pc xenevtchn xendevicemodel"
QEMU_CFLAGS="$QEMU_CFLAGS $($pkg_config --cflags $xen_pc)"
libs_softmmu="$($pkg_config --libs $xen_pc) $libs_softmmu"
LDFLAGS="$($pkg_config --libs $xen_pc) $LDFLAGS"
else
# Xen (any)
cat > $TMPC <<EOF
xen_libs="-lxenstore -lxenctrl -lxenguest"
xen_stable_libs="-lxencall -lxenforeignmemory -lxengnttab -lxenevtchn"
# First we test whether Xen headers and libraries are available.
# If no, we are done and there is no Xen support.
# If yes, more tests are run to detect the Xen version.
# Xen (any)
cat > $TMPC <<EOF
#include <xenctrl.h>
int main(void) {
return 0;
}
EOF
if ! compile_prog "" "$xen_libs" ; then
# Xen not found
if test "$xen" = "yes" ; then
feature_not_found "xen" "Install xen devel"
fi
xen=no
if ! compile_prog "" "$xen_libs" ; then
# Xen not found
if test "$xen" = "yes" ; then
feature_not_found "xen" "Install xen devel"
fi
xen=no
# Xen unstable
elif
cat > $TMPC <<EOF &&
# Xen unstable
elif
cat > $TMPC <<EOF &&
#undef XC_WANT_COMPAT_DEVICEMODEL_API
#define __XEN_TOOLS__
#include <xendevicemodel.h>
int main(void) {
xendevicemodel_handle *xd;
xd = xendevicemodel_open(0, 0);
xendevicemodel_close(xd);
return 0;
}
EOF
compile_prog "" "$xen_libs -lxendevicemodel $xen_stable_libs"
then
xen_stable_libs="-lxendevicemodel $xen_stable_libs"
xen_ctrl_version=40900
xen=yes
elif
cat > $TMPC <<EOF &&
/*
* If we have stable libs the we don't want the libxc compat
* layers, regardless of what CFLAGS we may have been given.
@@ -2029,12 +2105,12 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=480
xen=yes
elif
cat > $TMPC <<EOF &&
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=40800
xen=yes
elif
cat > $TMPC <<EOF &&
/*
* If we have stable libs the we don't want the libxc compat
* layers, regardless of what CFLAGS we may have been given.
@@ -2080,12 +2156,12 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=471
xen=yes
elif
cat > $TMPC <<EOF &&
compile_prog "" "$xen_libs $xen_stable_libs"
then
xen_ctrl_version=40701
xen=yes
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <stdint.h>
int main(void) {
@@ -2095,14 +2171,14 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=470
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40700
xen=yes
# Xen 4.6
elif
cat > $TMPC <<EOF &&
# Xen 4.6
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2123,14 +2199,14 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=460
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40600
xen=yes
# Xen 4.5
elif
cat > $TMPC <<EOF &&
# Xen 4.5
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2150,13 +2226,13 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=450
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40500
xen=yes
elif
cat > $TMPC <<EOF &&
elif
cat > $TMPC <<EOF &&
#include <xenctrl.h>
#include <xenstore.h>
#include <stdint.h>
@@ -2175,24 +2251,25 @@ int main(void) {
return 0;
}
EOF
compile_prog "" "$xen_libs"
then
xen_ctrl_version=420
xen=yes
compile_prog "" "$xen_libs"
then
xen_ctrl_version=40200
xen=yes
else
if test "$xen" = "yes" ; then
feature_not_found "xen (unsupported version)" \
"Install a supported xen (xen 4.2 or newer)"
else
if test "$xen" = "yes" ; then
feature_not_found "xen (unsupported version)" \
"Install a supported xen (xen 4.2 or newer)"
fi
xen=no
fi
xen=no
fi
if test "$xen" = yes; then
if test $xen_ctrl_version -ge 471 ; then
libs_softmmu="$xen_stable_libs $libs_softmmu"
if test "$xen" = yes; then
if test $xen_ctrl_version -ge 40701 ; then
libs_softmmu="$xen_stable_libs $libs_softmmu"
fi
libs_softmmu="$xen_libs $libs_softmmu"
fi
libs_softmmu="$xen_libs $libs_softmmu"
fi
fi
@@ -2888,6 +2965,12 @@ for drv in $audio_drv_list; do
audio_pt_int="yes"
;;
sdl)
if test "$sdl" = "no"; then
error_exit "sdl not found or disabled, can not use sdl audio driver"
fi
;;
coreaudio)
libs_softmmu="-framework CoreAudio $libs_softmmu"
;;
@@ -2901,8 +2984,8 @@ for drv in $audio_drv_list; do
libs_softmmu="$oss_lib $libs_softmmu"
;;
sdl|wav)
# XXX: Probes for CoreAudio, DirectSound, SDL(?)
wav)
# XXX: Probes for CoreAudio, DirectSound
;;
*)
@@ -3035,12 +3118,23 @@ fi
##########################################
# glib support probe
glib_req_ver=2.22
if test "$mingw32" = yes; then
glib_req_ver=2.30
else
glib_req_ver=2.22
fi
glib_modules=gthread-2.0
if test "$modules" = yes; then
glib_modules="$glib_modules gmodule-2.0"
fi
# This workaround is required due to a bug in pkg-config file for glib as it
# doesn't define GLIB_STATIC_COMPILATION for pkg-config --static
if test "$static" = yes -a "$mingw32" = yes; then
QEMU_CFLAGS="-DGLIB_STATIC_COMPILATION $QEMU_CFLAGS"
fi
for i in $glib_modules; do
if $pkg_config --atleast-version=$glib_req_ver $i; then
glib_cflags=$($pkg_config --cflags $i)
@@ -3443,6 +3537,7 @@ if test "$opengl" != "no" ; then
if test "$gtk" = "yes" && $pkg_config --exists "$gtkpackage >= 3.16"; then
gtk_gl="yes"
fi
QEMU_CFLAGS="$QEMU_CFLAGS $opengl_cflags"
else
if test "$opengl" = "yes" ; then
feature_not_found "opengl" "Please install opengl (mesa) devel pkgs: $opengl_pkgs"
@@ -3466,37 +3561,6 @@ EOF
fi
fi
##########################################
# archipelago probe
if test "$archipelago" != "no" ; then
cat > $TMPC <<EOF
#include <stdio.h>
#include <xseg/xseg.h>
#include <xseg/protocol.h>
int main(void) {
xseg_initialize();
return 0;
}
EOF
archipelago_libs=-lxseg
if compile_prog "" "$archipelago_libs"; then
archipelago="yes"
libs_tools="$archipelago_libs $libs_tools"
libs_softmmu="$archipelago_libs $libs_softmmu"
echo "WARNING: Please check the licenses of QEMU and libxseg carefully."
echo "GPLv3 versions of libxseg may not be compatible with QEMU's"
echo "license and therefore prevent redistribution."
echo
echo "To disable Archipelago, use --disable-archipelago"
else
if test "$archipelago" = "yes" ; then
feature_not_found "Archipelago backend support" "Install libxseg devel"
fi
archipelago="no"
fi
fi
##########################################
# glusterfs probe
@@ -4747,6 +4811,47 @@ if test "$modules" = "yes" && test "$LD_REL_FLAGS" = ""; then
feature_not_found "modules" "Cannot find how to build relocatable objects"
fi
##########################################
# check for sysmacros.h
have_sysmacros=no
cat > $TMPC << EOF
#include <sys/sysmacros.h>
int main(void) {
return makedev(0, 0);
}
EOF
if compile_prog "" "" ; then
have_sysmacros=yes
fi
##########################################
# Veritas HyperScale block driver VxHS
# Check if libvxhs is installed
if test "$vxhs" != "no" ; then
cat > $TMPC <<EOF
#include <stdint.h>
#include <qnio/qnio_api.h>
void *vxhs_callback;
int main(void) {
iio_init(QNIO_VERSION, vxhs_callback);
return 0;
}
EOF
vxhs_libs="-lvxhs -lssl"
if compile_prog "" "$vxhs_libs" ; then
vxhs=yes
else
if test "$vxhs" = "yes" ; then
feature_not_found "vxhs block device" "Install libvxhs See github"
fi
vxhs=no
fi
fi
##########################################
# End of CC checks
# After here, no more $cc or $ld runs
@@ -5099,7 +5204,6 @@ echo "coroutine backend $coroutine"
echo "coroutine pool $coroutine_pool"
echo "debug stack usage $debug_stack_usage"
echo "GlusterFS support $glusterfs"
echo "Archipelago support $archipelago"
echo "gcov $gcov_tool"
echo "gcov enabled $gcov"
echo "TPM support $tpm"
@@ -5114,11 +5218,38 @@ echo "tcmalloc support $tcmalloc"
echo "jemalloc support $jemalloc"
echo "avx2 optimization $avx2_opt"
echo "replication support $replication"
echo "VxHS block device $vxhs"
if test "$sdl_too_old" = "yes"; then
echo "-> Your SDL version is too old - please upgrade to have SDL support"
fi
if test "$supported_cpu" = "no"; then
echo
echo "WARNING: SUPPORT FOR THIS HOST CPU WILL GO AWAY IN FUTURE RELEASES!"
echo
echo "CPU host architecture $cpu support is not currently maintained."
echo "The QEMU project intends to remove support for this host CPU in"
echo "a future release if nobody volunteers to maintain it and to"
echo "provide a build host for our continuous integration setup."
echo "configure has succeeded and you can continue to build, but"
echo "if you care about QEMU on this platform you should contact"
echo "us upstream at qemu-devel@nongnu.org."
fi
if test "$supported_os" = "no"; then
echo
echo "WARNING: SUPPORT FOR THIS HOST OS WILL GO AWAY IN FUTURE RELEASES!"
echo
echo "Host OS $targetos support is not currently maintained."
echo "The QEMU project intends to remove support for this host OS in"
echo "a future release if nobody volunteers to maintain it and to"
echo "provide a build host for our continuous integration setup."
echo "configure has succeeded and you can continue to build, but"
echo "if you care about QEMU on this platform you should contact"
echo "us upstream at qemu-devel@nongnu.org."
fi
config_host_mak="config-host.mak"
echo "# Automatically generated by configure - do not modify" >config-all-disas.mak
@@ -5515,7 +5646,6 @@ fi
if test "$opengl" = "yes" ; then
echo "CONFIG_OPENGL=y" >> $config_host_mak
echo "OPENGL_CFLAGS=$opengl_cflags" >> $config_host_mak
echo "OPENGL_LIBS=$opengl_libs" >> $config_host_mak
if test "$opengl_dmabuf" = "yes" ; then
echo "CONFIG_OPENGL_DMABUF=y" >> $config_host_mak
@@ -5640,11 +5770,6 @@ if test "$glusterfs_zerofill" = "yes" ; then
echo "CONFIG_GLUSTERFS_ZEROFILL=y" >> $config_host_mak
fi
if test "$archipelago" = "yes" ; then
echo "CONFIG_ARCHIPELAGO=m" >> $config_host_mak
echo "ARCHIPELAGO_LIBS=$archipelago_libs" >> $config_host_mak
fi
if test "$libssh2" = "yes" ; then
echo "CONFIG_LIBSSH2=m" >> $config_host_mak
echo "LIBSSH2_CFLAGS=$libssh2_cflags" >> $config_host_mak
@@ -5719,6 +5844,10 @@ if test "$have_af_vsock" = "yes" ; then
echo "CONFIG_AF_VSOCK=y" >> $config_host_mak
fi
if test "$have_sysmacros" = "yes" ; then
echo "CONFIG_SYSMACROS=y" >> $config_host_mak
fi
# Hold two types of flag:
# CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on
# a thread we have a handle to
@@ -5729,6 +5858,11 @@ if test "$pthread_setname_np" = "yes" ; then
echo "CONFIG_PTHREAD_SETNAME_NP=y" >> $config_host_mak
fi
if test "$vxhs" = "yes" ; then
echo "CONFIG_VXHS=y" >> $config_host_mak
echo "VXHS_LIBS=$vxhs_libs" >> $config_host_mak
fi
if test "$tcg_interpreter" = "yes"; then
QEMU_INCLUDES="-I\$(SRC_PATH)/tcg/tci $QEMU_INCLUDES"
elif test "$ARCH" = "sparc64" ; then

View File

@@ -35,7 +35,7 @@ void cpu_loop_exit_noexc(CPUState *cpu)
#if defined(CONFIG_SOFTMMU)
void cpu_reloading_memory_map(void)
{
if (qemu_in_vcpu_thread()) {
if (qemu_in_vcpu_thread() && current_cpu->running) {
/* The guest can in theory prolong the RCU critical section as long
* as it feels like. The major problem with this is that because it
* can do multiple reconfigurations of the memory map within the

View File

@@ -33,6 +33,7 @@
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
#include "hw/i386/apic.h"
#endif
#include "sysemu/cpus.h"
#include "sysemu/replay.h"
/* -icount align implementation. */
@@ -186,12 +187,6 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
cc->set_pc(cpu, last_tb->pc);
}
}
if (tb_exit == TB_EXIT_REQUESTED) {
/* We were asked to stop executing TBs (probably a pending
* interrupt. We've now stopped, so clear the flag.
*/
atomic_set(&cpu->tcg_exit_req, 0);
}
return ret;
}
@@ -560,8 +555,9 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
qemu_mutex_unlock_iothread();
}
if (unlikely(atomic_read(&cpu->exit_request) || replay_has_interrupt())) {
/* Finally, check if we need to exit to the main loop. */
if (unlikely(atomic_read(&cpu->exit_request)
|| (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
return true;
@@ -571,62 +567,54 @@ static inline bool cpu_handle_interrupt(CPUState *cpu,
}
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
TranslationBlock **last_tb, int *tb_exit,
SyncClocks *sc)
TranslationBlock **last_tb, int *tb_exit)
{
uintptr_t ret;
if (unlikely(atomic_read(&cpu->exit_request))) {
return;
}
int32_t insns_left;
trace_exec_tb(tb, tb->pc);
ret = cpu_tb_exec(cpu, tb);
tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
*tb_exit = ret & TB_EXIT_MASK;
switch (*tb_exit) {
case TB_EXIT_REQUESTED:
if (*tb_exit != TB_EXIT_REQUESTED) {
*last_tb = tb;
return;
}
*last_tb = NULL;
insns_left = atomic_read(&cpu->icount_decr.u32);
atomic_set(&cpu->icount_decr.u16.high, 0);
if (insns_left < 0) {
/* Something asked us to stop executing chained TBs; just
* continue round the main loop. Whatever requested the exit
* will also have set something else (eg interrupt_request)
* which we will handle next time around the loop. But we
* need to ensure the tcg_exit_req read in generated code
* comes before the next read of cpu->exit_request or
* cpu->interrupt_request.
* will also have set something else (eg exit_request or
* interrupt_request) which we will handle next time around
* the loop. But we need to ensure the zeroing of icount_decr
* comes before the next read of cpu->exit_request
* or cpu->interrupt_request.
*/
smp_mb();
*last_tb = NULL;
break;
case TB_EXIT_ICOUNT_EXPIRED:
{
/* Instruction counter expired. */
#ifdef CONFIG_USER_ONLY
abort();
#else
int insns_left = cpu->icount_decr.u32;
*last_tb = NULL;
if (cpu->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
cpu->icount_extra += insns_left;
insns_left = MIN(0xffff, cpu->icount_extra);
cpu->icount_extra -= insns_left;
cpu->icount_decr.u16.low = insns_left;
} else {
if (insns_left > 0) {
/* Execute remaining instructions. */
cpu_exec_nocache(cpu, insns_left, tb, false);
align_clocks(sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
cpu_loop_exit(cpu);
return;
}
/* Instruction counter expired. */
assert(use_icount);
#ifndef CONFIG_USER_ONLY
/* Ensure global icount has gone forward */
cpu_update_icount(cpu);
/* Refill decrementer and continue execution. */
insns_left = MIN(0xffff, cpu->icount_budget);
cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
if (!cpu->icount_extra) {
/* Execute any remaining instructions, then let the main loop
* handle the next event.
*/
if (insns_left > 0) {
cpu_exec_nocache(cpu, insns_left, tb, false);
}
break;
}
#endif
}
default:
*last_tb = tb;
break;
}
}
/* main execution loop */
@@ -635,7 +623,7 @@ int cpu_exec(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
int ret;
SyncClocks sc;
SyncClocks sc = { 0 };
/* replay_interrupt may need current_cpu */
current_cpu = cpu;
@@ -683,7 +671,7 @@ int cpu_exec(CPUState *cpu)
while (!cpu_handle_interrupt(cpu, &last_tb)) {
TranslationBlock *tb = tb_find(cpu, last_tb, tb_exit);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);

258
cpus.c
View File

@@ -51,10 +51,6 @@
#include "hw/nmi.h"
#include "sysemu/replay.h"
#ifndef _WIN32
#include "qemu/compatfd.h"
#endif
#ifdef CONFIG_LINUX
#include <sys/prctl.h>
@@ -185,10 +181,7 @@ static bool check_tcg_memory_orders_compatible(void)
static bool default_mttcg_enabled(void)
{
QemuOpts *icount_opts = qemu_find_opts_singleton("icount");
const char *rr = qemu_opt_get(icount_opts, "rr");
if (rr || TCG_OVERSIZED_GUEST) {
if (use_icount || TCG_OVERSIZED_GUEST) {
return false;
} else {
#ifdef TARGET_SUPPORTS_MTTCG
@@ -206,11 +199,17 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
if (strcmp(t, "multi") == 0) {
if (TCG_OVERSIZED_GUEST) {
error_setg(errp, "No MTTCG when guest word size > hosts");
} else if (use_icount) {
error_setg(errp, "No MTTCG when icount is enabled");
} else {
#ifndef TARGET_SUPPORTS_MTTCG
error_report("Guest not yet converted to MTTCG - "
"you may get unexpected results");
#endif
if (!check_tcg_memory_orders_compatible()) {
error_report("Guest expects a stronger memory ordering "
"than the host provides");
error_printf("This may cause strange/hard to debug errors");
error_printf("This may cause strange/hard to debug errors\n");
}
mttcg_enabled = true;
}
@@ -224,20 +223,51 @@ void qemu_tcg_configure(QemuOpts *opts, Error **errp)
}
}
/* The current number of executed instructions is based on what we
* originally budgeted minus the current state of the decrementing
* icount counters in extra/u16.low.
*/
static int64_t cpu_get_icount_executed(CPUState *cpu)
{
return cpu->icount_budget - (cpu->icount_decr.u16.low + cpu->icount_extra);
}
/*
* Update the global shared timer_state.qemu_icount to take into
* account executed instructions. This is done by the TCG vCPU
* thread so the main-loop can see time has moved forward.
*/
void cpu_update_icount(CPUState *cpu)
{
int64_t executed = cpu_get_icount_executed(cpu);
cpu->icount_budget -= executed;
#ifdef CONFIG_ATOMIC64
atomic_set__nocheck(&timers_state.qemu_icount,
atomic_read__nocheck(&timers_state.qemu_icount) +
executed);
#else /* FIXME: we need 64bit atomics to do this safely */
timers_state.qemu_icount += executed;
#endif
}
int64_t cpu_get_icount_raw(void)
{
int64_t icount;
CPUState *cpu = current_cpu;
icount = timers_state.qemu_icount;
if (cpu) {
if (cpu && cpu->running) {
if (!cpu->can_do_io) {
fprintf(stderr, "Bad icount read\n");
exit(1);
}
icount -= (cpu->icount_decr.u16.low + cpu->icount_extra);
/* Take into account what has run */
cpu_update_icount(cpu);
}
return icount;
#ifdef CONFIG_ATOMIC64
return atomic_read__nocheck(&timers_state.qemu_icount);
#else /* FIXME: we need 64bit atomics to do this safely */
return timers_state.qemu_icount;
#endif
}
/* Return the virtual CPU time, based on the instruction counter. */
@@ -801,6 +831,27 @@ static void qemu_cpu_kick_rr_cpu(void)
} while (cpu != atomic_mb_read(&tcg_current_rr_cpu));
}
static void do_nothing(CPUState *cpu, run_on_cpu_data unused)
{
}
void qemu_timer_notify_cb(void *opaque, QEMUClockType type)
{
if (!use_icount || type != QEMU_CLOCK_VIRTUAL) {
qemu_notify_event();
return;
}
if (!qemu_in_vcpu_thread() && first_cpu) {
/* qemu_cpu_kick is not enough to kick a halted CPU out of
* qemu_tcg_wait_io_event. async_run_on_cpu, instead,
* causes cpu_thread_is_idle to return false. This way,
* handle_icount_deadline can run.
*/
async_run_on_cpu(first_cpu, do_nothing, RUN_ON_CPU_NULL);
}
}
static void kick_tcg_thread(void *opaque)
{
timer_mod(tcg_kick_vcpu_timer, qemu_tcg_next_kick());
@@ -924,13 +975,23 @@ static void sigbus_reraise(void)
abort();
}
static void sigbus_handler(int n, struct qemu_signalfd_siginfo *siginfo,
void *ctx)
static void sigbus_handler(int n, siginfo_t *siginfo, void *ctx)
{
if (kvm_on_sigbus(siginfo->ssi_code,
(void *)(intptr_t)siginfo->ssi_addr)) {
if (siginfo->si_code != BUS_MCEERR_AO && siginfo->si_code != BUS_MCEERR_AR) {
sigbus_reraise();
}
if (current_cpu) {
/* Called asynchronously in VCPU thread. */
if (kvm_on_sigbus_vcpu(current_cpu, siginfo->si_code, siginfo->si_addr)) {
sigbus_reraise();
}
} else {
/* Called synchronously (via signalfd) in main thread. */
if (kvm_on_sigbus(siginfo->si_code, siginfo->si_addr)) {
sigbus_reraise();
}
}
}
static void qemu_init_sigbus(void)
@@ -939,92 +1000,17 @@ static void qemu_init_sigbus(void)
memset(&action, 0, sizeof(action));
action.sa_flags = SA_SIGINFO;
action.sa_sigaction = (void (*)(int, siginfo_t*, void*))sigbus_handler;
action.sa_sigaction = sigbus_handler;
sigaction(SIGBUS, &action, NULL);
prctl(PR_MCE_KILL, PR_MCE_KILL_SET, PR_MCE_KILL_EARLY, 0, 0);
}
static void qemu_kvm_eat_signals(CPUState *cpu)
{
struct timespec ts = { 0, 0 };
siginfo_t siginfo;
sigset_t waitset;
sigset_t chkset;
int r;
sigemptyset(&waitset);
sigaddset(&waitset, SIG_IPI);
sigaddset(&waitset, SIGBUS);
do {
r = sigtimedwait(&waitset, &siginfo, &ts);
if (r == -1 && !(errno == EAGAIN || errno == EINTR)) {
perror("sigtimedwait");
exit(1);
}
switch (r) {
case SIGBUS:
if (kvm_on_sigbus_vcpu(cpu, siginfo.si_code, siginfo.si_addr)) {
sigbus_reraise();
}
break;
default:
break;
}
r = sigpending(&chkset);
if (r == -1) {
perror("sigpending");
exit(1);
}
} while (sigismember(&chkset, SIG_IPI) || sigismember(&chkset, SIGBUS));
}
#else /* !CONFIG_LINUX */
static void qemu_init_sigbus(void)
{
}
static void qemu_kvm_eat_signals(CPUState *cpu)
{
}
#endif /* !CONFIG_LINUX */
#ifndef _WIN32
static void dummy_signal(int sig)
{
}
static void qemu_kvm_init_cpu_signals(CPUState *cpu)
{
int r;
sigset_t set;
struct sigaction sigact;
memset(&sigact, 0, sizeof(sigact));
sigact.sa_handler = dummy_signal;
sigaction(SIG_IPI, &sigact, NULL);
pthread_sigmask(SIG_BLOCK, NULL, &set);
sigdelset(&set, SIG_IPI);
sigdelset(&set, SIGBUS);
r = kvm_set_signal_mask(cpu, &set);
if (r) {
fprintf(stderr, "kvm_set_signal_mask: %s\n", strerror(-r));
exit(1);
}
}
#else /* _WIN32 */
static void qemu_kvm_init_cpu_signals(CPUState *cpu)
{
abort();
}
#endif /* _WIN32 */
static QemuMutex qemu_global_mutex;
static QemuThread io_thread;
@@ -1099,7 +1085,6 @@ static void qemu_kvm_wait_io_event(CPUState *cpu)
qemu_cond_wait(cpu->halt_cond, &qemu_global_mutex);
}
qemu_kvm_eat_signals(cpu);
qemu_wait_io_event_common(cpu);
}
@@ -1122,7 +1107,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
exit(1);
}
qemu_kvm_init_cpu_signals(cpu);
kvm_init_cpu_signals(cpu);
/* signal CPU creation */
cpu->created = true;
@@ -1212,16 +1197,54 @@ static int64_t tcg_get_icount_limit(void)
static void handle_icount_deadline(void)
{
assert(qemu_in_vcpu_thread());
if (use_icount) {
int64_t deadline =
qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
if (deadline == 0) {
/* Wake up other AioContexts. */
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
}
}
}
static void prepare_icount_for_run(CPUState *cpu)
{
if (use_icount) {
int insns_left;
/* These should always be cleared by process_icount_data after
* each vCPU execution. However u16.high can be raised
* asynchronously by cpu_exit/cpu_interrupt/tcg_handle_interrupt
*/
g_assert(cpu->icount_decr.u16.low == 0);
g_assert(cpu->icount_extra == 0);
cpu->icount_budget = tcg_get_icount_limit();
insns_left = MIN(0xffff, cpu->icount_budget);
cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
}
}
static void process_icount_data(CPUState *cpu)
{
if (use_icount) {
/* Account for executed instructions */
cpu_update_icount(cpu);
/* Reset the counters */
cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
cpu->icount_budget = 0;
replay_account_executed_instructions();
}
}
static int tcg_cpu_exec(CPUState *cpu)
{
int ret;
@@ -1232,20 +1255,6 @@ static int tcg_cpu_exec(CPUState *cpu)
#ifdef CONFIG_PROFILER
ti = profile_getclock();
#endif
if (use_icount) {
int64_t count;
int decr;
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u16.low = 0;
cpu->icount_extra = 0;
count = tcg_get_icount_limit();
timers_state.qemu_icount += count;
decr = (count > 0xffff) ? 0xffff : count;
count -= decr;
cpu->icount_decr.u16.low = decr;
cpu->icount_extra = count;
}
qemu_mutex_unlock_iothread();
cpu_exec_start(cpu);
ret = cpu_exec(cpu);
@@ -1254,15 +1263,6 @@ static int tcg_cpu_exec(CPUState *cpu)
#ifdef CONFIG_PROFILER
tcg_time += profile_getclock() - ti;
#endif
if (use_icount) {
/* Fold pending instructions back into the
instruction counter, and clear the interrupt flag. */
timers_state.qemu_icount -= (cpu->icount_decr.u16.low
+ cpu->icount_extra);
cpu->icount_decr.u32 = 0;
cpu->icount_extra = 0;
replay_account_executed_instructions();
}
return ret;
}
@@ -1330,6 +1330,11 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
/* Account partial waits to QEMU_CLOCK_VIRTUAL. */
qemu_account_warp_timer();
/* Run the timers here. This is much more efficient than
* waking up the I/O thread and waiting for completion.
*/
handle_icount_deadline();
if (!cpu) {
cpu = first_cpu;
}
@@ -1344,7 +1349,13 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
if (cpu_can_run(cpu)) {
int r;
prepare_icount_for_run(cpu);
r = tcg_cpu_exec(cpu);
process_icount_data(cpu);
if (r == EXCP_DEBUG) {
cpu_handle_guest_debug(cpu);
break;
@@ -1371,8 +1382,6 @@ static void *qemu_tcg_rr_cpu_thread_fn(void *arg)
atomic_mb_set(&cpu->exit_request, 0);
}
handle_icount_deadline();
qemu_tcg_wait_io_event(cpu ? cpu : QTAILQ_FIRST(&cpus));
deal_with_unplugged_cpus();
}
@@ -1384,8 +1393,9 @@ static void *qemu_hax_cpu_thread_fn(void *arg)
{
CPUState *cpu = arg;
int r;
qemu_mutex_lock_iothread();
qemu_thread_get_self(cpu->thread);
qemu_mutex_lock(&qemu_global_mutex);
cpu->thread_id = qemu_get_thread_id();
cpu->created = true;
@@ -1431,6 +1441,8 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
{
CPUState *cpu = arg;
g_assert(!use_icount);
rcu_register_thread();
qemu_mutex_lock_iothread();
@@ -1473,8 +1485,6 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
}
}
handle_icount_deadline();
atomic_mb_set(&cpu->exit_request, 0);
qemu_tcg_wait_io_event(cpu);
}

View File

@@ -473,10 +473,10 @@ qcrypto_block_luks_load_key(QCryptoBlock *block,
* then encrypted.
*/
rv = readfunc(block,
opaque,
slot->key_offset * QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
splitkey, splitkeylen,
errp,
opaque);
errp);
if (rv < 0) {
goto cleanup;
}
@@ -676,11 +676,10 @@ qcrypto_block_luks_open(QCryptoBlock *block,
/* Read the entire LUKS header, minus the key material from
* the underlying device */
rv = readfunc(block, 0,
rv = readfunc(block, opaque, 0,
(uint8_t *)&luks->header,
sizeof(luks->header),
errp,
opaque);
errp);
if (rv < 0) {
ret = rv;
goto fail;
@@ -1246,7 +1245,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE;
/* Reserve header space to match payload offset */
initfunc(block, block->payload_offset, &local_err, opaque);
initfunc(block, opaque, block->payload_offset, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto error;
@@ -1268,11 +1267,10 @@ qcrypto_block_luks_create(QCryptoBlock *block,
/* Write out the partition header and key slot headers */
writefunc(block, 0,
writefunc(block, opaque, 0,
(const uint8_t *)&luks->header,
sizeof(luks->header),
&local_err,
opaque);
&local_err);
/* Delay checking local_err until we've byte-swapped */
@@ -1297,12 +1295,11 @@ qcrypto_block_luks_create(QCryptoBlock *block,
/* Write out the master key material, starting at the
* sector immediately following the partition header. */
if (writefunc(block,
if (writefunc(block, opaque,
luks->header.key_slots[0].key_offset *
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE,
splitkey, splitkeylen,
errp,
opaque) != splitkeylen) {
errp) != splitkeylen) {
goto error;
}

View File

@@ -29,6 +29,7 @@ CONFIG_LAN9118=y
CONFIG_SMC91C111=y
CONFIG_ALLWINNER_EMAC=y
CONFIG_IMX_FEC=y
CONFIG_FTGMAC100=y
CONFIG_DS1338=y
CONFIG_PFLASH_CFI01=y
CONFIG_PFLASH_CFI02=y

View File

@@ -39,7 +39,6 @@ CONFIG_TPM_TIS=$(CONFIG_TPM)
CONFIG_MC146818RTC=y
CONFIG_PCI_PIIX=y
CONFIG_WDT_IB700=y
CONFIG_XEN_I386=$(CONFIG_XEN)
CONFIG_ISA_DEBUG=y
CONFIG_ISA_TESTDEV=y
CONFIG_VMPORT=y
@@ -59,3 +58,4 @@ CONFIG_I82801B11=y
CONFIG_SMBIOS=y
CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
CONFIG_PXB=y
CONFIG_ACPI_VMGENID=y

View File

@@ -45,6 +45,7 @@ CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_PLATFORM_BUS=y
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
CONFIG_SM501=y
# For PReP
CONFIG_SERIAL_ISA=y
CONFIG_MC146818RTC=y

View File

@@ -6,6 +6,10 @@ include usb.mak
CONFIG_VIRTIO_VGA=y
CONFIG_ESCC=y
CONFIG_M48T59=y
CONFIG_IPMI=y
CONFIG_IPMI_LOCAL=y
CONFIG_IPMI_EXTERN=y
CONFIG_ISA_IPMI_BT=y
CONFIG_SERIAL=y
CONFIG_PARALLEL=y
CONFIG_I8254=y
@@ -47,6 +51,7 @@ CONFIG_OPENPIC_KVM=$(and $(CONFIG_E500),$(CONFIG_KVM))
CONFIG_PLATFORM_BUS=y
CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
CONFIG_SM501=y
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)

View File

@@ -15,3 +15,4 @@ CONFIG_I8259=y
CONFIG_XILINX=y
CONFIG_XILINX_ETHLITE=y
CONFIG_LIBDECNUMBER=y
CONFIG_SM501=y

View File

@@ -39,7 +39,6 @@ CONFIG_TPM_TIS=$(CONFIG_TPM)
CONFIG_MC146818RTC=y
CONFIG_PCI_PIIX=y
CONFIG_WDT_IB700=y
CONFIG_XEN_I386=$(CONFIG_XEN)
CONFIG_ISA_DEBUG=y
CONFIG_ISA_TESTDEV=y
CONFIG_VMPORT=y
@@ -59,3 +58,4 @@ CONFIG_I82801B11=y
CONFIG_SMBIOS=y
CONFIG_HYPERV_TESTDEV=$(CONFIG_KVM)
CONFIG_PXB=y
CONFIG_ACPI_VMGENID=y

View File

@@ -3901,9 +3901,9 @@ print_insn_arm (bfd_vma pc, struct disassemble_info *info)
status = info->read_memory_func (pc, (bfd_byte *)b, 4, info);
if (little)
given = (b[0]) | (b[1] << 8) | (b[2] << 16) | (b[3] << 24);
given = (b[0]) | (b[1] << 8) | (b[2] << 16) | ((unsigned)b[3] << 24);
else
given = (b[3]) | (b[2] << 8) | (b[1] << 16) | (b[0] << 24);
given = (b[3]) | (b[2] << 8) | (b[1] << 16) | ((unsigned)b[0] << 24);
}
else
{

View File

@@ -2009,7 +2009,7 @@ print_with_operands (const struct cris_opcode *opcodep,
case 'n':
{
/* Like N but pc-relative to the start of the insn. */
unsigned long number
uint32_t number
= (buffer[2] + buffer[3] * 256 + buffer[4] * 65536
+ buffer[5] * 0x1000000 + addr);
@@ -2048,7 +2048,7 @@ print_with_operands (const struct cris_opcode *opcodep,
{
/* We're looking at [pc+], i.e. we need to output an immediate
number, where the size can depend on different things. */
long number;
int32_t number;
int signedp
= ((*cs == 'z' && (insn & 0x20))
|| opcodep->match == BDAP_QUICK_OPCODE);
@@ -2201,7 +2201,7 @@ print_with_operands (const struct cris_opcode *opcodep,
{
/* It's [pc+]. This cannot possibly be anything
but an address. */
unsigned long number
uint32_t number
= prefix_buffer[2] + prefix_buffer[3] * 256
+ prefix_buffer[4] * 65536
+ prefix_buffer[5] * 0x1000000;
@@ -2290,7 +2290,7 @@ print_with_operands (const struct cris_opcode *opcodep,
if ((prefix_insn & 0x400) && (prefix_insn & 15) == 15)
{
long number;
int32_t number;
unsigned int nbytes;
/* It's a value. Get its size. */

View File

@@ -1788,8 +1788,7 @@ fput_fp_reg_r (unsigned reg, disassemble_info *info)
if (reg < 4)
(*info->fprintf_func) (info->stream, "fpe%d", reg * 2 + 1);
else
(*info->fprintf_func) (info->stream, "%sR",
reg ? fp_reg_names[reg] : "fr0");
(*info->fprintf_func) (info->stream, "%sR", fp_reg_names[reg]);
}
static void

View File

@@ -4043,7 +4043,7 @@ print_insn (bfd_vma pc, disassemble_info *info)
}
}
if (putop (dp->name, sizeflag) == 0)
if (dp->name != NULL && putop (dp->name, sizeflag) == 0)
{
for (i = 0; i < MAX_OPERANDS; ++i)
{

View File

@@ -4685,10 +4685,11 @@ get_field (const unsigned char *data, enum floatformat_byteorders order,
/* This is the last byte; zero out the bits which are not part of
this field. */
result |=
(*(data + cur_byte) & ((1 << (len - cur_bitshift)) - 1))
(unsigned long)(*(data + cur_byte)
& ((1 << (len - cur_bitshift)) - 1))
<< cur_bitshift;
else
result |= *(data + cur_byte) << cur_bitshift;
result |= (unsigned long)*(data + cur_byte) << cur_bitshift;
cur_bitshift += FLOATFORMAT_CHAR_BIT;
if (order == floatformat_little)
++cur_byte;

View File

@@ -159,7 +159,7 @@ enum microblaze_instr_type {
#define MIN_PVR_REGNUM 0
#define MAX_PVR_REGNUM 15
#define REG_PC 32 /* PC */
/* 32 is REG_PC */
#define REG_MSR 33 /* machine status reg */
#define REG_EAR 35 /* Exception reg */
#define REG_ESR 37 /* Exception reg */
@@ -748,9 +748,11 @@ read_insn_microblaze (bfd_vma memaddr,
}
if (info->endian == BFD_ENDIAN_BIG)
inst = (ibytes[0] << 24) | (ibytes[1] << 16) | (ibytes[2] << 8) | ibytes[3];
inst = ((unsigned)ibytes[0] << 24) | (ibytes[1] << 16)
| (ibytes[2] << 8) | ibytes[3];
else if (info->endian == BFD_ENDIAN_LITTLE)
inst = (ibytes[3] << 24) | (ibytes[2] << 16) | (ibytes[1] << 8) | ibytes[0];
inst = ((unsigned)ibytes[3] << 24) | (ibytes[2] << 16)
| (ibytes[1] << 8) | ibytes[0];
else
abort ();

View File

@@ -41,3 +41,12 @@ has three bootable devices target1, target3, target5 connected to it,
the option ROM will have a boot method for each of them, but it is not
possible to map from boot method back to a specific target. This is a
shortcoming of the PC BIOS boot specification.
== Mixing bootindex and boot order parameters ==
Note that it does not make sense to use the bootindex property together
with the "-boot order=..." (or "-boot once=...") parameter. The guest
firmware implementations normally either support the one or the other,
but not both parameters at the same time. Mixing them will result in
undefined behavior, and thus the guest firmware will likely not boot
from the expected devices.

View File

@@ -117,10 +117,13 @@ Example:
==== Expression documentation ====
Each expression that isn't an include directive must be preceded by a
Each expression that isn't an include directive may be preceded by a
documentation block. Such blocks are called expression documentation
blocks.
When documentation is required (see pragma 'doc-required'), expression
documentation blocks are mandatory.
The documentation block consists of a first line naming the
expression, an optional overview, a description of each argument (for
commands and events) or member (for structs, unions and alternates),
@@ -128,10 +131,8 @@ and optional tagged sections.
FIXME: the parser accepts these things in almost any order.
Optional arguments / members are tagged with the phrase '#optional',
often with their default value; and extensions added after the
expression was first released are also given a '(since x.y.z)'
comment.
Extensions added after the expression was first released carry a
'(since x.y.z)' comment.
A tagged section starts with one of the following words:
"Note:"/"Notes:", "Since:", "Example"/"Examples", "Returns:", "TODO:".
@@ -147,10 +148,10 @@ For example:
#
# Statistics of a virtual block device or a block backing device.
#
# @device: #optional If the stats are for a virtual block device, the name
# @device: If the stats are for a virtual block device, the name
# corresponding to the virtual block device.
#
# @node-name: #optional The node name of the device. (since 2.3)
# @node-name: The node name of the device. (since 2.3)
#
# ... more members ...
#
@@ -165,7 +166,7 @@ For example:
#
# Query the @BlockStats for all virtual block devices.
#
# @query-nodes: #optional If true, the command will query all the
# @query-nodes: If true, the command will query all the
# block nodes ... explain, explain ... (since 2.3)
#
# Returns: A list of @BlockStats for each virtual block devices.
@@ -204,45 +205,53 @@ once. It is permissible for the schema to contain additional types
not used by any commands or events in the Client JSON Protocol, for
the side effect of generated C code used internally.
There are seven top-level expressions recognized by the parser:
'include', 'command', 'struct', 'enum', 'union', 'alternate', and
'event'. There are several groups of types: simple types (a number of
built-in types, such as 'int' and 'str'; as well as enumerations),
complex types (structs and two flavors of unions), and alternate types
(a choice between other types). The 'command' and 'event' expressions
can refer to existing types by name, or list an anonymous type as a
dictionary. Listing a type name inside an array refers to a
single-dimension array of that type; multi-dimension arrays are not
directly supported (although an array of a complex struct that
contains an array member is possible).
There are eight top-level expressions recognized by the parser:
'include', 'pragma', 'command', 'struct', 'enum', 'union',
'alternate', and 'event'. There are several groups of types: simple
types (a number of built-in types, such as 'int' and 'str'; as well as
enumerations), complex types (structs and two flavors of unions), and
alternate types (a choice between other types). The 'command' and
'event' expressions can refer to existing types by name, or list an
anonymous type as a dictionary. Listing a type name inside an array
refers to a single-dimension array of that type; multi-dimension
arrays are not directly supported (although an array of a complex
struct that contains an array member is possible).
All names must begin with a letter, and contain only ASCII letters,
digits, hyphen, and underscore. There are two exceptions: enum values
may start with a digit, and names that are downstream extensions (see
section Downstream extensions) start with underscore.
Names beginning with 'q_' are reserved for the generator, which uses
them for munging QMP names that resemble C keywords or other
problematic strings. For example, a member named "default" in qapi
becomes "q_default" in the generated C code.
Types, commands, and events share a common namespace. Therefore,
generally speaking, type definitions should always use CamelCase for
user-defined type names, while built-in types are lowercase. Type
definitions should not end in 'Kind', as this namespace is used for
creating implicit C enums for visiting union types, or in 'List', as
this namespace is used for creating array types. Command names,
and member names within a type, should be all lower case with words
separated by a hyphen. However, some existing older commands and
complex types use underscore; when extending such expressions,
consistency is preferred over blindly avoiding underscore. Event
names should be ALL_CAPS with words separated by underscore. Member
names cannot start with 'has-' or 'has_', as this is reserved for
tracking optional members.
user-defined type names, while built-in types are lowercase.
Type names ending with 'Kind' or 'List' are reserved for the
generator, which uses them for implicit union enums and array types,
respectively.
Command names, and member names within a type, should be all lower
case with words separated by a hyphen. However, some existing older
commands and complex types use underscore; when extending such
expressions, consistency is preferred over blindly avoiding
underscore.
Event names should be ALL_CAPS with words separated by underscore.
Member names starting with 'has-' or 'has_' are reserved for the
generator, which uses them for tracking optional members.
Any name (command, event, type, member, or enum value) beginning with
"x-" is marked experimental, and may be withdrawn or changed
incompatibly in a future release. All names must begin with a letter,
and contain only ASCII letters, digits, dash, and underscore. There
are two exceptions: enum values may start with a digit, and any
extensions added by downstream vendors should start with a prefix
matching "__RFQDN_" (for the reverse-fully-qualified-domain-name of
the vendor), even if the rest of the name uses dash (example:
__com.redhat_drive-mirror). Names beginning with 'q_' are reserved
for the generator: QMP names that resemble C keywords or other
problematic strings will be munged in C to use this prefix. For
example, a member named "default" in qapi becomes "q_default" in the
generated C code.
incompatibly in a future release.
Pragma 'name-case-whitelist' lets you violate the rules on use of
upper and lower case. Use for new code is strongly discouraged.
In the rest of this document, usage lines are given for each
expression type, with literal strings written in lower case and
@@ -277,7 +286,7 @@ The following types are predefined, and map to C as follows:
QType QType JSON string matching enum QType values
=== Includes ===
=== Include directives ===
Usage: { 'include': STRING }
@@ -297,6 +306,26 @@ an outer file. The parser may be made stricter in the future to
prevent incomplete include files.
=== Pragma directives ===
Usage: { 'pragma': DICT }
The pragma directive lets you control optional generator behavior.
The dictionary's entries are pragma names and values.
Pragma's scope is currently the complete schema. Setting the same
pragma to different values in parts of the schema doesn't work.
Pragma 'doc-required' takes a boolean value. If true, documentation
is required. Default is false.
Pragma 'returns-whitelist' takes a list of command names that may
violate the rules on permitted return types. Default is none.
Pragma 'name-case-whitelist' takes a list of names that may violate
rules on use of upper- vs. lower-case letters. Default is none.
=== Struct types ===
Usage: { 'struct': STRING, 'data': DICT, '*base': STRUCT-NAME }
@@ -536,22 +565,18 @@ The 'data' argument maps to the "arguments" dictionary passed in as
part of a Client JSON Protocol command. The 'data' member is optional
and defaults to {} (an empty dictionary). If present, it must be the
string name of a complex type, or a dictionary that declares an
anonymous type with the same semantics as a 'struct' expression, with
one exception noted below when 'gen' is used.
anonymous type with the same semantics as a 'struct' expression.
The 'returns' member describes what will appear in the "return" member
of a Client JSON Protocol reply on successful completion of a command.
The member is optional from the command declaration; if absent, the
"return" member will be an empty dictionary. If 'returns' is present,
it must be the string name of a complex or built-in type, a
one-element array containing the name of a complex or built-in type,
with one exception noted below when 'gen' is used. Although it is
permitted to have the 'returns' member name a built-in type or an
array of built-in types, any command that does this cannot be extended
to return additional information in the future; thus, new commands
should strongly consider returning a dictionary-based type or an array
of dictionaries, even if the dictionary only contains one member at the
present.
one-element array containing the name of a complex or built-in type.
To return anything else, you have to list the command in pragma
'returns-whitelist'. If you do this, the command cannot be extended
to return additional information in the future. Use of
'returns-whitelist' for new commands is strongly discouraged.
All commands in Client JSON Protocol use a dictionary to report
failure, with no way to specify that in QAPI. Where the error return
@@ -643,6 +668,18 @@ any non-empty complex type (struct, union, or alternate), and a
pointer to that QAPI type is passed as a single argument.
=== Downstream extensions ===
QAPI schema names that are externally visible, say in the Client JSON
Protocol, need to be managed with care. Names starting with a
downstream prefix of the form __RFQDN_ are reserved for the downstream
who controls the valid, reverse fully qualified domain name RFQDN.
RFQDN may only contain ASCII letters, digits, hyphen and period.
Example: Red Hat, Inc. controls redhat.com, and may therefore add a
downstream command __com.redhat_drive-mirror.
== Client JSON Protocol introspection ==
Clients of a Client JSON Protocol commonly need to figure out what
@@ -1138,7 +1175,7 @@ Example:
Visitor *v;
UserDefOneList *arg1 = NULL;
v = qobject_input_visitor_new(QOBJECT(args), true);
v = qobject_input_visitor_new(QOBJECT(args));
visit_start_struct(v, NULL, NULL, 0, &err);
if (err) {
goto out;

View File

@@ -65,7 +65,7 @@ along with this manual. If not, see http://www.gnu.org/licenses/.
@c for texi2pod:
@c man begin DESCRIPTION
@include qemu-qapi.texi
@include qemu-qmp-qapi.texi
@c man end

245
docs/specs/vmgenid.txt Normal file
View File

@@ -0,0 +1,245 @@
VIRTUAL MACHINE GENERATION ID
=============================
Copyright (C) 2016 Red Hat, Inc.
Copyright (C) 2017 Skyport Systems, Inc.
This work is licensed under the terms of the GNU GPL, version 2 or later.
See the COPYING file in the top-level directory.
===
The VM generation ID (vmgenid) device is an emulated device which
exposes a 128-bit, cryptographically random, integer value identifier,
referred to as a Globally Unique Identifier, or GUID.
This allows management applications (e.g. libvirt) to notify the guest
operating system when the virtual machine is executed with a different
configuration (e.g. snapshot execution or creation from a template). The
guest operating system notices the change, and is then able to react as
appropriate by marking its copies of distributed databases as dirty,
re-initializing its random number generator etc.
Requirements
------------
These requirements are extracted from the "How to implement virtual machine
generation ID support in a virtualization platform" section of the
specification, dated August 1, 2012.
The document may be found on the web at:
http://go.microsoft.com/fwlink/?LinkId=260709
R1a. The generation ID shall live in an 8-byte aligned buffer.
R1b. The buffer holding the generation ID shall be in guest RAM, ROM, or device
MMIO range.
R1c. The buffer holding the generation ID shall be kept separate from areas
used by the operating system.
R1d. The buffer shall not be covered by an AddressRangeMemory or
AddressRangeACPI entry in the E820 or UEFI memory map.
R1e. The generation ID shall not live in a page frame that could be mapped with
caching disabled. (In other words, regardless of whether the generation ID
lives in RAM, ROM or MMIO, it shall only be mapped as cacheable.)
R2 to R5. [These AML requirements are isolated well enough in the Microsoft
specification for us to simply refer to them here.]
R6. The hypervisor shall expose a _HID (hardware identifier) object in the
VMGenId device's scope that is unique to the hypervisor vendor.
QEMU Implementation
-------------------
The above-mentioned specification does not dictate which ACPI descriptor table
will contain the VM Generation ID device. Other implementations (Hyper-V and
Xen) put it in the main descriptor table (Differentiated System Description
Table or DSDT). For ease of debugging and implementation, we have decided to
put it in its own Secondary System Description Table, or SSDT.
The following is a dump of the contents from a running system:
# iasl -p ./SSDT -d /sys/firmware/acpi/tables/SSDT
Intel ACPI Component Architecture
ASL+ Optimizing Compiler version 20150717-64
Copyright (c) 2000 - 2015 Intel Corporation
Reading ACPI table from file /sys/firmware/acpi/tables/SSDT - Length
00000198 (0x0000C6)
ACPI: SSDT 0x0000000000000000 0000C6 (v01 BOCHS VMGENID 00000001 BXPC
00000001)
Acpi table [SSDT] successfully installed and loaded
Pass 1 parse of [SSDT]
Pass 2 parse of [SSDT]
Parsing Deferred Opcodes (Methods/Buffers/Packages/Regions)
Parsing completed
Disassembly completed
ASL Output: ./SSDT.dsl - 1631 bytes
# cat SSDT.dsl
/*
* Intel ACPI Component Architecture
* AML/ASL+ Disassembler version 20150717-64
* Copyright (c) 2000 - 2015 Intel Corporation
*
* Disassembling to symbolic ASL+ operators
*
* Disassembly of /sys/firmware/acpi/tables/SSDT, Sun Feb 5 00:19:37 2017
*
* Original Table Header:
* Signature "SSDT"
* Length 0x000000CA (202)
* Revision 0x01
* Checksum 0x4B
* OEM ID "BOCHS "
* OEM Table ID "VMGENID"
* OEM Revision 0x00000001 (1)
* Compiler ID "BXPC"
* Compiler Version 0x00000001 (1)
*/
DefinitionBlock ("/sys/firmware/acpi/tables/SSDT.aml", "SSDT", 1, "BOCHS ",
"VMGENID", 0x00000001)
{
Name (VGIA, 0x07FFF000)
Scope (\_SB)
{
Device (VGEN)
{
Name (_HID, "QEMUVGID") // _HID: Hardware ID
Name (_CID, "VM_Gen_Counter") // _CID: Compatible ID
Name (_DDN, "VM_Gen_Counter") // _DDN: DOS Device Name
Method (_STA, 0, NotSerialized) // _STA: Status
{
Local0 = 0x0F
If ((VGIA == Zero))
{
Local0 = Zero
}
Return (Local0)
}
Method (ADDR, 0, NotSerialized)
{
Local0 = Package (0x02) {}
Index (Local0, Zero) = (VGIA + 0x28)
Index (Local0, One) = Zero
Return (Local0)
}
}
}
Method (\_GPE._E05, 0, NotSerialized) // _Exx: Edge-Triggered GPE
{
Notify (\_SB.VGEN, 0x80) // Status Change
}
}
Design Details:
---------------
Requirements R1a through R1e dictate that the memory holding the
VM Generation ID must be allocated and owned by the guest firmware,
in this case BIOS or UEFI. However, to be useful, QEMU must be able to
change the contents of the memory at runtime, specifically when starting a
backed-up or snapshotted image. In order to do this, QEMU must know the
address that has been allocated.
The mechanism chosen for this memory sharing is writeable fw_cfg blobs.
These are data object that are visible to both QEMU and guests, and are
addressable as sequential files.
More information about fw_cfg can be found in "docs/specs/fw_cfg.txt"
Two fw_cfg blobs are used in this case:
/etc/vmgenid_guid - contains the actual VM Generation ID GUID
- read-only to the guest
/etc/vmgenid_addr - contains the address of the downloaded vmgenid blob
- writeable by the guest
QEMU sends the following commands to the guest at startup:
1. Allocate memory for vmgenid_guid fw_cfg blob.
2. Write the address of vmgenid_guid into the SSDT (VGIA ACPI variable as
shown above in the iasl dump). Note that this change is not propagated
back to QEMU.
3. Write the address of vmgenid_guid back to QEMU's copy of vmgenid_addr
via the fw_cfg DMA interface.
After step 3, QEMU is able to update the contents of vmgenid_guid at will.
Since BIOS or UEFI does not necessarily run when we wish to change the GUID,
the value of VGIA is persisted via the VMState mechanism.
As spelled out in the specification, any change to the GUID executes an
ACPI notification. The exact handler to use is not specified, so the vmgenid
device uses the first unused one: \_GPE._E05.
Endian-ness Considerations:
---------------------------
Although not specified in Microsoft's document, it is assumed that the
device is expected to use little-endian format.
All GUID passed in via command line or monitor are treated as big-endian.
GUID values displayed via monitor are shown in big-endian format.
GUID Storage Format:
--------------------
In order to implement an OVMF "SDT Header Probe Suppressor", the contents of
the vmgenid_guid fw_cfg blob are not simply a 128-bit GUID. There is also
significant padding in order to align and fill a memory page, as shown in the
following diagram:
+----------------------------------+
| SSDT with OEM Table ID = VMGENID |
+----------------------------------+
| ... | TOP OF PAGE
| VGIA dword object ---------------|-----> +---------------------------+
| ... | | fw-allocated array for |
| _STA method referring to VGIA | | "etc/vmgenid_guid" |
| ... | +---------------------------+
| ADDR method referring to VGIA | | 0: OVMF SDT Header probe |
| ... | | suppressor |
+----------------------------------+ | 36: padding for 8-byte |
| alignment |
| 40: GUID |
| 56: padding to page size |
+---------------------------+
END OF PAGE
Device Usage:
-------------
The device has one property, which may be only be set using the command line:
guid - sets the value of the GUID. A special value "auto" instructs
QEMU to generate a new random GUID.
For example:
QEMU -device vmgenid,guid="324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"
QEMU -device vmgenid,guid=auto
The property may be queried via QMP/HMP:
(QEMU) query-vm-generation-id
{"return": {"guid": "324e6eaf-d1d1-4bf6-bf41-b9bb6c91fb87"}}
Setting of this parameter is intentionally left out from the QMP/HMP
interfaces. There are no known use cases for changing the GUID once QEMU is
running, and adding this capability would greatly increase the complexity.

View File

@@ -405,6 +405,9 @@ information. If used together with the "tcg" property, it adds a second
"TCGv_env" argument that must point to the per-target global TCG register that
points to the vCPU when guest code is executed (usually the "cpu_env" variable).
The "tcg" and "vcpu" properties are currently only honored in the root
./trace-events file.
The following example events:
foo(uint32_t a) "a=%x"

View File

@@ -252,7 +252,7 @@ here goes "hello-world"'s new entry for the qapi-schema.json file:
#
# Print a client provided string to the standard output stream.
#
# @message: #optional string to be printed
# @message: string to be printed
#
# Returns: Nothing on success.
#
@@ -358,7 +358,7 @@ The best way to return that data is to create a new QAPI type, as shown below:
#
# @clock-name: The alarm clock method's name.
#
# @next-deadline: #optional The time (in nanoseconds) the next alarm will fire.
# @next-deadline: The time (in nanoseconds) the next alarm will fire.
#
# Since: 1.0
##

2
dtc

Submodule dtc updated: fa8bc7f928...558cd81bdd

241
exec.c
View File

@@ -42,6 +42,8 @@
#include "exec/memory.h"
#include "exec/ioport.h"
#include "sysemu/dma.h"
#include "sysemu/numa.h"
#include "sysemu/hw_accel.h"
#include "exec/address-spaces.h"
#include "sysemu/xen-mapcache.h"
#include "trace-root.h"
@@ -221,6 +223,12 @@ struct CPUAddressSpace {
MemoryListener tcg_as_listener;
};
struct DirtyBitmapSnapshot {
ram_addr_t start;
ram_addr_t end;
unsigned long dirty[];
};
#endif
#if !defined(CONFIG_USER_ONLY)
@@ -1059,6 +1067,75 @@ bool cpu_physical_memory_test_and_clear_dirty(ram_addr_t start,
return dirty;
}
DirtyBitmapSnapshot *cpu_physical_memory_snapshot_and_clear_dirty
(ram_addr_t start, ram_addr_t length, unsigned client)
{
DirtyMemoryBlocks *blocks;
unsigned long align = 1UL << (TARGET_PAGE_BITS + BITS_PER_LEVEL);
ram_addr_t first = QEMU_ALIGN_DOWN(start, align);
ram_addr_t last = QEMU_ALIGN_UP(start + length, align);
DirtyBitmapSnapshot *snap;
unsigned long page, end, dest;
snap = g_malloc0(sizeof(*snap) +
((last - first) >> (TARGET_PAGE_BITS + 3)));
snap->start = first;
snap->end = last;
page = first >> TARGET_PAGE_BITS;
end = last >> TARGET_PAGE_BITS;
dest = 0;
rcu_read_lock();
blocks = atomic_rcu_read(&ram_list.dirty_memory[client]);
while (page < end) {
unsigned long idx = page / DIRTY_MEMORY_BLOCK_SIZE;
unsigned long offset = page % DIRTY_MEMORY_BLOCK_SIZE;
unsigned long num = MIN(end - page, DIRTY_MEMORY_BLOCK_SIZE - offset);
assert(QEMU_IS_ALIGNED(offset, (1 << BITS_PER_LEVEL)));
assert(QEMU_IS_ALIGNED(num, (1 << BITS_PER_LEVEL)));
offset >>= BITS_PER_LEVEL;
bitmap_copy_and_clear_atomic(snap->dirty + dest,
blocks->blocks[idx] + offset,
num);
page += num;
dest += num >> BITS_PER_LEVEL;
}
rcu_read_unlock();
if (tcg_enabled()) {
tlb_reset_dirty_range_all(start, length);
}
return snap;
}
bool cpu_physical_memory_snapshot_get_dirty(DirtyBitmapSnapshot *snap,
ram_addr_t start,
ram_addr_t length)
{
unsigned long page, end;
assert(start >= snap->start);
assert(start + length <= snap->end);
end = TARGET_PAGE_ALIGN(start + length - snap->start) >> TARGET_PAGE_BITS;
page = (start - snap->start) >> TARGET_PAGE_BITS;
while (page < end) {
if (test_bit(page, snap->dirty)) {
return true;
}
page++;
}
return false;
}
/* Called from RCU critical section */
hwaddr memory_region_section_get_iotlb(CPUState *cpu,
MemoryRegionSection *section,
@@ -1256,6 +1333,87 @@ void qemu_mutex_unlock_ramlist(void)
qemu_mutex_unlock(&ram_list.mutex);
}
#ifdef __linux__
/*
* FIXME TOCTTOU: this iterates over memory backends' mem-path, which
* may or may not name the same files / on the same filesystem now as
* when we actually open and map them. Iterate over the file
* descriptors instead, and use qemu_fd_getpagesize().
*/
static int find_max_supported_pagesize(Object *obj, void *opaque)
{
char *mem_path;
long *hpsize_min = opaque;
if (object_dynamic_cast(obj, TYPE_MEMORY_BACKEND)) {
mem_path = object_property_get_str(obj, "mem-path", NULL);
if (mem_path) {
long hpsize = qemu_mempath_getpagesize(mem_path);
if (hpsize < *hpsize_min) {
*hpsize_min = hpsize;
}
} else {
*hpsize_min = getpagesize();
}
}
return 0;
}
long qemu_getrampagesize(void)
{
long hpsize = LONG_MAX;
long mainrampagesize;
Object *memdev_root;
if (mem_path) {
mainrampagesize = qemu_mempath_getpagesize(mem_path);
} else {
mainrampagesize = getpagesize();
}
/* it's possible we have memory-backend objects with
* hugepage-backed RAM. these may get mapped into system
* address space via -numa parameters or memory hotplug
* hooks. we want to take these into account, but we
* also want to make sure these supported hugepage
* sizes are applicable across the entire range of memory
* we may boot from, so we take the min across all
* backends, and assume normal pages in cases where a
* backend isn't backed by hugepages.
*/
memdev_root = object_resolve_path("/objects", NULL);
if (memdev_root) {
object_child_foreach(memdev_root, find_max_supported_pagesize, &hpsize);
}
if (hpsize == LONG_MAX) {
/* No additional memory regions found ==> Report main RAM page size */
return mainrampagesize;
}
/* If NUMA is disabled or the NUMA nodes are not backed with a
* memory-backend, then there is at least one node using "normal" RAM,
* so if its page size is smaller we have got to report that size instead.
*/
if (hpsize > mainrampagesize &&
(nb_numa_nodes == 0 || numa_info[0].node_memdev == NULL)) {
static bool warned;
if (!warned) {
error_report("Huge page support disabled (n/a for main memory).");
warned = true;
}
return mainrampagesize;
}
return hpsize;
}
#else
long qemu_getrampagesize(void)
{
return getpagesize();
}
#endif
#ifdef __linux__
static int64_t get_file_size(int fd)
{
@@ -1385,7 +1543,7 @@ static void *file_ram_alloc(RAMBlock *block,
}
if (mem_prealloc) {
os_mem_prealloc(fd, area, memory, errp);
os_mem_prealloc(fd, area, memory, smp_cpus, errp);
if (errp && *errp) {
goto error;
}
@@ -1445,7 +1603,7 @@ static ram_addr_t find_ram_offset(ram_addr_t size)
return offset;
}
ram_addr_t last_ram_offset(void)
unsigned long last_ram_page(void)
{
RAMBlock *block;
ram_addr_t last = 0;
@@ -1455,7 +1613,7 @@ ram_addr_t last_ram_offset(void)
last = MAX(last, block->offset + block->max_length);
}
rcu_read_unlock();
return last;
return last >> TARGET_PAGE_BITS;
}
static void qemu_ram_setup_dump(void *addr, ram_addr_t size)
@@ -1478,6 +1636,11 @@ const char *qemu_ram_get_idstr(RAMBlock *rb)
return rb->idstr;
}
bool qemu_ram_is_shared(RAMBlock *rb)
{
return rb->flags & RAM_SHARED;
}
/* Called with iothread lock held. */
void qemu_ram_set_idstr(RAMBlock *new_block, const char *name, DeviceState *dev)
{
@@ -1639,7 +1802,7 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
ram_addr_t old_ram_size, new_ram_size;
Error *err = NULL;
old_ram_size = last_ram_offset() >> TARGET_PAGE_BITS;
old_ram_size = last_ram_page();
qemu_mutex_lock_ramlist();
new_block->offset = find_ram_offset(new_block->max_length);
@@ -1670,7 +1833,6 @@ static void ram_block_add(RAMBlock *new_block, Error **errp)
new_ram_size = MAX(old_ram_size,
(new_block->offset + new_block->max_length) >> TARGET_PAGE_BITS);
if (new_ram_size > old_ram_size) {
migration_bitmap_extend(old_ram_size, new_ram_size);
dirty_memory_extend(old_ram_size, new_ram_size);
}
/* Keep the list sorted from biggest to smallest block. Unlike QTAILQ,
@@ -3148,75 +3310,33 @@ int64_t address_space_cache_init(MemoryRegionCache *cache,
hwaddr len,
bool is_write)
{
hwaddr l, xlat;
MemoryRegion *mr;
void *ptr;
assert(len > 0);
l = len;
mr = address_space_translate(as, addr, &xlat, &l, is_write);
if (!memory_access_is_direct(mr, is_write)) {
return -EINVAL;
}
l = address_space_extend_translation(as, addr, len, mr, xlat, l, is_write);
ptr = qemu_ram_ptr_length(mr->ram_block, xlat, &l);
cache->xlat = xlat;
cache->is_write = is_write;
cache->mr = mr;
cache->ptr = ptr;
cache->len = l;
memory_region_ref(cache->mr);
return l;
cache->len = len;
cache->as = as;
cache->xlat = addr;
return len;
}
void address_space_cache_invalidate(MemoryRegionCache *cache,
hwaddr addr,
hwaddr access_len)
{
assert(cache->is_write);
invalidate_and_set_dirty(cache->mr, addr + cache->xlat, access_len);
}
void address_space_cache_destroy(MemoryRegionCache *cache)
{
if (!cache->mr) {
return;
}
if (xen_enabled()) {
xen_invalidate_map_cache_entry(cache->ptr);
}
memory_region_unref(cache->mr);
cache->mr = NULL;
}
/* Called from RCU critical section. This function has the same
* semantics as address_space_translate, but it only works on a
* predefined range of a MemoryRegion that was mapped with
* address_space_cache_init.
*/
static inline MemoryRegion *address_space_translate_cached(
MemoryRegionCache *cache, hwaddr addr, hwaddr *xlat,
hwaddr *plen, bool is_write)
{
assert(addr < cache->len && *plen <= cache->len - addr);
*xlat = addr + cache->xlat;
return cache->mr;
cache->as = NULL;
}
#define ARG1_DECL MemoryRegionCache *cache
#define ARG1 cache
#define SUFFIX _cached
#define TRANSLATE(...) address_space_translate_cached(cache, __VA_ARGS__)
#define TRANSLATE(addr, ...) \
address_space_translate(cache->as, cache->xlat + (addr), __VA_ARGS__)
#define IS_DIRECT(mr, is_write) true
#define MAP_RAM(mr, ofs) (cache->ptr + (ofs - cache->xlat))
#define INVALIDATE(mr, ofs, len) ((void)0)
#define RCU_READ_LOCK() ((void)0)
#define RCU_READ_UNLOCK() ((void)0)
#define MAP_RAM(mr, ofs) qemu_map_ram_ptr((mr)->ram_block, ofs)
#define INVALIDATE(mr, ofs, len) invalidate_and_set_dirty(mr, ofs, len)
#define RCU_READ_LOCK() rcu_read_lock()
#define RCU_READ_UNLOCK() rcu_read_unlock()
#include "memory_ldst.inc.c"
/* virtual memory access for debug (includes writing to ROM) */
@@ -3227,6 +3347,7 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
hwaddr phys_addr;
target_ulong page;
cpu_synchronize_state(cpu);
while (len > 0) {
int asidx;
MemTxAttrs attrs;
@@ -3260,9 +3381,9 @@ int cpu_memory_rw_debug(CPUState *cpu, target_ulong addr,
* Allows code that needs to deal with migration bitmaps etc to still be built
* target independent.
*/
size_t qemu_target_page_bits(void)
size_t qemu_target_page_size(void)
{
return TARGET_PAGE_BITS;
return TARGET_PAGE_SIZE;
}
#endif

View File

@@ -801,6 +801,20 @@ STEXI
Show information about hotpluggable CPUs
ETEXI
STEXI
@item info vm-generation-id
@findex vm-generation-id
Show Virtual Machine Generation ID
ETEXI
{
.name = "vm-generation-id",
.args_type = "",
.params = "",
.help = "Show Virtual Machine Generation ID",
.cmd = hmp_info_vm_generation_id,
},
STEXI
@end table
ETEXI

View File

@@ -523,6 +523,38 @@ Dump 80 16 bit values at the start of the video memory.
0x000b8090: 0x0720 0x0720 0x0720 0x0720 0x0720 0x0720 0x0720 0x0720
@end smallexample
@end itemize
ETEXI
{
.name = "gpa2hva",
.args_type = "addr:l",
.params = "addr",
.help = "print the host virtual address corresponding to a guest physical address",
.cmd = hmp_gpa2hva,
},
STEXI
@item gpa2hva @var{addr}
@findex gpa2hva
Print the host virtual address at which the guest's physical address @var{addr}
is mapped.
ETEXI
#ifdef CONFIG_LINUX
{
.name = "gpa2hpa",
.args_type = "addr:l",
.params = "addr",
.help = "print the host physical address corresponding to a guest physical address",
.cmd = hmp_gpa2hpa,
},
#endif
STEXI
@item gpa2hpa @var{addr}
@findex gpa2hpa
Print the host physical address at which the guest's physical address @var{addr}
is mapped.
ETEXI
{

40
hmp.c
View File

@@ -215,6 +215,9 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->ram->normal_bytes >> 10);
monitor_printf(mon, "dirty sync count: %" PRIu64 "\n",
info->ram->dirty_sync_count);
monitor_printf(mon, "page size: %" PRIu64 " kbytes\n",
info->ram->page_size >> 10);
if (info->ram->dirty_pages_rate) {
monitor_printf(mon, "dirty pages rate: %" PRIu64 " pages\n",
info->ram->dirty_pages_rate);
@@ -265,13 +268,11 @@ void hmp_info_migrate_capabilities(Monitor *mon, const QDict *qdict)
caps = qmp_query_migrate_capabilities(NULL);
if (caps) {
monitor_printf(mon, "capabilities: ");
for (cap = caps; cap; cap = cap->next) {
monitor_printf(mon, "%s: %s ",
monitor_printf(mon, "%s: %s\n",
MigrationCapability_lookup[cap->value->capability],
cap->value->state ? "on" : "off");
}
monitor_printf(mon, "\n");
}
qapi_free_MigrationCapabilityStatusList(caps);
@@ -284,46 +285,44 @@ void hmp_info_migrate_parameters(Monitor *mon, const QDict *qdict)
params = qmp_query_migrate_parameters(NULL);
if (params) {
monitor_printf(mon, "parameters:");
assert(params->has_compress_level);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_LEVEL],
params->compress_level);
assert(params->has_compress_threads);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_COMPRESS_THREADS],
params->compress_threads);
assert(params->has_decompress_threads);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_DECOMPRESS_THREADS],
params->decompress_threads);
assert(params->has_cpu_throttle_initial);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INITIAL],
params->cpu_throttle_initial);
assert(params->has_cpu_throttle_increment);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_CPU_THROTTLE_INCREMENT],
params->cpu_throttle_increment);
monitor_printf(mon, " %s: '%s'",
monitor_printf(mon, "%s: '%s'\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_CREDS],
params->has_tls_creds ? params->tls_creds : "");
monitor_printf(mon, " %s: '%s'",
monitor_printf(mon, "%s: '%s'\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_TLS_HOSTNAME],
params->has_tls_hostname ? params->tls_hostname : "");
assert(params->has_max_bandwidth);
monitor_printf(mon, " %s: %" PRId64 " bytes/second",
monitor_printf(mon, "%s: %" PRId64 " bytes/second\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_MAX_BANDWIDTH],
params->max_bandwidth);
assert(params->has_downtime_limit);
monitor_printf(mon, " %s: %" PRId64 " milliseconds",
monitor_printf(mon, "%s: %" PRId64 " milliseconds\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_DOWNTIME_LIMIT],
params->downtime_limit);
assert(params->has_x_checkpoint_delay);
monitor_printf(mon, " %s: %" PRId64,
monitor_printf(mon, "%s: %" PRId64 "\n",
MigrationParameter_lookup[MIGRATION_PARAMETER_X_CHECKPOINT_DELAY],
params->x_checkpoint_delay);
monitor_printf(mon, "\n");
}
qapi_free_MigrationParameters(params);
@@ -2605,3 +2604,14 @@ void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict)
qapi_free_HotpluggableCPUList(saved);
}
void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict)
{
Error *err = NULL;
GuidInfo *info = qmp_query_vm_generation_id(&err);
if (info) {
monitor_printf(mon, "%s\n", info->guid);
}
hmp_handle_error(mon, &err);
qapi_free_GuidInfo(info);
}

1
hmp.h
View File

@@ -137,5 +137,6 @@ void hmp_rocker_of_dpa_flows(Monitor *mon, const QDict *qdict);
void hmp_rocker_of_dpa_groups(Monitor *mon, const QDict *qdict);
void hmp_info_dump(Monitor *mon, const QDict *qdict);
void hmp_hotpluggable_cpus(Monitor *mon, const QDict *qdict);
void hmp_info_vm_generation_id(Monitor *mon, const QDict *qdict);
#endif

View File

@@ -349,7 +349,7 @@ static int local_set_cred_passthrough(FsContext *fs_ctx, int dirfd,
const char *name, FsCred *credp)
{
if (fchownat(dirfd, name, credp->fc_uid, credp->fc_gid,
AT_SYMLINK_NOFOLLOW | AT_EMPTY_PATH) < 0) {
AT_SYMLINK_NOFOLLOW) < 0) {
/*
* If we fail to change ownership and if we are
* using security model none. Ignore the error
@@ -435,6 +435,7 @@ static int local_opendir(FsContext *ctx,
stream = fdopendir(dirfd);
if (!stream) {
close(dirfd);
return -1;
}
fs->dir.stream = stream;
@@ -959,7 +960,7 @@ static int local_unlinkat_common(FsContext *ctx, int dirfd, const char *name,
if (flags == AT_REMOVEDIR) {
int fd;
fd = openat(dirfd, name, O_RDONLY | O_DIRECTORY | O_PATH);
fd = openat_dir(dirfd, name);
if (fd == -1) {
goto err_out;
}
@@ -1008,7 +1009,7 @@ static int local_remove(FsContext *ctx, const char *path)
int err = -1;
dirfd = local_opendir_nofollow(ctx, dirpath);
if (dirfd) {
if (dirfd == -1) {
goto out;
}
@@ -1052,6 +1053,9 @@ static int local_statfs(FsContext *s, V9fsPath *fs_path, struct statfs *stbuf)
int fd, ret;
fd = local_open_nofollow(s, fs_path->data, O_RDONLY, 0);
if (fd == -1) {
return -1;
}
ret = fstatfs(fd, stbuf);
close_preserve_errno(fd);
return ret;
@@ -1094,8 +1098,13 @@ static int local_name_to_path(FsContext *ctx, V9fsPath *dir_path,
{
if (dir_path) {
v9fs_path_sprintf(target, "%s/%s", dir_path->data, name);
} else {
} else if (strcmp(name, "/")) {
v9fs_path_sprintf(target, "%s", name);
} else {
/* We want the path of the export root to be relative, otherwise
* "*at()" syscalls would treat it as "/" in the host.
*/
v9fs_path_sprintf(target, "%s", ".");
}
return 0;
}

View File

@@ -165,7 +165,8 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
return retval;
}
reply->iov_len = PROXY_HDR_SZ;
proxy_unmarshal(reply, 0, "dd", &header.type, &header.size);
retval = proxy_unmarshal(reply, 0, "dd", &header.type, &header.size);
assert(retval == 4 * 2);
/*
* if response size > PROXY_MAX_IO_SZ, read the response but ignore it and
* return -ENOBUFS
@@ -194,9 +195,7 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
if (header.type == T_ERROR) {
int ret;
ret = proxy_unmarshal(reply, PROXY_HDR_SZ, "d", status);
if (ret < 0) {
*status = ret;
}
assert(ret == 4);
return 0;
}
@@ -213,6 +212,7 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
&prstat.st_atim_sec, &prstat.st_atim_nsec,
&prstat.st_mtim_sec, &prstat.st_mtim_nsec,
&prstat.st_ctim_sec, &prstat.st_ctim_nsec);
assert(retval == 8 * 3 + 4 * 3 + 8 * 10);
prstat_to_stat(response, &prstat);
break;
}
@@ -225,6 +225,7 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
&prstfs.f_files, &prstfs.f_ffree,
&prstfs.f_fsid[0], &prstfs.f_fsid[1],
&prstfs.f_namelen, &prstfs.f_frsize);
assert(retval == 8 * 11);
prstatfs_to_statfs(response, &prstfs);
break;
}
@@ -246,7 +247,8 @@ static int v9fs_receive_response(V9fsProxy *proxy, int type,
break;
}
case T_GETVERSION:
proxy_unmarshal(reply, PROXY_HDR_SZ, "q", response);
retval = proxy_unmarshal(reply, PROXY_HDR_SZ, "q", response);
assert(retval == 8);
break;
default:
return -1;
@@ -274,18 +276,16 @@ static int v9fs_receive_status(V9fsProxy *proxy,
return retval;
}
reply->iov_len = PROXY_HDR_SZ;
proxy_unmarshal(reply, 0, "dd", &header.type, &header.size);
if (header.size != sizeof(int)) {
*status = -ENOBUFS;
return 0;
}
retval = proxy_unmarshal(reply, 0, "dd", &header.type, &header.size);
assert(retval == 4 * 2);
retval = socket_read(proxy->sockfd,
reply->iov_base + PROXY_HDR_SZ, header.size);
if (retval < 0) {
return retval;
}
reply->iov_len += header.size;
proxy_unmarshal(reply, PROXY_HDR_SZ, "d", status);
retval = proxy_unmarshal(reply, PROXY_HDR_SZ, "d", status);
assert(retval == 4);
return 0;
}

View File

@@ -22,7 +22,13 @@ static inline void close_preserve_errno(int fd)
static inline int openat_dir(int dirfd, const char *name)
{
return openat(dirfd, name, O_DIRECTORY | O_RDONLY | O_PATH);
#ifdef O_PATH
#define OPENAT_DIR_O_PATH O_PATH
#else
#define OPENAT_DIR_O_PATH 0
#endif
return openat(dirfd, name,
O_DIRECTORY | O_RDONLY | O_NOFOLLOW | OPENAT_DIR_O_PATH);
}
static inline int openat_file(int dirfd, const char *name, int flags,

View File

@@ -108,6 +108,7 @@ ssize_t v9fs_list_xattr(FsContext *ctx, const char *path,
g_free(name);
close_preserve_errno(dirfd);
if (xattr_len < 0) {
g_free(orig_value);
return -1;
}

View File

@@ -539,14 +539,15 @@ static void coroutine_fn virtfs_reset(V9fsPDU *pdu)
/* Free all fids */
while (s->fid_list) {
/* Get fid */
fidp = s->fid_list;
s->fid_list = fidp->next;
fidp->ref++;
if (fidp->ref) {
fidp->clunked = 1;
} else {
free_fid(pdu, fidp);
}
/* Clunk fid */
s->fid_list = fidp->next;
fidp->clunked = 1;
put_fid(pdu, fidp);
}
}
@@ -1550,6 +1551,10 @@ static void coroutine_fn v9fs_lcreate(void *opaque)
err = -ENOENT;
goto out_nofid;
}
if (fidp->fid_type != P9_FID_NONE) {
err = -EINVAL;
goto out;
}
flags = get_dotl_openflags(pdu->s, flags);
err = v9fs_co_open2(pdu, fidp, &name, gid,
@@ -2153,6 +2158,10 @@ static void coroutine_fn v9fs_create(void *opaque)
err = -EINVAL;
goto out_nofid;
}
if (fidp->fid_type != P9_FID_NONE) {
err = -EINVAL;
goto out;
}
if (perm & P9_STAT_MODE_DIR) {
err = v9fs_co_mkdir(pdu, fidp, &name, perm & 0777,
fidp->uid, -1, &stbuf);
@@ -2353,7 +2362,7 @@ static void coroutine_fn v9fs_flush(void *opaque)
ssize_t err;
int16_t tag;
size_t offset = 7;
V9fsPDU *cancel_pdu;
V9fsPDU *cancel_pdu = NULL;
V9fsPDU *pdu = opaque;
V9fsState *s = pdu->s;
@@ -2364,9 +2373,13 @@ static void coroutine_fn v9fs_flush(void *opaque)
}
trace_v9fs_flush(pdu->tag, pdu->id, tag);
QLIST_FOREACH(cancel_pdu, &s->active_list, next) {
if (cancel_pdu->tag == tag) {
break;
if (pdu->tag == tag) {
error_report("Warning: the guest sent a self-referencing 9P flush request");
} else {
QLIST_FOREACH(cancel_pdu, &s->active_list, next) {
if (cancel_pdu->tag == tag) {
break;
}
}
}
if (cancel_pdu) {
@@ -2375,8 +2388,10 @@ static void coroutine_fn v9fs_flush(void *opaque)
* Wait for pdu to complete.
*/
qemu_co_queue_wait(&cancel_pdu->complete, NULL);
cancel_pdu->cancelled = 0;
pdu_free(cancel_pdu);
if (!qemu_co_queue_next(&cancel_pdu->complete)) {
cancel_pdu->cancelled = 0;
pdu_free(cancel_pdu);
}
}
pdu_complete(pdu, 7);
}

View File

@@ -119,6 +119,12 @@ static inline char *rpath(FsContext *ctx, const char *path)
typedef struct V9fsPDU V9fsPDU;
struct V9fsState;
typedef struct {
uint32_t size_le;
uint8_t id;
uint16_t tag_le;
} QEMU_PACKED P9MsgHeader;
struct V9fsPDU
{
uint32_t size;

View File

@@ -5,5 +5,6 @@ common-obj-y += coth.o cofs.o codir.o cofile.o
common-obj-y += coxattr.o 9p-synth.o
common-obj-$(CONFIG_OPEN_BY_HANDLE) += 9p-handle.o
common-obj-y += 9p-proxy.o
common-obj-$(CONFIG_XEN) += xen-9p-backend.o
obj-y += virtio-9p-device.o

View File

@@ -46,11 +46,7 @@ static void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
VirtQueueElement *elem;
while ((pdu = pdu_alloc(s))) {
struct {
uint32_t size_le;
uint8_t id;
uint16_t tag_le;
} QEMU_PACKED out;
P9MsgHeader out;
elem = virtqueue_pop(vq, sizeof(VirtQueueElement));
if (!elem) {

440
hw/9pfs/xen-9p-backend.c Normal file
View File

@@ -0,0 +1,440 @@
/*
* Xen 9p backend
*
* Copyright Aporeto 2017
*
* Authors:
* Stefano Stabellini <stefano@aporeto.com>
*
*/
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "hw/9pfs/9p.h"
#include "hw/xen/xen_backend.h"
#include "hw/9pfs/xen-9pfs.h"
#include "qemu/config-file.h"
#include "fsdev/qemu-fsdev.h"
#define VERSIONS "1"
#define MAX_RINGS 8
#define MAX_RING_ORDER 8
typedef struct Xen9pfsRing {
struct Xen9pfsDev *priv;
int ref;
xenevtchn_handle *evtchndev;
int evtchn;
int local_port;
int ring_order;
struct xen_9pfs_data_intf *intf;
unsigned char *data;
struct xen_9pfs_data ring;
struct iovec *sg;
QEMUBH *bh;
/* local copies, so that we can read/write PDU data directly from
* the ring */
RING_IDX out_cons, out_size, in_cons;
bool inprogress;
} Xen9pfsRing;
typedef struct Xen9pfsDev {
struct XenDevice xendev; /* must be first */
V9fsState state;
char *path;
char *security_model;
char *tag;
char *id;
int num_rings;
Xen9pfsRing *rings;
} Xen9pfsDev;
static void xen_9pfs_in_sg(Xen9pfsRing *ring,
struct iovec *in_sg,
int *num,
uint32_t idx,
uint32_t size)
{
RING_IDX cons, prod, masked_prod, masked_cons;
cons = ring->intf->in_cons;
prod = ring->intf->in_prod;
xen_rmb();
masked_prod = xen_9pfs_mask(prod, XEN_FLEX_RING_SIZE(ring->ring_order));
masked_cons = xen_9pfs_mask(cons, XEN_FLEX_RING_SIZE(ring->ring_order));
if (masked_prod < masked_cons) {
in_sg[0].iov_base = ring->ring.in + masked_prod;
in_sg[0].iov_len = masked_cons - masked_prod;
*num = 1;
} else {
in_sg[0].iov_base = ring->ring.in + masked_prod;
in_sg[0].iov_len = XEN_FLEX_RING_SIZE(ring->ring_order) - masked_prod;
in_sg[1].iov_base = ring->ring.in;
in_sg[1].iov_len = masked_cons;
*num = 2;
}
}
static void xen_9pfs_out_sg(Xen9pfsRing *ring,
struct iovec *out_sg,
int *num,
uint32_t idx)
{
RING_IDX cons, prod, masked_prod, masked_cons;
cons = ring->intf->out_cons;
prod = ring->intf->out_prod;
xen_rmb();
masked_prod = xen_9pfs_mask(prod, XEN_FLEX_RING_SIZE(ring->ring_order));
masked_cons = xen_9pfs_mask(cons, XEN_FLEX_RING_SIZE(ring->ring_order));
if (masked_cons < masked_prod) {
out_sg[0].iov_base = ring->ring.out + masked_cons;
out_sg[0].iov_len = ring->out_size;
*num = 1;
} else {
if (ring->out_size >
(XEN_FLEX_RING_SIZE(ring->ring_order) - masked_cons)) {
out_sg[0].iov_base = ring->ring.out + masked_cons;
out_sg[0].iov_len = XEN_FLEX_RING_SIZE(ring->ring_order) -
masked_cons;
out_sg[1].iov_base = ring->ring.out;
out_sg[1].iov_len = ring->out_size -
(XEN_FLEX_RING_SIZE(ring->ring_order) -
masked_cons);
*num = 2;
} else {
out_sg[0].iov_base = ring->ring.out + masked_cons;
out_sg[0].iov_len = ring->out_size;
*num = 1;
}
}
}
static ssize_t xen_9pfs_pdu_vmarshal(V9fsPDU *pdu,
size_t offset,
const char *fmt,
va_list ap)
{
Xen9pfsDev *xen_9pfs = container_of(pdu->s, Xen9pfsDev, state);
struct iovec in_sg[2];
int num;
xen_9pfs_in_sg(&xen_9pfs->rings[pdu->tag % xen_9pfs->num_rings],
in_sg, &num, pdu->idx, ROUND_UP(offset + 128, 512));
return v9fs_iov_vmarshal(in_sg, num, offset, 0, fmt, ap);
}
static ssize_t xen_9pfs_pdu_vunmarshal(V9fsPDU *pdu,
size_t offset,
const char *fmt,
va_list ap)
{
Xen9pfsDev *xen_9pfs = container_of(pdu->s, Xen9pfsDev, state);
struct iovec out_sg[2];
int num;
xen_9pfs_out_sg(&xen_9pfs->rings[pdu->tag % xen_9pfs->num_rings],
out_sg, &num, pdu->idx);
return v9fs_iov_vunmarshal(out_sg, num, offset, 0, fmt, ap);
}
static void xen_9pfs_init_out_iov_from_pdu(V9fsPDU *pdu,
struct iovec **piov,
unsigned int *pniov)
{
Xen9pfsDev *xen_9pfs = container_of(pdu->s, Xen9pfsDev, state);
Xen9pfsRing *ring = &xen_9pfs->rings[pdu->tag % xen_9pfs->num_rings];
int num;
g_free(ring->sg);
ring->sg = g_malloc0(sizeof(*ring->sg) * 2);
xen_9pfs_out_sg(ring, ring->sg, &num, pdu->idx);
*piov = ring->sg;
*pniov = num;
}
static void xen_9pfs_init_in_iov_from_pdu(V9fsPDU *pdu,
struct iovec **piov,
unsigned int *pniov,
size_t size)
{
Xen9pfsDev *xen_9pfs = container_of(pdu->s, Xen9pfsDev, state);
Xen9pfsRing *ring = &xen_9pfs->rings[pdu->tag % xen_9pfs->num_rings];
int num;
g_free(ring->sg);
ring->sg = g_malloc0(sizeof(*ring->sg) * 2);
xen_9pfs_in_sg(ring, ring->sg, &num, pdu->idx, size);
*piov = ring->sg;
*pniov = num;
}
static void xen_9pfs_push_and_notify(V9fsPDU *pdu)
{
RING_IDX prod;
Xen9pfsDev *priv = container_of(pdu->s, Xen9pfsDev, state);
Xen9pfsRing *ring = &priv->rings[pdu->tag % priv->num_rings];
g_free(ring->sg);
ring->sg = NULL;
ring->intf->out_cons = ring->out_cons;
xen_wmb();
prod = ring->intf->in_prod;
xen_rmb();
ring->intf->in_prod = prod + pdu->size;
xen_wmb();
ring->inprogress = false;
xenevtchn_notify(ring->evtchndev, ring->local_port);
qemu_bh_schedule(ring->bh);
}
static const struct V9fsTransport xen_9p_transport = {
.pdu_vmarshal = xen_9pfs_pdu_vmarshal,
.pdu_vunmarshal = xen_9pfs_pdu_vunmarshal,
.init_in_iov_from_pdu = xen_9pfs_init_in_iov_from_pdu,
.init_out_iov_from_pdu = xen_9pfs_init_out_iov_from_pdu,
.push_and_notify = xen_9pfs_push_and_notify,
};
static int xen_9pfs_init(struct XenDevice *xendev)
{
return 0;
}
static int xen_9pfs_receive(Xen9pfsRing *ring)
{
P9MsgHeader h;
RING_IDX cons, prod, masked_prod, masked_cons;
V9fsPDU *pdu;
if (ring->inprogress) {
return 0;
}
cons = ring->intf->out_cons;
prod = ring->intf->out_prod;
xen_rmb();
if (xen_9pfs_queued(prod, cons, XEN_FLEX_RING_SIZE(ring->ring_order)) <
sizeof(h)) {
return 0;
}
ring->inprogress = true;
masked_prod = xen_9pfs_mask(prod, XEN_FLEX_RING_SIZE(ring->ring_order));
masked_cons = xen_9pfs_mask(cons, XEN_FLEX_RING_SIZE(ring->ring_order));
xen_9pfs_read_packet((uint8_t *) &h, ring->ring.out, sizeof(h),
masked_prod, &masked_cons,
XEN_FLEX_RING_SIZE(ring->ring_order));
/* cannot fail, because we only handle one request per ring at a time */
pdu = pdu_alloc(&ring->priv->state);
pdu->size = le32_to_cpu(h.size_le);
pdu->id = h.id;
pdu->tag = le32_to_cpu(h.tag_le);
ring->out_size = le32_to_cpu(h.size_le);
ring->out_cons = cons + le32_to_cpu(h.size_le);
qemu_co_queue_init(&pdu->complete);
pdu_submit(pdu);
return 0;
}
static void xen_9pfs_bh(void *opaque)
{
Xen9pfsRing *ring = opaque;
xen_9pfs_receive(ring);
}
static void xen_9pfs_evtchn_event(void *opaque)
{
Xen9pfsRing *ring = opaque;
evtchn_port_t port;
port = xenevtchn_pending(ring->evtchndev);
xenevtchn_unmask(ring->evtchndev, port);
qemu_bh_schedule(ring->bh);
}
static int xen_9pfs_free(struct XenDevice *xendev)
{
int i;
Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
g_free(xen_9pdev->id);
g_free(xen_9pdev->tag);
g_free(xen_9pdev->path);
g_free(xen_9pdev->security_model);
for (i = 0; i < xen_9pdev->num_rings; i++) {
if (xen_9pdev->rings[i].data != NULL) {
xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
xen_9pdev->rings[i].data,
(1 << xen_9pdev->rings[i].ring_order));
}
if (xen_9pdev->rings[i].intf != NULL) {
xengnttab_unmap(xen_9pdev->xendev.gnttabdev,
xen_9pdev->rings[i].intf,
1);
}
if (xen_9pdev->rings[i].evtchndev > 0) {
qemu_set_fd_handler(xenevtchn_fd(xen_9pdev->rings[i].evtchndev),
NULL, NULL, NULL);
xenevtchn_unbind(xen_9pdev->rings[i].evtchndev,
xen_9pdev->rings[i].local_port);
}
if (xen_9pdev->rings[i].bh != NULL) {
qemu_bh_delete(xen_9pdev->rings[i].bh);
}
}
g_free(xen_9pdev->rings);
return 0;
}
static int xen_9pfs_connect(struct XenDevice *xendev)
{
int i;
Xen9pfsDev *xen_9pdev = container_of(xendev, Xen9pfsDev, xendev);
V9fsState *s = &xen_9pdev->state;
QemuOpts *fsdev;
if (xenstore_read_fe_int(&xen_9pdev->xendev, "num-rings",
&xen_9pdev->num_rings) == -1 ||
xen_9pdev->num_rings > MAX_RINGS || xen_9pdev->num_rings < 1) {
return -1;
}
xen_9pdev->rings = g_malloc0(xen_9pdev->num_rings * sizeof(Xen9pfsRing));
for (i = 0; i < xen_9pdev->num_rings; i++) {
char *str;
int ring_order;
xen_9pdev->rings[i].priv = xen_9pdev;
xen_9pdev->rings[i].evtchn = -1;
xen_9pdev->rings[i].local_port = -1;
str = g_strdup_printf("ring-ref%u", i);
if (xenstore_read_fe_int(&xen_9pdev->xendev, str,
&xen_9pdev->rings[i].ref) == -1) {
goto out;
}
g_free(str);
str = g_strdup_printf("event-channel-%u", i);
if (xenstore_read_fe_int(&xen_9pdev->xendev, str,
&xen_9pdev->rings[i].evtchn) == -1) {
goto out;
}
g_free(str);
xen_9pdev->rings[i].intf = xengnttab_map_grant_ref(
xen_9pdev->xendev.gnttabdev,
xen_9pdev->xendev.dom,
xen_9pdev->rings[i].ref,
PROT_READ | PROT_WRITE);
if (!xen_9pdev->rings[i].intf) {
goto out;
}
ring_order = xen_9pdev->rings[i].intf->ring_order;
if (ring_order > MAX_RING_ORDER) {
goto out;
}
xen_9pdev->rings[i].ring_order = ring_order;
xen_9pdev->rings[i].data = xengnttab_map_domain_grant_refs(
xen_9pdev->xendev.gnttabdev,
(1 << ring_order),
xen_9pdev->xendev.dom,
xen_9pdev->rings[i].intf->ref,
PROT_READ | PROT_WRITE);
if (!xen_9pdev->rings[i].data) {
goto out;
}
xen_9pdev->rings[i].ring.in = xen_9pdev->rings[i].data;
xen_9pdev->rings[i].ring.out = xen_9pdev->rings[i].data +
XEN_FLEX_RING_SIZE(ring_order);
xen_9pdev->rings[i].bh = qemu_bh_new(xen_9pfs_bh, &xen_9pdev->rings[i]);
xen_9pdev->rings[i].out_cons = 0;
xen_9pdev->rings[i].out_size = 0;
xen_9pdev->rings[i].inprogress = false;
xen_9pdev->rings[i].evtchndev = xenevtchn_open(NULL, 0);
if (xen_9pdev->rings[i].evtchndev == NULL) {
goto out;
}
fcntl(xenevtchn_fd(xen_9pdev->rings[i].evtchndev), F_SETFD, FD_CLOEXEC);
xen_9pdev->rings[i].local_port = xenevtchn_bind_interdomain
(xen_9pdev->rings[i].evtchndev,
xendev->dom,
xen_9pdev->rings[i].evtchn);
if (xen_9pdev->rings[i].local_port == -1) {
xen_pv_printf(xendev, 0,
"xenevtchn_bind_interdomain failed port=%d\n",
xen_9pdev->rings[i].evtchn);
goto out;
}
xen_pv_printf(xendev, 2, "bind evtchn port %d\n", xendev->local_port);
qemu_set_fd_handler(xenevtchn_fd(xen_9pdev->rings[i].evtchndev),
xen_9pfs_evtchn_event, NULL, &xen_9pdev->rings[i]);
}
xen_9pdev->security_model = xenstore_read_be_str(xendev, "security_model");
xen_9pdev->path = xenstore_read_be_str(xendev, "path");
xen_9pdev->id = s->fsconf.fsdev_id =
g_strdup_printf("xen9p%d", xendev->dev);
xen_9pdev->tag = s->fsconf.tag = xenstore_read_fe_str(xendev, "tag");
v9fs_register_transport(s, &xen_9p_transport);
fsdev = qemu_opts_create(qemu_find_opts("fsdev"),
s->fsconf.tag,
1, NULL);
qemu_opt_set(fsdev, "fsdriver", "local", NULL);
qemu_opt_set(fsdev, "path", xen_9pdev->path, NULL);
qemu_opt_set(fsdev, "security_model", xen_9pdev->security_model, NULL);
qemu_opts_set_id(fsdev, s->fsconf.fsdev_id);
qemu_fsdev_add(fsdev);
v9fs_device_realize_common(s, NULL);
return 0;
out:
xen_9pfs_free(xendev);
return -1;
}
static void xen_9pfs_alloc(struct XenDevice *xendev)
{
xenstore_write_be_str(xendev, "versions", VERSIONS);
xenstore_write_be_int(xendev, "max-rings", MAX_RINGS);
xenstore_write_be_int(xendev, "max-ring-page-order", MAX_RING_ORDER);
}
static void xen_9pfs_disconnect(struct XenDevice *xendev)
{
/* Dynamic hotplug of PV filesystems at runtime is not supported. */
}
struct XenDevOps xen_9pfs_ops = {
.size = sizeof(Xen9pfsDev),
.flags = DEVOPS_FLAG_NEED_GNTDEV,
.alloc = xen_9pfs_alloc,
.init = xen_9pfs_init,
.initialise = xen_9pfs_connect,
.disconnect = xen_9pfs_disconnect,
.free = xen_9pfs_free,
};

21
hw/9pfs/xen-9pfs.h Normal file
View File

@@ -0,0 +1,21 @@
/*
* Xen 9p backend
*
* Copyright Aporeto 2017
*
* Authors:
* Stefano Stabellini <stefano@aporeto.com>
*
* This work is licensed under the terms of the GNU GPL version 2 or
* later. See the COPYING file in the top-level directory.
*
*/
#include <xen/io/protocols.h>
#include "hw/xen/io/ring.h"
/*
* Do not merge into xen-9p-backend.c: clang doesn't allow unused static
* inline functions in c files.
*/
DEFINE_XEN_FLEX_RING_AND_INTF(xen_9pfs);

View File

@@ -5,6 +5,7 @@ common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu_hotplug.o
common-obj-$(CONFIG_ACPI_MEMORY_HOTPLUG) += memory_hotplug.o
common-obj-$(CONFIG_ACPI_CPU_HOTPLUG) += cpu.o
common-obj-$(CONFIG_ACPI_NVDIMM) += nvdimm.o
common-obj-$(CONFIG_ACPI_VMGENID) += vmgenid.o
common-obj-$(call lnot,$(CONFIG_ACPI_X86)) += acpi-stub.o
common-obj-y += acpi_interface.o

View File

@@ -1559,6 +1559,7 @@ void acpi_build_tables_init(AcpiBuildTables *tables)
tables->rsdp = g_array_new(false, true /* clear */, 1);
tables->table_data = g_array_new(false, true /* clear */, 1);
tables->tcpalog = g_array_new(false, true /* clear */, 1);
tables->vmgenid = g_array_new(false, true /* clear */, 1);
tables->linker = bios_linker_loader_init();
}
@@ -1568,6 +1569,7 @@ void acpi_build_tables_cleanup(AcpiBuildTables *tables, bool mfre)
g_array_free(tables->rsdp, true);
g_array_free(tables->table_data, true);
g_array_free(tables->tcpalog, mfre);
g_array_free(tables->vmgenid, mfre);
}
/* Build rsdt table */

View File

@@ -78,6 +78,21 @@ struct BiosLinkerLoaderEntry {
uint32_t length;
} cksum;
/*
* COMMAND_WRITE_POINTER - write the fw_cfg file (originating from
* @dest_file) at @wr_pointer.offset, by adding a pointer to
* @src_offset within the table originating from @src_file.
* 1,2,4 or 8 byte unsigned addition is used depending on
* @wr_pointer.size.
*/
struct {
char dest_file[BIOS_LINKER_LOADER_FILESZ];
char src_file[BIOS_LINKER_LOADER_FILESZ];
uint32_t dst_offset;
uint32_t src_offset;
uint8_t size;
} wr_pointer;
/* padding */
char pad[124];
};
@@ -85,9 +100,10 @@ struct BiosLinkerLoaderEntry {
typedef struct BiosLinkerLoaderEntry BiosLinkerLoaderEntry;
enum {
BIOS_LINKER_LOADER_COMMAND_ALLOCATE = 0x1,
BIOS_LINKER_LOADER_COMMAND_ADD_POINTER = 0x2,
BIOS_LINKER_LOADER_COMMAND_ADD_CHECKSUM = 0x3,
BIOS_LINKER_LOADER_COMMAND_ALLOCATE = 0x1,
BIOS_LINKER_LOADER_COMMAND_ADD_POINTER = 0x2,
BIOS_LINKER_LOADER_COMMAND_ADD_CHECKSUM = 0x3,
BIOS_LINKER_LOADER_COMMAND_WRITE_POINTER = 0x4,
};
enum {
@@ -278,3 +294,47 @@ void bios_linker_loader_add_pointer(BIOSLinker *linker,
g_array_append_vals(linker->cmd_blob, &entry, sizeof entry);
}
/*
* bios_linker_loader_write_pointer: ask guest to write a pointer to the
* source file into the destination file, and write it back to QEMU via
* fw_cfg DMA.
*
* @linker: linker object instance
* @dest_file: destination file that must be written
* @dst_patched_offset: location within destination file blob to be patched
* with the pointer to @src_file, in bytes
* @dst_patched_offset_size: size of the pointer to be patched
* at @dst_patched_offset in @dest_file blob, in bytes
* @src_file: source file who's address must be taken
* @src_offset: location within source file blob to which
* @dest_file+@dst_patched_offset will point to after
* firmware's executed WRITE_POINTER command
*/
void bios_linker_loader_write_pointer(BIOSLinker *linker,
const char *dest_file,
uint32_t dst_patched_offset,
uint8_t dst_patched_size,
const char *src_file,
uint32_t src_offset)
{
BiosLinkerLoaderEntry entry;
const BiosLinkerFileEntry *source_file =
bios_linker_find_file(linker, src_file);
assert(source_file);
assert(src_offset < source_file->blob->len);
memset(&entry, 0, sizeof entry);
strncpy(entry.wr_pointer.dest_file, dest_file,
sizeof entry.wr_pointer.dest_file - 1);
strncpy(entry.wr_pointer.src_file, src_file,
sizeof entry.wr_pointer.src_file - 1);
entry.command = cpu_to_le32(BIOS_LINKER_LOADER_COMMAND_WRITE_POINTER);
entry.wr_pointer.dst_offset = cpu_to_le32(dst_patched_offset);
entry.wr_pointer.src_offset = cpu_to_le32(src_offset);
entry.wr_pointer.size = dst_patched_size;
assert(dst_patched_size == 1 || dst_patched_size == 2 ||
dst_patched_size == 4 || dst_patched_size == 8);
g_array_append_vals(linker->cmd_blob, &entry, sizeof entry);
}

View File

@@ -75,8 +75,6 @@ static void tco_timer_expired(void *opaque)
if (pm->smi_en & ICH9_PMIO_SMI_EN_TCO_EN) {
ich9_generate_smi();
} else {
ich9_generate_nmi();
}
tr->tco.rld = tr->tco.tmr;
tco_timer_reload(tr);

281
hw/acpi/vmgenid.c Normal file
View File

@@ -0,0 +1,281 @@
/*
* Virtual Machine Generation ID Device
*
* Copyright (C) 2017 Skyport Systems.
*
* Author: Ben Warren <ben@skyportsystems.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qmp-commands.h"
#include "hw/acpi/acpi.h"
#include "hw/acpi/aml-build.h"
#include "hw/acpi/vmgenid.h"
#include "hw/nvram/fw_cfg.h"
#include "sysemu/sysemu.h"
void vmgenid_build_acpi(VmGenIdState *vms, GArray *table_data, GArray *guid,
BIOSLinker *linker)
{
Aml *ssdt, *dev, *scope, *method, *addr, *if_ctx;
uint32_t vgia_offset;
QemuUUID guid_le;
/* Fill in the GUID values. These need to be converted to little-endian
* first, since that's what the guest expects
*/
g_array_set_size(guid, VMGENID_FW_CFG_SIZE - ARRAY_SIZE(guid_le.data));
guid_le = vms->guid;
qemu_uuid_bswap(&guid_le);
/* The GUID is written at a fixed offset into the fw_cfg file
* in order to implement the "OVMF SDT Header probe suppressor"
* see docs/specs/vmgenid.txt for more details
*/
g_array_insert_vals(guid, VMGENID_GUID_OFFSET, guid_le.data,
ARRAY_SIZE(guid_le.data));
/* Put this in a separate SSDT table */
ssdt = init_aml_allocator();
/* Reserve space for header */
acpi_data_push(ssdt->buf, sizeof(AcpiTableHeader));
/* Storage for the GUID address */
vgia_offset = table_data->len +
build_append_named_dword(ssdt->buf, "VGIA");
scope = aml_scope("\\_SB");
dev = aml_device("VGEN");
aml_append(dev, aml_name_decl("_HID", aml_string("QEMUVGID")));
aml_append(dev, aml_name_decl("_CID", aml_string("VM_Gen_Counter")));
aml_append(dev, aml_name_decl("_DDN", aml_string("VM_Gen_Counter")));
/* Simple status method to check that address is linked and non-zero */
method = aml_method("_STA", 0, AML_NOTSERIALIZED);
addr = aml_local(0);
aml_append(method, aml_store(aml_int(0xf), addr));
if_ctx = aml_if(aml_equal(aml_name("VGIA"), aml_int(0)));
aml_append(if_ctx, aml_store(aml_int(0), addr));
aml_append(method, if_ctx);
aml_append(method, aml_return(addr));
aml_append(dev, method);
/* the ADDR method returns two 32-bit words representing the lower and
* upper halves * of the physical address of the fw_cfg blob
* (holding the GUID)
*/
method = aml_method("ADDR", 0, AML_NOTSERIALIZED);
addr = aml_local(0);
aml_append(method, aml_store(aml_package(2), addr));
aml_append(method, aml_store(aml_add(aml_name("VGIA"),
aml_int(VMGENID_GUID_OFFSET), NULL),
aml_index(addr, aml_int(0))));
aml_append(method, aml_store(aml_int(0), aml_index(addr, aml_int(1))));
aml_append(method, aml_return(addr));
aml_append(dev, method);
aml_append(scope, dev);
aml_append(ssdt, scope);
/* attach an ACPI notify */
method = aml_method("\\_GPE._E05", 0, AML_NOTSERIALIZED);
aml_append(method, aml_notify(aml_name("\\_SB.VGEN"), aml_int(0x80)));
aml_append(ssdt, method);
g_array_append_vals(table_data, ssdt->buf->data, ssdt->buf->len);
/* Allocate guest memory for the Data fw_cfg blob */
bios_linker_loader_alloc(linker, VMGENID_GUID_FW_CFG_FILE, guid, 4096,
false /* page boundary, high memory */);
/* Patch address of GUID fw_cfg blob into the ADDR fw_cfg blob
* so QEMU can write the GUID there. The address is expected to be
* < 4GB, but write 64 bits anyway.
* The address that is patched in is offset in order to implement
* the "OVMF SDT Header probe suppressor"
* see docs/specs/vmgenid.txt for more details.
*/
bios_linker_loader_write_pointer(linker,
VMGENID_ADDR_FW_CFG_FILE, 0, sizeof(uint64_t),
VMGENID_GUID_FW_CFG_FILE, VMGENID_GUID_OFFSET);
/* Patch address of GUID fw_cfg blob into the AML so OSPM can retrieve
* and read it. Note that while we provide storage for 64 bits, only
* the least-signficant 32 get patched into AML.
*/
bios_linker_loader_add_pointer(linker,
ACPI_BUILD_TABLE_FILE, vgia_offset, sizeof(uint32_t),
VMGENID_GUID_FW_CFG_FILE, 0);
build_header(linker, table_data,
(void *)(table_data->data + table_data->len - ssdt->buf->len),
"SSDT", ssdt->buf->len, 1, NULL, "VMGENID");
free_aml_allocator();
}
void vmgenid_add_fw_cfg(VmGenIdState *vms, FWCfgState *s, GArray *guid)
{
/* Create a read-only fw_cfg file for GUID */
fw_cfg_add_file(s, VMGENID_GUID_FW_CFG_FILE, guid->data,
VMGENID_FW_CFG_SIZE);
/* Create a read-write fw_cfg file for Address */
fw_cfg_add_file_callback(s, VMGENID_ADDR_FW_CFG_FILE, NULL, NULL,
vms->vmgenid_addr_le,
ARRAY_SIZE(vms->vmgenid_addr_le), false);
}
static void vmgenid_update_guest(VmGenIdState *vms)
{
Object *obj = object_resolve_path_type("", TYPE_ACPI_DEVICE_IF, NULL);
uint32_t vmgenid_addr;
QemuUUID guid_le;
if (obj) {
/* Write the GUID to guest memory */
memcpy(&vmgenid_addr, vms->vmgenid_addr_le, sizeof(vmgenid_addr));
vmgenid_addr = le32_to_cpu(vmgenid_addr);
/* A zero value in vmgenid_addr means that BIOS has not yet written
* the address
*/
if (vmgenid_addr) {
/* QemuUUID has the first three words as big-endian, and expect
* that any GUIDs passed in will always be BE. The guest,
* however, will expect the fields to be little-endian.
* Perform a byte swap immediately before writing.
*/
guid_le = vms->guid;
qemu_uuid_bswap(&guid_le);
/* The GUID is written at a fixed offset into the fw_cfg file
* in order to implement the "OVMF SDT Header probe suppressor"
* see docs/specs/vmgenid.txt for more details.
*/
cpu_physical_memory_write(vmgenid_addr, guid_le.data,
sizeof(guid_le.data));
/* Send _GPE.E05 event */
acpi_send_event(DEVICE(obj), ACPI_VMGENID_CHANGE_STATUS);
}
}
}
static void vmgenid_set_guid(Object *obj, const char *value, Error **errp)
{
VmGenIdState *vms = VMGENID(obj);
if (!strcmp(value, "auto")) {
qemu_uuid_generate(&vms->guid);
} else if (qemu_uuid_parse(value, &vms->guid) < 0) {
error_setg(errp, "'%s. %s': Failed to parse GUID string: %s",
object_get_typename(OBJECT(vms)), VMGENID_GUID, value);
return;
}
vmgenid_update_guest(vms);
}
/* After restoring an image, we need to update the guest memory and notify
* it of a potential change to VM Generation ID
*/
static int vmgenid_post_load(void *opaque, int version_id)
{
VmGenIdState *vms = opaque;
vmgenid_update_guest(vms);
return 0;
}
static const VMStateDescription vmstate_vmgenid = {
.name = "vmgenid",
.version_id = 1,
.minimum_version_id = 1,
.post_load = vmgenid_post_load,
.fields = (VMStateField[]) {
VMSTATE_UINT8_ARRAY(vmgenid_addr_le, VmGenIdState, sizeof(uint64_t)),
VMSTATE_END_OF_LIST()
},
};
static void vmgenid_handle_reset(void *opaque)
{
VmGenIdState *vms = VMGENID(opaque);
/* Clear the guest-allocated GUID address when the VM resets */
memset(vms->vmgenid_addr_le, 0, ARRAY_SIZE(vms->vmgenid_addr_le));
}
static Property vmgenid_properties[] = {
DEFINE_PROP_BOOL("x-write-pointer-available", VmGenIdState,
write_pointer_available, true),
DEFINE_PROP_END_OF_LIST(),
};
static void vmgenid_realize(DeviceState *dev, Error **errp)
{
VmGenIdState *vms = VMGENID(dev);
if (!vms->write_pointer_available) {
error_setg(errp, "%s requires DMA write support in fw_cfg, "
"which this machine type does not provide", VMGENID_DEVICE);
return;
}
/* Given that this function is executing, there is at least one VMGENID
* device. Check if there are several.
*/
if (!find_vmgenid_dev()) {
error_setg(errp, "at most one %s device is permitted", VMGENID_DEVICE);
return;
}
qemu_register_reset(vmgenid_handle_reset, vms);
}
static void vmgenid_device_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
dc->vmsd = &vmstate_vmgenid;
dc->realize = vmgenid_realize;
dc->hotpluggable = false;
dc->props = vmgenid_properties;
object_class_property_add_str(klass, VMGENID_GUID, NULL,
vmgenid_set_guid, NULL);
object_class_property_set_description(klass, VMGENID_GUID,
"Set Global Unique Identifier "
"(big-endian) or auto for random value",
NULL);
}
static const TypeInfo vmgenid_device_info = {
.name = VMGENID_DEVICE,
.parent = TYPE_DEVICE,
.instance_size = sizeof(VmGenIdState),
.class_init = vmgenid_device_class_init,
};
static void vmgenid_register_types(void)
{
type_register_static(&vmgenid_device_info);
}
type_init(vmgenid_register_types)
GuidInfo *qmp_query_vm_generation_id(Error **errp)
{
GuidInfo *info;
VmGenIdState *vms;
Object *obj = find_vmgenid_dev();
if (!obj) {
error_setg(errp, "VM Generation ID device not found");
return NULL;
}
vms = VMGENID(obj);
info = g_malloc0(sizeof(*info));
info->guid = qemu_uuid_unparse_strdup(&vms->guid);
return info;
}

View File

@@ -118,12 +118,6 @@ static void aw_a10_class_init(ObjectClass *oc, void *data)
DeviceClass *dc = DEVICE_CLASS(oc);
dc->realize = aw_a10_realize;
/*
* Reason: creates an ARM CPU, thus use after free(), see
* arm_cpu_class_init()
*/
dc->cannot_destroy_with_object_finalize_yet = true;
}
static const TypeInfo aw_a10_type_info = {

View File

@@ -19,6 +19,7 @@
#include "hw/char/serial.h"
#include "qemu/log.h"
#include "hw/i2c/aspeed_i2c.h"
#include "net/net.h"
#define ASPEED_SOC_UART_5_BASE 0x00184000
#define ASPEED_SOC_IOMEM_SIZE 0x00200000
@@ -33,6 +34,8 @@
#define ASPEED_SOC_TIMER_BASE 0x1E782000
#define ASPEED_SOC_WDT_BASE 0x1E785000
#define ASPEED_SOC_I2C_BASE 0x1E78A000
#define ASPEED_SOC_ETH1_BASE 0x1E660000
#define ASPEED_SOC_ETH2_BASE 0x1E680000
static const int uart_irqs[] = { 9, 32, 33, 34, 10 };
static const int timer_irqs[] = { 16, 17, 18, 35, 36, 37, 38, 39, };
@@ -175,6 +178,10 @@ static void aspeed_soc_init(Object *obj)
object_initialize(&s->wdt, sizeof(s->wdt), TYPE_ASPEED_WDT);
object_property_add_child(obj, "wdt", OBJECT(&s->wdt), NULL);
qdev_set_parent_bus(DEVICE(&s->wdt), sysbus_get_default());
object_initialize(&s->ftgmac100, sizeof(s->ftgmac100), TYPE_FTGMAC100);
object_property_add_child(obj, "ftgmac100", OBJECT(&s->ftgmac100), NULL);
qdev_set_parent_bus(DEVICE(&s->ftgmac100), sysbus_get_default());
}
static void aspeed_soc_realize(DeviceState *dev, Error **errp)
@@ -299,6 +306,20 @@ static void aspeed_soc_realize(DeviceState *dev, Error **errp)
return;
}
sysbus_mmio_map(SYS_BUS_DEVICE(&s->wdt), 0, ASPEED_SOC_WDT_BASE);
/* Net */
qdev_set_nic_properties(DEVICE(&s->ftgmac100), &nd_table[0]);
object_property_set_bool(OBJECT(&s->ftgmac100), true, "aspeed", &err);
object_property_set_bool(OBJECT(&s->ftgmac100), true, "realized",
&local_err);
error_propagate(&err, local_err);
if (err) {
error_propagate(errp, err);
return;
}
sysbus_mmio_map(SYS_BUS_DEVICE(&s->ftgmac100), 0, ASPEED_SOC_ETH1_BASE);
sysbus_connect_irq(SYS_BUS_DEVICE(&s->ftgmac100), 0,
qdev_get_gpio_in(DEVICE(&s->vic), 2));
}
static void aspeed_soc_class_init(ObjectClass *oc, void *data)

View File

@@ -160,12 +160,6 @@ static void bcm2836_class_init(ObjectClass *oc, void *data)
dc->props = bcm2836_props;
dc->realize = bcm2836_realize;
/*
* Reason: creates an ARM CPU, thus use after free(), see
* arm_cpu_class_init()
*/
dc->cannot_destroy_with_object_finalize_yet = true;
}
static const TypeInfo bcm2836_type_info = {

View File

@@ -31,6 +31,9 @@
#define KERNEL_LOAD_ADDR 0x00010000
#define KERNEL64_LOAD_ADDR 0x00080000
#define ARM64_TEXT_OFFSET_OFFSET 8
#define ARM64_MAGIC_OFFSET 56
typedef enum {
FIXUP_NONE = 0, /* do nothing */
FIXUP_TERMINATOR, /* end of insns */
@@ -768,6 +771,49 @@ static uint64_t arm_load_elf(struct arm_boot_info *info, uint64_t *pentry,
return ret;
}
static uint64_t load_aarch64_image(const char *filename, hwaddr mem_base,
hwaddr *entry)
{
hwaddr kernel_load_offset = KERNEL64_LOAD_ADDR;
uint8_t *buffer;
int size;
/* On aarch64, it's the bootloader's job to uncompress the kernel. */
size = load_image_gzipped_buffer(filename, LOAD_IMAGE_MAX_GUNZIP_BYTES,
&buffer);
if (size < 0) {
gsize len;
/* Load as raw file otherwise */
if (!g_file_get_contents(filename, (char **)&buffer, &len, NULL)) {
return -1;
}
size = len;
}
/* check the arm64 magic header value -- very old kernels may not have it */
if (memcmp(buffer + ARM64_MAGIC_OFFSET, "ARM\x64", 4) == 0) {
uint64_t hdrvals[2];
/* The arm64 Image header has text_offset and image_size fields at 8 and
* 16 bytes into the Image header, respectively. The text_offset field
* is only valid if the image_size is non-zero.
*/
memcpy(&hdrvals, buffer + ARM64_TEXT_OFFSET_OFFSET, sizeof(hdrvals));
if (hdrvals[1] != 0) {
kernel_load_offset = le64_to_cpu(hdrvals[0]);
}
}
*entry = mem_base + kernel_load_offset;
rom_add_blob_fixed(filename, buffer, size, *entry);
g_free(buffer);
return size;
}
static void arm_load_kernel_notify(Notifier *notifier, void *data)
{
CPUState *cs;
@@ -776,7 +822,7 @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
int is_linux = 0;
uint64_t elf_entry, elf_low_addr, elf_high_addr;
int elf_machine;
hwaddr entry, kernel_load_offset;
hwaddr entry;
static const ARMInsnFixup *primary_loader;
ArmLoadKernelNotifier *n = DO_UPCAST(ArmLoadKernelNotifier,
notifier, notifier);
@@ -841,14 +887,12 @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64)) {
primary_loader = bootloader_aarch64;
kernel_load_offset = KERNEL64_LOAD_ADDR;
elf_machine = EM_AARCH64;
} else {
primary_loader = bootloader;
if (!info->write_board_setup) {
primary_loader += BOOTLOADER_NO_BOARD_SETUP_OFFSET;
}
kernel_load_offset = KERNEL_LOAD_ADDR;
elf_machine = EM_ARM;
}
@@ -900,17 +944,15 @@ static void arm_load_kernel_notify(Notifier *notifier, void *data)
kernel_size = load_uimage(info->kernel_filename, &entry, NULL,
&is_linux, NULL, NULL);
}
/* On aarch64, it's the bootloader's job to uncompress the kernel. */
if (arm_feature(&cpu->env, ARM_FEATURE_AARCH64) && kernel_size < 0) {
entry = info->loader_start + kernel_load_offset;
kernel_size = load_image_gzipped(info->kernel_filename, entry,
info->ram_size - kernel_load_offset);
kernel_size = load_aarch64_image(info->kernel_filename,
info->loader_start, &entry);
is_linux = 1;
}
if (kernel_size < 0) {
entry = info->loader_start + kernel_load_offset;
} else if (kernel_size < 0) {
/* 32-bit ARM */
entry = info->loader_start + KERNEL_LOAD_ADDR;
kernel_size = load_image_targphys(info->kernel_filename, entry,
info->ram_size - kernel_load_offset);
info->ram_size - KERNEL_LOAD_ADDR);
is_linux = 1;
}
if (kernel_size < 0) {

View File

@@ -101,12 +101,6 @@ static void digic_class_init(ObjectClass *oc, void *data)
DeviceClass *dc = DEVICE_CLASS(oc);
dc->realize = digic_realize;
/*
* Reason: creates an ARM CPU, thus use after free(), see
* arm_cpu_class_init()
*/
dc->cannot_destroy_with_object_finalize_yet = true;
}
static const TypeInfo digic_type_info = {

Some files were not shown because too many files have changed in this diff Show More