The dictzip code in SLE11 received some treatment over time to support
running on big endian hosts. Somewhere in the transition to SLE12 this
support got lost. Add it back in again from the SLE11 code base.
Furthermore while at it, fix up the debug prints to not emit warnings.
[AG: BSC#937572]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Newer GLib's want unique test paths, and thus moan at dupes.
(Seen on Fedora 23 which has glib 2.46)
Uniqueify the paths.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
In ffc6372851, we swapped the guest
base to the address base register from the address index register.
Except that 31 in the base slot is SP not XZR, so we need to be
more intelligent about which reg gets placed in which slot.
Cc: qemu-stable@nongnu.org (v2.4.0)
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 352bcb0a2b)
[AF: Backported to GUEST_BASE and CONFIG_USE_GUEST_BASE]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This reverts commit ae6e8ef11e.
Our Factory libseccomp is patched to support ppc, ppc64, armv7l and soon
ppc64le.
Signed-off-by: Andreas Färber <afaerber@suse.de>
On hosts with limited virtual address space (32bit pointers), we can very
easily run out of virtual memory with big thread pools.
Instead, we should limit ourselves to small pools to keep memory footprint
low on those systems.
This patch fixes random VM stalls like
(process:25114): GLib-ERROR **: gmem.c:103: failed to allocate 1048576 bytes
on 32bit ARM systems for me.
Signed-off-by: Alexander Graf <agraf@suse.de>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.
This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.
Signed-off-by: Alexander Graf <agraf@suse.de>
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.
So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.
This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.
The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: bdrv_file_open got an Error **errp argument, bdrv_delete -> brd_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
BlockDriverCompletionFunc -> BlockCompletionFunc,
qemu_aio_release() -> qemu_aio_unref(),
drop tar_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.
Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.
This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.
Tar patch follows.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: Error **errp added for bdrv_file_open, bdrv_delete -> bdrv_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
BlockDriverCompletionFunc -> BlockCompletionFunc,
qemu_aio_release() -> qemu_aio_unref(),
drop dictzip_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.
The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Running multi-threaded code can easily expose some of the fundamental
breakages in QEMU's design. It's just not a well supported scenario.
So if we pin the whole process to a single host CPU, we guarantee that
we will never have concurrent memory access actually happen. We can still
get scheduled away at any time, so it's no complete guarantee, but apparently
it reduces the odds well enough to get my test cases to pass.
This gets Java 1.7 working for me again on my test box.
Signed-off-by: Alexander Graf <agraf@suse.de>
The tcg code generator is not thread safe. Lock its generation between
different threads.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto exec.c/translate-all.c split for 1.4]
[AF: Rebased for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
During invocations of losetup, we run into an ioctl that doesn't
exist. However, because of that we output an error, which then
screws up the kiwi logic around that call.
So let's silently ignore that bogus ioctl.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.
Signed-off-by: Alexander Graf <agraf@suse.de>
When entering the guest we take a lock to ensure that nobody else messes
with our TB chaining while we're doing it. If we get a segfault inside that
code, we manage to work on, but will not unlock the lock.
This patch forces unlocking of that lock in the segv handler. I'm not sure
this is the right approach though. Maybe we should rather make sure we don't
segfault in the code? I would greatly appreciate someone more intelligible
than me to look at this :).
Example code to trigger this is at: http://csgraf.de/tmp/conftest.c
Reported-by: Fabio Erculiani <lxnay@sabayon.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.
This breaks in some subtile situations, such as the grep and make test
suites.
This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.
The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.
CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
Hack to prevent ALSA from using mmap() interface to simplify emulation.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
virtio fix for 2.4
Fixes migration in virtio 1 mode.
We still have a known bug with memory hotplug, it doesn't
look like we can fix that in time for 2.4.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 05 Aug 2015 15:57:39 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
virtio: fix 1.0 virtqueue migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 488981a4 [block: convert quorum blockdrv to use crypto APIs]
broke qemu-iotest 041 on hosts with GnuTLS < 2.10.0. It converted a
compile-time check to a run-time check at device open time. The result
is that we now advertise a feature (the quorum block driver) that will
never work (on those hosts). There's no way (short of parsing
human-readable error messages) for qemu-iotests or any other API
consumer to recognise that the quorum block driver isn't _actually_
available and shouldn't be used or tested.
Move the run-time check to bdrv_quorum_init() to avoid registering the
quorum block driver if we know it cannot work. This way API consumers
can recognise it's unavailable.
Fixes: 488981a4af
Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1438699705-21761-1-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
1.0 does not requires physically-contiguous pages layout for a
virtqueue. So we could not infer avail and used from desc. This means
we need to migrate vring.avail and vring.used when host support virtio
1.0. This fixes malfunction of virtio 1.0 device after migration.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
X86 queue, 2015-08-04
# gpg: Signature made Tue 04 Aug 2015 16:49:42 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
target-i386: fix IvyBridge xlevel in PC_COMPAT_2_3
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
MIPS patches 2015-08-04
Changes:
* fix semihosting for microMIPS R6
* fix an abort when booting mips64 kernel with --enable-tcg-debug
# gpg: Signature made Tue 04 Aug 2015 12:32:17 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8DD3 2F98 5495 9D66 35D4 4FC0 5211 8E3C 0B29 DA6B
* remotes/lalrae/tags/mips-20150804:
target-mips: Copy restrictions from ext/ins to dext/dins
target-mips: fix semihosting for microMIPS R6
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The checks in dins is required to avoid triggering an assertion
in tcg_gen_deposit_tl. The check in dext is just for completeness.
Fold the other D cases in via fallthru.
In this case the errant dins appears to be data, not code, as
translation failed to stop after a break insn.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
In semihosting mode the SDBBP 1 instructions should trigger UHI syscall,
but in QEMU this does not happen for recently added microMIPS R6.
Consequently bare metal microMIPS R6 programs supporting UHI will not run.
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
cve-2015-5166
# gpg: Signature made Mon 03 Aug 2015 15:27:44 BST using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
* remotes/sstabellini/tags/cve-2015-5166-tag:
Fix release_drive on unplugged devices (pci_piix3_xen_ide_unplug)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
xen-migration-2.4
# gpg: Signature made Mon 03 Aug 2015 17:18:36 BST using RSA key ID 70E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
* remotes/sstabellini/tags/xen-migration-2.4-tag:
migration: Fix regression for xenfv and pc,accel=xen machine.
migration: Fix global state with Xen.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fix migration from the same QEMU version and from previous QEMU
version.
>From the global state section, we don't need runstate with Xen. Right now,
the way the Xen toolstack knows when QEMU is ready is when QEMU reach
"running" runstate.
The configuration section and the section footers are not going to be
present in previous version of QEMU with xenfv machine, so we skip them.
The Xen toolstack libxenlight does not specify a particular version of the
'pc' machine, so migration from older version of QEMU used by Xen to newer
one would break due to missing "configuration" section and section footers.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
When doing migration via the QMP command xen_save_devices_state, the
current runstate is not store into the global state section. Also the
current runstate is not the one we want on the receiver side.
During migration, the Xen toolstack paused QEMU before save the devices
state. Also, the toolstack expect QEMU to autostart when the migration is
finished.
So this patch store "running" as it's current runstate.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
pci_piix3_xen_ide_unplug should completely unhook the unplugged
IDEDevice from the corresponding BlockBackend, otherwise the next call
to release_drive will try to detach the drive again.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Pull request
# gpg: Signature made Mon Aug 3 13:08:25 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/rtl8139-cplus-tx-input-validation-pull-request:
rtl8139: check TCP Data Offset field (CVE-2015-5165)
rtl8139: skip offload on short TCP header (CVE-2015-5165)
rtl8139: check IP Total Length field (CVE-2015-5165)
rtl8139: check IP Header Length field (CVE-2015-5165)
rtl8139: skip offload on short Ethernet/IP header (CVE-2015-5165)
rtl8139: drop tautologous if (ip) {...} statement (CVE-2015-5165)
rtl8139: avoid nested ifs in IP header parsing (CVE-2015-5165)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The TCP Data Offset field contains the length of the header. Make sure
it is valid and does not exceed the IP data length.
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
TCP Large Segment Offload accesses the TCP header in the packet. If the
packet is too short we must not attempt to access header fields:
tcp_header *p_tcp_hdr = (tcp_header*)(eth_payload_data + hlen);
int tcp_hlen = TCP_HEADER_DATA_OFFSET(p_tcp_hdr);
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The IP Total Length field includes the IP header and data. Make sure it
is valid and does not exceed the Ethernet payload size.
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Transmit offload features access Ethernet and IP headers the packet. If
the packet is too short we must not attempt to access header fields:
int proto = be16_to_cpu(*(uint16_t *)(saved_buffer + 12));
...
eth_payload_data = saved_buffer + ETH_HLEN;
...
ip = (ip_header*)eth_payload_data;
if (IP_HEADER_VERSION(ip) != IP_HEADER_VERSION_4) {
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The previous patch stopped using the ip pointer as an indicator that the
IP header is present. When we reach the if (ip) {...} statement we know
ip is always non-NULL.
Remove the if statement to reduce nesting.
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Transmit offload needs to parse packet headers. If header fields have
unexpected values the offload processing is skipped.
The code currently uses nested ifs because there is relatively little
input validation. The next patches will add missing input validation
and a goto label is more appropriate to avoid deep if statement nesting.
Reported-by: 朱东海(启路) <donghai.zdh@alibaba-inc.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
TCG MIPS and S390 fixes for 2.4.
# gpg: Signature made Mon Aug 3 09:09:59 2015 BST using RSA key ID 1DDD8C9B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg: aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg: aka "Aurelien Jarno <aurel32@debian.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77 196D BA9C 7806 1DDD 8C9B
* remotes/aurel/tags/pull-tcg-mips-s390-20150803:
tcg/mips: fix add2
tcg/s390x: Mask TCGMemOp appropriately for indexing
tcg/mips: Mask TCGMemOp appropriately for indexing
tcg/mips: fix TLB loading for BE host with 32-bit guests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Fri Jul 31 23:24:06 2015 BST using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E
* remotes/jnsnow/tags/ide-pull-request:
ahci: fix ICC mask definition
macio: re-add TRIM support
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The add2 code in the tcg_out_addsub2 function doesn't take into account
the case where rl == al == bl. In that case we can't compute the carry
after the addition. As it corresponds to a multiplication by 2, the
carry bit is the bit 31.
While this is a corner case, this prevents x86-64 guests to boot on a
MIPS host.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 2b7ec66f fixed TCGMemOp masking following the MO_AMASK addition,
but two cases were forgotten in the TCG S390 backend.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Commit 2b7ec66f fixed TCGMemOp masking following the MO_AMASK addition,
but two cases were forgotten in the TCG MIPS backend.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
For 32-bit guest, we load a 32-bit address from the TLB, so there is no
need to compensate for the low or high part. This fixes 32-bit guests on
big-endian hosts.
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
There are likely others that could be updated, but we'll
go with a light touch for 2.4 for now.
Without the Unsigned specifier, this shifts bits into the
signed bit, which makes clang unhappy and could cause
unwanted behavior.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1437501721-24495-1-git-send-email-jsnow@redhat.com
Pull request
These fixes make dataplane work again after the notify_me optimization was
added. They also solve QEMUBH memory leaks and fix a bug in dataplane's
cleanup code.
# gpg: Signature made Wed Jul 29 14:50:26 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/block-pull-request:
AioContext: force event loop iteration using BH
AioContext: avoid leaking BHs on cleanup
virtio-blk-dataplane: delete bottom half before the AioContext is freed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The notify_me optimization introduced in commit eabc977973
("AioContext: fix broken ctx->dispatching optimization") skips
event_notifier_set() calls when the event loop thread is not blocked in
ppoll(2).
This optimization causes a deadlock if two aio_context_acquire() calls
race. notify_me = 0 during the race so the winning thread can enter
ppoll(2) unaware that the other thread is waiting its turn to acquire
the AioContext.
This patch forces ppoll(2) to return by scheduling a BH instead of
calling aio_notify().
The following deadlock with virtio-blk dataplane is fixed:
qemu ... -object iothread,id=iothread0 \
-drive if=none,id=drive0,file=test.img,... \
-device virtio-blk-pci,iothread=iothread0,drive=drive0
This command-line results in a hang early on without this patch.
Thanks to Paolo Bonzini <pbonzini@redhat.com> for investigating this bug
with me.
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1438101249-25166-4-git-send-email-pbonzini@redhat.com
Message-Id: <1438014819-18125-3-git-send-email-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Pull request
These two .can_receive() are now reviewed. The net subsystem queue for 2.4 is now empty.
# gpg: Signature made Tue Jul 28 13:26:03 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
xen: Drop net_rx_ok
hw/net: handle flow control in mcf_fec driver receiver
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virtio fixes for 2.4
Mostly virtio 1 spec compliance fixes.
We are unlikely to make it perfectly compliant in
the first release, but it seems worth it to try.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon Jul 27 21:55:48 2015 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
virtio: minor cleanup
acpi: fix pvpanic device is not shown in ui
virtio-blk: only clear VIRTIO_F_ANY_LAYOUT for legacy device
virtio-blk: fail get_features when both scsi and 1.0 were set
virtio: get_features() can fail
virtio-pci: fix memory MR cleanup for modern
virtio: set any_layout in virtio core
virtio-9p: fix any_layout
virtio-serial: fix ANY_LAYOUT
virtio: hide legacy features from modern guests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
MIPS patches 2015-07-28
Changes:
* net/dp8393x fixes
* Vectored Interrupts bug fix
* fix for a bug in machine.c which was provoking a warning on FreeBSD
# gpg: Signature made Tue Jul 28 10:47:19 2015 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8DD3 2F98 5495 9D66 35D4 4FC0 5211 8E3C 0B29 DA6B
* remotes/lalrae/tags/mips-20150728:
net/dp8393x: do not use memory_region_init_rom_device with NULL
net/dp8393x: remove check of runt packets
net/dp8393x: disable user creation
target-mips: fix offset calculation for Interrupts
target-mips: fix passing incompatible pointer type in machine.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* crypto fixes
* megasas SIGSEGV fix
* memory refcount change to fix virtio hot-unplug
# gpg: Signature made Tue Jul 28 08:29:07 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
memory: do not add a reference to the owner of aliased regions
megasas: Add write function to handle write access to PCI BAR 3
crypto: extend unit tests to cover decryption too
crypto: fix built-in AES decrypt function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Tue Jul 28 05:22:29 2015 BST using RSA key ID C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/jtc-for-upstream-pull-request:
block/ssh: Avoid segfault if inet_connect doesn't set errno.
sheepdog: serialize requests to overwrapping area
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Let net_rx_packet() (which checks the same conditions) drops the packet
if the device is not ready. Drop net_xen_info.can_receive and update the
return value for the buffer full case.
We rely on the qemu_flush_queued_packets() in net_event() to wake up
the peer when the buffer becomes available again.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1438077176-378-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The network mcf_fec driver emulated receive side method is not dealing
with network queue flow control properly.
Modify the receive side to check if we have enough space in the
descriptors to store the current packet. If not we process none of it
and return 0. When the guest frees up some buffers through its descriptors
we signal the qemu net layer to send more packets.
[Fixed coding style: 4-space indent and curly braces on if statement.
--Stefan]
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Message-id: 1438045374-10358-1-git-send-email-gerg@uclinux.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replace memory_region_init_rom_device() with memory_region_init_ram() and
memory_region_set_readonly().
This fixes a guest-triggerable QEMU crash when guest tries to write to PROM.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
[leon.alrae@imgtec.com: shorten subject length]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Ethernet requires that messages are at least 64 bytes on the wire. This
limitation does not exist on emulation (no wire message), so remove the
check. Netcard is now able to receive small network packets.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Netcard needs an address space to write data to, which can't be specified
on command line.
This fixes a crash when user starts QEMU with "-device dp8393x"
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
Block layer patches for 2.4.0-rc3
# gpg: Signature made Mon Jul 27 16:19:17 2015 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream:
block: qemu-iotests - add check for multiplication overflow in vpc
block: vpc - prevent overflow if max_table_entries >= 0x40000000
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Correct computation of vector offsets for EXCP_EXT_INTERRUPT.
For instance, if Cause.IV is 0 the vector offset should be 0x180.
Simplify the finding vector number logic for the Vectored Interrupts.
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
[leon.alrae@imgtec.com: cosmetic changes]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
On some (but not all) systems:
$ qemu-img create -f qcow2 overlay -b ssh://xen/
Segmentation fault
It turns out this happens when inet_connect returns -1 in the
following code, but errno == 0.
s->sock = inet_connect(s->hostport, errp);
if (s->sock < 0) {
ret = -errno;
goto err;
}
In the test case above, no host called "xen" exists, so getaddrinfo fails.
On Fedora 22, getaddrinfo happens to set errno = ENOENT (although it
is *not* documented to do that), so it doesn't segfault.
On RHEL 7, errno is not set by the failing getaddrinfo, so ret =
-errno = 0, so the caller doesn't know there was an error and
continues with a half-initialized BDRVSSHState struct, and everything
goes south from there, eventually resulting in a segfault.
Fix this by setting ret to -EIO (same as block/nbd.c and
block/sheepdog.c). The real error is saved in the Error** errp
struct, so it is printed correctly:
$ ./qemu-img create -f qcow2 overlay -b ssh://xen/
qemu-img: overlay: address resolution failed for xen:22: No address associated with hostname
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reported-by: Jun Li
BZ: https://bugzilla.redhat.com/show_bug.cgi?id=1147343
Signed-off-by: Jeff Cody <jcody@redhat.com>
Current sheepdog driver only serializes create requests in oid
unit. This mechanism isn't enough for handling requests to
overwrapping area spanning multiple oids, so it can result bugs like
below:
https://bugs.launchpad.net/sheepdog-project/+bug/1456421
This patch adds a new serialization mechanism for the problem. The
difference from the old one is:
1. serialize entire aiocb if their targetting areas overwrap
2. serialize all requests (read, write, and discard), not only creates
This patch also removes the old mechanism because the new one can be
an alternative.
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Teruaki Ishizaki <ishizaki.teruaki@lab.ntt.co.jp>
Cc: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Hitoshi Mitake <mitake.hitoshi@lab.ntt.co.jp>
Tested-by: Vasiliy Tolstov <v.tolstov@selfip.ru>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Very often the owner of the aliased region is the same as the owner of the alias
region itself. When this happens, the reference count can never go back to 0 and
the owner is leaked. This is for example breaking hot-unplug of virtio-pci
devices (the device cannot be plugged back again with the same id).
Another common use for alias is to transform the system I/O address space
into an MMIO regions; in this case the aliased region never dies, so there
is no problem. Otherwise the owner is always the same for aliasing
and aliased region.
I checked all calls to memory_region_init_alias introduced after commit
dfde4e6 (memory: add ref/unref calls, 2013-05-06) and they do not need the
reference in order to keep the owner of the aliased region alive.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch fixes a QEMU SEGFAULT when a write operation is performed on
the memory region of the PCI BAR 3 (base address space).
When a writeb(0xe0000000) is performed the .write function is invoked to
handle the write access, however, since the .write is not initialised,
the call to 0, causes QEMU to SEGFAULT.
Signed-off-by: Salva Peiró <speirofr@gmail.com>
Acked-by: Hannes Reinecke <hare@suse.com>
Message-Id: <1437987112-24744-1-git-send-email-speirofr@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
gdb expects that the thread ID for c and g-class operations is set to
the CPU we provide when reporting VM stop conditions. If the stub is
still tuned to a different CPU, the wrong information is delivered to
the gdb frontend.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The auto increment bit of the timer control register was wrongly
defined.
See Cortex-A9 MPcore Technical Reference Manual, Section 4.4.2.
Signed-off-by: Johannes Schlatow <schlatow@ida.ing.tu-bs.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
As of d98bc0b65 there are two files that are automatically generated:
ui/shader/texture-blit-frag.h and /ui/shader/texture-blit-vert.h. None
of them is wanted to be tracked by git. Put them into the ignore file
then.
Signed-off-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
We want to have uniform build messages, so fix some messages
which did not follow the standard pattern.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add documentation comments for various utility string functions
which we have implemented in util/cutils.c:
pstrcpy()
strpadcpy()
pstrcat()
strstart()
stristart()
qemu_strnlen()
qemu_strsep()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Fix buglets for 2.4
# gpg: Signature made Mon Jul 27 15:26:48 2015 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
* remotes/rth/tags/pull-tcg-20150727:
tcg: mark temps as mem_coherent = 0 for mov with a constant
tcg: correctly mark dead inputs for mov with a constant
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
aio_notify can be optimized away, and in fact almost always will. However,
qemu_notify_event is used in places where this is incorrect---most notably,
when handling SIGTERM. When aio_notify is optimized away, it is possible that
QEMU enters a blocking ppoll immediately afterwards and stays there, without
reaching main_loop_should_exit().
Fix this by using a bottom half. The bottom half can be optimized too, but
scheduling it is enough for the ppoll not to block. The hang is thus avoided.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1437738175-23624-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This checks that VPC is able to successfully fail (without segfault)
on an image file with a max_table_entries that exceeds 0x40000000.
This table entry is within the valid range for VPC (although too large
for this sample image).
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When we allocate the pagetable based on max_table_entries, we multiply
the max table entry value by 4 to accomodate a table of 32-bit integers.
However, max_table_entries is a uint32_t, and the VPC driver accepts
ranges for that entry over 0x40000000. So during this allocation:
s->pagetable = qemu_try_blockalign(bs->file, s->max_table_entries * 4);
The size arg overflows, allocating significantly less memory than
expected.
Since qemu_try_blockalign() size argument is size_t, cast the
multiplication correctly to prevent overflow.
The value of "max_table_entries * 4" is used elsewhere in the code as
well, so store the correct value for use in all those cases.
We also check the Max Tables Entries value, to make sure that it is <
SIZE_MAX / 4, so we know the pagetable size will fit in size_t.
Cc: qemu-stable@nongnu.org
Reported-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Unfortunately Ubuntu's pkg-config information for gnutls is broken
for the static linking case, and outputs --libs options which the
compiler does not recognize. Work around this problem by testing
that the --cflags/--libs output will at least allow compilation
before enabling gnutls support.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1437758888-22486-1-git-send-email-peter.maydell@linaro.org
Chapter 6.3 of spec said
"
Transitional devices MUST offer, and if offered by the device
transitional drivers MUST accept the following:
VIRTIO_F_ANY_LAYOUT (27)
"
So this patch only clear VIRTIO_F_LAYOUT for legacy device.
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
SCSI passthrough was no longer supported in virtio 1.0, so this patch
fail the get_features() when both 1.0 and scsi is set. And also only
advertise VIRTIO_BLK_F_SCSI for legacy virtio-blk device.
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Each memory_region_add_subregion must be paired with
memory_region_del_subregion.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
When tcg_reg_alloc_mov propagate a constant, we failed to correctly mark
a temp as dead if the liveness analysis hints so. This fixes the
following assert when configure with --enable-debug-tcg:
qemu-x86_64: tcg/tcg.c:1827: tcg_reg_alloc_bb_end: Assertion `ts->val_type == TEMP_VAL_DEAD' failed.
Cc: Richard Henderson <rth@twiddle.net>
Reported-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Message-Id: <1437994568-7825-2-git-send-email-aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Pull request
Here are NIC fixes from Fam Zheng that prevent rx hangs (caused by NIC models
where .can_receive() stops rx but qemu_flush_queued_packets() isn't called).
# gpg: Signature made Mon Jul 27 14:51:48 2015 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
* remotes/stefanha/tags/net-pull-request:
axienet: Flush queued packets when rx is done
dp8393x: Flush packets when link comes up
stellaris_enet: Flush queued packets when read done
mipsnet: Flush queued packets when receiving is enabled
milkymist-minimac2: Flush queued packets when link comes up
mcf_fec: Drop mcf_fec_can_receive
etsec: Flush queue when rx buffer is consumed
etsec: Move etsec_can_receive into etsec_receive
usbnet: Drop usbnet_can_receive
eepro100: Drop nic_can_receive
pcnet: Drop pcnet_can_receive
xgmac: Drop packets with eth_can_rx is false.
hw/net: fix mcf_fec driver receiver
hw/net: add simple phy support to mcf_fec driver
hw/net: add ANLPAR bit definitions to generic mii
hw/net: create common collection of MII definitions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
eth_can_rx checks s->rxsize and returns false if it is non-zero. Because
of the .can_receive semantics change, this will make the incoming queue
disabled by peer, until it is explicitly flushed. So we should flush it
when s->rxsize is becoming zero.
Squash eth_can_rx semantics into etx_rx and drop .can_receive()
callback, also add flush when rx buffer becomes available again after a
packet gets queued.
The other conditions, "!axienet_rx_resetting(s) &&
axienet_rx_enabled(s)" are OK because enet_write already calls
qemu_flush_queued_packets when the register bits are changed.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436955553-22791-13-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
.can_receive callback changes semantics that once return 0, backend will
try sending again until explicitly flushed, change the device to meet
that.
dp8393x_can_receive checks SONIC_CR_RXEN bit in SONIC_CR register and
SONIC_ISR_RBE bit in SONIC_ISR register, try flushing the queue when
either bit is being updated.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-12-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If s->np reaches 31, the queue will be disabled by peer when it sees
stellaris_enet_can_receive() returns false, until we explicitly flushes
it which notifies the peer. Do this when guest is done reading all
existing data.
Move the semantics to stellaris_enet_receive, by returning 0 when the
buffer is full, so that new packets will be queued. In
stellaris_enet_read, flush and restart the queue when guest has done
reading.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-11-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Drop .can_receive and move the semantics into minimac2_rx, by returning
0.
That is once minimac2_rx returns 0, incoming packets will be queued
until the queue is explicitly flushed. We do this when s->regs[R_STATE0]
or s->regs[R_STATE1] is changed in minimac2_write.
Also drop the unused trace point.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436955553-22791-9-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The semantics of .can_receive requires us to flush the queue explicitly
when s->rx_enabled becomes true after it returns 0, but the packet being
queued is not meaningful since the guest hasn't activated the card.
Let's just drop the packet in this case.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-8-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
usbnet_receive already drops packet if rndis_state is not
RNDIS_DATA_INITIALIZED, and queues packet if in buffer is not available.
The only difference is s->dev.config but that is similar to rndis_state.
Drop usbnet_can_receive and move these checks to usbnet_receive, so that
we don't need to explicitly flush the queue when s->dev.config changes
value.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-5-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
nic_receive already checks the conditions and drop packets if false.
Due to the new semantics since 6e99c63 ("net/socket: Drop
net_socket_can_send"), having .can_receive returning 0 requires us to
explicitly flush the queued packets when the conditions are becoming
true, but queuing the packets when guest driver is not ready doesn't
make much sense.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
pcnet_receive already checks the conditions and drop packets if false.
Due to the new semantics since 6e99c63 ("net/socket: Drop
net_socket_can_send"), having .can_receive returning 0 requires us to
explicitly flush the queued packets when the conditions are becoming
true, but queuing the packets when guest driver is not ready doesn't
make much sense.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Message-id: 1436955553-22791-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The network mcf_fec driver emulated receive side method is returning a
result of 0 causing the network layer to disable receive for this emulated
device. This results in the guest only ever receiving one packet.
Fix the recieve side processing to return the number of bytes that we
passed back through to the guest.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435296436-12152-5-git-send-email-gerg@uclinux.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Linux fec driver needs at least basic phy support to probe and work.
The current qemu mcf_fec emulation has no support for the reading or
writing of the MDIO lines to access an attached phy.
This code adds a very simple set of register results for a fixed phy
setup - very similar to that used on an m5208evb board. This is enough
to probe and identify an emulated attached phy.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435296436-12152-4-git-send-email-gerg@uclinux.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Add a base set of bit definitions for the standard MII phy "Auto-Negotiation
Link Partner Ability Register" (ANLPAR).
The original definitions moved into mii.h from the allwinner_emac driver
did not define these.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435296436-12152-3-git-send-email-gerg@uclinux.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Create a common set of definitions of address and register values for
ethernet MII phys. A few of the current ethernet drivers have at least
a partial set of these definitions. Others just use hard coded raw
constant numbers.
This initial set is copied directly from the allwinner_emac code.
Signed-off-by: Greg Ungerer <gerg@uclinux.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435296436-12152-2-git-send-email-gerg@uclinux.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# gpg: Signature made Mon Jul 27 13:01:10 2015 BST using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E
* remotes/jnsnow/tags/cve-2015-5154-pull-request:
ide: Clear DRQ after handling all expected accesses
ide/atapi: Fix START STOP UNIT command completion
ide: Check array bounds before writing to io_buffer (CVE-2015-5154)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current unit test only verifies the encryption API,
resulting in us missing a recently introduced bug in the
decryption API from commit d3462e3. It was fortunately
later discovered & fixed by commit bd09594, thanks to the
QEMU I/O tests for qcow2 encryption, but we should really
detect this directly in the crypto unit tests. Also remove
an accidental debug message and simplify some asserts.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1437468902-23230-1-git-send-email-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
virtio pci allows any device to have a modern interface,
this in turn requires ANY_LAYOUT support.
Fix up ANY_LAYOUT for virtio-9p.
Reported-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Don't assume a specific layout for control messages.
Required by virtio 1.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
NOTIFY_ON_EMPTY, ANY_LAYOUT and BAD are only valid on the legacy
interface.
Hide them from modern guests.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is additional hardening against an end_transfer_func that fails to
clear the DRQ status bit. The bit must be unset as soon as the PIO
transfer has completed, so it's better to do this in a central place
instead of duplicating the code in all commands (and forgetting it in
some).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
The command must be completed on all code paths. START STOP UNIT with
pwrcnd set should succeed without doing anything.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
If the end_transfer_func of a command is called because enough data has
been read or written for the current PIO transfer, and it fails to
correctly call the command completion functions, the DRQ bit in the
status register and s->end_transfer_func may remain set. This allows the
guest to access further bytes in s->io_buffer beyond s->data_end, and
eventually overflowing the io_buffer.
One case where this currently happens is emulation of the ATAPI command
START STOP UNIT.
This patch fixes the problem by adding explicit array bounds checks
before accessing the buffer instead of relying on end_transfer_func to
function correctly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
* qemu-char fixes
* SCSI fixes (including CVE-2015-5158)
* RCU fixes
* Framebuffer logic to set DIRTY_MEMORY_VGA
* Fix compiler warning for --disable-vnc
* qemu-doc fixes
* x86 TCG pasto fix
# gpg: Signature made Fri Jul 24 12:57:52 2015 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream:
target-i386/FPU: a misprint in helper_fistll_ST0
qemu-doc: fix typos
framebuffer: set DIRTY_MEMORY_VGA on RAM that is used for the framebuffer
memory: count number of active VGA logging clients
vl: Fix compiler warning for builds without VNC
scsi: Handle no media case for scsi_get_configuration
rcu: actually register threads that have RCU read-side critical sections
scsi: fix buffer overflow in scsi_req_parse_cdb (CVE-2015-5158)
vnc: fix memory leak
qemu-char: Fix missed data on unix socket
qemu-char: handle EINTR for TCP character devices
exec.c: Use atomic_rcu_read() to access dispatch in memory_region_section_get_iotlb()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The MemoryRegionSection contains enough information to access the
RAM region underlying the framebuffer, and can be cached inside the
display device.
By doing this, the new framebuffer_update_memory_section function can
enable dirty memory logging on the relevant RAM region. The function
must be called whenever the stride or base of the framebuffer changes;
a simple way to cover these cases is to call it on every full frame
invalidation, which is a rare case.
framebuffer_update_display now works entirely on a MemoryRegionSection,
without going through cpu_physical_memory_map/unmap.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For a board that has multiple framebuffer devices, both of them
might want to use DIRTY_MEMORY_VGA on the same memory region.
The lack of reference counting in memory_region_set_log makes
this very awkward to implement.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, scsi_get_configuration always returns a current
profile (DVD or CD), even when there is actually no media present.
By comparison, ide/atapi uses a default profile of 0 (MMC_PROFILE_NONE)
for this case and checks for tray_open, so let's do the same for scsi.
This fixes a problem I'm seeing with Fedora 22 guests where systemd
cdrom_id fails to unmount after a QEMU-initiated eject against a
scsi cdrom device because it believes the media is still present
(but unreadable).
Signed-off-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Message-Id: <1436986352-10695-1-git-send-email-mjrosato@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a guest-triggerable buffer overflow present in QEMU 2.2.0
and newer. scsi_cdb_length returns -1 as an error value, but the
caller does not check it.
Luckily, the massive overflow means that QEMU will just SIGSEGV,
making the impact much smaller.
Reported-by: Zhu Donghai (朱东海) <donghai.zdh@alibaba-inc.com>
Fixes: 1894df0281
Reviewed-by: Fam Zheng <famz@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Similar to the same fix for user-mode, except this instance
occurs on the softmmu path. Again, the tlb addend must be
the base register, while the guest address is the index.
Reviewed-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
NUMA queue, 2015-07-22
# gpg: Signature made Wed Jul 22 19:11:04 2015 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/numa-pull-request:
hostmem: Fix qemu_opt_get_bool() crash in host_memory_backend_init()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 812c1057 introduced HUP detection on unix and tcp sockets prior
to a read in tcp_chr_read. This unfortunately broke CloudStack 4.2
which relied on the old behaviour where data on a socket was readable
even if a HUP was present.
A working solution is to properly check the return values from recv,
handling a closed socket once there is no more data to read.
Also enable polling for G_IO_NVAL to ensure the callback is called
for all possible events as these should now be possible to handle
with the improved error detection.
Signed-off-by: Nils Carlson <pyssling@ludd.ltu.se>
Message-Id: <1437338396-22336-1-git-send-email-pyssling@ludd.ltu.se>
[Do not handle EINTR; use socket_error(). - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When accessing the dispatch pointer in an AddressSpace within an RCU
critical section we should always use atomic_rcu_read(). Fix an
access within memory_region_section_get_iotlb() which was incorrectly
doing a direct pointer access.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1437391637-31576-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
bootindex was incorrectly changed to a device Property during the
platform code split, resulting in it no longer working. Remove it.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # v2.3+
The RTL8168 quirk correctly describes using bit 31 as a signal to
mark a latch/completion, but the code mistakenly uses bit 28. This
causes the Realtek driver to spin on this register for quite a while,
20k cycles on Windows 7 v7.092 driver. Then it gets frustrated and
tries to set the bit itself and spins for another 20k cycles. For
some this still results in a working driver, for others not. About
the only thing the code really does in its current form is protect
the guest from sneaking in writes to the real hardware MSI-X table.
The fix is obviously to use bit 31 as we document that we should.
The other problem doesn't seem to affect current drivers as nobody
seems to use these window registers for writes to the MSI-X table, but
we need to use the stored data when a write is triggered, not the
value of the current write, which only provides the offset.
Note that only the Windows drivers from Realtek seem to use these
registers, the Microsoft drivers provided with Windows 8.1 do not
access them, nor do Linux in-kernel drivers.
Link: https://bugs.launchpad.net/qemu/+bug/1384892
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Cc: qemu-stable@nongnu.org # v2.1+
This fixes the following crash, introduced by commit
49d2e648e8:
$ gdb --args qemu-system-x86_64 -machine pc,mem-merge=off -object memory-backend-ram,id=ram-node0,size=1024
[...]
Program received signal SIGABRT, Aborted.
(gdb) bt
#0 0x00007ffff253b8c7 in raise () at /lib64/libc.so.6
#1 0x00007ffff253d52a in abort () at /lib64/libc.so.6
#2 0x00007ffff253446d in __assert_fail_base () at /lib64/libc.so.6
#3 0x00007ffff2534522 in () at /lib64/libc.so.6
#4 0x00005555558bb80a in qemu_opt_get_bool_helper (opts=0x55555621b650, name=name@entry=0x5555558ec922 "mem-merge", defval=defval@entry=true, del=del@entry=false) at qemu/util/qemu-option.c:388
#5 0x00005555558bbb5a in qemu_opt_get_bool (opts=<optimized out>, name=name@entry=0x5555558ec922 "mem-merge", defval=defval@entry=true) at qemu/util/qemu-option.c:398
#6 0x0000555555720a24 in host_memory_backend_init (obj=0x5555562ac970) at qemu/backends/hostmem.c:226
Instead of using qemu_opt_get_bool(), that didn't work with
qemu_machine_opts for a long time, we can use the corresponding
MachineState fields.
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
qxl: build fix for 2.4
# gpg: Signature made Wed Jul 22 15:55:00 2015 BST using DSA key ID F43F0992
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-Andre Lureau <marcandre.lureau@gmail.com>"
# gpg: aka "Marc-Andre Lureau <marc-andre.lureau@nokia.com>"
# gpg: aka "Marc-André Lureau <marc-andre.lureau@nokia.com>"
# gpg: aka "Marc-André Lureau (elmarco) <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7346 2483 9404 4E20 ABFF 7D48 D864 9487 F43F 0992
* remotes/elmarco/tags/for-upstream:
qxl: Fix new function name for spice-server library
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The new spice-server function to limit the number of monitors (0.12.6)
changed while development from spice_qxl_set_monitors_config_limit to
spice_qxl_max_monitors (accepted upstream).
By mistake I post patch with former name.
This patch fix the function name.
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Acked-by: Christophe Fergeau <cfergeau@redhat.com>
Acked-by: Martin Kletzander <mkletzan@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
It is pretty rare for aio_notify to actually set the EventNotifier. It
can happen with worker threads such as thread-pool.c's, but otherwise it
should never be set thanks to the ctx->notify_me optimization. The
previous patch, unfortunately, added an unconditional call to
event_notifier_test_and_clear; now add a userspace fast path that
avoids the call.
Note that it is not possible to do the same with event_notifier_set;
it would break, as proved (again) by the included formal model.
This patch survived over 3000 reboots on aarch64 KVM.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-7-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
event_notifier_test_and_clear must be called before processing events.
Otherwise, an aio_poll could "eat" the notification before the main
I/O thread invokes ppoll(). The main I/O thread then never wakes up.
This is an example of what could happen:
i/o thread vcpu thread worker thread
---------------------------------------------------------------------
lock_iothread
notify_me = 1
...
unlock_iothread
bh->scheduled = 1
event_notifier_set
lock_iothread
notify_me = 3
ppoll
notify_me = 1
aio_dispatch
aio_bh_poll
thread_pool_completion_bh
bh->scheduled = 1
event_notifier_set
node->io_read(node->opaque)
event_notifier_test_and_clear
ppoll
*** hang ***
"Tracing" with qemu_clock_get_ns shows pretty much the same behavior as
in the previous bug, so there are no new tricks here---just stare more
at the code until it is apparent.
One could also use a formal model, of course. The included one shows
this with three processes: notifier corresponds to a QEMU thread pool
worker, temporary_waiter to a VCPU thread that invokes aio_poll(),
waiter to the main I/O thread. I would be happy to say that the
formal model found the bug for me, but actually I wrote it after the
fact.
This patch is a bit of a big hammer. The next one optimizes it,
with help (this time for real rather than a posteriori :)) from
another, similar formal model.
Reported-by: Richard W. M. Jones <rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch rewrites the ctx->dispatching optimization, which was the cause
of some mysterious hangs that could be reproduced on aarch64 KVM only.
The hangs were indirectly caused by aio_poll() and in particular by
flash memory updates's call to blk_write(), which invokes aio_poll().
Fun stuff: they had an extremely short race window, so much that
adding all kind of tracing to either the kernel or QEMU made it
go away (a single printf made it half as reproducible).
On the plus side, the failure mode (a hang until the next keypress)
made it very easy to examine the state of the process with a debugger.
And there was a very nice reproducer from Laszlo, which failed pretty
often (more than half of the time) on any version of QEMU with a non-debug
kernel; it also failed fast, while still in the firmware. So, it could
have been worse.
For some unknown reason they happened only with virtio-scsi, but
that's not important. It's more interesting that they disappeared with
io=native, making thread-pool.c a likely suspect for where the bug arose.
thread-pool.c is also one of the few places which use bottom halves
across threads, by the way.
I hope that no other similar bugs exist, but just in case :) I am
going to describe how the successful debugging went... Since the
likely culprit was the ctx->dispatching optimization, which mostly
affects bottom halves, the first observation was that there are two
qemu_bh_schedule() invocations in the thread pool: the one in the aio
worker and the one in thread_pool_completion_bh. The latter always
causes the optimization to trigger, the former may or may not. In
order to restrict the possibilities, I introduced new functions
qemu_bh_schedule_slow() and qemu_bh_schedule_fast():
/* qemu_bh_schedule_slow: */
ctx = bh->ctx;
bh->idle = 0;
if (atomic_xchg(&bh->scheduled, 1) == 0) {
event_notifier_set(&ctx->notifier);
}
/* qemu_bh_schedule_fast: */
ctx = bh->ctx;
bh->idle = 0;
assert(ctx->dispatching);
atomic_xchg(&bh->scheduled, 1);
Notice how the atomic_xchg is still in qemu_bh_schedule_slow(). This
was already debated a few months ago, so I assumed it to be correct.
In retrospect this was a very good idea, as you'll see later.
Changing thread_pool_completion_bh() to qemu_bh_schedule_fast() didn't
trigger the assertion (as expected). Changing the worker's invocation
to qemu_bh_schedule_slow() didn't hide the bug (another assumption
which luckily held). This already limited heavily the amount of
interaction between the threads, hinting that the problematic events
must have triggered around thread_pool_completion_bh().
As mentioned early, invoking a debugger to examine the state of a
hung process was pretty easy; the iothread was always waiting on a
poll(..., -1) system call. Infinite timeouts are much rarer on x86,
and this could be the reason why the bug was never observed there.
With the buggy sequence more or less resolved to an interaction between
thread_pool_completion_bh() and poll(..., -1), my "tracing" strategy was
to just add a few qemu_clock_get_ns(QEMU_CLOCK_REALTIME) calls, hoping
that the ordering of aio_ctx_prepare(), aio_ctx_dispatch, poll() and
qemu_bh_schedule_fast() would provide some hint. The output was:
(gdb) p last_prepare
$3 = 103885451
(gdb) p last_dispatch
$4 = 103876492
(gdb) p last_poll
$5 = 115909333
(gdb) p last_schedule
$6 = 115925212
Notice how the last call to qemu_poll_ns() came after aio_ctx_dispatch().
This makes little sense unless there is an aio_poll() call involved,
and indeed with a slightly different instrumentation you can see that
there is one:
(gdb) p last_prepare
$3 = 107569679
(gdb) p last_dispatch
$4 = 107561600
(gdb) p last_aio_poll
$5 = 110671400
(gdb) p last_schedule
$6 = 110698917
So the scenario becomes clearer:
iothread VCPU thread
--------------------------------------------------------------------------
aio_ctx_prepare
aio_ctx_check
qemu_poll_ns(timeout=-1)
aio_poll
aio_dispatch
thread_pool_completion_bh
qemu_bh_schedule()
At this point bh->scheduled = 1 and the iothread has not been woken up.
The solution must be close, but this alone should not be a problem,
because the bottom half is only rescheduled to account for rare situations
(see commit 3c80ca1, thread-pool: avoid deadlock in nested aio_poll()
calls, 2014-07-15).
Introducing a third thread---a thread pool worker thread, which
also does qemu_bh_schedule()---does bring out the problematic case.
The third thread must be awakened *after* the callback is complete and
thread_pool_completion_bh has redone the whole loop, explaining the
short race window. And then this is what happens:
thread pool worker
--------------------------------------------------------------------------
<I/O completes>
qemu_bh_schedule()
Tada, bh->scheduled is already 1, so qemu_bh_schedule() does nothing
and the iothread is never woken up. This is where the bh->scheduled
optimization comes into play---it is correct, but removing it would
have masked the bug.
So, what is the bug?
Well, the question asked by the ctx->dispatching optimization ("is any
active aio_poll dispatching?") was wrong. The right question to ask
instead is "is any active aio_poll *not* dispatching", i.e. in the prepare
or poll phases? In that case, the aio_poll is sleeping or might go to
sleep anytime soon, and the EventNotifier must be invoked to wake
it up.
In any other case (including if there is *no* active aio_poll at all!)
we can just wait for the next prepare phase to pick up the event (e.g. a
bottom half); the prepare phase will avoid the blocking and service the
bottom half.
Expressing the invariant with a logic formula, the broken one looked like:
!(exists(thread): in_dispatching(thread)) => !optimize
or equivalently:
!(exists(thread):
in_aio_poll(thread) && in_dispatching(thread)) => !optimize
In the correct one, the negation is in a slightly different place:
(exists(thread):
in_aio_poll(thread) && !in_dispatching(thread)) => !optimize
or equivalently:
(exists(thread): in_prepare_or_poll(thread)) => !optimize
Even if the difference boils down to moving an exclamation mark :)
the implementation is quite different. However, I think the new
one is simpler to understand.
In the old implementation, the "exists" was implemented with a boolean
value. This didn't really support well the case of multiple concurrent
event loops, but I thought that this was okay: aio_poll holds the
AioContext lock so there cannot be concurrent aio_poll invocations, and
I was just considering nested event loops. However, aio_poll _could_
indeed be concurrent with the GSource. This is why I came up with the
wrong invariant.
In the new implementation, "exists" is computed simply by counting how many
threads are in the prepare or poll phases. There are some interesting
points to consider, but the gist of the idea remains:
1) AioContext can be used through GSource as well; as mentioned in the
patch, bit 0 of the counter is reserved for the GSource.
2) the counter need not be updated for a non-blocking aio_poll, because
it won't sleep forever anyway. This is just a matter of checking
the "blocking" variable. This requires some changes to the win32
implementation, but is otherwise not too complicated.
3) as mentioned above, the new implementation will not call aio_notify
when there is *no* active aio_poll at all. The tests have to be
adjusted for this change. The calls to aio_notify in async.c are fine;
they only want to kick aio_poll out of a blocking wait, but need not
do anything if aio_poll is not running.
4) nested aio_poll: these just work with the new implementation; when
a nested event loop is invoked, the outer event loop is never in the
prepare or poll phases. The outer event loop thus has already decremented
the counter.
Reported-by: Richard W. M. Jones <rjones@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Preparatory bugfixes and tweaks to the loop before the next patch:
- disable dispatch optimization during aio_prepare. This fixes a bug.
- do not modify "blocking" until after the first WaitForMultipleObjects
call. This is needed in the next patch.
- change the loop to do...while. This makes it obvious that the loop
is always entered at least once. In the next patch this is important
because the first iteration undoes the ctx->notify_me increment that
happened before entering the loop.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In these tests, the purpose of the initial calls to aio_poll and
g_main_context_iteration is simply to put the AioContext in a
known state; the return value of the function does not really
matter. The next patch will change those return values; change
the assertions to a while loop which expresses the intention
better.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The normal value for the event is to be set. If we do not do
this, pause_all_vcpus (through qemu_clock_enable) hangs unless
timerlist_run_timers has been run at least once for the timerlist.
This can happen with the following patches, that make aio_notify do
nothing most of the time.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Message-id: 1437487673-23740-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
tag for qga-pull-2015-07-21
Small fix to correct schema versioning annotations for recently-added
GuestDiskBusType enum values. Not the end of the world, but ideally
this inconsistency would be corrected prior to 2.4 release.
# gpg: Signature made Tue Jul 21 20:43:24 2015 BST using RSA key ID F108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg: aka "Michael Roth <mdroth@utexas.edu>"
# gpg: aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D 3FA0 3353 C9CE F108 B584
* remotes/mdroth/tags/qga-pull-2015-07-21-tag:
qga: fixed versions for guest bus types in qapi-schema
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm queue:
* don't sync CNTVCT with kernel all the time (fixes VM time weirdnesses)
* fix a warning compiling disas/arm-a64 with -Wextra
# gpg: Signature made Tue Jul 21 12:15:33 2015 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
* remotes/pmaydell/tags/pull-target-arm-20150721:
disas/arm-a64: Add missing compiler attribute GCC_FMT_ATTR
target-arm: kvm: Differentiate registers based on write-back levels
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some registers like the CNTVCT register should only be written to the
kernel as part of machine initialization or on vmload operations, but
never during runtime, as this can potentially make time go backwards or
create inconsistent time observations between VCPUs.
Introduce a list of registers that should not be written back at runtime
and check this list on syncing the register state to the KVM state.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Message-id: 1437046488-10773-1-git-send-email-christoffer.dall@linaro.org
[PMM: tweaked a few comments, added the new argument to the stub
write_list_to_kvmstate() in target-arm/kvm-stub.c]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Mon Jul 20 19:27:04 2015 BST using RSA key ID AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: FAEB 9711 A12C F475 812F 18F2 88A9 064D 1835 61EB
# Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76 CBD0 7DEF 8106 AAFC 390E
* remotes/jnsnow/tags/ide-pull-request:
tests: Fix broken targets check-report-qtest-*
ahci: Force ICC bits in PxCMD to zero
qtest/ide: add another short PRDT test flavor
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
They need QTEST_QEMU_IMG. Without it, the tests raise an assertion:
$ make -C bin check-report-qtest-i386.xml
make: Entering directory 'bin'
GTESTER check-report-qtest-i386.xml
blkdebug: Suspended request 'A'
blkdebug: Resuming request 'A'
ahci-test: tests/libqos/libqos.c:162:
mkimg: Assertion `qemu_img_path' failed.
main-loop: WARNING: I/O thread spun for 1000 iterations
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1437231284-17455-1-git-send-email-sw@weilnetz.de
Signed-off-by: John Snow <jsnow@redhat.com>
Since commit 6e99c63 "net/socket: Drop net_socket_can_send" and friends,
net queues need to be explicitly flushed after qemu_can_send_packet()
returns false, because the netdev side will disable the polling of fd.
This fixes the case of "cont" after "stop" (or migration).
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1436232067-29144-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Convert partially summed packets to be fully checksummed.
In case csum offloaded packet, vmxnet3 implementation always passes an
RxCompDesc with the "Checksum calculated and found correct" notification
to the OS. This emulates the observed ESXi behavior.
Therefore, if packet has the NEEDS_CSUM bit set, we must calculate and
place a fully computed checksum into the tcp/udp header. Otherwise, the
OS driver will receive a checksum-correct indication but with the actual
tcp/udp checksum field having just the pseudo header csum value.
If host OS performs forwarding, it will forward an incorrectly
checksummed packet.
Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Message-id: 1436864116-19154-3-git-send-email-shmulik.ladkani@ravellosystems.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The AHCI spec requires that the HBA sets the ICC bits to zero after the
ICC change is done. Since we don't do any ICC change, force the bits to
zero all the time.
This fixes delays with some OSs (e.g. OpenBSD) waiting for the ICC bits
to change to 0.
Signed-off-by: Stefan Fritsch <sf@sfritsch.de>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: E1ZFpg7-00027N-HW@eru.sfritsch.de
Signed-off-by: John Snow <jsnow@redhat.com>
The existing short PRDT test case does not transfer any data because the
first PRD is less than 1 sector.
This patch adds another short PRDT test case where the first sector can
be read but the PRDT is still smaller than the requested number of
sectors. This exercises a different code path in ide_dma_cb().
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1435770571-9906-1-git-send-email-stefanha@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Commit e0cf11f31c ("timer: Use a single
definition of NSEC_PER_SEC for the whole codebase") renamed
NANOSECONDS_PER_SECOND to NSEC_PER_SEC.
On Mac OS X there is a <dispatch/time.h> system header which also
defines NSEC_PER_SEC. This causes compiler warnings.
Let's use the old name instead. It's longer but it doesn't clash.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1436364609-7929-1-git-send-email-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches for 2.4.0-rc2
# gpg: Signature made Mon Jul 20 15:48:56 2015 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
* remotes/kevin/tags/for-upstream:
crypto: Fix aes_decrypt_wrapper()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virtio, vhost, pc fixes for 2.4
The only notable thing here is vhost-user multiqueue
revert. We'll work on making it stable in 2.5,
reverting now means we won't have to maintain
bug for bug compability forever.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon Jul 20 12:24:00 2015 BST using RSA key ID D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
* remotes/mst/tags/for_upstream:
virtio-net: remove virtio queues if the guest doesn't support multiqueue
virtio-net: Flush incoming queues when DRIVER_OK is being set
pci_add_capability: remove duplicate comments
virtio-net: unbreak any layout
Revert "vhost-user: add multi queue support"
ich9: fix skipped vmstate_memhp_state subsection
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit d3462e3 broke qcow2's encryption functionality by using encrypt
instead of decrypt in the wrapper function it introduces. This was found
by qemu-iotests case 134.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
commit da51a335 adds all queues in .realize(). But if the
guest doesn't support multiqueue, we forget to remove them. And
we cannot handle the ctrl vq corretly. The guest will hang.
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
This patch fixes network hang after "stop" then "cont", while network
packets keep arriving.
Tested both manually (tap, host pinging guest) and with Jason's qtest
series (plus his "[PATCH 2.4] socket: pass correct size in
net_socket_send()" fix).
As virtio_net_set_status is called when guest driver is setting status
byte and when vm state is changing, it is a good opportunity to flush
queued packets.
This is necessary because during vm stop the backend (e.g. tap) would
stop rx processing after .can_receive returns false, until the queue is
explicitly flushed or purged.
The other interesting condition in .can_receive, virtio_queue_ready(),
is handled by virtio_net_handle_rx() when guest kicks; the 3rd condition
is invalid queue index which doesn't need flushing.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 032a74a1c0
("virtio-net: byteswap virtio-net header") breaks any layout by
requiring out_sg[0].iov_len >= n->guest_hdr_len. Fixing this by
copying header to temporary buffer if swap is needed, and then use
this buffer as part of out_sg.
Fixes 032a74a1c0
("virtio-net: byteswap virtio-net header")
Cc: qemu-stable@nongnu.org
Cc: clg@fr.ibm.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This reverts commit 830d70db69.
The interface isn't fully backwards-compatible, which is bad.
Let's redo this properly after 2.4.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
By declaring another .subsections array for vmstate_tco_io_state made
vmstate_memhp_state not registered anymore. There must be only one
.subsections array for all subsections.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Reported-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Paulo Alcantara <pcacjr@zytor.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Fire timer only when required. Brings down wakeups by a big number.
# gpg: Signature made Fri Jul 17 14:41:40 2015 BST using RSA key ID 854083B6
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
* remotes/amit-virtio-rng/tags/vrng-2.4:
virtio-rng: trigger timer only when guest requests for entropy
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch triggers timer only when guest requests for
entropy. As soon as first request from guest for entropy
comes we set the timer. Timer bumps up the quota value
when it gets triggered.
Signed-off-by: Pankaj Gupta <pagupta@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1436962608-9961-2-git-send-email-pagupta@redhat.com>
[Re-worded patch subject, removed extra whitespace -- Amit]
Signed-off-by: Amit Shah <amit.shah@redhat.com>
This reverts commit 4e8cfbe114.
We should not poll via timer, and with ccid being fixed
to properly notify us about pending transfers we don't have to.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Calling a function pointer that was cast from an incompatible function
results in undefined behavior. 'void *' isn't compatible with 'struct
XXX *', so we can't cast to nettle_cipher_func, but have to provide a
wrapper. (Conversion from 'void *' to 'struct XXX *' might require
computation, which won't be done if we drop argument's true type, and
pointers can have different sizes so passing arguments on stack would
bug.)
Having two different prototypes based on nettle version doesn't make
this solution any nicer.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1437062641-12684-3-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In nettle 3, cbc_encrypt() accepts 'nettle_cipher_func' instead of
'nettle_crypt_func' and these two differ in 'const' qualifier of the
first argument. The build fails with:
In file included from crypto/cipher.c:71:0:
./crypto/cipher-nettle.c: In function ‘qcrypto_cipher_encrypt’:
./crypto/cipher-nettle.c:154:38: error: passing argument 2 of
‘nettle_cbc_encrypt’ from incompatible pointer type
cbc_encrypt(ctx->ctx_encrypt, ctx->alg_encrypt,
^
In file included from ./crypto/cipher-nettle.c:24:0,
from crypto/cipher.c:71:
/usr/include/nettle/cbc.h:48:1: note: expected
‘void (*)(const void *, size_t, uint8_t *, const uint8_t *)
but argument is of type
‘void (*)( void *, size_t, uint8_t *, const uint8_t *)
To allow both versions, we switch to the new definition and #if typedef
it for old versions.
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Message-Id: <1436548682-9315-2-git-send-email-rkrcmar@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
memory_region_present() leaks a reference to a MemoryRegion in the
case "mr == container". While fixing it, avoid reference counting
altogether for memory_region_present(), by using RCU only.
The return value could in principle be already invalid immediately
after memory_region_present returns, but presumably the caller knows
that and it's using memory_region_present to probe for devices that
are unpluggable, or something like that. The RCU critical section
is needed anyway, because it protects as->current_map.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch allow to limit number of heads using qxl driver. By default
qxl driver is not limited on any kind on head use so can decide to use
as much heads.
libvirt has this as a video card parameter (actually set to 1 but not
used). This parameter will allow to limit setting a use can do (which
could be confusing).
Signed-off-by: Frediano Ziglio <fziglio@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2015-07-16 17:31:05 +02:00
198 changed files with 7587 additions and 1153 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.