Git-commit: 89fbea8737
References: bsc#1182137
Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.
This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.
This is wrong in several ways:
- a potential clunk request from the client could tear the first
fid down and cause the reference to be stale. This leads to a
use-after-free error that can be detected with ASAN, using a
custom 9p client
- fids are added at the head of the list : restarting from the
previous head will always miss fids added by a some other
potential request
All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.
Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364
Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.
This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.
Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 64c9bc181f
References: bsc#1175441, CVE-2020-14364
When processing remote NDIS control message packets, the USB Net
device emulator uses a fixed length(4096) data buffer. The incoming
packet length could exceed this limit. Add a check to avoid it.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-2-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 2e1dcbc0c2af64fcb17009eaf2ceedd81be2b27f
References: bsc#1179467
While processing ARP/NCSI packets in 'arp_input' or 'ncsi_input'
routines, ensure that pkt_len is large enough to accommodate the
respective protocol headers, lest it should do an OOB access.
Add check to avoid it.
CVE-2020-29129 CVE-2020-29130
QEMU: slirp: out-of-bounds access while processing ARP/NCSI packets
-> https://www.openwall.com/lists/oss-security/2020/11/27/1
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20201126135706.273950-1-ppandit@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[BR: no NCSI code, so only ARP patched]
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257
During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:
if (tp->size + bytes > msh)
bytes = msh - tp->size;
This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443
A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF). For now paper over it
with assertions. The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.
Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625
While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362
A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.
Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
References: bsc#1172385, CVE-2020-12829
This is in place of commit 4a1f253adb which added the log.h
reference, but did other changes we're not interested in.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: f3a60058c9
References: bsc#1172385, CVE-2020-12829
The value from twoD_foreground (which is in host endian format) must
be converted to the endianness of the framebuffer (currently always
little endian) before it can be used to perform the fill operation.
Signed-off-by: Marcus Comstedt <marcus@mc.pp.se>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: eb76243c9d
References: bsc#1172385, CVE-2020-12829
Set the changed memory region dirty after performed a 2D operation to
ensure that the screen is updated properly.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 33159dd7ce
References: bsc#1172385, CVE-2020-12829
Display updates and drawing hardware cursor did not work when frame
buffer address was non-zero. Fix this by taking the frame buffer
address into account in these cases. This fixes screen dragging on
AmigaOS. Based on patch by Sebastian Bauer.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 06cb926aaa
References: bsc#1172385, CVE-2020-12829
The sm501 currently implements only a very limited set of raster operation
modes. After this change, unknown raster operation modes are logged so
these can be easily spotted.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: debc7e7dad
References: bsc#1172385, CVE-2020-12829
Add support for the negated destination operation mode. This is used e.g.
by AmigaOS for the INVERSEVID drawing mode. With this change, the cursor
in the shell and non-immediate window adjustment are working now.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 54b2a4339c
References: bsc#1172385, CVE-2020-12829
Before, crt_h_total was used for src_width and dst_width. This is a
property of the current display setting and not relevant for the 2D
operation that also can be done off-screen. The pitch register's purpose
is to describe line pitch relevant of the 2D operation.
Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 5690d9ecef
References: bsc#1172385, CVE-2020-12829
These are not really implemented (just return zero or default values)
but add these so guests accessing them can run.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 6a2a5aae02
References: bsc#1172478, CVE-2020-13765
Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: e423455c4f
References: bsc#1172478, CVE-2020-13765
Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:
d = dest + (rom->addr - addr);
and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.
Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 369ff955a8
References: bsc#1172384, CVE-2020-13361
A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.
int cnt = d->frame_cnt >> 16;
int size = d->frame_cnt & 0xffff;
if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.
Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.
Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de594e4765)
[BR: BSC#1146873 CVE-2019-12068 - minor trace related tweak applied]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 2faae0f778f818fadc873308f983289df697eb93
References: bsc#1170940, CVE-2020-1983
The q pointer is updated when the mbuf data is moved from m_dat to
m_ext.
m_ext buffer may also be realloc()'ed and moved during m_cat():
q should also be updated in this case.
Reported-by: Aviv Sasson <asasson@paloaltonetworks.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 9bd6c5913271eabcb7768a58197ed3301fe19f2d)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 68ccb8021a838066f0951d4b2817eb6b6f10a843
References: bsc#1163018, CVE-2020-8608
Various calls to snprintf() assume that snprintf() returns "only" the
number of bytes written (excluding terminating NUL).
https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04
"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."
Before patch ce131029, if there isn't enough room in "m_data" for the
"DCC ..." message, we overflow "m_data".
After the patch, if there isn't enough room for the same, we don't
overflow "m_data", but we set "m_len" out-of-bounds. The next time an
access is bounded by "m_len", we'll have a buffer overflow then.
Use slirp_fmt*() to fix potential OOB memory access.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-7-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 82ebe9c370a0e2970fb5695aa19aa5214a6a1c80
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608
While emulating services in tcp_emu(), it uses 'mbuf' size
'm->m_size' to write commands via snprintf(3). Use M_FREEROOM(m)
size to avoid possible OOB access.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-3-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: ce131029d6d4a405cb7d3ac6716d03e58fb4a5d9
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608
While emulating IRC DCC commands, tcp_emu() uses 'mbuf' size
'm->m_size' to write DCC commands via snprintf(3). This may
lead to OOB write access, because 'bptr' points somewhere in
the middle of 'mbuf' buffer, not at the start. Use M_FREEROOM(m)
size to avoid OOB access.
Reported-by: Vishnu Dev TJ <vishnudevtj@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-2-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 2655fffed7a9e765bcb4701dd876e9dab975f289
References: bsc#1161066, CVE2020-7039, bsc#1161066, CVS-2020-7039
The main loop only checks for one available byte, while we sometimes
need two bytes.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Git-commit: 30648c03b27fb8d9611b723184216cd3174b6775
References: bsc#1163018, CVE-2020-8608
Various calls to snprintf() in libslirp assume that snprintf() returns
"only" the number of bytes written (excluding terminating NUL).
https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04
"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."
Introduce slirp_fmt() that handles several pathological cases the
way libslirp usually expect:
- treat error as fatal (instead of silently returning -1)
- fmt0() will always \0 end
- return the number of bytes actually written (instead of what would
have been written, which would usually result in OOB later), including
the ending \0 for fmt0()
- warn if truncation happened (instead of ignoring)
Other less common cases can still be handled with strcpy/snprintf() etc.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-2-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When the first fragment does not fit in the preallocated buffer, q will
already be pointing to the ext buffer, so we mustn't try to update it.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 126c04acbabd7ad32c2b018fe10dfac2a3bc1210)
[LY: CVE-2019-14378 BSC#1143794]
Signed-off-by: Liang Yan <lyan@suse.com>
For some reason, EMU_IDENT is not like other "emulated" protocols and
tries to reconstitute the original buffer, if it came in multiple
packets. Unfortunately, it does so wrongly, as it doesn't respect the
sbuf circular buffer appending rules, nor does it maintain some of the
invariants (rptr is incremented without bounds, etc): this leads to
further memory corruption revealed by ASAN or various malloc
errors. Furthermore, the so_rcv buffer is regularly flushed, so there
is no guarantee that buffer reconstruction will do what is expected.
Instead, do what the function comment says: "XXX Assumes the whole
command came in one packet", and don't touch so_rcv.
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Using ip_deq after m_free might read pointers from an allocation reuse.
This would be difficult to exploit, but that is still related with
CVE-2019-14378 which generates fragmented IP packets that would trigger this
issue and at least produce a DoS.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit c59279437eda91841b9d26079c70b8a540d41204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[BR: BSC#1149811 CVE-2019-15890]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The interface names in qemu-bridge-helper are defined to be
of size IFNAMSIZ(=16), including the terminating null('\0') byte.
The same is applied to interface names read from 'bridge.conf'
file to form ACLs rules. If user supplied '--br=bridge' name
is not restricted to the same length, it could lead to ACL bypass
issue. Restrict bridge name to IFNAMSIZ, including null byte.
Reported-by: Riccardo Schirone <rschiron@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1140402 CVE-2019-13164]
Signed-off-by: Liang Yan <lyan@suse.com>
When releasing spice resources in release_resource() routine,
if release info object 'ext.info' is null, it leads to null
pointer dereference. Add check to avoid it.
Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20190425063534.32747-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d52680fc93)
[LY: BSC#1135902 CVE-2019-12155]
Signed-off-by: Liang Yan <lyan@suse.com>
md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction. Add the new feature.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1111331 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130
CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
A subsequent patch to ppc/spapr needs to load the RTAS blob into
qemu memory rather than target memory (so it can later be copied
into the right spot at machine reset time).
I would use load_image() but it is marked deprecated because it
doesn't take a buffer size as argument, so let's add load_image_size()
that does.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ea87616d6c)
[LM: BSC#1130675 CVE-2018-20815]
Signed-off-by: Lin Ma <lma@suse.com>
Recent commit 5b76ef50f6 fixed a race where v9fs_co_open2() could
possibly overwrite a fid path with v9fs_path_copy() while it is being
accessed by some other thread, ie, use-after-free that can be detected
by ASAN with a custom 9p client.
It turns out that the same can happen at several locations where
v9fs_path_copy() is used to set the fid path. The fix is again to
take the write lock.
Fixes CVE-2018-19364.
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b3c77aa58)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The assumption that the fid cannot be used by any other operation is
wrong. At least, nothing prevents a misbehaving client to create a
file with a given fid, and to pass this fid to some other operation
at the same time (ie, without waiting for the response to the creation
request). The call to v9fs_path_copy() performed by the worker thread
after the file was created can race with any access to the fid path
performed by some other thread. This causes use-after-free issues that
can be detected by ASAN with a custom 9p client.
Unlike other operations that only read the fid path, v9fs_co_open2()
does modify it. It should hence take the write lock.
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b76ef50f6)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:
while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done
With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:
Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59 while (*path && fd != -1) {
(gdb) bt
path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
path=0x0) at hw/9pfs/9p-local.c:92
fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
at hw/9pfs/9p.c:1083
at util/coroutine-ucontext.c:116
(gdb)
The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().
Impact: DoS triggered by unprivileged guest users.
Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1d20398694)
[BR: BSC#1117275]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.
Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7d)
[BR: BSC#1123156 CVE-2019-6778, modify patch to use spaces instead of tabs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on:
- APIC LVT Timer register.
- TSC value.
Change the order to respect the dependency.
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7477cd3897)
[LM: BSC#1109544]
Signed-off-by: Lin Ma <lma@suse.com>
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.
Discovered by Deja vu Security. Reported by Oracle.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e58ccf0396)
[BR: BSC#1114422 CVE-2018-18849]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Due to an integer overflow issue found in ccid_card_vscard_read
(CVE-2018-18438), memory was previously susceptible to corruption. To
mitigate this issue, a number of components, including the
ccid-card-passthrough device are now utilizing size_t in place of int to
prevent an the overflow from occuring when values larger than
available space are passed to corresponding functions.
While CCID smart card readers were the primary target of the
vulnerability, a number of other subsystems have also been modified in the
same manner to prevent them from being exploited in the same way; specifically
functions which are modeled after the IOCanReadHandler and IOReadHandler
typedefs.
Subsystems affected include, but are not limited to:
CCID Smart Cards
Spice consoles
Monitors
Network Sockets
Network Tap Devices
Serial IO
Emulated Shell Serial IO
USB Redirection and Passthrough
Consoles
SLiRP IO
Mouse Emulation
Xen Consoles
SLCP IO
Network Block Devices
(cherry picked from commit 4792016a6119a4d5768bd8f989ae90b52dfef789)
[LD: BSC#1112185 CVE-2018-18438]
Signed-off-by: Larry Dewey <ldewey@suse.com>
There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()
CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1592a99470)
[LD: BSC#1111013 CVE-2018-17963]
Signed-off-by: Larry Dewey <ldewey@suse.com>
In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.
Fixing by converting the type of size to size_t.
CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit b1d80d12c5)
[LD: BSC#1111010 CVE-2018-17962]
Signed-off-by: Larry Dewey <ldewey@suse.com>
In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.
Fixing by converting the type of size to size_t.
CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1a326646fe)
[LD: BSC#1111006 CVE-2018-17958]
Signed-off-by: Larry Dewey <ldewey@suse.com>
In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.
Fixing by converting the type of size to size_t.
CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fdc89e90fa)
[LD: BSC#1110910 CVE-2018-10839]
Signed-off-by: Larry Dewey <ldewey@suse.com>
Just before starting a child thread, a call to seccomp_start() has been
added to ensure that each thread is sandboxed properly, and that
seccomp is successfully applied to all threads when the user specifies
that sanboxing should be used either via /etc/libvirt/qemu.conf or via
the command line by passing `--seccomp on` to qemu.
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
AMD Zen expose the Intel equivalant to Speculative Store Bypass Disable
via the 0x80000008_EBX[25] CPUID feature bit.
This needs to be exposed to guest OS to allow them to protect
against CVE-2018-3639.
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 403503b162)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD). To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f. With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.
Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit cfeea0c021)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While reading file content via 'guest-file-read' command,
'qmp_guest_file_read' routine allocates buffer of count+1
bytes. It could overflow for large values of 'count'.
Add check to avoid it.
Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 141b197408)
[FL: BSC#1098735 CVE-2018-12617 modify as only qga/commands-posix.c
has related code, but not qga/commands-win32.c]
Signed-off-by: Fei Li <fli@suse.com>
While reassembling incoming fragmented datagrams, 'm_cat' routine
extends the 'mbuf' buffer, if it has insufficient room. It computes
a wrong buffer size, which leads to overwriting adjacent heap buffer
area. Correct this size computation in m_cat.
Reported-by: ZDI Disclosures <zdi-disclosures@trendmicro.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 864036e251)
[FL: BSC#1096223 CVE-2018-11806]
Signed-off-by: Fei Li <fli@suse.com>
As an attempt to help the user do the right thing, warn if we
detect spec_ctrl data in the migration stream, but where the
cpu defined doesn't have the feature. This would indicate the
migration is from the quick and dirty qemu produced in January
2018 to handle Spectre v2. That qemu version exposed the IBRS
cpu feature to all vcpu types, which helped in the short term
but wasn't a well designed approach.
Warn the user that the now migrated guest needs to be restarted
as soon as possible, using the spec_ctrl cpu feature flag or a
*-IBRS vcpu model specified as appropriate.
Signed-off-by: Bruce Rogers <brogers@suse.com>
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715. Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.
The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.
Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ac96c41354)
[BR: BSC#1068032 CVE-2017-5715 CPU models are reduced as appropriate
to match this QEMU version, and features specified according to current
code conventions]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add the new feature word and the "ibpb" feature flag.
Based on a patch by Paolo Bonzini.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1b3420e1c4)
[BR: BSC#1068032 CVE-2017-5715 change to match current code's feat_name
type. Also adapted to old style feature management.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The spec can be found in Intel Software Developer Manual or in
Instruction Set Extensions Programming Reference.
Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1477902446-5932-1-git-send-email-he.chen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 95ea69fb46)
[BR: BSC#1068032 CVE-2017-5715 - orig patch modified to not provide new
feature, just infrastructure, and change to match current code's feat_name
type. Adjust to old style feature tracking as well.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This is perhaps the easiest way to handle the longer model names
introduced with the -IBRS models. This is in lew of backporting
commit 807e9869b8 and other commits
which that would require.
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. In that, end of the data segment address
'mh_load_end_addr' should be less than the bss segment address
'mh_bss_end_addr'. Add check to validate that.
Reported-by: CERT CC <cert.cc@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1083291 CVE-2018-7550]
Signed-off-by: Liang Yan <lyan@suse.com>
Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2 -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:
int main()
{
iopl(3);
srand(time(NULL));
int a,b;
while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
outb(a,b);
}
return 0;
}
The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc17)
[LY: BSC#1076114 CVE-2018-5683]
Signed-off-by: Liang Yan <lyan@suse.com>
Commit "bea60dd ui/vnc: fix potential memory corruption issues" is
incomplete. vnc_update_stats must calculate width and height the same
way vnc_refresh_server_surface does it, to make sure we don't use width
and height values larger than the qemu vnc server can handle.
Commit "e22492d ui/vnc: disable adaptive update calculations if not
needed" masks the issue in the default configuration. It triggers only
in case the "lossy" option is set to "on" (default is "off").
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1485248428-575-1-git-send-email-kraxel@redhat.com
(cherry picked from commit eebe0b7905)
[BR: BSC#1026612 CVE-2017-2633 (this fix fixes first fix)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The _cmp_bytes variable added by commit "bea60dd ui/vnc: fix potential
memory corruption issues" can become negative. Result is (possibly
exploitable) memory corruption. Reason for that is it uses the stride
instead of bytes per scanline to apply limits.
For the server surface is is actually fine. vnc creates that itself,
there is never any padding and thus scanline length always equals stride.
For the guest surface scanline length and stride are typically identical
too, but it doesn't has to be that way. So add and use a new variable
(guest_ll) for the guest scanline length. Also rename min_stride to
line_bytes to make more clear what it actually is. Finally sprinkle
in an assert() to make sure we never use a negative _cmp_bytes again.
Reported-by: 范祚至(库特) <zuozhi.fzz@alibaba-inc.com>
Reviewed-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit eb8934b041)
[BR: BSC#1026612 CVE-2017-2633 (this fix fixes first fix)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
this patch makes the VNC server work correctly if the
server surface and the guest surface have different sizes.
Basically the server surface is adjusted to not exceed VNC_MAX_WIDTH
x VNC_MAX_HEIGHT and additionally the width is rounded up to multiple of
VNC_DIRTY_PIXELS_PER_BIT.
If we have a resolution whose width is not dividable by VNC_DIRTY_PIXELS_PER_BIT
we now get a small black bar on the right of the screen.
If the surface is too big to fit the limits only the upper left area is shown.
On top of that this fixes 2 memory corruption issues:
The first was actually discovered during playing
around with a Windows 7 vServer. During resolution
change in Windows 7 it happens sometimes that Windows
changes to an intermediate resolution where
server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface).
This happens only if width % VNC_DIRTY_PIXELS_PER_BIT != 0.
The second is a theoretical issue, but is maybe exploitable
by the guest. If for some reason the guest surface size is bigger
than VNC_MAX_WIDTH x VNC_MAX_HEIGHT we end up in severe corruption since
this limit is nowhere enforced.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bea60dd767)
[BR: BSC#1026612 CVE-2017-2633]
Signed-off-by: Bruce Rogers <brogers@suse.com>
this fixes invalid rectangle updates observed after commit 12b316d
with the vmware VGA driver. The issues occured because the server
and client surface update seems to be out of sync at some points
and the max width of the surface is not dividable by
VNC_DIRTY_BITS_PER_PIXEL (16).
Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2f487a3d40)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
vnc_update_client currently scans the dirty bitmap of each client
bitwise which is a very costly operation if only few bits are dirty.
vnc_refresh_server_surface does almost the same.
this patch optimizes both by utilizing the heavily optimized
function find_next_bit to find the offset of the next dirty
bit in the dirty bitmaps.
The following artifical test (just the bitmap operation part) running
vnc_update_client 65536 times on a 2560x2048 surface illustrates the
performance difference:
All bits clean - vnc_update_client_new: 0.07 secs
vnc_update_client_old: 10.98 secs
All bits dirty - vnc_update_client_new: 11.26 secs
vnc_update_client_old: 20.19 secs
Few bits dirty - vnc_update_client_new: 0.08 secs
vnc_update_client_old: 10.98 secs
The case for all bits dirty is still rather slow, this
is due to the implementation of find_and_clear_dirty_height.
This will be addresses in a separate patch.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 12b316d4c1)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
this allows for setting VNC_DIRTY_PIXELS_PER_BIT to different
values than 16 if desired.
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6cd859aa8a)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Now that nobody depends on DisplayState in DisplayChangeListener
callbacks any more we can remove the parameter from all callbacks.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bc2ed9704f)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Rework DisplayStateListener callbacks to not use the DisplayState
any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit )5e00d3ac475fb4c9afa17612a908e933fe142f00
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Rework DisplayStateListener callbacks to not use the DisplayState
any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8db9bae94e)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Rework DisplayStateListener callbacks to not use the DisplayState
any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d39fa6d86d)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add convinence wrappers to query DisplaySurface properties.
Simliar to ds_get_*, but operating in the DisplaySurface
not the DisplayState.
With this patch in place ui frontents can stop using DisplayState
in the rendering code paths, they can simply operate using the
DisplaySurface passed in via dpy_gfx_switch callback.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 626e3b34e3)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Replace the dpy_gfx_resize and dpy_gfx_setdata DisplayChangeListener
callbacks with a dpy_gfx_switch callback which notifies the ui code
when the framebuffer backing storage changes.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c12aeb860c)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
It's broken by design. There can be multiple DisplayChangeListener
instances, so they simply can't store state in the (single) DisplayState
struct. Try 'qemu -display gtk -vnc :0', watch it crash & burn.
With DisplayChangeListenerOps having a more sane interface now we can
simply use the DisplayChangeListener pointer to get access to our
private data instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 21ef45d712)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Split callbacks into separate Ops struct. Pass DisplayChangeListener
pointer as first argument to all callbacks. Uninline a bunch of
display functions and move them from console.h to console.c
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 7c20b4a374)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
9p back-end first queries the size of an extended attribute,
allocates space for it via g_malloc() and then retrieves its
value into allocated buffer. Race between querying attribute
size and retrieving its could lead to memory bytes disclosure.
Use g_malloc0() to avoid it.
Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7bd9275630)
[BR: BSC#1062069 CVE-2017-15038]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).
Impact: DoS for privileged guest users. qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.
Fixes: CVE-2017-13672
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828122906.18993-1-kraxel@redhat.com
(cherry picked from commit 3d90c62548)
[FL: BSC#1056334 CVE-2017-13672, add macro to fix multiple #include]
Signed-off-by: Fei Li <fli@suse.com>
When accessing guest's ram block during DMA operation, use
'qemu_ram_ptr_length' to get ram block pointer. It ensures
that DMA operation of given length is possible; And avoids
any OOB memory access situations.
Reported-by: Alex <broscutamaker@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170712123840.29328-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 04bf2526ce)
[FL: BSC#1048902 CVE-2017-11334]
Signed-off-by: Fei Li <fli@suse.com>
While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.
This is CVE-2017-11434.
Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 413d463f43)
[FL: BSC#1049381 CVE-2017-11434]
Signed-off-by: Fei Li <fli@suse.com>
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.
This is CVE-2017-14167.
Reported-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ed4f86e8b6)
[FL: BSC#1057585 CVE-2017-14167]
Signed-off-by: Fei Li <fli@suse.com>
Don't reinvent a broken wheel, just use the hexdump function we have.
Impact: low, broken code doesn't run unless you have debug logging
enabled.
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509110128.27261-1-kraxel@redhat.com
(cherry picked from commit bd4a683505)
[BR: BSC#1047674 CVE-2017-10806]
Signed-off-by: Bruce Rogers <brogers@suse.com>
qemu proper has done so for 13 years
(8a7ddc38a6), qemu-img and qemu-io have
done so for four years (526eda14a6).
Ignoring this signal is especially important in qemu-nbd because
otherwise a client can easily take down the qemu-nbd server by dropping
the connection when the server wants to send something, for example:
$ qemu-nbd -x foo -f raw -t null-co:// &
[1] 12726
$ qemu-io -c quit nbd://localhost/bar
can't open device nbd://localhost/bar: No export with name 'bar' available
[1] + 12726 broken pipe qemu-nbd -x foo -f raw -t null-co://
In this case, the client sends an NBD_OPT_ABORT and closes the
connection (because it is not required to wait for a reply), but the
server replies with an NBD_REP_ACK (because it is required to reply).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170611123714.31292-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 041e32b8d9)
[BR: BSC#1046636 CVE-2017-10664]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This ensures that the request is unref'ed properly, and avoids a
segmentation fault in the new qtest testcase that is added.
Reported-by: Zhangyanyu <zyy4013@stu.ouc.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503, dropped testcase from patch]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Avoid TOC-TOU bugs by storing the DCMD opcode in the MegasasCmd
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.
This is a regression introduced by commit "df4938a6651b 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from
ret = remove("$dir/.virtfs_metadata/$name");
if (ret < 0 && errno != ENOENT) {
/* Error out */
}
/* Ignore absence of metadata */
to
fd = openat("$dir/.virtfs_metadata")
unlinkat(fd, "$name")
if (ret < 0 && errno != ENOENT) {
/* Error out */
}
/* Ignore absence of metadata */
If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.
We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]
(cherry picked from commit 6a87e7929f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using the mapped-file security mode, we shouldn't let the client mess
with the metadata. The current code already tries to hide the metadata dir
from the client by skipping it in local_readdir(). But the client can still
access or modify it through several other operations. This can be used to
escalate privileges in the guest.
Affected backend operations are:
- local_mknod()
- local_mkdir()
- local_open2()
- local_symlink()
- local_link()
- local_unlinkat()
- local_renameat()
- local_rename()
- local_name_to_path()
Other operations are safe because they are only passed a fid path, which
is computed internally in local_name_to_path().
This patch converts all the functions listed above to fail and return
EINVAL when being passed the name of the metadata dir. This may look
like a poor choice for errno, but there's no such thing as an illegal
path name on Linux and I could not think of anything better.
This fixes CVE-2017-7493.
Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 7a95434e0c)
[BR: BSC#1039495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
It should return 1 if an error occurs when reading iso td.
This will avoid an infinite loop issue in ohci_service_ed_list.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5899ac3e.1033240a.944d5.9a2d@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 26f670a244)
[BR: BSC#1042159 CVE-2017-9330, also includes fix from upstream
commit cf66ee8e where the test for failure of ohci_read_iso_td()
was wrong]
Signed-off-by: Bruce Rogers <brogers@suse.com>
As the pci ahci can be hotplug and unplug, in the ahci unrealize
function it should free all the resource once allocated in the
realized function. This patch add ide_exit to free the resource.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c9f086418a)
[BSC#1042801 CVE-2017-9373]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Does basically the same as "cirrus: stop passing around dst pointers in
the blitter", just for the src pointer instead of the dst pointer.
For the src we have to care about cputovideo blits though and fetch the
data from s->cirrus_bltbuf instead of vga memory. The cirrus_src*()
helper functions handle that.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489584487-3489-1-git-send-email-kraxel@redhat.com
(cherry picked from commit ffaf857778)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Instead pass around the address (aka offset into vga memory). Calculate
the pointer in the rop_* functions, after applying the mask to the
address, to make sure the address stays within the valid range.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489574872-8679-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 026aeffcb4)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
check the validity of parameters in cirrus_bitblt_rop_fwd_transp_xxx
and cirrus_bitblt_rop_fwd_xxx to avoid the OOB read which causes qemu Segmentation fault.
After the fix, we will touch the assert in
cirrus_invalidate_region:
assert(off_cur_end >= off_cur);
Signed-off-by: fangying <fangying1@huawei.com>
Signed-off-by: hangaohuai <hangaohuai@huawei.com>
Message-id: 20170314063919.16200-1-hangaohuai@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 215902d7b6)
[BR: BSC#1034908 CVE-2017-7718]
Signed-off-by: Bruce Rogers <brogers@suse.com>
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests. It is supported by cirrus and vnc server. The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.
This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more. Any linux guest using the cirrus drm
driver doesn't. Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.
So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".
Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full). This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display. So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.
Lets kill it once for all.
Oh, and one more reason: Turns out (after writing the patch) we have a
security bug in that code path ...
Fixes: CVE-2016-9603
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 50628d3479)
[BR: BSC#1028656]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Limits should be big enough that normal guest should not hit it.
Add a tracepoint to log them, just in case. Also, while being
at it, log the existing link trb limit too.
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1486383669-6421-1-git-send-email-kraxel@redhat.com
(cherry picked from commit f89b60f6e5)
[BR: BSC#1025109 CVE-2017-5973]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/usb/hcd-xhci.c
hw/usb/trace-events
While the machine is paused, in guest_phys_blocks_append() we register a
one-shot MemoryListener, solely for the initial collection of the valid
guest-physical memory ranges that happens at listener registration time.
For each range that is reported to guest_phys_blocks_region_add(), we
attempt to merge the range with the preceding one.
Ranges can only be joined if they are contiguous in both guest-physical
address space, and contiguous in host virtual address space.
The "maximal" ranges that remain in the end constitute the guest-physical
memory map that the dump will be based on.
Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit c5d7f60f06)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The vmcore must use physical addresses that are visible to the guest, not
addresses that point into linear RAMBlocks. As first step, introduce the
list type into which we'll collect the physical mappings in effect at the
time of the dump.
Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 5ee163e8ea)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Even a trusted & clean-state guest can map more memory than what it was
given. Since the vmcore contains RAMBlocks, mapping sizes should be
clamped to RAMBlock sizes. Otherwise such oversized mappings can exceed
the entire file size, and ELF parsers might refuse even the valid portion
of the PT_LOAD entry.
Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 2cac260768)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If we want to support targets that can change endianness (modern PPC and
ARM for the moment), we need to add a per-CPU class method to be called
from the virtio code. The virtio_ prefix in the name is a hint for people
to avoid misusage (aka. anywhere but from the virtio code).
The default behaviour is to return the compile-time default target
endianness.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bf7663c4bd)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.
Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:
Unknown savevm section type 5
load of migration failed
Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6b321a3df5)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
Move it to qom/cpu.h.
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4917cf4432)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
This will avoid issues with hwaddr and ram_addr_t when including
sysemu/memory_mapping.h for CONFIG_USER_ONLY, e.g., from qom/cpu.h.
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 6d4d3ae77d)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
Commit a0e640a8 introduced a path processing error.
Pass fstatat the dirpath based path component instead
of the entire path.
[BR: BSC#1045035]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local backend was recently converted to using "at*()" syscalls in order
to ensure all accesses happen below the shared directory. This requires that
we only pass relative paths, otherwise the dirfd argument to the "at*()"
syscalls is ignored and the path is treated as an absolute path in the host.
This is actually the case for paths in all fids, with the notable exception
of the root fid, whose path is "/". This causes the following backend ops to
act on the "/" directory of the host instead of the virtfs shared directory
when the export root is involved:
- lstat
- chmod
- chown
- utimensat
ie, chmod /9p_mount_point in the guest will be converted to chmod / in the
host for example. This could cause security issues with a privileged QEMU.
All "*at()" syscalls are being passed an open file descriptor. In the case
of the export root, this file descriptor points to the path in the host that
was passed to -fsdev.
The fix is thus as simple as changing the path of the export root fid to be
"." instead of "/".
This is CVE-2017-7471.
Cc: qemu-stable@nongnu.org
Reported-by: Léo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9c6b899f7a)
[BR: BSC#1034866 CVE-2017-7471]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Free 'orig_value' in error path.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4ffcdef427)
[BR: BSC#1035950 CVE-2017-8086]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.
This patch ensures that the fid isn't already associated to anything
before using it.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
(reworded the changelog, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit d63fb193e7)
[BR: BSC#1032075 CVE-2017-7377]
Signed-off-by: Bruce Rogers <brogers@suse.com>
According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.
QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.
If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.
This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).
The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.
[*] http://man.cat-v.org/plan_9/5/flush
Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit d5f2af7b95)
Signed-off-by: Bruce Rogers <brogers@suse.com>
We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.
While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).
This fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b003fc0d8a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.
On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.
Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 918112c02a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.
It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b314f6a077)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.
(Coverity issue CID1371729)
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit 23da0145cc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Coverity issue CID1371731
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit faab207f11)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.
(Coverity issue CID1371732)
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b7361d46e7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Now that the all callbacks have been converted to use "at" syscalls, we
can drop this code.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c23d5f1d5b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_open2() callback is vulnerable to symlink attacks because it
calls:
(1) open() which follows symbolic links for all path elements but the
rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
chmod(), both functions also following symbolic links
This patch converts local_open2() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively. Since local_open2() already opens
a descriptor to the target file, local_set_cred_passthrough() is
modified to reuse it instead of opening a new one.
The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to openat().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a565fea565)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_mkdir() callback is vulnerable to symlink attacks because it
calls:
(1) mkdir() which follows symbolic links for all path elements but the
rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
chmod(), both functions also following symbolic links
This patch converts local_mkdir() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively.
The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mkdirat().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3f3a16990b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_mknod() callback is vulnerable to symlink attacks because it
calls:
(1) mknod() which follows symbolic links for all path elements but the
rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
chmod(), both functions also following symbolic links
This patch converts local_mknod() to rely on opendir_nofollow() and
mknodat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.
A new local_set_cred_passthrough() helper based on fchownat() and
fchmodat_nofollow() is introduced as a replacement to
local_post_create_passthrough() to fix (4).
The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mknodat().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d815e72190)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_symlink() callback is vulnerable to symlink attacks because it
calls:
(1) symlink() which follows symbolic links for all path elements but the
rightmost one
(2) open(O_NOFOLLOW) which follows symbolic links for all path elements but
the rightmost one
(3) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(4) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
This patch converts local_symlink() to rely on opendir_nofollow() and
symlinkat() to fix (1), openat(O_NOFOLLOW) to fix (2), as well as
local_set_xattrat() and local_set_mapped_file_attrat() to fix (3) and
(4) respectively.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 38771613ea)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_chown() callback is vulnerable to symlink attacks because it
calls:
(1) lchown() which follows symbolic links for all path elements but the
rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
This patch converts local_chown() to rely on open_nofollow() and
fchownat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d369f20763)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_chmod() callback is vulnerable to symlink attacks because it
calls:
(1) chmod() which follows symbolic links for all path elements
(2) local_set_xattr()->setxattr() which follows symbolic links for all
path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
mkdir(), both functions following symbolic links for all path
elements but the rightmost one
We would need fchmodat() to implement AT_SYMLINK_NOFOLLOW to fix (1). This
isn't the case on linux unfortunately: the kernel doesn't even have a flags
argument to the syscall :-\ It is impossible to fix it in userspace in
a race-free manner. This patch hence converts local_chmod() to rely on
open_nofollow() and fchmod(). This fixes the vulnerability but introduces
a limitation: the target file must readable and/or writable for the call
to openat() to succeed.
It introduces a local_set_xattrat() replacement to local_set_xattr()
based on fsetxattrat() to fix (2), and a local_set_mapped_file_attrat()
replacement to local_set_mapped_file_attr() based on local_fopenat()
and mkdirat() to fix (3). No effort is made to factor out code because
both local_set_xattr() and local_set_mapped_file_attr() will be dropped
when all users have been converted to use the "at" versions.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3187a45dd)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_link() callback is vulnerable to symlink attacks because it calls:
(1) link() which follows symbolic links for all path elements but the
rightmost one
(2) local_create_mapped_attr_dir()->mkdir() which follows symbolic links
for all path elements but the rightmost one
This patch converts local_link() to rely on opendir_nofollow() and linkat()
to fix (1), mkdirat() to fix (2).
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ad0b46e6ac)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using the mapped-file security model, we also have to create a link
for the metadata file if it exists. In case of failure, we should rollback.
That's what this patch does.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6dd4b1f1d0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_rename() callback is vulnerable to symlink attacks because it
uses rename() which follows symbolic links in all path elements but the
rightmost one.
This patch simply transforms local_rename() into a wrapper around
local_renameat() which is symlink-attack safe.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d2767edec5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_renameat() callback is currently a wrapper around local_rename()
which is vulnerable to symlink attacks.
This patch rewrites local_renameat() to have its own implementation, based
on local_opendir_nofollow() and renameat().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 99f2cf4b2d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_lstat() callback is vulnerable to symlink attacks because it
calls:
(1) lstat() which follows symbolic links in all path elements but the
rightmost one
(2) getxattr() which follows symbolic links in all path elements
(3) local_mapped_file_attr()->local_fopen()->openat(O_NOFOLLOW) which
follows symbolic links in all path elements but the rightmost
one
This patch converts local_lstat() to rely on opendir_nofollow() and
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1), fgetxattrat_nofollow() to
fix (2).
A new local_fopenat() helper is introduced as a replacement to
local_fopen() to fix (3). No effort is made to factor out code
because local_fopen() will be dropped when all users have been
converted to call local_fopenat().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f9aef99b3e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
modified to also fix an existing error path to do proper cleanup.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_readlink() callback is vulnerable to symlink attacks because it
calls:
(1) open(O_NOFOLLOW) which follows symbolic links for all path elements but
the rightmost one
(2) readlink() which follows symbolic links for all path elements but the
rightmost one
This patch converts local_readlink() to rely on open_nofollow() to fix (1)
and opendir_nofollow(), readlinkat() to fix (2).
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bec1e9546e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_truncate() callback is vulnerable to symlink attacks because
it calls truncate() which follows symbolic links in all path elements.
This patch converts local_truncate() to rely on open_nofollow() and
ftruncate() instead.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ac125d993b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_statfs() callback is vulnerable to symlink attacks because it
calls statfs() which follows symbolic links in all path elements.
This patch converts local_statfs() to rely on open_nofollow() and fstatfs()
instead.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 31e51d1c15)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_utimensat() callback is vulnerable to symlink attacks because it
calls qemu_utimens()->utimensat(AT_SYMLINK_NOFOLLOW) which follows symbolic
links in all path elements but the rightmost one or qemu_utimens()->utimes()
which follows symbolic links for all path elements.
This patch converts local_utimensat() to rely on opendir_nofollow() and
utimensat(AT_SYMLINK_NOFOLLOW) directly instead of using qemu_utimens().
It is hence assumed that the OS supports utimensat(), i.e. has glibc 2.6
or higher and linux 2.6.22 or higher, which seems reasonable nowadays.
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a33eda0dd9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_remove() callback is vulnerable to symlink attacks because it
calls:
(1) lstat() which follows symbolic links in all path elements but the
rightmost one
(2) remove() which follows symbolic links in all path elements but the
rightmost one
This patch converts local_remove() to rely on opendir_nofollow(),
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1) and unlinkat() to fix (2).
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a0e640a872)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_unlinkat() callback is vulnerable to symlink attacks because it
calls remove() which follows symbolic links in all path elements but the
rightmost one.
This patch converts local_unlinkat() to rely on opendir_nofollow() and
unlinkat() instead.
Most of the code is moved to a separate local_unlinkat_common() helper
which will be reused in a subsequent patch to fix the same issue in
local_remove().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit df4938a665)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_lremovexattr() callback is vulnerable to symlink attacks because
it calls lremovexattr() which follows symbolic links in all path elements
but the rightmost one.
This patch introduces a helper to emulate the non-existing fremovexattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lremovexattr().
local_lremovexattr() is converted to use this helper and opendir_nofollow().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 72f0d0bf51)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_lsetxattr() callback is vulnerable to symlink attacks because
it calls lsetxattr() which follows symbolic links in all path elements but
the rightmost one.
This patch introduces a helper to emulate the non-existing fsetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lsetxattr().
local_lsetxattr() is converted to use this helper and opendir_nofollow().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3e36aba757)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_llistxattr() callback is vulnerable to symlink attacks because
it calls llistxattr() which follows symbolic links in all path elements but
the rightmost one.
This patch introduces a helper to emulate the non-existing flistxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to llistxattr().
local_llistxattr() is converted to use this helper and opendir_nofollow().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5507904e36)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_lgetxattr() callback is vulnerable to symlink attacks because
it calls lgetxattr() which follows symbolic links in all path elements but
the rightmost one.
This patch introduces a helper to emulate the non-existing fgetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lgetxattr().
local_lgetxattr() is converted to use this helper and opendir_nofollow().
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56ad3e54da)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The local_open() and local_opendir() callbacks are vulnerable to symlink
attacks because they call:
(1) open(O_NOFOLLOW) which follows symbolic links in all path elements but
the rightmost one
(2) opendir() which follows symbolic links in all path elements
This patch converts both callbacks to use new helpers based on
openat_nofollow() to only open files and directories if they are
below the virtfs shared folder
This partly fixes CVE-2016-9602.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 996a0d76d7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This patch opens the shared folder and caches the file descriptor, so that
it can be used to do symlink-safe path walk.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0e35a37829)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When using the passthrough security mode, symbolic links created by the
guest are actual symbolic links on the host file system.
Since the resolution of symbolic links during path walk is supposed to
occur on the client side. The server should hence never receive any path
pointing to an actual symbolic link. This isn't guaranteed by the protocol
though, and malicious code in the guest can trick the server to issue
various syscalls on paths whose one or more elements are symbolic links.
In the case of the "local" backend using the "passthrough" or "none"
security modes, the guest can directly create symbolic links to arbitrary
locations on the host (as per spec). The "mapped-xattr" and "mapped-file"
security modes are also affected to a lesser extent as they require some
help from an external entity to create actual symbolic links on the host,
i.e. another guest using "passthrough" mode for example.
The current code hence relies on O_NOFOLLOW and "l*()" variants of system
calls. Unfortunately, this only applies to the rightmost path component.
A guest could maliciously replace any component in a trusted path with a
symbolic link. This could allow any guest to escape a virtfs shared folder.
This patch introduces a variant of the openat() syscall that successively
opens each path element with O_NOFOLLOW. When passing a file descriptor
pointing to a trusted directory, one is guaranteed to be returned a
file descriptor pointing to a path which is beneath the trusted directory.
This will be used by subsequent patches to implement symlink-safe path walk
for any access to the backend.
Symbolic links aren't the only threats actually: a malicious guest could
change a path element to point to other types of file with undesirable
effects:
- a named pipe or any other thing that would cause openat() to block
- a terminal device which would become QEMU's controlling terminal
These issues can be addressed with O_NONBLOCK and O_NOCTTY.
Two helpers are introduced: one to open intermediate path elements and one
to open the rightmost path element.
Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(renamed openat_nofollow() to relative_openat_nofollow(),
assert path is relative and doesn't contain '//',
fixed side-effect in assert, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 6482a96163)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If these functions fail, they should not change *fs. Let's use local
variables to fix this.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 21328e1e57)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If this function fails, it should not modify *ctx.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00c90bd1c2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
These functions are always called indirectly. It really doesn't make sense
for them to sit in a header file.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56fc494bdc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The server can handle MAX_REQ - 1 PDUs at a time and the virtio-9p
device has a MAX_REQ sized virtqueue. If the client manages to fill
up the virtqueue, pdu_alloc() will fail and the request won't be
processed without any notice to the client (it actually causes the
linux 9p client to hang).
This has been there since the beginning (commit 9f10751365 "virtio-9p:
Add a virtio 9p device to qemu"), but it needs an agressive workload to
run in the guest to show up.
We actually allocate MAX_REQ PDUs and I see no reason not to link them
all into the free list, so let's fix the init loop.
Reported-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0d78289c3d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Correct recalculation of vq->inuse after migration for the corner case
where the avail_idx has already wrapped but used_idx not yet.
Also change the type of the VirtQueue.inuse to unsigned int. This is
done to be consistent with other members representing sizes (VRing.num),
and because C99 guarantees max ring size < UINT_MAX but does not
guarantee max ring size < INT_MAX.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: bccdef6b ("virtio: recalculate vq->inuse after migration")
CC: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e66bcc4081)
[BR: BSC#1020928]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If the user passes -device virtio-9p without the corresponding -fsdev, QEMU
dereferences a NULL pointer and crashes.
This is a 2.8 regression introduced by commit 702dbcc274.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
(cherry picked from commit f2b58c4375)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
pdus are initialized and used in 9pfs common code. Move the array from
V9fsVirtioState to V9fsState.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 583f21f8b9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 702dbcc274)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The virtfs_reset() function is called either when the virtio-9p device
gets reset, or when the client starts a new 9P session. In both cases,
if it finds fids from a previous session, the following is printed in
the monitor:
9pfs:virtfs_reset: One or more uncluncked fids found during reset
For example, if a linux guest with a mounted 9P share is reset from the
monitor with system_reset, the message will be printed. This is excessive
since these fids are now clunked and the state is clean.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 79decce35b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The v9fs_xattr_read() and v9fs_xattr_write() are passed a guest
originated offset: they must ensure this offset does not go beyond
the size of the extended attribute that was set in v9fs_xattrcreate().
Unfortunately, the current code implement these checks with unsafe
calculations on 32 and 64 bit values, which may allow a malicious
guest to cause OOB access anyway.
Fix this by comparing the offset and the xattr size, which are
both uint64_t, before trying to compute the effective number of bytes
to read or write.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7e55d65c56)
[BR: CVE-2016-9104 BSC#1007493]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If an error occurs when marshalling the transfer length to the guest, the
v9fs_write() function doesn't free an IO vector, thus leading to a memory
leak. This patch fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit fdfcc9aeea)
[BR: CVE-2016-9106 BSC#1007495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The v9fs_link() function keeps a reference on the source fid object. This
causes a memory leak since the reference never goes down to 0. This patch
fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4c1586787f)
[BR: CVE-2016-9105 BSC#1007494]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The 'fs.xattr.value' field in V9fsFidState object doesn't consider the
situation that this field has been allocated previously. Every time, it
will be allocated directly. This leads to a host memory leak issue if
the client sends another Txattrcreate message with the same fid number
before the fid from the previous time got clunked.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, updated the changelog to indicate how the leak can occur]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ff55e94d23)
[BR: CVE-2016-9102 BSC#1007450]
Signed-off-by: Bruce Rogers <brogers@suse.com>
9pfs uses g_malloc() to allocate the xattr memory space, if the guest
reads this memory before writing to it, this will leak host heap memory
to the guest. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit eb68760285)
[BR: CVE-2016-9103 BSC#1007454]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Virtio devices should implement the VirtIODevice->reset() function to
perform necessary cleanup actions and to bring the device to a quiescent
state.
In the case of the virtio-9p device, this means:
- emptying the list of active PDUs (i.e. draining all in-flight I/O)
- freeing all fids (i.e. close open file descriptors and free memory)
That's what this patch does.
The reset handler first waits for all active PDUs to complete. Since
completion happens in the QEMU global aio context, we just have to
loop around aio_poll() until the active list is empty.
The freeing part involves some actions to be performed on the backend,
like closing file descriptors or flushing extended attributes to the
underlying filesystem. The virtfs_reset() function already does the
job: it calls free_fid() for all open fids not involved in an ongoing
I/O operation. We are sure this is the case since we have drained
the PDU active list.
The current code implements all backend accesses with coroutines, but we
want to stay synchronous on the reset path. We can either change the
current code to be able to run when not in coroutine context, or create
a coroutine context and wait for virtfs_reset() to complete. This patch
goes for the latter because it results in simpler code.
Note that we also need to create a dummy PDU because it is also an API
to pass the FsContext pointer to all backend callbacks.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0e44a0fd3f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If a PDU has a flush request pending, the current code calls pdu_free()
twice:
1) pdu_complete()->pdu_free() with pdu->cancelled set, which does nothing
2) v9fs_flush()->pdu_free() with pdu->cancelled cleared, which moves the
PDU back to the free list.
This works but it complexifies the logic of pdu_free().
With this patch, pdu_complete() only calls pdu_free() if no flush request
is pending, i.e. qemu_co_queue_next() returns false.
Since pdu_free() is now supposed to be called with pdu->cancelled cleared,
the check in pdu_free() is dropped and replaced by an assertion.
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit f74e27bf0f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Out of the three users of pdu_free(), none ever passes a NULL pointer to
this function.
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 6868a420c5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In 9pfs read dispatch function, it doesn't free two QEMUIOVector
object thus causing potential memory leak. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit e95c9a493a)
[BR: CVE-2016-8577 BSC#1003893]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If a guest sends an empty string paramater to any 9P operation, the current
code unmarshals it into a V9fsString equal to { .size = 0, .data = NULL }.
This is unfortunate because it can cause NULL pointer dereference to happen
at various locations in the 9pfs code. And we don't want to check str->data
everywhere we pass it to strcmp() or any other function which expects a
dereferenceable pointer.
This patch enforces the allocation of genuine C empty strings instead, so
callers don't have to bother.
Out of all v9fs_iov_vunmarshal() users, only v9fs_xattrwalk() checks if
the returned string is empty. It now uses v9fs_string_size() since
name.data cannot be NULL anymore.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[groug, rewritten title and changelog,
fix empty string check in v9fs_xattrwalk()]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ba42ebb863)
[BR: CVE-2016-8578 BSC#1003894]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If the call to fid_to_qid() returns an error, we will call v9fs_path_free()
on uninitialized paths.
It is a regression introduced by the following commit:
56f101ecce 9pfs: handle walk of ".." in the root directory
Let's fix this by initializing dpath and path before calling fid_to_qid().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[groug: updated the changelog to indicate this is regression and to provide
the offending commit SHA1]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 13fd08e631)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This helper is similar to v9fs_string_sprintf(), but it includes the
terminating NUL character in the size field.
This is to avoid doing v9fs_string_sprintf((V9fsString *) &path) and
then bumping the size.
Affected users are changed to use this new helper.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit e3e83f2e21)
[BR: support patch for BSC#1034866]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:
All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot). The parent of the root directory of a server's
tree is itself.
This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".
This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 56f101ecce)
[BR: CVE-2016-7116 BSC#996441]
Signed-off-by: Bruce Rogers <brogers@suse.com>
vq->inuse must be zeroed upon device reset like most other virtqueue
fields.
In theory, virtio_reset() just needs assert(vq->inuse == 0) since
devices must clean up in-flight requests during reset (requests cannot
not be leaked!).
In practice, it is difficult to achieve vq->inuse == 0 across reset
because balloon, blk, 9p, etc implement various different strategies for
cleaning up requests. Most devices call g_free(elem) directly without
telling virtio.c that the VirtQueueElement is cleaned up. Therefore
vq->inuse is not decremented during reset.
This patch zeroes vq->inuse and trusts that devices are not leaking
VirtQueueElements across reset.
I will send a follow-up series that refactors request life-cycle across
all devices and converts vq->inuse = 0 into assert(vq->inuse == 0) but
this more invasive approach is not appropriate for stable trees.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
(cherry picked from commit 4b7f91ed02)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again. It's necessary to decrement vq->inuse to avoid "leaking"
the element count.
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 58a83c6149)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The vq->inuse field is not migrated. Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.
At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements. For these devices we need to recalculate
vq->inuse upon load so the value is correct.
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bccdef6b1a)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If we are to switch back to readdir(), we need a more complex type than
DIR * to be able to serialize concurrent accesses to the directory stream.
This patch introduces a placeholder type and fixes all users.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
(cherry picked from commit f314ea4e30)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
V9fsState now only contains generic fields. Introduce V9fsVirtioState
for virtio transport. Change virtio-pci and virtio-ccw to use
V9fsVirtioState.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 00588a0aa2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 2a0c56aa4c)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Note:
The previous reference to virtio_9p_get_features is left in place.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
It's not virtio specific.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 72a189770a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The new function resides in virtio specific file.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
cherry picked from commit 0d3716b4e6)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Factor out v9fs_iov_v{,un}marshal. Implement pdu_{,un}marshal with those
functions.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 0e2082d9e5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This matches naming convention of pdu_marshal and pdu_unmarshal.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit dc295f8353)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
V9fsState can be referenced by pdu->s. Initialise that in device
realization function.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit ad38ce9ed1)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
And rename v9fs_marshal to v9fs_iov_marshal, v9fs_unmarshal to
v9fs_iov_unmarshal.
The rationale behind this change is that, this marshalling interface is
used both by virtio and proxy helper. Renaming files and functions to
reflect the true nature of this interface.
Xen transport is going to have its own marshalling interface.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 2209bd050a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Break out some generic functions for marshaling 9p state. Pure code
motion plus minor fixes for build system.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 829dd2861a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Some structures in virtio-9p.h have been unused since 2011 when relevant
functions switched to use coroutines.
The declaration of pdu_packunpack and function do_pdu_unpack are
useless.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 71042cffc0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The deleted file only contained V9fsConf which wasn't virtio specific.
Merge that to the general header of 9pfs.
Fixed header inclusions as I went along.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 756cb74a59)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
These three files are not virtio specific. Rename them to generic
names.
Fix comments and header inclusion in various files.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 267ae092e2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Those two files are not virtio specific. Rename them to use generic
names.
Fix includes in various C files. Change define guards and comments
in header files.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 494a8ebe71)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This file is not virtio specific. Rename it to use generic name.
Fix comment and remove unneeded inclusion of virtio.h.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit d57b78002c)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This file is not virtio specific. Rename it to use generic name.
Fix comment and remove unneeded inclusion of virtio.h.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit f00d4f596b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This file is not virtio specific. Rename it to use generic name.
Fix comment and remove unneeded inclusion of virtio.h.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 3b9ca04653)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Since commit 4652f1640e "virtio-9p: add savevm
handlers", if the user hot-unplugs a quiescent 9p device and live
migrates, the source QEMU crashes before migration completetion...
This happens because virtio-9p devices have a realize handler which
calls virtio_init() and register_savevm(). Both calls store pointers
to the device internals, that get dereferenced during migration even
if the device got unplugged.
This patch simply adds an unrealize handler to perform minimal
cleanup and avoid the crash. Hot unplug of non-quiescent 9p devices
is still not supported in QEMU, and not supported by linux guests
either.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20151208155457.27775.69441.stgit@bahia.huguette.org
[PMM: rewrapped long lines in commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6cecf09373)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The coroutine files are currently referenced by the block-obj-y
variable. The coroutine functionality though is already used by
more than just the block code. eg migration code uses coroutine
yield. In the future the I/O channel code will also use the
coroutine yield functionality. Since the coroutine code is nicely
self-contained it can be easily built as part of the libqemuutil.a
library, making it widely available.
The headers are also moved into include/qemu, instead of the
include/block directory, since they are now part of the util
codebase, and the impl was never in the block/ directory
either.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602
In place of upstream commit 10817bf09d,
we just simulate the generalization of the qemu headers by creating
a include/qemu/coroutine.h, which simply includes include/block/
coroutine.h, nothing more. That satisfies our very limited needs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
As only one place in virtio-9p-device.c uses
DEFINE_VIRTIO_9P_PROPERTIES, there is no need to expose it. Inline it
into virtio-9p-device.c to avoid wrongly use.
Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 83a84878da)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Drop a bunch of code duplicated from virtio_config.h and virtio_ring.h.
This makes us rename event index accessors which conflict,
as reusing the ones from virtio_ring.h isn't trivial.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
(cherry picked from commit e9600c6ca9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Only
the part of the commit which seemed essential for the backport was
cherry-picked]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Switch to virtio_ring.h from standard headers.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 4fbe0f322d)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add files imported from linux-next (what will become linux 4.0) using
scripts/update-linux-headers.sh
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9fbe302b2a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
virtio-9p-pci all duplicate the qdev properties of their
V9fsState child. This approach does not work well with
string or pointer properties since we must be careful
about leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
V9fsState child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 48833071d9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The qdev_alias_all_properties() function creates QOM alias properties
for each qdev property on a DeviceState. This is useful for parent
objects that wish to forward property accesses to their children.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
(cherry picked from commit 67cc7e0aac)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Sometimes an object needs to present a property which is actually on
another object, or it needs to provide an alias name for an existing
property.
Examples:
a.foo -> b.foo
a.old_name -> a.new_name
The new object_property_add_alias() API allows objects to alias a
property on the same object or another object. The source and target
names can be different.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
(cherry picked from commit ef7c7ff6d4)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This is the virtio-access.h header file taken from Rusty's "endian-ambivalent
targets using legacy virtio" patch. It introduces helpers that should be used
when accessing vring data or by drivers for data that contains headers.
The virtio config space is also target endian, but the current code already
handles that with the virtio_is_big_endian() helper. There is no obvious
benefit at using the virtio accessors in this case.
Now we have two distinct paths: a fast inline one for fixed endian targets,
and a slow out-of-line one for targets that define the new TARGET_IS_BIENDIAN
macro.
Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ relicensed virtio-access.h to GPLv2+ on Rusty's request,
pass &address_space_memory to physical memory accessors,
per-device endianness,
virtio tswap16 and tswap64 helpers,
faspath for fixed endian targets,
Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Cc: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0f5d1d2a49)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.
We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 616a655219)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.
Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.
This patch doesn't change any functionality.
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 98ed8ecfc9)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Add a special Error * that can be passed to error handling APIs to
signal that any errors are fatal and should abort QEMU. There are two
advantages to this:
- allows for brevity when wishing to assert success of Error **
accepting APIs. No need for this pattern:
Error * local_err = NULL;
api_call(foo, bar, &local_err);
assert_no_error(local_err);
This also removes the need for _nofail variants of APIs with
asserting call sites now reduced to 1LOC.
- SIGABRT happens from within the offending API. When a fatal error
occurs in an API call (when the caller is asserting sucess) failure
often means the API itself is broken. With the abort happening in the
API call now, the stack frames into the call are available at debug
time. In the assert_no_error scheme the abort happens after the fact.
The exact semantic is that when an error is raised, if the argument
Error ** matches &error_abort, then the abort occurs immediately. The
error messaged is reported.
For error_propagate, if the destination error is &error_abort, then
the abort happens at propagation time.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 5d24ee70bc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 306ec6c3ce)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Only
part of this patch is used, and it also retains some of the old
interface to allow old and new to co-exist]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Drop VirtioDeviceClass::init.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0ba94b6f94)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
altered to allow init to continue to be in use.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 59be75227d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Temporarily allow either VirtioDeviceClass::init or
VirtioDeviceClass::realize.
Introduce VirtioDeviceClass::unrealize for symmetry.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1d244b42d2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
modified to maintain both old and new ways to init/realize]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Right now we have these pairs:
- virtio_bus_plug_device/virtio_bus_destroy_device. The first
takes a VirtIODevice, the second takes a VirtioBusState
- device_plugged/device_unplug callbacks in the VirtioBusClass
(here it's just the naming that is inconsistent)
- virtio_bus_destroy_device is not called by anyone (and since
it calls qdev_free, it would be called by the proxies---but
then the callback is useless since the proxies can do whatever
they want before calling virtio_bus_destroy_device)
And there is a k->init but no k->exit, hence virtio_device_exit is
overwritten by subclasses (except virtio-9p). This cleans it up by:
- renaming the device_unplug callback to device_unplugged
- renaming virtio_bus_plug_device to virtio_bus_device_plugged,
matching the callback name
- renaming virtio_bus_destroy_device to virtio_bus_device_unplugged,
removing the qdev_free, making it take a VirtIODevice and calling it
from virtio_device_exit
- adding a k->exit callback
virtio_device_exit is still overwritten, the next patches will fix that.
Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5e96f5d2f8)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
virtqueue_get_avail_bytes: when found a indirect desc, we need loop over it.
/* loop over the indirect descriptor table */
indirect = 1;
max = vring_desc_len(desc_pa, i) / sizeof(VRingDesc);
num_bufs = i = 0;
desc_pa = vring_desc_addr(desc_pa, i);
But, It init i to 0, then use i to update desc_pa. so we will always get:
desc_pa = vring_desc_addr(desc_pa, 0);
the last two line should swap.
Cc: qemu-stable@nongnu.org
Signed-off-by: Yin Yin <yin.yin@cs2c.com.cn>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1ae2757c6c)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In virtio_9p_device_init() there are 6x goto out that will lead to
v9fs_path_free() attempting to free unitialized path.data field.
Easiest way to trigger is: qemu-system-x86_64 -device virtio-9p-pci
Fix this by moving v9fs_path_init() before any goto out.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1375315187-16534-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 27915efb97)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Here the virtio-9p-pci is modified for the new API. The device
virtio-9p-pci extends virtio-pci. It creates and connects a
virtio-9p-device during the init. The properties are not changed.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366708123-19626-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 234a336f9e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0d09e41a51)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. This patch
is severely edited to only include the changes needed for this backport]
Signed-off-by: Bruce Rogers <brogers@suse.com>
These structures must be made public to avoid two memory allocations for
refactored virtio devices.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1363624648-16906-2-git-send-email-fred.konrad@greensocs.com
Changes V4 <- V3:
* Rebased on current git.
Changes V3 <- V2:
* Style correction spotted by Andreas (virtio-scsi.h).
* Style correction for virtio-net.h.
Changes V2 <- V1:
* Move the dataplane include into the header (virtio-blk).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit f1b24e840f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
It is very useful to get the main loop AioContext, which is a static
variable in main-loop.c.
I'm not sure whether qemu_get_aio_context() will be necessary in the
future once devices focus on using their own AioContext instead of the
main loop AioContext, but for now it allows us to refactor code to
support multiple AioContext while actually passing the main loop
AioContext.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5f3aa1ff47)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Done with this script:
cd hw
for i in `find . -name '*.h' | sed 's/^..//'`; do
echo '\,^#.*include.*["<]'$i'[">], s,'$i',hw/&,'
done | sed -i -f - `find . -type f`
This is so that paths remain valid as files are moved.
Instead, files in hw/dataplane are referenced with the relative path.
We know they are not going to move to include/, and they are the only
include files that are in subdirectories _and_ move.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 83c9f4ca79)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Very
severely trimmed to only include what is needful for this backporting
effort]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Also move the 9p.h file to 9pfs/virtio-9p-device.h, for consistency
with the corresponding .c file.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 60653b28f5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7e6b14dfb5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Per default any SCSI commands are sent with an infinite timeout,
which essentially disables any command abort mechanism on the
host and causes the guest to stall.
This patch adds a new option 'timeout' for scsi-generic and
scsi-disk which allows the user to set the timeout value to
something sensible.
Instead of disabling command aborts by setting the command timeout
to infinity we should be setting it to '0' per default, allowing
the host to fall back to its default values.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Lin Ma <lma@suse.com>
This patch fixes a bug in scsi_block_new_request() that was introduced
by commit 137745c5c6. If the host cache
is used - i.e. if BDRV_O_NOCACHE is _not_ set - the 'break' statement
needs to be executed to 'fall back' to SG_IO.
Cc: qemu-stable@nongnu.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2fe5a9f73b)
[LM: BSC#1031051]
Signed-off-by: Lin Ma <lma@suse.com>
If the guest sets the sglist size to a value >=2GB, megasas_handle_dcmd
will return MFI_STAT_MEMORY_NOT_AVAILABLE without freeing the memory.
Avoid this by returning only the status from map_dcmd, and loading
cmd->iov_size in the caller.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 765a707000)
[BR: CVE-2017-5856 BSC#1023053]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When the Intel 6300ESB watchdog is hot unplug. The timer allocated
in realize isn't freed thus leaking memory leak. This patch avoid
this through adding the exit function.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <583cde9c.3223ed0a.7f0c2.886e@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit eb7a20a361)
[BR: CVE-2016-10155 BSC#1021129]
Signed-off-by: Bruce Rogers <brogers@suse.com>
CCID device emulator uses Application Protocol Data Units(APDU)
to exchange command and responses to and from the host.
The length in these units couldn't be greater than 65536. Add
check to ensure the same. It'd also avoid potential integer
overflow in emulated_apdu_from_guest.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170202192228.10847-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c7dfbf3225)
[BR: CVE-2017-5898 BSC#1023907]
Signed-off-by: Bruce Rogers <brogers@suse.com>
CIRRUS_BLTMODE_MEMSYSSRC blits do NOT check blit destination
and blit width, at all. Oops. Fix it.
Security impact: high.
The missing blit destination check allows to write to host memory.
Basically same as CVE-2014-8106 for the other blit variants.
The missing blit width check allows to overflow cirrus_bltbuf,
with the attractive target cirrus_srcptr (current cirrus_bltbuf write
position) being located right after cirrus_bltbuf in CirrusVGAState.
Due to cirrus emulation writing cirrus_bltbuf bytewise the attacker
hasn't full control over cirrus_srcptr though, only one byte can be
changed. Once the first byte has been modified further writes land
elsewhere.
[ This is CVE-2017-2620 / XSA-209 - Ian Jackson ]
Fixed compilation by removing extra parameter to blit_is_unsafe. -iwj
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
[BR: BSC#1024972]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The blit_region_is_unsafe checks don't work correctly for the
patterncopy source. It's a fixed-sized region, which doesn't
depend on cirrus_blt_{width,height}. So go do the check in
cirrus_bitblt_common_patterncopy instead, then tell blit_is_unsafe that
it doesn't need to verify the source. Also handle the case where we
blit from cirrus_bitbuf correctly.
This patch replaces 5858dd1801.
Security impact: I think for the most part error on the safe side this
time, refusing blits which should have been allowed.
Only exception is placing the blit source at the end of the video ram,
so cirrus_blt_srcaddr + 256 goes beyond the end of video memory. But
even in that case I'm not fully sure this actually allows read access to
host memory. To trick the commit 5858dd18 security checks one has to
pick very small cirrus_blt_{width,height} values, which in turn implies
only a fraction of the blit source will actually be used.
Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 1486645341-5010-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 95280c31cd)
Signed-off-by: Bruce Rogers <brogers@suse.com>
The "max" value is being compared with >=, but addr + width points to
the first byte that will _not_ be copied. Laszlo suggested using a
"greater than" comparison, instead of subtracting one like it is
already done above for the height, so that max remains always positive.
The mistake is "safe"---it will reject some blits, but will never cause
out-of-bounds writes.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1455121059-18280-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d2ba7ecb34)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Apply the cirrus_addr_mask to cirrus_blt_dstaddr and cirrus_blt_srcaddr
right after assigning them, in cirrus_bitblt_start(), instead of having
this all over the place in the cirrus code, and missing a few places.
Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485338996-17095-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 60cd23e851)
Signed-off-by: Bruce Rogers <brogers@suse.com>
cirrus_invalidate_region() calls memory_region_set_dirty()
on a per-line basis, always ranging from off_begin to
off_begin+bytesperline. With a negative pitch off_begin
marks the top most used address and thus we need to do an
initial shift backwards by a line for negative pitches of
backward blits, otherwise the first iteration covers the
line going from the start offset forwards instead of
backwards.
Additionally since the start address is inclusive, if we
shift by a full `bytesperline` we move to the first address
*not* included in the blit, so we only shift by one less
than bytesperline.
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-id: 1485352137-29367-1-git-send-email-w.bumiller@proxmox.com
[ kraxel: codestyle fixes ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f153b563f8)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Commit 4299b90 added a check which is too broad, given that the source
pitch value is not required to be initialized for solid fill operations.
This patch refines the blit_is_unsafe() check to ignore source pitch in
that case. After applying the above commit as a security patch, we
noticed the SLES 11 SP4 guest gui failed to initialize properly.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 20170109203520.5619-1-brogers@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 913a87885f)
[BR: BSC#1016779]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In Cirrus CLGD 54xx VGA Emulator, if cirrus graphics mode is VGA,
'cirrus_get_bpp' returns zero(0), which could lead to a divide
by zero error in while copying pixel data. The same could occur
via blit pitch values. Add check to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1476776717-24807-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4299b90e9b)
[BR: CVE-2016-9921 CVE-2016-9922 BSC#1014702 BSC#1015169]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5821c0f4.091c6b0a.e0c92.e811@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 791f97758e)
[BR: CVE-2016-9911 BSC#1014111]
Signed-off-by: Bruce Rogers <brogers@suse.com>
ColdFire Fast Ethernet Controller uses a receive buffer size
register(EMRBR) to hold maximum size of all receive buffers.
It is set by a user before any operation. If it was set to be
zero, ColdFire emulator would go into an infinite loop while
receiving data in mcf_fec_receive. Add check to avoid it.
Reported-by: Wjjzhang <wjjzhang@tencent.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 77d54985b8)
[BR: CVE-2016-9776 BSC#1013285]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The JAZZ RC4030 chipset emulator has a periodic timer and
associated interval reload register. The reload value is used
as divider when computing timer's next tick value. If reload
value is large, it could lead to divide by zero error. Limit
the interval reload value to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-8667 BSC#1004702]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Intel HDA emulator uses stream of buffers during DMA data
transfers. Each entry has buffer length and buffer pointer
position, which are used to derive bytes to 'copy'. If this
length and buffer pointer were to be same, 'copy' could be
set to zero(0), leading to an infinite loop. Add check to
avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1476949224-6865-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0c0fc2b5fd)
[BR: CVE-2016-8909 BSC#1006536]
Signed-off-by: Bruce Rogers <brogers@suse.com>
RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.
Reported-by: Andrew Henderson <hendersa@icculus.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c7c3591669)
[BR: CVE-2016-8910 BSC#1006538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The exit dispatch of eepro100 network card device doesn't free
the 's->vmstate' field which was allocated in device realize thus
leading a host memory leak. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2634ab7fe2)
[BR: CVE-2016-9101 BSC#1007391]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The AMD PC-Net II emulator has set of control and status(CSR)
registers. Of these, CSR76 and CSR78 hold receive and transmit
descriptor ring length respectively. This ring length could range
from 1 to 65535. Setting ring length to zero leads to an infinite
loop in pcnet_rdra_addr() or pcnet_transmit(). Add check to avoid it.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 34e29ce754)
[BR: CVE-2016-7909 BSC#1002557]
Signed-off-by: Bruce Rogers <brogers@suse.com>
16550A UART device uses an oscillator to generate frequencies
(baud base), which decide communication speed. This speed could
be changed by dividing it by a divider. If the divider is
greater than the baud base, speed is set to zero, leading to a
divide by zero error. Add check to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1476251888-20238-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3592fe0c91)
[BR: CVE-2016-8669 BSC#1004707]
Signed-off-by: Bruce Rogers <brogers@suse.com>
ColdFire Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set upper limit to number of buffer descriptors.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 070c4b92b8)
[BR: CVE-2016-7908 BSC#1002550]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Check the cursor size more carefully. Also switch to unsigned while
being at it, so they can't be negative.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5829b09720)
[BR: support patch for CVE-2016-7170 BSC#998516]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The .receive callback of xlnx.xps-ethernetlite doesn't check the length
of data before calling memcpy. As a result, the NetClientState object in
heap will be overflowed. All versions of qemu with xlnx.xps-ethernetlite
will be affected.
Reported-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit a0d1cbdacf)
[BR: CVE-2016-7161 BSC#1001151]
Signed-off-by: Bruce Rogers <brogers@suse.com>
bits_per_pixel that are less than 8 could result in accessing
non-initialized buffers later in the code due to the expectation
that bytes_per_pixel value that is used to initialize these buffers is
never zero.
To fix this check that bits_per_pixel from the client is one of the
values that the rfb protocol specification allows.
This is CVE-2014-7815.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
[ kraxel: apply codestyle fix ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e6908bfe8e)
[BR: BSC#902737]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
ui/vnc.c
virtio back end uses set of buffers to facilitate I/O operations.
An infinite loop unfolds in virtqueue_pop() if a buffer was
of zero size. Add check to avoid it.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 1e7aed7014)
[BR: CVE-2016-6490 BSC#991466]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/virtio.c
A broken or malicious guest can submit more requests than the virtqueue
size permits, causing unbounded memory allocation in QEMU.
The guest can submit requests without bothering to wait for completion
and is therefore not bound by virtqueue size. This requires reusing
vring descriptors in more than one request, which is not allowed by the
VIRTIO 1.0 specification.
In "3.2.1 Supplying Buffers to The Device", the VIRTIO 1.0 specification
says:
1. The driver places the buffer into free descriptor(s) in the
descriptor table, chaining as necessary
and
Note that the above code does not take precautions against the
available ring buffer wrapping around: this is not possible since the
ring buffer is the same size as the descriptor table, so step (1) will
prevent such a condition.
This implies that placing more buffers into the virtqueue than the
descriptor table size is not allowed.
QEMU is missing the check to prevent this case. Processing a request
allocates a VirtQueueElement leading to unbounded memory allocation
controlled by the guest.
Exit with an error if the guest provides more requests than the
virtqueue size permits. This bounds memory allocation and makes the
buggy guest visible to the user.
This patch fixes CVE-2016-5403 and was reported by Zhenhao Hong from 360
Marvel Team, China.
Reported-by: Zhenhao Hong <hongzhenhao@360.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afd9096eb1)
[BR: CVE-2016-5403 BSC#991080]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The FIFO contains two bytes; hence the write ptr should be two bytes ahead
of the read pointer.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d020aa504c)
[BR: CVE-2016-5238 BSC#982959]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While reading information via 'megasas_ctrl_get_info' routine,
a local bios version buffer isn't null terminated. Add the
terminating null byte to avoid any OOB access.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 844864fbae)
[BR: CVE-2016-5337 BSC#983961]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/megasas.c
The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
FIFO buffers. One is used to handle commands and other is for
information transfer. Three control variables 'ti_rptr',
'ti_wptr' and 'ti_size' are used to control r/w access to the
information transfer buffer ti_buf[TI_BUFSZ=16]. In that,
'ti_rptr' is used as read index, where read occurs.
'ti_wptr' is a write index, where write would occur.
'ti_size' indicates total bytes to be read from the buffer.
While reading/writing to this buffer, index could exceed its
size. Add check to avoid OOB r/w access.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1465230883-22303-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ff589551c8)
[BR: CVE-2016-5338 BSC#983982]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Sanity checks are applied when the fifo is enabled by the guest
(SVGA_REG_CONFIG_DONE write). Which doesn't help much if the guest
changes the fifo registers afterwards. Move the checks to
vmsvga_fifo_length so they are done each time qemu is about to read
from the fifo.
Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-2-git-send-email-kraxel@redhat.com
(cherry picked from commit 5213602678)
[BR: CVE-2016-4454 BSC#982222]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While doing MegaRAID SAS controller command frame lookup, routine
'megasas_lookup_frame' uses 'read_queue_head' value as an index
into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value
within array bounds to avoid any OOB access.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b60bdd1f1e)
[BR: CVE-2016-5107 BSC#982019]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/megasas.c
When reading MegaRAID SAS controller configuration via MegaRAID
Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read
uses an uninitialised local data buffer. Initialise this buffer
to avoid stack information leakage.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d37af74073)
[BR: CVE-2016-5105 BSC#982017]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When setting MegaRAID SAS controller properties via MegaRAID
Firmware Interface(MFI) commands, a user supplied size parameter
is used to set property value. Use appropriate size value to avoid
OOB access issues.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1b85898025)
[BR: CVE-2016-5106 BSC#982018]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Allocate timer once, at init time, instead of allocating/freeing
it all the time when starting/stopping the bus. Simplifies the
code, also fixes bugs (memory leak) due to missing checks whenever
the time is already allocated or not.
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fa1298c2d6)
[BR: CVE-2016-2391 BSC#967013]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/usb/hcd-ohci.c
When processing remote NDIS control message packets, the USB Net
device emulator checks to see if the USB configuration descriptor
object is of RNDIS type(2). But it does not check if it is null,
which leads to a null dereference error. Add check to avoid it.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455188480-14688-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 80eecda8e5)
[BR: CVE-2016-2392 BSC#967012]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.
[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.
--Stefan]
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1441383666-6590-1-git-send-email-stefanha@redhat.com
(cherry picked from commit b947ac2bf2)
[BR: CVE-2015-6815 BSC#944697]
Signed-off-by: Bruce Rogers <brogers@suse.com>
AHCI couldn't cope with asynchronous commands that aren't doing DMA, it
simply wouldn't complete them. Due to the bug fixed in commit f68ec837,
FLUSH commands would seem to have completed immediately even if they
were still running on the host. After the commit, they would simply hang
and never unset the BSY bit, rendering AHCI unusable on any OS sending
flushes.
This patch adds another callback for the completion of asynchronous
commands. This is what AHCI really wants to use for its command
completion logic rather than an DMA completion callback.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a62eaa26c1)
[LM: BNC#982356]
Signed-off-by: Lin Ma <lma@suse.com>
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.
Fixes CVE-2016-4441.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-3-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6c1fef6b59)
[BR: CVE-2016-4441 BSC#980723]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.
Fixes CVE-2016-4439.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c98c6c105f)
[BR: CVE-2016-4439 BSC#980711]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression. The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.
This patch introduces a new sr_vbe register set. The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[]. Normal vga register reads and
writes go to sr[]. Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.
This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.
Cc: qemu-stable@nongnu.org
Reported-by: Thomas Lamprecht <thomas@lamprecht.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1463475294-14119-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 94ef4f337f)
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/display/vga.c
hw/display/vga_int.h
Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode. This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.
Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.
Which is good for one or another buffer overflow. Not that
critical as they typically read overflows happening somewhere
in the display code. So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.
Fixes: CVE-2016-3712
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Call the new vbe_update_vgaregs() function on vbe configuration
changes, to make sure vga registers are up-to-date.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Makes code a bit easier to read.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.
The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.
Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.
Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.
Fixes: CVE-2016-3710
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978158 CVE-2016-3710]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.
Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975700 CVE-2016-4020]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When receiving packets over MIPSnet network device, it uses
receive buffer of size 1514 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.
Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975136 CVE-2016-4002]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While computing IP checksum, 'net_checksum_calculate' reads
payload length from the packet. It could exceed the given 'data'
buffer size. Add a check to avoid it.
Reported-by: Liu Ling <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 362786f14a)
[BR: BSC#970037 CVE-2016-2857]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Requests are now created in the RngBackend parent class and the
code path is shared by both rng-egd and rng-random.
This commit fixes the rng-random implementation which processed
only one request at a time and simply discarded all but the most
recent one. In the guest this manifested as delayed completion
of reads from virtio-rng, i.e. a read was completed only after
another read was issued.
By switching rng-random to use the same request queue as rng-egd,
the unsafe stack-based allocation of the entropy buffer is
eliminated and replaced with g_malloc.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-5-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 60253ed1e6)
[BR: BSC#970036 CVE-2016-2858 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
RngBackend is now in charge of cleaning up the linked list on
instance finalization. It also exposes a function to finalize
individual RngRequest instances, called by its child classes.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-4-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 9f14b0add1)
[BR: support patch for BSC#970036 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
backends/rng.c
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. Registers PSTART & PSTOP
define ring buffer size & location. Setting these registers
to invalid values could lead to infinite loop or OOB r/w
access issues. Add check to avoid it.
Reported-by: Yang Hongke <yanghongke@huawei.com>
Tested-by: Yang Hongke <yanghongke@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 415ab35a44)
[BR: BSC#969350 CVE-2016-2841 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing remote NDIS control message packets,
the USB Net device emulator uses a fixed length(4096) data buffer.
The incoming informationBufferOffset & Length combination could
overflow and cross that range. Check control message buffer
offsets and length to avoid it.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-3-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fe3c546c5f)
[BR: BSC#967969 CVE-2016-2538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Due converting PIO to the new memory read/write api we no longer provide
separate I/O region lenghts for read and write operations. As a result,
reading from PIT Mode/Command register will end with accessing
pit->channels with invalid index.
Fix this by ignoring read from the Mode/Command register.
This is CVE-2015-3214.
Reported-by: Matt Tait <matttait@google.com>
Fixes: 0505bcdec8
Cc: qemu-stable@nongnu.org
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d4862a87e3)
[AF: bsc#934069]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This impacts both BMDMA and AHCI HBA interfaces for IDE.
Currently, we confuse the difference between a PRDT having
"0 bytes" and a PRDT having "0 complete sectors."
When we receive an incomplete sector, inconsistent error checking
leads to an infinite loop wherein the call succeeds, but it
didn't give us enough bytes -- leading us to re-call the
DMA chain over and over again. This leads to, in the BMDMA case,
leaked memory for short PRDTs, and infinite loops and resource
usage in the AHCI case.
The .prepare_buf() callback is reworked to return the number of
bytes that it successfully prepared. 0 is a valid, non-error
answer that means the table was empty and described no bytes.
-1 indicates an error.
Our current implementation uses the io_buffer in IDEState to
ultimately describe the size of a prepared scatter-gather list.
Even though the AHCI PRDT/SGList can be as large as 256GiB, the
AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
however, defines the largest transaction to be an LBA48 command
that transfers 65,536 sectors. With a 512 byte sector size, this
is just 32MiB.
Since our current state structures use the int type to describe
the size of the buffer, and this state is migrated as int32, we
are limited to describing 2GiB buffer sizes unless we change the
migration protocol.
For this reason, this patch begins to unify the assertions in the
IDE pathways that the scatter-gather list provided by either the
AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
2GiB. This should be resilient enough unless we need a sector
size that exceeds 32KiB.
Further, the likelihood of any guest operating system actually
attempting to transfer this much data in a single operation is
very slim.
To this end, the IDEState variables have been updated to more
explicitly clarify our maximum supported size. Callers to the
prepare_buf callback have been reworked to understand the new
return code, and all versions of the prepare_buf callback have
been adjusted accordingly.
Lastly, the ahci_populate_sglist helper, relied upon by the
AHCI implementation of .prepare_buf() as well as the PCI
implementation of the callback have had overflow assertions
added to help make clear the reasonings behind the various
type changes.
[Added %d -> %"PRId64" fix John sent because off_pos changed from int to
int64_t.
--Stefan]
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3251bdcf1c)
[BR: BSC#928393 CVE-2014-9718]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/ide/ahci.c
hw/ide/core.c
hw/ide/internal.h
hw/ide/macio.c
Switch vmsvga_update_rect over to use vmsvga_verify_rect. Slight change
in behavior: We don't try to automatically fixup rectangles any more.
In case we find invalid update requests we'll do a full-screen update
instead.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 1735fe1edb)
[AF: bsc#901508, surface_{width,height}() -> ds_get_{width,height}()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Quick & easy stopgap for CVE-2014-3689: We just compile out the
hardware acceleration functions which lack sanity checks. Thankfully
we have capability bits for them (SVGA_CAP_RECT_COPY and
SVGA_CAP_RECT_FILL), so guests should deal just fine, in theory.
Subsequent patches will add the missing checks and re-enable the
hardware acceleration emulation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 83afa38eb2)
[AF: bsc#901508]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:
- the TDLEN and RDLEN registers store the total size of the descriptor
area,
- while the TDH and RDH registers store the offset (in whole tx / rx
descriptors) into the area where the transfer is supposed to start.
Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).
QEMU already contains logic to deal with bogus transfers submitted by the
guest:
- Normally, the transmit case wants to increase TDH from its initial value
to TDT. (TDT is allowed to be numerically smaller than the initial TDH
value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
that QEMU currently has here is a check against reaching the original
TDH value again -- a complete wraparound, which should never happen.
- In the receive case RDH is increased from its initial value until
"total_size" bytes have been received; preferably in a single step, or
in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
RX descriptors are skipped without receiving data, while RDH is
incremented just the same. QEMU tries to prevent an infinite loop
(processing only null RX descriptors) by detecting whether RDH assumes
its original value during the loop. (Again, wrapping from RDLEN to 0 is
normal.)
What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.
The condition that expresses this is:
xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)
i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.
This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.
This is CVE-2016-1981.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dd793a7488)
[BR: in above description e1000_receive_iov is e1000_receive in backport]
[BR: BSC#963782 CVE-2016-1981]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/net/e1000.c
USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#964413 CVE-2016-2198]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.
Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.
Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 64ffbe04ea)
[BR: BSC#960334 CVE-2015-8619]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hmp.c
include/ui/console.h
ui/input-legacy.c
currently a malicious client could define a payload
size of 2^32 - 1 bytes and send up to that size of
data to the vnc server. The server would allocated
that amount of memory which could easily create an
out of memory condition.
This patch limits the payload size to 1MB max.
Please note that client_cut_text messages are currently
silently ignored.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f9a70e7939)
[CYL: BSC#944463 CVE-2015-5239]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers. Call that
unconditionally on every register write. That way we should catch
everything, even changing one register affecting the valid range of
another register.
Some of the holes have been added by commit
e9c6149f6a. Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.
Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.
Security impact:
(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source -> host memory leak. Memory isn't leaked to
the guest but to the vnc client though.
(2) Qemu will segfault in case the memory range happens to include
unmapped areas -> Guest can DoS itself.
The guest can not modify host memory, so I don't think this can be used
by the guest to escape.
CVE-2014-3615
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
[CYL: BSC#895528 CVE-2014-3615]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
VgaState->vram_size is the size of the pci bar. In case of qxl not the
whole pci bar can be used as vga framebuffer. Add a new variable
vbe_size to handle that case. By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.
This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
[CYL: BSC#895528 CVE-2014-3615]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
When processing NCQ commands, AHCI device emulation prepares a
NCQ transfer object; To which an aio control block(aiocb) object
is assigned in 'execute_ncq_command'. In case, when the NCQ
command is invalid, the 'aiocb' object is not assigned, and NCQ
transfer object is left as 'used'. This leads to a use after
free kind of error in 'bdrv_aio_cancel_async' via 'ahci_reset_port'.
Reset NCQ transfer object to 'unused' to avoid it.
[Maintainer edit: s/ACHI/AHCI/ in the commit message. --js]
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1452282511-4116-1-git-send-email-ppandit@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 4ab0359a8a)
[LM: BSC#961333 CVE-2016-1568]
Signed-off-by: Lin Ma <lma@suse.com>
qpci_msix_pending() writes on pba region, causing qemu to SEGV:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fba8c0 (LWP 25882)]
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ()
#1 0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434
#2 0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506
#3 0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176
#4 0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439
#5 0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534
#6 0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80
#7 0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378
#8 0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569
#9 0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581
#10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306
#11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876
#12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122
(without this patch, this can be reproduced with the ivshmem qtest)
Implement an empty mmio write to avoid the crash.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 43b11a91dd)
[LM: BSC#958917 CVE-2015-7549]
Signed-off-by: Lin Ma <lma@suse.com>
Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever). Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.
So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit. That should really
catch all cases now.
Reported-by: 杜少博 <dushaobo@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 1ae3f2f178)
[BR: BSC#976109 CVE-2016-4037]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Don't assume a specific layout for control messages.
Required by virtio 1.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7882080388)
[LM: BSC#940929 CVE-2015-5745]
Signed-off-by: Lin Ma <lma@suse.com>
While sending 'SetPixelFormat' messages to a VNC server,
the client could set the 'red-max', 'green-max' and 'blue-max'
values to be zero. This leads to a floating point exception in
write_png_palette while doing frame buffer updates.
Reported-by: Lian Yihan <lianyihan@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4c65fed8bd)
[LM: BSC#958491 CVE-2015-8504]
Signed-off-by: Lin Ma <lma@suse.com>
While processing controller 'CTRL_GET_INFO' command, the routine
'megasas_ctrl_get_info' overflows the '&info' object size. Use its
appropriate size to null initialise it.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512211501420.22471@wniryva>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 36fef36b91)
[BR: BSC#961556 CVE-2015-8613]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/scsi/megasas.c
Hello,
A null pointer dereference issue was reported by Mr Ling Liu, CC'd here. It
occurs while doing I/O port write operations via hmp interface. In that,
'current_cpu' remains null as it is not called from cpu_exec loop, which
results in the said issue.
Below is a proposed (tested)patch to fix this issue; Does it look okay?
===
From ae88a4947fab9a148cd794f8ad2d812e7f5a1d0f Mon Sep 17 00:00:00 2001
From: Prasad J Pandit <pjp@fedoraproject.org>
Date: Fri, 18 Dec 2015 11:16:07 +0530
Subject: [PATCH] i386: avoid null pointer dereference
When I/O port write operation is called from hmp interface,
'current_cpu' remains null, as it is not called from cpu_exec()
loop. This leads to a null pointer dereference in vapic_write
routine. Add check to avoid it.
Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512181129320.9805@wniryva>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 4c1396cb57)
[BR: BSC#962320 CVE-2016-1922]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/i386/kvmvapic.c
When processing firmware configurations, an OOB r/w access occurs
if 's->cur_entry' is set to be invalid(FW_CFG_INVALID=0xffff).
Add a check to validate 's->cur_entry' to avoid such access.
Reported-by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#961691 CVE-2016-1714]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The implementation of the ATA FLUSH command invokes a flush at the block
layer, which may on raw files on POSIX entail a synchronous fdatasync().
This may in some cases take so long that the SLES 11 SP1 guest driver
reports I/O errors and filesystems get corrupted or remounted read-only.
Avoid this by setting BUSY_STAT, so that the guest is made aware we are
in the middle of an operation and no ATA commands are attempted to be
processed concurrently.
Addresses BNC#637297.
Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f68ec8379e)
[BSC#936132]
Signed-off-by: Bruce Rogers <brogers@suse.com>
We're a little too lenient with what we'll let an ATAPI drive handle.
Clamp down on the IDE command execution table to remove CD_OK permissions
from commands that are not and have never been ATAPI commands.
For ATAPI command validity, please see:
- ATA4 Section 6.5 ("PACKET Command feature set")
- ATA8/ACS Section 4.3 ("The PACKET feature set")
- ACS3 Section 4.3 ("The PACKET feature set")
ACS3 has a historical command validity table in Table B.4
("Historical Command Assignments") that can be referenced to find when
a command was introduced, deprecated, obsoleted, etc.
The only reference for ATAPI command validity is by checking that
version's PACKET feature set section.
ATAPI was introduced by T13 into ATA4, all commands retired prior to ATA4
therefore are assumed to have never been ATAPI commands.
Mandatory commands, as listed in ATA8-ACS3, are:
- DEVICE RESET
- EXECUTE DEVICE DIAGNOSTIC
- IDENTIFY DEVICE
- IDENTIFY PACKET DEVICE
- NOP
- PACKET
- READ SECTOR(S)
- SET FEATURES
Optional commands as listed in ATA8-ACS3, are:
- FLUSH CACHE
- READ LOG DMA EXT
- READ LOG EXT
- WRITE LOG DMA EXT
- WRITE LOG EXT
All other commands are illegal to send to an ATAPI device and should
be rejected by the device.
CD_OK removal justifications:
0x06 WIN_DSM Defined in ACS2. Not valid for ATAPI.
0x21 WIN_READ_ONCE Retired in ATA5. Not ATAPI in ATA4.
0x94 WIN_STANDBYNOW2 Retired in ATA4. Did not coexist with ATAPI.
0x95 WIN_IDLEIMMEDIATE2 Retired in ATA4. Did not coexist with ATAPI.
0x96 WIN_STANDBY2 Retired in ATA4. Did not coexist with ATAPI.
0x97 WIN_SETIDLE2 Retired in ATA4. Did not coexist with ATAPI.
0x98 WIN_CHECKPOWERMODE2 Retired in ATA4. Did not coexist with ATAPI.
0x99 WIN_SLEEPNOW2 Retired in ATA4. Did not coexist with ATAPI.
0xE0 WIN_STANDBYNOW1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE1 WIN_IDLEIMMDIATE Not part of ATAPI in ATA4, ACS or ACS3.
0xE2 WIN_STANDBY Not part of ATAPI in ATA4, ACS or ACS3.
0xE3 WIN_SETIDLE1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE4 WIN_CHECKPOWERMODE1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE5 WIN_SLEEPNOW1 Not part of ATAPI in ATA4, ACS or ACS3.
0xF8 WIN_READ_NATIVE_MAX Obsoleted in ACS3. Not ATAPI in ATA4 or ACS.
This patch fixes a divide by zero fault that can be caused by sending
the WIN_READ_NATIVE_MAX command to an ATAPI drive, which causes it to
attempt to use zeroed CHS values to perform sector arithmetic.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1441816082-21031-1-git-send-email-jsnow@redhat.com
CC: qemu-stable@nongnu.org
(cherry picked from commit d9033e1d3a)
[BR: CVE-2015-6855 BSC#945404]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:
- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated
In order to be consistent with vhost, fix by discarding the descriptor
in this case.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0cf33fb6b4)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 29b9f5efd7)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ce31746157)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, leading to an infinite
loop situation.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 737d2b3c41)
[BR: BSC#945989]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/net/ne2000.c
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, which could lead to a
memory buffer overflow. Added other checks at initialisation.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9bbdbc66e5)
[BR: BSC#945987]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/net/ne2000.c
Include-If: %if 0%{?sp_version} == 4
Unfortunately we had to break migration for previous releases of
kvm on SLE11 SP4, in order to be compatible with SLE11 SP3 as
well as SLE12. In order to help the users understand and get
past this issue, we will output a message when the migration
fails, pointing them to TID # 7017048, which will provide details
about the issue.
NOTE: As soon as the vast majority of users are already past the
likelihood of still hitting this issue, we should remove this
patch.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Backends could provide a packet whose length is greater than buffer
size. Check for this and truncate the packet to avoid rx buffer
overflow in this case.
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
[BR: BSC#957162]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.
To make sure we never go backwards, calculate what the guest would have seen as time at the point of migration and use that value instead of the kernel returned one when it's more recent.
This bases the view of the kvmclock after migration on the
same foundation in host as well as guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9a48bcd1b8)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>
kvmclock should not count while vm is paused, because:
1) if the vm is paused for long periods, timekeeping
math can overflow while converting the (large) clocksource
delta to nanoseconds.
2) Users rely on CLOCK_MONOTONIC to count run time, that is,
time which OS has been in a runnable state (see CLOCK_BOOTTIME).
Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 00f4d64ee7)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This is additional hardening against an end_transfer_func that fails to
clear the DRQ status bit. The bit must be unset as soon as the PIO
transfer has completed, so it's better to do this in a central place
instead of duplicating the code in all commands (and forgetting it in
some).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The command must be completed on all code paths. START STOP UNIT with
pwrcnd set should succeed without doing anything.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If the end_transfer_func of a command is called because enough data has
been read or written for the current PIO transfer, and it fails to
correctly call the command completion functions, the DRQ bit in the
status register and s->end_transfer_func may remain set. This allows the
guest to access further bytes in s->io_buffer beyond s->data_end, and
eventually overflowing the io_buffer.
One case where this currently happens is emulation of the ATAPI command
START STOP UNIT.
This patch fixes the problem by adding explicit array bounds checks
before accessing the buffer instead of relying on end_transfer_func to
function correctly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In this version I used mkdtemp(3) which is:
_BSD_SOURCE
|| /* Since glibc 2.10: */
(_POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700)
(POSIX.1-2008), so should be available on systems we care about.
While at it, reset the resulting directory name within smb structure
on error so cleanup function wont try to remove directory which we
failed to create.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 8b8f1c7e9d)
[BR: BSC#932267]
Signed-off-by: Bruce Rogers <brogers@suse.com>
4096 is the maximum length per TMD and it is also currently the size of
the relay buffer pcnet driver uses for sending the packet data to QEMU
for further processing. With packet spanning multiple TMDs it can
happen that the overall packet size will be bigger than sizeof(buffer),
which results in memory corruption.
Fix this by only allowing to queue maximum sizeof(buffer) bytes.
This is CVE-2015-3209.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Matt Tait <matttait@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b50d00911)
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/net/pcnet.c
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.
Fix this by making sure that the index is always bounded by the
allocated memory.
This is CVE-2015-3456.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[AF: BSC#929339]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The VNC server websockets decoder will read and buffer data from
websockets clients until it sees the end of the HTTP headers,
as indicated by \r\n\r\n. In theory this allows a malicious to
trick QEMU into consuming an arbitrary amount of RAM. In practice,
because QEMU runs g_strstr_len() across the buffered header data,
it will spend increasingly long burning CPU time searching for
the substring match and less & less time reading data. So while
this does cause arbitrary memory growth, the bigger problem is
that QEMU will be burning 100% of available CPU time.
A novnc websockets client typically sends headers of around
512 bytes in length. As such it is reasonable to place a 4096
byte limit on the amount of data buffered while searching for
the end of HTTP headers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2cdb5e142f)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The logic for decoding websocket frames wants to fully
decode the frame header and payload, before allowing the
VNC server to see any of the payload data. There is no
size limit on websocket payloads, so this allows a
malicious network client to consume 2^64 bytes in memory
in QEMU. It can trigger this denial of service before
the VNC server even performs any authentication.
The fix is to decode the header, and then incrementally
decode the payload data as it is needed. With this fix
the websocket decoder will allow at most 4k of data to
be buffered before decoding and processing payload.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[ kraxel: fix frequent spurious disconnects, suggested by Peter Maydell ]
[ kraxel: fix 32bit build ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit a2bebfd6e0)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching
- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands
- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled
The bug is at the third step. If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state. The guest will control the
cache mode solely using configuration space. This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.
Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option. With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.
Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: Fixes bsc#920571]
(cherry picked from commit ef5bc96268)
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/virtio-blk.c
hw/virtio-blk.h
In qemu 1.4.x, when performing migrate_cancel during migration,
if the migrate_fd_cancel in main thread is scheduled to run before
thread buffered_file_thread calls migrate_fd_put_buffer, the s->state
will be modified to MIG_STATE_CANCELLED by main thread, then the
migrate_fd_put_buffer in thread buffered_file_thread will return
-EIO if s->state != MIG_STATE_ACTIVE. This return value will trigger
migrate_fd_error to set s->state = MIG_STATE_ERROR.
The patch fixes this issue.
Signed-off-by: Lin Ma <lma@suse.com>
[AF: BNC#843074]
Signed-off-by: Andreas Färber <afaerber@suse.de>
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.
To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.
Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.
Fixes: CVE-2014-7840
Note: follow-up patches add extra checks on each block->host access.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 0be839a270)
Signed-off-by: Alexander Graf <agraf@suse.de>
This is CVE-2014-8106.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bf25983345)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Issues:
* Doesn't check pitches correctly in case it is negative.
* Doesn't check width at all.
Turn macro into functions while being at it, also factor out the check
for one region which we then can simply call twice for src + dst.
This is CVE-2014-8106.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d3532a0db0)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
on incoming migration do not memset pages to zero if they already read as zero.
this will allocate a new zero page and consume memory unnecessarily. even
if we madvise a MADV_DONTNEED later this will only deallocate the memory
asynchronously.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 211ea74022)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Not sending zero pages breaks migration if a page is zero
at the source but not at the destination. This can e.g. happen
if different BIOS versions are used at source and destination.
It has also been reported that migration on pseries is completely
broken with this patch.
This effectively reverts commit f1c72795af.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
arch_init.c
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9ef051e553)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.
currently all these pages are copied to the page cache regardless
of whether they are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.
this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.
since the page cache is likely smaller than the number of pages
it's also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.
on the other hand a lot of unnecessary mallocs, memdups and frees
are saved.
the following results have been taken earlier while executing
the test program from docs/xbzrle.txt. (+) with the patch and (-)
without. (thanks to Eric Blake for reformatting and comments)
+ total time: 22185 milliseconds
- total time: 22410 milliseconds
Shaved 0.3 seconds, better than 1%!
+ downtime: 29 milliseconds
- downtime: 21 milliseconds
Not sure why downtime seemed worse, but probably not the end of the world.
+ transferred ram: 706034 kbytes
- transferred ram: 721318 kbytes
Fewer bytes sent - good.
+ remaining ram: 0 kbytes
- remaining ram: 0 kbytes
+ total ram: 1057216 kbytes
- total ram: 1057216 kbytes
+ duplicate: 108556 pages
- duplicate: 105553 pages
+ normal: 175146 pages
- normal: 179589 pages
+ normal bytes: 700584 kbytes
- normal bytes: 718356 kbytes
Fewer normal bytes...
+ cache size: 67108864 bytes
- cache size: 67108864 bytes
+ xbzrle transferred: 3127 kbytes
- xbzrle transferred: 630 kbytes
...and more compressed pages sent - good.
+ xbzrle pages: 117811 pages
- xbzrle pages: 21527 pages
+ xbzrle cache miss: 18750
- xbzrle cache miss: 179589
And very good improvement on the cache miss rate.
+ xbzrle overflow : 0
- xbzrle overflow : 0
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5cc11c46cf)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
avoid searching for dirty pages just increment the
page offset. all pages are dirty anyway.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 70c8652bf3)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.
even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.
this patch also updates QMP to return the number of
skipped pages in MigrationStats.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit f1c72795af)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
the first round of ram transfer is special since all pages
are dirty and thus all memory pages are transferred to
the target. this patch adds a boolean variable to track
this stage.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 78d07ae7ac)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
virtually all dup pages are zero pages. remove
the special is_dup_page() function and use the
optimized buffer_find_nonzero_offset() function
instead.
here buffer_find_nonzero_offset() is used directly
to avoid the unnecssary additional checks in
buffer_is_zero().
raw performace gain checking 1 GByte zeroed memory
over is_dup_page() is approx. 10-12% with SSE2
and 8-10% with unsigned long arithmedtic.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3edcd7e6eb)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.
the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.
due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 41a259bd2b)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
vector optimizations will now be used at various places
not just in is_dup_page() in arch_init.c
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit c61ca00ada)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if
FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even
compiled in in this case. However, there may be implementations which
support the latter but not the former (e.g., NFSv4.2) as well as vice
versa.
To cover both cases, try FIEMAP first (as this will return -ENOTSUP if
not supported instead of returning a failsafe value (everything
allocated as a single extent)) and if that does not work, fall back to
SEEK_HOLE/SEEK_DATA.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4f11aa8a40)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
block/raw-posix.c
strtoul(l) might overflow, in which case it'll return '-1' and set
the appropriate error code. So update the calls to strtoul(l) when
parsing hex properties to avoid silent overflows.
And we should be using an intermediate variable to avoid clobbering
of the passed-in point on error.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Backported the patch from:
http://lists.nongnu.org/archive/html/qemu-devel/2013-11/msg03950.html
[LM: BNC#852397]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Seabios already added a new device type to halt booting.
Qemu can add "HALT" at the end of bootindex string, then
seabios will halt booting after trying to boot from all
selected devices.
This patch added a new boot option to configure if boot
from un-selected devices.
This option only effects when boot priority is changed by
bootindex options, the old style(-boot order=..) will still
try to boot from un-selected devices.
v2: add HALT entry in get_boot_devices_list()
v3: rebase to latest qemu upstream
Signed-off-by: Amos Kong <akong@redhat.com>
Message-id: 1363674207-31496-1-git-send-email-akong@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c8a6ae8bb9)
Signed-off-by: Lin Ma <lma@suse.com>
[BR: BNC#900084]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Libvirt has no way to probe if an option or property is supported,
This patch introduces a new qmp command to query command line
option information. hmp command isn't added because it's not needed.
Signed-off-by: Amos Kong <akong@redhat.com>
CC: Luiz Capitulino <lcapitulino@redhat.com>
CC: Osier Yang <jyang@redhat.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 1f8f987d34)
[BR: BNC#899144]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
qapi-schema.json
When guest sends udp packet with source port and source addr 0,
uninitialized socket is picked up when looking for matching and already
created udp sockets, and later passed to sosendto() where NULL pointer
dereference is hit during so->slirp->vnetwork_mask.s_addr access.
Fix this by checking that the socket is not just a socket stub.
This is CVE-2014-3640.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Xavier Mehrenberger <xavier.mehrenberger@airbus.com>
Reported-by: Stephane Duverger <stephane.duverger@eads.net>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20140918063537.GX9321@dhcp-25-225.brq.redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 01f7cecf00)
[AF: BNC#897654]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Correct post load checks:
1. dev->setup_len == sizeof(dev->data_buf)
seems fine, no need to fail migration
2. When state is DATA, passing index > len
will cause memcpy with negative length,
resulting in heap overflow
First of the issues was reported by dgilbert.
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 719ffe1f5f)
[AF: BNC#878541]
Signed-off-by: Andreas Färber <afaerber@suse.de>
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.
This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 46485de0cb)
[AF: BNC#877645; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.
To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).
This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 42eb58179b)
[AF: BNC#877642; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.
The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 7159a45b2b)
[AF: error_setg() -> qerror_report(), disabled iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
We have a limited number of IRQ routing entries. To ensure that we don't
exceed that number, we flush dynamically created MSI route entries when
we realize that we're running out of space.
However, we count the GSI count incorrectly. This patch adds a safetly net
to make sure we're flushing all dynamic MSI routes when we realize that we
would be exceeding the number space.
Signed-off-by: Alexander Graf <agraf@suse.de>
Different virtual CPUs implement different capabilities of features that
get reflected in XSAVE depth. So instead of passing in the maximum XSAVE
capabilities, we should only tell the guest as much as it has to see for
the respective chosen vcpu type.
This patch is heavily based on commit 2560f19f but adjusted to apply on
our ancient code base.
Signed-off-by: Alexander Graf <agraf@suse.de>
On x86 we expose all KVM PMU capabilities back into KVM. Unfortunately our
vPMU emulation breaks Windows Server 2012 R2.
Because we're lacking support for all the vPMU MSRs anyway, disable all PMU
CPUID flags as well, so that Windows is happy and we don't get a half-way
implemented PMU exposed into our guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for
r = -EINVAL;
if (routing.nr >= KVM_MAX_IRQ_ROUTES)
goto out;
erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.
Signed-off-by: Alexander Graf <agraf@suse.de>
Malformed input can have config_len in migration stream
exceed the array size allocated on destination, the
result will be heap overflow.
To fix, that config_len matches on both sides.
CVE-2014-0182
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
--
v2: use %ix and %zx to print config_len values
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a890a2f913)
[AF: BNC#874788]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4149 QEMU 1.3.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c
> } else if (n->mac_table.in_use) {
> uint8_t *buf = g_malloc0(n->mac_table.in_use);
We are allocating buffer of size n->mac_table.in_use
> qemu_get_buffer(f, buf, n->mac_table.in_use * ETH_ALEN);
and read to the n->mac_table.in_use size buffer n->mac_table.in_use *
ETH_ALEN bytes, corrupting memory.
If adversary controls state then memory written there is controlled
by adversary.
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 98f93ddd84)
[AF: BNC#864649]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4534
opp->nb_cpus is read from the wire and used to determine how many
IRQDest elements to read into opp->dst[]. If the value exceeds the
length of opp->dst[], MAX_CPU, opp->dst[] can be overrun with arbitrary
data from the wire.
Fix this by failing migration if the value read from the wire exceeds
MAX_CPU.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 73d963c0a7)
[AF: BNC#864811; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4537
s->arglen is taken from wire and used as idx
in ssi_sd_transfer().
Validate it before access.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a9c380db3b)
[AF: BNC#864391]
Signed-off-by: Andreas Färber <afaerber@suse.de>
At the moment we require vmstate definitions to set minimum_version_id_old
to the same value as minimum_version_id if they do not provide a
load_state_old handler. Since the load_state_old functionality is
required only for a handful of devices that need to retain migration
compatibility with a pre-vmstate implementation, this means the bulk
of devices have pointless boilerplate. Relax the definition so that
minimum_version_id_old is ignored if there is no load_state_old handler.
Note that under the old scheme we would segfault if the vmstate
specified a minimum_version_id_old that was less than minimum_version_id
but did not provide a load_state_old function, and the incoming state
specified a version number between minimum_version_id_old and
minimum_version_id. Under the new scheme this will just result in
our failing the migration.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 767adce2d9)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4541
s->setup_len and s->setup_index are fed into usb_packet_copy as
size/offset into s->data_buf, it's possible for invalid state to exploit
this to load arbitrary data.
setup_len and setup_index should be checked to make sure
they are not negative.
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9f8e9895c5)
[AF: BNC#864802]
Signed-off-by: Andreas Färber <afaerber@suse.de>
As the macro verifies the value is positive, rename it
to make the function clearer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3476436a44)
[AF: backported (target-arm doesn't use VMState yet)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4542
hw/scsi/scsi-bus.c invokes load_request.
virtio_scsi_load_request does:
qemu_get_buffer(f, (unsigned char *)&req->elem, sizeof(req->elem));
this probably can make elem invalid, for example,
make in_num or out_num huge, then:
virtio_scsi_parse_req(s, vs->cmd_vqs[n], req);
will do:
if (req->elem.out_num > 1) {
qemu_sgl_init_external(req, &req->elem.out_sg[1],
&req->elem.out_addr[1],
req->elem.out_num - 1);
} else {
qemu_sgl_init_external(req, &req->elem.in_sg[1],
&req->elem.in_addr[1],
req->elem.in_num - 1);
}
and this will access out of array bounds.
Note: this adds security checks within assert calls since
SCSIBusInfo's load_request cannot fail.
For now simply disable builds with NDEBUG - there seems
to be little value in supporting these.
Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3c3ce98142)
[AF: BNC#864804, backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4540
Within scoop_gpio_handler_update, if prev_level has a high bit set, then
we get bit > 16 and that causes a buffer overrun.
Since prev_level comes from wire indirectly, this can
happen on invalid state load.
Similarly for gpio_level and gpio_dir.
To fix, limit to 16 bit.
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 52f91c3723)
[AF: BNC#864801]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4539
s->precision, nextprecision, function and nextfunction
come from wire and are used
as idx into resolution[] in TSC_CUT_RESOLUTION.
Validate after load to avoid buffer overrun.
Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5193be3be3)
[AF: BNC#864805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4538
s->cmd_len used as index in ssd0323_transfer() to store 32-bit field.
Possible this field might then be supplied by guest to overwrite a
return addr somewhere. Same for row/col fields, which are indicies into
framebuffer array.
To fix validate after load.
Additionally, validate that the row/col_start/end are within bounds;
otherwise the guest can provoke an overrun by either setting the _end
field so large that the row++ increments just walk off the end of the
array, or by setting the _start value to something bogus and then
letting the "we hit end of row" logic reset row to row_start.
For completeness, validate mode as well.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ead7a57df3)
[AF: BNC#864769]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4533
s->rx_level is read from the wire and used to determine how many bytes
to subsequently read into s->rx_fifo[]. If s->rx_level exceeds the
length of s->rx_fifo[] the buffer can be overrun with arbitrary data
from the wire.
Fix this by validating rx_level against the size of s->rx_fifo.
Cc: Don Koch <dkoch@verizon.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit caa881abe0)
[AF: BNC#864655]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4535
CVE-2013-4536
Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.
To fix, validate num_sg.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 36cf2a3713)
[AF: BNC#864665]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-6399
vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.
Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4b53c2c72c)
[AF: BNC#864814]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4531
cpreg_vmstate_indexes is a VARRAY_INT32. A negative value for
cpreg_vmstate_array_len will cause a buffer overflow.
VMSTATE_INT32_LE was supposed to protect against this
but doesn't because it doesn't validate that input is
non-negative.
Fix this macro to valide the value appropriately.
The only other user of VMSTATE_INT32_LE doesn't
ever use negative numbers so it doesn't care.
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d2ef4b61fe)
[AF: BNC#864796, backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Fix comparison of vmstate_info_int32_le so that it succeeds if loaded
value is (l)ess than or (e)qual
When the comparison succeeds, assign the value loaded
This is a change in behaviour but I think the original intent, since
the idea is to check if the version/size of the thing you're loading is
less than some limit, but you might well want to do something based on
the actual version/size in the file
Fix up comment and name text
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 24a370ef23)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4530
pl022.c did not bounds check tx_fifo_head and
rx_fifo_head after loading them from file and
before they are used to dereference array.
Reported-by: Michael S. Tsirkin <mst@redhat.com
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d8d0a0bc7e)
[AF: BNC#864682]
Signed-off-by: Andreas Färber <afaerber@suse.de>
4) CVE-2013-4529
hw/pci/pcie_aer.c pcie aer log can overrun the buffer if log_num is
too large
There are two issues in this file:
1. log_max from remote can be larger than on local
then buffer will overrun with data coming from state file.
2. log_num can be larger then we get data corruption
again with an overflow but not adversary controlled.
Fix both issues.
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5f691ff91d)
[AF: BNC#864678]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4527 hw/timer/hpet.c buffer overrun
hpet is a VARRAY with a uint8 size but static array of 32
To fix, make sure num_timers is valid using VMSTATE_VALID hook.
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3f1c49e213)
[AF: BNC#864673]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Validate state using VMS_ARRAY with num = 0 and VMS_MUST_EXIST
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4082f0889b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Can be used to verify a required field exists or validate
state in some other way.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5bf81c8d63)
[AF: Backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4526
Within hw/ide/ahci.c, VARRAY refers to ports which is also loaded. So
we use the old version of ports to read the array but then allow any
value for ports. This can cause the code to overflow.
There's no reason to migrate ports - it never changes.
So just make sure it matches.
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ae2158ad6c)
[AF: BNC#864671]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4151 QEMU 1.0 out-of-bounds buffer write in
virtio_load@hw/virtio/virtio.c
So we have this code since way back when:
num = qemu_get_be32(f);
for (i = 0; i < num; i++) {
vdev->vq[i].vring.num = qemu_get_be32(f);
array of vqs has size VIRTIO_PCI_QUEUE_MAX, so
on invalid input this will write beyond end of buffer.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit cc45995294)
[AF: BNC#864653]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4150 QEMU 1.5.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c
This code is in hw/net/virtio-net.c:
if (n->max_queues > 1) {
if (n->max_queues != qemu_get_be16(f)) {
error_report("virtio-net: different max_queues ");
return -1;
}
n->curr_queues = qemu_get_be16(f);
for (i = 1; i < n->curr_queues; i++) {
n->vqs[i].tx_waiting = qemu_get_be32(f);
}
}
Number of vqs is max_queues, so if we get invalid input here,
for example if max_queues = 2, curr_queues = 3, we get
write beyond end of the buffer, with data that comes from
wire.
This might be used to corrupt qemu memory in hard to predict ways.
Since we have lots of function pointers around, RCE might be possible.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit eea750a562)
[AF: BNC#864650]
Signed-off-by: Andreas Färber <afaerber@suse.de>
CVE-2013-4148 QEMU 1.0 integer conversion in
virtio_net_load()@hw/net/virtio-net.c
Deals with loading a corrupted savevm image.
> n->mac_table.in_use = qemu_get_be32(f);
in_use is int so it can get negative when assigned 32bit unsigned value.
> /* MAC_TABLE_ENTRIES may be different from the saved image */
> if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {
passing this check ^^^
> qemu_get_buffer(f, n->mac_table.macs,
> n->mac_table.in_use * ETH_ALEN);
with good in_use value, "n->mac_table.in_use * ETH_ALEN" can get
positive and bigger than mac_table.macs. For example 0x81000000
satisfies this condition when ETH_ALEN is 6.
Fix it by making the value unsigned.
For consistency, change first_multi as well.
Note: all call sites were audited to confirm that
making them unsigned didn't cause any issues:
it turns out we actually never do math on them,
so it's easy to validate because both values are
always <= MAC_TABLE_ENTRIES.
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 71f7fe48e1)
[AF: BNC#864812; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When VM guest programs multicast addresses for
a virtio net card, it supplies a 32 bit
entries counter for the number of addresses.
These addresses are read into tail portion of
a fixed macs array which has size MAC_TABLE_ENTRIES,
at offset equal to in_use.
To avoid overflow of this array by guest, qemu attempts
to test the size as follows:
- if (in_use + mac_data.entries <= MAC_TABLE_ENTRIES) {
however, as mac_data.entries is uint32_t, this sum
can overflow, e.g. if in_use is 1 and mac_data.entries
is 0xffffffff then in_use + mac_data.entries will be 0.
Qemu will then read guest supplied buffer into this
memory, overflowing buffer on heap.
CVE-2014-0150
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1397218574-25058-1-git-send-email-mst@redhat.com
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit edc2438512)
[AF: Resolves BNC#873235; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The SMART self test counter was incorrectly being reset to zero,
not 1. This had the effect that on every 21st SMART EXECUTE OFFLINE:
* We would write off the beginning of a dynamically allocated buffer
* We forgot the SMART history
Fix this.
Signed-off-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1397336390-24664-1-git-send-email-benoit.canet@irqsave.net
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
[PMM: tweaked commit message as per suggestions from Markus]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 940973ae0b)
[AF: Addresses CVE-2014-2894; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
There are several several issues in the current checking:
- The check was based on the minus of unsigned values which can overflow
- It was done after .{set|get}_config() which can lead crash when config_len
is zero since vdev->config is NULL
Fix this by:
- Validate the address in virtio_pci_config_{read|write}() before
.{set|get}_config
- Use addition instead minus to do the validation
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Petr Matousek <pmatouse@redhat.com>
Message-id: 1367905369-10765-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 5f5a131865)
[AF: BNC#817593, CVE-2013-2016; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This avoids a possible division by zero.
Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9302e863aa)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.
The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afbcc40bee)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Even with a limit of 64k snapshots, each snapshot could have a filename
and an ID with up to 64k, which would still lead to pretty large
allocations, which could potentially lead to qemu aborting. Limit the
total size of the snapshot table to an average of 1k per entry when
the limit of 64k snapshots is fully used. This should be plenty for any
reasonable user.
This also fixes potential integer overflows of s->snapshot_size.
Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dae6e30c5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6a83f8b5be)
[AF: BNC#870439; rebased on report_unsupported(),
error_setg() -> error_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c05e4667be)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 11b128f406)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).
The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6b7d4c5558)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Limiting the size of a single request to INT_MAX not only fixes a
direct integer overflow in bdrv_check_request() (which would only
trigger bad behaviour with ridiculously huge images, as in close to
2^64 bytes), but can also prevent overflows in all block drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8f4754ede5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Both compressed and uncompressed I/O is buffered. dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.
There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:
switch (s->types[chunk]) {
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
We must account against the maximum uncompressed buffer size for type=1
chunks.
This patch fixes the maximum buffer size calculation to take into
account the chunk type. It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f0dce23475)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The DMG metadata is stored as uint64_t, so use the same type for
sector_num. int was a particularly poor choice since it is only 32-bit
and would truncate large values.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 686d7148ec)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument. Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c165f77580)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Use the right types instead of signed int:
size_t new_size;
This is a byte count for g_realloc() that is calculated from uint32_t
and size_t values.
uint32_t chunk_count;
Use the same type as s->n_chunks, which is used together with
chunk_count.
This patch is a cleanup and does not fix bugs.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eb71803b04)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
It is not necessary to check errno for EINTR and the block layer does
not produce short reads. Therefore we can drop the loop that attempts
to read a compressed chunk.
The loop is buggy because it incorrectly adds the transferred bytes
twice:
do {
ret = bdrv_pread(...);
i += ret;
} while (ret >= 0 && ret + i < s->lengths[chunk]);
Luckily we can drop the loop completely and perform a single
bdrv_pread().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b404bf8542)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.
If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses. Don't do
that.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 73ed27ec28)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c. There are no semantic changes since this
patch simply reformats the code.
This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2c1885adcf)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The size in bytes is assigned to an int later, so check that instead of
the number of entries.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cab60de930)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:
$ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
$ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
Segmentation fault
With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:
$ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
qemu-img: The image size is too large for file format 'qcow2'
Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().
If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2cf7cfa1cd)
Signed-off-by: Andreas Färber <afaerber@suse.de>
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2b5d5953ee)
[AF: BNC#870439; rebased on report_unsupported()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit db8a31d11d)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.
So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.
The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)
[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b106ad9185)
[AF: BNC#870439; dropped QCOW2_DISCARD_NEVER argument to
qcow2_free_clusters() and update_refcount(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.
This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d33e8e7dc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This avoids an unbounded allocation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2d51c32c4b)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce48f2f441)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.
Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8c7de28305)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dab2faddc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a1b3955c94)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This fixes an unbounded allocation for s->unknown_header_fields.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 24342f2cae)
[AF: BNC#870439; error_setg() -> unsupported_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d4b9e55fc)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB. Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.
This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 63fa06dc97)
[AF: BNC#870439; changed error_setg() -> logout()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This fixes some cases of division by zero crashes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5e71dfad76)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This adds checks to make sure that max_table_entries and block_size
are in sane ranges. Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.
Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 97f1c45c6f)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8e53abbc20)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3737b820b)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 246f65838d)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
This is an on-disk structure, so offsets must be accurate.
Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.
This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3dd8a6763b)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
cloop stores the number of compressed blocks in the n_blocks header
field. The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.
The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:
uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];
This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.
Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 42d43d35d9)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size. If the offsets are bogus the maximum compressed
data size will be unrealistic.
This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway. Therefore we should refuse such images.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f56b9bc3ae)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Limit offsets_size to 512 MB so that:
1. g_malloc() does not abort due to an unreasonable size argument.
2. offsets_size does not overflow the bdrv_pread() int size argument.
This limit imposes a maximum image size of 16 TB at 256 KB block size.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b103b36d6)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:
uint32_t n_blocks, offsets_size;
[...]
ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
[...]
s->n_blocks = be32_to_cpu(s->n_blocks);
/* read offsets */
offsets_size = s->n_blocks * sizeof(uint64_t);
s->offsets = g_malloc(offsets_size);
[...]
for(i=0;i<s->n_blocks;i++) {
s->offsets[i] = be64_to_cpu(s->offsets[i]);
offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.
This patch refuses to open files if offsets_size would overflow.
Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 509a41bab5)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value. Also enforce the
assumption that the value is a non-zero multiple of 512.
These constraints conform to cloop 2.639's code so we accept existing
image files.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d65f97a82c)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
r->buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.
To fix, we allocate the buffer dynamically.
Signed-off-by: Asias He <asias@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 846424350b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
We have to set the cssid to 0, otherwise the stsch code will
return an operand exception without the m bit. In the same way
we should set m=0.
This case was triggered in some cases during reboot, if for some
reason the location of blk_schid.cssid contains 1 and m was 0.
Turns out that the qemu elf loader does not zero out the bss section
on reboot.
The symptom was an dump of the old kernel with several areas
overwritten. The bootloader does not register a program check
handler, so bios exception jumped back into the old kernel.
Lets just use a local struct with a designed initializer. That
will guarantee that all other subelements are initialized to 0.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 5d739a4787)
Signed-off-by: Andreas Färber <afaerber@suse.de>
The current code does not initialize next_idx in the virtio ring.
As the ccw bios will always use guest memory at a fixed location,
this queue might != 0 after a reboot.
Lets make the initialization explicit.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1028f1b5b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
The guest side must not manipulate the index for the used buffers. Instead,
remember the state of the used buffer locally and wait until it has moved.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 441ea695f9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
With the ccw ipl code sometimes an error message like
"virtio: trying to map MMIO memory" or
"Guest moved used index from %u to %u" appeared. Turns out
that the ccw bios did not zero out the vring, which might
cause stale values in avail->idx and friends, especially
on reboot.
Lets zero out the relevant fields. To activate the patch we
need to rebuild s390-ccw.img as well.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1369309901-418-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 39c93c67c5)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Use the passed device, if there is no device, use the first applicable device.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ff151f4ec9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
- Use tpi + tsch to get interrupts.
- Return an error if the irb indicates problems.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 776e7f0f21)
Signed-off-by: Andreas Färber <afaerber@suse.de>
stsch is the canonical way to detect devices. As a bonus, we can
abort the loop if we get cc 3, and we need to check only the valid
devices (dnv set).
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 22d67ab55a)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Lets fix this gcc warning:
virtio.c: In function ‘vring_send_buf’:
virtio.c:125:35: error: operation on ‘vr->next_idx’ may be undefined
[-Werror=sequence-point]
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit dc03640b58)
Signed-off-by: Andreas Färber <afaerber@suse.de>
We have to call strip with s390-ccw.elf as input and
s390-ccw.img as output
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 6328801f19)
Signed-off-by: Andreas Färber <afaerber@suse.de>
dont waste cpu power on an error condition. Lets stop the guest
with a disabled wait.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7f61cbc108)
Signed-off-by: Andreas Färber <afaerber@suse.de>
This patch adds a makefile, so we can build our ccw firmware. Also
add the resulting binaries to .gitignore, so that nobody is annoyed
they might be in the tree.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit b462fcd57c)
Signed-off-by: Andreas Färber <afaerber@suse.de>
On s390, there is an architected boot map format that we can read to
boot a certain entry off the disk. Implement a simple reader for this
that always boots the first (default) entry.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 685d49a63e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Like all great programs, we have to call between different functions in
different object files. And all of them need a common ground of defines.
Provide a file that provides these defines.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit c9c39d3b5e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
In order to boot, we need to be able to access a virtio-blk device through
the CCW bus. Implement support for this.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1e17c2c15b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
In order to communicate with the user, we need an I/O mechanism that he
can read. Implement SCLP ASCII support, which happens to be the default
in the s390 ccw machine.
This file is missing read support for now. It can only print messages.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0369b2eb07)
Signed-off-by: Andreas Färber <afaerber@suse.de>
This C file is the main driving piece of the s390 ccw firmware. It
provides a search for a workable block device, sets it as the default
to boot off of and boots from it.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 92f2ca38b0)
Signed-off-by: Andreas Färber <afaerber@suse.de>
We want to write most of our code in C, so add a small assembly
stub that jumps straight into C code for us to continue booting.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 80fea6e893)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Our current VNC code does not handle color maps (aka non-true-color) at all
and aborts if a client requests them. There are 2 major issues with this:
1) A VNC viewer on an 8-bit X11 system may request color maps
2) RealVNC _always_ starts requesting color maps, then moves on to full color
In order to support these 2 use cases, let's just create a fake color map
that covers exactly our normal true color 8 bit color space. That way we don't
lose anything over a client that wants true color.
Reported-by: Sascha Wehnert <swehnert@suse.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.
Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).
Addresses part of BNC#812836.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Analogously to other NICs, we have to inform the network layer when
the can_receive handler will no longer report 0. Without this, we may
get stuck waiting on queued incoming packets.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ee76c1f821)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Net queues support efficient "receive disable". For example, tap's file
descriptor will not be polled while its peer has receive disabled. This
saves CPU cycles for needlessly copying and then dropping packets which
the peer cannot receive.
rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
queue up when receive becomes possible again.
As a result, the Windows 7 guest driver reaches a state where the
rtl8139 cannot receive packets. The driver has actually refilled the
receive buffer but we never resume reception.
The bug can be reproduced by running a large FTP 'get' inside a Windows
7 guest:
$ qemu -netdev tap,id=tap0,...
-device rtl8139,netdev=tap0
The Linux guest driver does not trigger the bug, probably due to a
different buffer management strategy.
Reported-by: Oliver Francke <oliver.francke@filoo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00b7ade807)
Signed-off-by: Andreas Färber <afaerber@suse.de>
The latest ipl code adaptions collided with some of the virtio
refactoring rework. This resulted in always booting the first
disk. Let's fix booting from a given ID.
The new code also checks for command lines without bootindex to
avoid random behaviour when accessing dev_st (==0).
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5c8ded6ef5)
[AF: virtio refactoring not applicable, revert dev_st cast change]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Interpretation of the ccws to register (configuration) indicators contained
a thinko: We want to disallow reading from 0, but setting the indicator
pointer to 0 is fine.
Let's fix the handling for CCW_CMD_SET{,_CONF}_IND.
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1db1fa8df)
Signed-off-by: Andreas Färber <afaerber@suse.de>
If no kernel IPL entry is specified, boot the bios and pass if available
device information for the first boot device (as given by the boot index).
The provided information will be used in the next commit from the BIOS.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba1509c0a9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Instead of manually parsing the boot_list as character stream,
we can access the nth boot device, specified by the position in the
boot order.
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7dc5af5545)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Since we now have working firmware for s390-ccw in the tree, we can
default to it on our s390-ccw machine, rendering it more useful.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba747cc8f3)
Signed-off-by: Andreas Färber <afaerber@suse.de>
We have a virtio-s390 and a virtio-ccw machine in QEMU. Both use vastly
different ways to do I/O. Having the same firmware blob for both doesn't
really make any sense.
Instead, let's parametrize the firmware file name, so that we can have
different blobs for different machines.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d0249ce5a8)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Our firmware blob is always a raw file that we load at a fixed address today.
Support loading an ELF blob instead that we can map high up in memory.
This way we don't have to be so conscious about size constraints.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 3325995640)
Signed-off-by: Andreas Färber <afaerber@suse.de>
We can have different load addresses for different blobs we boot with.
Make the reset IP dynamic, so that we can handle things more flexibly.
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 74ad2d22c1)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Before, we reserved streams on request submission, going into a potential
endless loop if we run more than 4 parallel requests. There's no need to.
The decoding callback is synchronized already.
Signed-off-by: Alexander Graf <agraf@suse.de>
Yikes - when running on big endian systems, we had some serious bugs
exposed. This patch gets us rolling on those again.
Signed-off-by: Alexander Graf <agraf@suse.de>
Added TLS support to the VNC QEMU Websockets implementation.
VNC-TLS needs to be enabled for this feature to be used.
The required certificates are specified as in case of VNC-TLS
with the VNC parameter "x509=<path>".
If the server certificate isn't signed by a rooth authority it needs to
be manually imported in the browser because at least in case of Firefox
and Chrome there is no user dialog, the connection just gets canceled.
As a side note VEncrypt over Websocket doesn't work atm because TLS can't
be stacked in the current implementation. (It also didn't work before)
Nevertheless to my knowledge there is no HTML 5 VNC client which supports
it and the Websocket connection can be encrypted with regular TLS now so
it should be fine for most use cases.
Signed-off-by: Tim Hardeck <thardeck@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1366727581-5772-1-git-send-email-thardeck@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 0057a0d590)
[BNC#821819 / FATE#315032]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.
Signed-off-by: Bruce Rogers <brogers@suse.com>
We were missing a bunch of feature lists. Fix this by simply dumping
the meta list feature_word_info.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 3af60be28c)
Signed-off-by: Andreas Färber <afaerber@suse.de>
kvm_enabled() cannot be true at this point because accelerators are
initialized much later during init. Also, hiding this makes it very hard
to discover for users. Simply dump unconditionally if CONFIG_KVM is set.
Add explanation for "host" CPU type.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 21ad77892d)
Signed-off-by: Andreas Färber <afaerber@suse.de>
Make virtio-rng devices available for s390-ccw-virtio machines.
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
We don't want to confuse users by offering the legacy, broken
machine type -M s390-virtio. Let's just not include the machine
description in the first place..
Signed-off-by: Alexander Graf <agraf@suse.de>
When spawning a drive with -drive if=virtio on s390x, we want to
create a ccw device by default, as that is our default machine.
Signed-off-by: Alexander Graf <agraf@suse.de>
When running with the s390-ccw machine, we need to make sure we
spawn virtio-ccw devices for -net and -drive. Change the aliases
accordingly.
Signed-off-by: Alexander Graf <agraf@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
On our S390x Virtio machine we don't have anywhere to display early printks
on, because we don't know about VGA or serial ports.
So instead we just forward everything to the virtio console that we created
anyways.
Signed-off-by: Alexander Graf <agraf@suse.de>
Conflicts:
hw/s390-virtio.c
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.
This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.
Signed-off-by: Alexander Graf <agraf@suse.de>
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.
So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.
This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.
The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Conflicts:
block/Makefile.objs
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.
Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.
This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.
Tar patch follows.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Conflicts:
block/Makefile.objs
IABR SPR is already registered in gen_spr_603(), called from init_proc_603E().
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
With mapped security models like mapped-xattr and mapped-file, we save the
symlink target as file contents. Now if we ever expose a normal directory
with mapped security model and find real symlinks in export path, never
follow them and return proper error.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When guest tries to chmod a block or char device file over 9pfs,
the qemu process segfaults. With 9p2000.u protocol we use wstat to
change mode bits and client don't send extension information for
chmod. We need to check for size field to check whether extension
info is present or not.
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.
Fix this by using an asynchronous version of flush. This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.
Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.
Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dc7588c1eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If zero clusters are erroneously treated as unallocated, "qemu-img rebase"
will copy the backing file's contents onto the cluster.
The bug existed also in image streaming, but since the root cause was in
qcow2's is_allocated implementation it is enough to test it with qemu-img.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit acbf30ec60)
Conflicts:
tests/qemu-iotests/group
* fixed up to account for tests 48/49 being missing from 1.4
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Because dev->actual is uint32_t, the expression 'dev->actual <<
VIRTIO_BALLOON_PFN_SHIFT' is truncated to 32 bits. This overflows when
dev->actual >= 1048576.
To reproduce:
1. Start a VM with a QMP socket and 5G of RAM
2. Connect to the QMP socket, negotiate capabilities and issue:
{ "execute":"balloon", "arguments": { "value": 1073741824 } }
3. Watch for BALLOON_CHANGE QMP events, the last one will incorretly be:
{ "timestamp": { "seconds": 1366228965, "microseconds": 245466 },
"event": "BALLOON_CHANGE", "data": { "actual": 5368709120 } }
To fix it this commit casts it to ram_addr_t, which is ram_size's type.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit dcc6ceffc0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
These are needed for any of the Win32 alarm timer implementations.
They are not tied to mmtimer exclusively.
Jacob tested this patch with both mmtimer and Win32 timers.
Cc: qemu-stable@nongnu.org
Tested-by: Jacob Kroon <jacob.kroon@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
(cherry picked from commit 0727b86754)
Conflicts:
os-win32.c
* updated to retain cpu affinity settings for 1.4
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This is a back port of 7c2acc7062 to the
1.4 stable branch without needing the new error_exit() function.
configure: Don't fall back to gthread coroutine backend
The gthread coroutine backend is broken and does not produce a working
QEMU; it is only useful for some very limited debugging situations.
Clean up the backend selection logic in configure so that it now runs
"if on windows use windows; else prefer ucontext; else sigaltstack".
To do this we refactor the configure code to separate out "test
whether we have a working ucontext", "pick a default if user didn't
specify" and "validate that user didn't specify something invalid",
rather than having all three of these run together. We also simplify
the Makefile logic so it just links in the backend the configure
script selects.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1365419487-19867-3-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If no client is connected on the src side, then we won't receive a
parser during migrate, in this case usbredir_post_load() should be a nop,
rather then to try to derefefence the NULL dev->parser pointer.
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3713e1485e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
LC_ALL=C makeinfo --no-headers --no-split --number-sections --html qemu-doc.texi -o qemu-doc.html
./qemu-options.texi:1521: unknown command `list'
./qemu-options.texi:1521: table requires an argument: the formatter for @item
./qemu-options.texi:1521: warning: @table has text but no @item
This is for 1.4 stable only; master isn't affected, as it was fixed by
another commit (which isn't appropriate for stable):
commit 5d6768e3b8
Author: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Date: Fri Feb 22 12:39:51 2013 +0900
sheepdog: accept URIs
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In Windows guests this may make a difference.
Since the original patch (commit c689b4f1) sought to be pedantic and to
consider theoretical corner cases of portability, we should fix it up
where it failed to come through in that pursuit.
Suggested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 8fe6bbca71)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The (unsafe) function cpu_unlink_tb() is now unused, so we can simply
remove it and any code that was only used by it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 3a808cc407)
Conflicts:
translate-all.c
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).
This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 378df4b237)
Conflicts:
exec.c
include/qom/cpu.h
translate-all.c
include/exec/gen-icount.h
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Conflicts:
cpu-exec.c
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a
link to an indexed next TB, this means that the TB execution never
started (eg because the instruction counter hit zero). In this case the
guest PC has to be reset to the address of the start of the TB.
Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass
through a wrapper function which does this restoration if necessary.
Note that the apparent change in cpu_exec_nocache() from calling
cpu_pc_from_tb() with the old TB to calling it with the TB returned by
do_tcg_qemu_tb_exec() is safe, because in the nocache case we can
guarantee that the TB we try to execute is not linked to any others,
so the only possible returned TB is the one we started at. That is,
we should arguably previously have included in cpu_exec_nocache() an
assert(next_tb & ~TB_EXIT_MASK) == tb), since the API requires restore
from next_tb but we were using tb.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 77211379d7)
Conflicts:
cpu-exec.c
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Document tcg_qemu_tb_exec(). In particular, its return value is a
combination of a pointer to the next translation block and some
extra information in the low two bits. Provide some #defines for
the values passed in these bits to improve code clarity.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>
(cherry picked from commit 0980011b4f)
Conflicts:
tcg/tcg.h
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The qemu guest agent creates a bunch of files with insecure permissions
when started in daemon mode. For example:
-rw-rw-rw- 1 root root /var/log/qemu-ga.log
-rw-rw-rw- 1 root root /var/run/qga.state
-rw-rw-rw- 1 root root /var/log/qga-fsfreeze-hook.log
In addition, at least all files created with the "guest-file-open" QMP
command, and all files created with shell output redirection (or
otherwise) by utilities invoked by the fsfreeze hook script are affected.
For now mask all file mode bits for "group" and "others" in
become_daemon().
Temporarily, for compatibility reasons, stick with the 0666 file-mode in
case of files newly created by the "guest-file-open" QMP call. Do so
without changing the umask temporarily.
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c689b4f1ba)
Conflicts:
qga/commands-posix.c
*update includes to match stable
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When setcond2 is rewritten into setcond, the state of the destination
temp should be reset, so that a copy of the previous value is not
used instead of the result.
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 66e61b55f1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
gen_muldiv was passing int accumulator arguments directly
to gen_helper_dmult(u). This patch fixes it to use TCGs,
via the gen_helper_0e2i wrapper.
Fixes an --enable-debug-tcg build failure reported by Juergen Lock.
Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If a guest neglected to register (secondary) indicators but still runs
with notifications enabled, we might end up writing to guest zero;
avoid this by checking for valid indicators and only writing to the
guest and generating an interrupt if indicators have been setup.
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 7c4869761d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Since commit 249d41720b (qdev: Prepare
"realized" property) setting realized = true would register the device's
VMStateDescription, but realized = false would not unregister it. Fix that.
Moving the code from unparenting also revealed that we were calling
DeviceClass::init through DeviceClass::realize as interim solution but
DeviceClass::exit still at unparenting time with a realized check.
Make this symmetrical by implementing DeviceClass::unrealize to call it,
while we're setting realized = false in the unparenting path.
The only other unrealize user is mac_nvram, which can safely override it.
Thus, mark DeviceClass::exit as obsolete, new devices should implement
DeviceClass::unrealize instead.
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1366043650-9719-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit fe6c211781)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently the qemu-nbd program will auto-detect the format of
any disk it is given. This behaviour is known to be insecure.
For example, if qemu-nbd initially exposes a 'raw' file to an
unprivileged app, and that app runs
'qemu-img create -f qcow2 -o backing_file=/etc/shadow /dev/nbd0'
then the next time the app is started, the qemu-nbd will now
detect it as a 'qcow2' file and expose /etc/shadow to the
unprivileged app.
The only way to avoid this is to explicitly tell qemu-nbd what
disk format to use on the command line, completely disabling
auto-detection. This patch adds a '-f' / '--format' arg for
this purpose, mirroring what is already available via qemu-img
and qemu commands.
qemu-nbd --format raw -p 9000 evil.img
will now always use raw, regardless of what format 'evil.img'
looks like it contains
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[Use errx, not err. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
*fixed conflict due to bdrv_open() not supporting "options" param
in v1.4.1
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add accumulator arguments to gen_HILO and gen_muldiv, rather than
extracting the accumulator directly from ctx->opcode. The extraction
was only right for the standard encoding: MIPS16 doesn't have access
to the DSP registers, while microMIPS encodes the accumulator register
in a different field (bits 14 and 15).
Passing the accumulator register is probably an over-generalisation
for division and 64-bit multiplication, which never access anything
other than HI and LO, and which always pass 0 as the new argument.
Separating them felt a bit fussy though.
Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 26135ead80)
Conflicts:
target-mips/translate.c
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Allow the clock_gettime() code using monotonic clock to be utilized on
more POSIX compliannt OS's. This started as a fix for OpenBSD which was
listed in one function as part of the previous hard coded list of OS's
for the functions to support but not in the other.
Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20130405003748.GH884@rox.home.comstyle.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit d05ef16045)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit 5ec01c2e96 broke "-cpu ..,enforce",
as it has moved kvm_check_features_against_host() after the
filter_features_for_kvm() call. filter_features_for_kvm() removes all
features not supported by the host, so this effectively made
kvm_check_features_against_host() impossible to fail.
This patch changes the call so we check for host feature support before
filtering the feature bits.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1364935692-24004-1-git-send-email-ehabkost@redhat.com
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit a509d632c8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
GCC 4.8.0 introduces a new warning:
block/qcow2-snapshot.c: In function 'qcow2_write_snapshots’:
block/qcow2-snapshot.c:252:18: error: typedef 'qemu_build_bug_on__253'
locally defined but not used [-Werror=unused-local-typedefs]
QEMU_BUILD_BUG_ON(offsetof(QCowHeader, snapshots_offset) !=
^
cc1: all warnings being treated as errors
(Caret diagnostics aren't perfect yet with macros... :)) Work around it
with __attribute__((unused)).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1364391272-1128-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 99835e0084)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
this patch ensures that all pending IOs are completed
before a device is resized. this is especially important
if a device is shrinked as it the bdrv_check_request()
result is invalidated.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92b7a08d64)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
brdv_truncate() is also called from readv/writev commands on self-
growing file based storage. this will result in requests waiting
for theirselves to complete.
This reverts commit 9a665b2b86.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5c916681ae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Ask the vga core to update the display. Will trigger dpy_gfx_resize
if needed. More complete than just calling dpy_gfx_resize.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c099e7aa02)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
End tables before headings, start new ones afterwards. Fixes
incorrect indentation of headings "File system options" and "Virtual
File system pass-through options" in manual page and qemu-doc.
Normalize markup some to increase chances it survives future edits.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360781383-28635-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c70a01e449)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
While investigating why a 32 bit Windows 2003 guest wasn't able to
successfully perform a shutdown /h, it was discovered that commit
afafe4bbe0 inadvertently dropped the
initialization of the s4_val used to handle s4 shutdown.
Initialize the value as before.
Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 1364928100-487-1-git-send-email-brogers@suse.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 560e639652)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Fix for rndrashift_short_acc to set correct value to higher 64 bits.
This change also corrects conditions when bit 23 of the DSPControl register
is set.
The existing test files have been extended with several examples that
trigger the issues. One bug/example in the test file for EXTR_RS_W has been
found and reported by Klaus Peichl.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 8b758d0568)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The previous implementation incorrectly used same macro to detect overflow
for addition and subtraction. This patch makes distinction between these
two, and creates separate macros. The affected routines are changed
accordingly.
This change also includes additions to the existing tests for SUBQ_S_PH and
SUBQ_S_W that would trigger the fixed issue, and it removes dead code from
the test file. The last test case in subq_s_w.c is a bug found/reported/
isolated by Klaus Peichl from Dolby.
Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 20c334a797)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Correct sign-propagation before multiplication in MULQ_W helper.
The change also fixes previously incorrect expected values in the
tests for MULQ_RS.W and MULQ_S.W.
Signed-off-by: Petar Jovanovic <petarj@mips.com>
Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit a345481baa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The change corrects sign-related issue with MULQ_S.PH. It also includes
extension to the already existing test which will trigger the issue.
Signed-off-by: Petar Jovanovic <petarj@mips.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 9c19eb1e20)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When we receive a file descriptor over a UNIX domain socket the
O_NONBLOCK flag is preserved. Clear the O_NONBLOCK flag and rely on
QEMU file descriptor users like migration, SPICE, VNC, block layer, and
others to set non-blocking only when necessary.
This change ensures we don't accidentally expose O_NONBLOCK in the QMP
API. QMP clients should not need to get the non-blocking state
"correct".
A recent real-world example was when libvirt passed a non-blocking TCP
socket for migration where we expected a blocking socket. The source
QEMU produced a corrupted migration stream since its code did not cope
with non-blocking sockets.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e374f7f816171f9783c1d9d00a041f26379f1ac6)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
socket_connect() sets non-blocking on TCP or UNIX domain sockets if a
callback function is passed. Do the same for file descriptor passing,
otherwise we could unexpectedly be using a blocking file descriptor.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 35fb94fa292173a3e1df0768433e06912a2a88e4)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
There are several code paths in net_init_socket() depending on how the
socket is created: file descriptor passing, UDP multicast, TCP, or UDP.
Some of these support both listen and connect.
Not all code paths set the socket to non-blocking. This patch addresses
the file descriptor passing and UDP cases which were missing
socket_set_nonblock(fd) calls.
I considered moving socket_set_nonblock(fd) to a central location but it
turns out the code paths are different enough to require non-blocking at
different places.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f05b707279dc7c29ab10d9d13dbf413df6ec22f1)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 399f1c8f8af1f6f8b18ef4e37169c6301264e467)
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Conflicts:
block/sheepdog.c
socket_set_block()/socket_set_nonblock() calls in different locations
include/qemu/sockets.h
socket_set_nodelay() does not exist in v1.4.0, messes up diff context
qemu-char.c
glib G_IO_IN events are not used in v1.4.0, messes up diff context
savevm.c
qemu_fopen_socket() only has read mode in v1.4.0, qemu_set_block() not
necessary.
slirp/misc.c
unportable setsockopt() calls in v1.4.0 mess up diff context
slirp/tcp_subr.c
file was reformatted, diff context is messed up
ui/vnc.c
old dcl->idle instead of vd->dcl.idle messes up diff context
Added:
migration-tcp.c, migration-unix.c
qemu_fopen_socket() write mode does not exist yet, qemu_set_block() call
is needed here.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Alex Williamson (3):
seabios q35: Enable all PIRQn IRQs at startup
seabios q35: Add new PCI slot to irq routing function
seabios: Add a dummy PCI slot to irq mapping function
Avik Sil (1):
USB-EHCI: Fix null pointer assignment
Kevin O'Connor (4):
Update tools/acpi_extract.py to handle iasl 20130117 release.
Fix Makefile - don't reference "out/" directly, instead use "$(OUT)".
build: Don't require $(OUT) to be a sub-directory of the main
directory.
Verify CC is valid during build tests.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5c75fb1002)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The nature of the kernel ABI for the get_robust_list and set_robust_list
syscalls means we cannot implement them in QEMU. Make get_robust_list
silently return ENOSYS rather than using the default "print message and
then fail ENOSYS" code path, in the same way we already do for
set_robust_list, and add a comment documenting why we do this.
This silences warnings which were being produced for emulating
even trivial programs like 'ls' in x86-64-on-x86-64.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit e9a970a831)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If the guest passes us a bogus negative length for an iovec, fail
EINVAL rather than proceeding blindly forward. This fixes some of
the error cases tests for readv and writev in the LTP.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit dfae8e00f8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Upstream libc has recently changed to start using
FUTEX_WAIT_BITSET instead of FUTEX_WAIT and this
is causing do_futex to return -TARGET_ENOSYS.
Pass bitset in val3 to sys_futex which will be
ignored by kernel for the FUTEX_WAIT case.
Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit cce246e0a2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Since qcow2 metadata is cached we need to flush the caches, not just the
underlying file. Use bdrv_flush(bs) instead of bdrv_flush(bs->file).
Also add the error return path when bdrv_flush() fails and move the
flush after checking for qcow2_alloc_clusters() failure so that the
qcow2_alloc_clusters() error return value takes precedence.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f6977f1556)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
update_refcount() affects the refcount cache, it does not write to disk.
Therefore bdrv_flush(bs->file) does nothing. We need to flush the
refcount cache in order to write out the refcount updates!
While we're here also add error returns when qcow2_cache_flush() fails.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9991923b26)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
XBZRLE encoded migration introduced a MRU page cache
meachnism. Unfortunately, cached items where never freed in
case of a collision in the page cache on cache_insert().
This lead to out of memory conditions during XBZRLE migration
if the page cache was small and there where a lot of collisions
in the cache.
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 32a1c08b60)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-blk registers a vmstate change handler. Unfortunately this
handler is not unregistered on unplug, leading to some random
crashes if the system is restarted, e.g. via virsh reboot.
Lets unregister the vmstate change handler if the device is removed.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 69b302b204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
It was defined to ..._MPC8545E_v21 rather than ..._MPC8547E_v21.
Due to both resolving to CPU_POWERPC_e500v2_v21 this did not show.
Fixing this nontheless helps with QOM'ifying CPU aliases.
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0136d715ad)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently the spapr-vlan device does not supply a cleanup call for its
NetClientInfo structure. With current qemu versions, that leads to a SEGV
on exit, when net_cleanup() attempts to call the cleanup handlers on all
net clients.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 156dfaded8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
As of 5a49d3e9 we assume SPICE_PORT_EVENT_BREAK to be defined.
However, it is defined not in 0.12.2 what we require now, but in
0.12.3. Therefore in order to prevent build failure we must
adjust our minimal requirements.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 358689fe29)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
is_tcg_gen_code() checks the upper limit of TCG generated code range wrong, so
that TCG could get broken occasionally only when CONFIG_QEMU_LDST_OPTIMIZATION
enabled. The reason is code_gen_buffer_max_size does not cover the upper range
up to (TCG_MAX_OP_SIZE * OPC_BUF_SIZE), thus code_gen_buffer_max_size should be
modified to code_gen_buffer_size.
CC: qemu-stable@nongnu.org
Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 52ae646d4a)
Conflicts:
translate-all.c
*modified to use non-tcg-ctx version of code_gen_* variables
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Hosts hold on to handles provided by guest-file-open for periods that can
span beyond the life of the qemu-ga process that issued them. Since these
are issued starting from 0 on every restart, we run the risk of issuing
duplicate handles after restarts/reboots.
As a result, users with a stale copy of these handles may end up
reading/writing corrupted data due to their existing handles effectively
being re-assigned to an unexpected file or offset.
We unfortunately do not issue handles as strings, but as integers, so a
solution such as using UUIDs can't be implemented without introducing a
new interface.
As a workaround, we fix this by implementing a persistent key-value store
that will be used to track the value of the last handle that was issued
across restarts/reboots to avoid issuing duplicates.
The store is automatically written to the same directory we currently
set via --statedir to track fsfreeze state, and so should be applicable
for stable releases where this flag is supported.
A follow-up can use this same store for handling fsfreeze state, but
that change is cosmetic and left out for now.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
* fixed guest_file_handle_add() return value from uint64_t to int64_t
(cherry picked from commit 39097daf15)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through. This also matches what is done in qed.
QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files. Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 381b487d54)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently, for the pseries machine the device tree supplied by qemu to SLOF
and from there to the guest does not include a 'compatible property' at the
root level. Usually that works fine, since in this case the compatible
property doesn't really give any information not already found in the
'device_type' or 'model' properties.
However, the lack of 'compatible' confuses the bootloader install in the
SLES11 SP2 and SLES11 SP3 installers. This patch therefore adds a token
'compatible' property to work around that.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d63919c93e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Enable all virtio-net features for the legacy s390 virtio bus. This also fixes
kernel BUG at /usr/src/packages/BUILD/kernel-default-3.0.58/linux-3.0/drivers/s390/kvm/kvm_virtio.c:121!
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 35569cea79)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Due to library conflicts, Fedora will have to put libiscsi in
/usr/lib/iscsi. Simplify configuration by using a pkg-config
file. The Fedora package will distribute one, and the patch
to add it has been sent to upstream libiscsi as well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3c33ea9640)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Always check it immediately after calling bdrv_acct_done, and
always do a "goto done" in case the "done" label has to free
some memory---as is the case for scsi_unmap_complete in the
previous patch.
This patch could fix problems that happen when a request is
split into multiple parts, and one of them is canceled. Then
the next part is fired, but the HBA's cancellation callbacks have
fired already. Whether this happens or not, depends on how the
block/ driver implements AIO cancellation. It it does a simple
bdrv_drain_all() or similar, then it will not have a problem.
If it only cancels the given AIOCB, this scenario could happen.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c92e0e6b6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We currently maintain a whitelist of commands that are safe during
fsfreeze. During fsfreeze, we disable all commands that aren't part of
that whitelist.
guest-sync-delimited meets the criteria for being whitelisted, and is
also required for qemu-ga clients that rely on guest-sync-delimited for
re-syncing the channel after a timeout.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit c5dcb6ae23)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In case host and guest endianness differ the vga code first creates
a shared surface (using qemu_create_displaysurface_from), then goes
patch the surface format to indicate that the bytes must be swapped.
The switch to pixman broke that hack as the format patching isn't
propagated into the pixman image, so ui code using the pixman image
directly (such as vnc) uses the wrong format.
Fix that by adding a byteswap parameter to
qemu_create_displaysurface_from, so we'll use the correct format
when creating the surface (and the pixman image) and don't have
to patch the format afterwards.
[ v2: unbreak xen build ]
Cc: qemu-stable@nongnu.org
Cc: mark.cave-ayland@ilande.co.uk
Cc: agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1361349432-23884-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit b1424e0381)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Edivaldo reports a problem that the array of NetClientState in NICState is too
large - MAX_QUEUE_NUM(1024) which will wastes memory even if multiqueue is not
used.
Instead of static arrays, solving this issue by allocating the queues on demand
for both the NetClientState array in NICState and VirtIONetQueue array in
VirtIONet.
Tested by myself, with single virtio-net-pci device. The memory allocation is
almost the same as when multiqueue is not merged.
Cc: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f6b26cf257)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Current colon position in "waiting for telnet connection" message template
produces messages like:
QEMU waiting for connection on: telnet::127.0.0.16666,server
After moving a colon to the right, we will get a correct messages like:
QEMU waiting for connection on: telnet:127.0.0.1:6666,server
Signed-off-by: Igor Mitsyanko <i.mitsyanko@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e5545854dd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Obviously, hub does not support multiqueue tap. So this patch forbids creating
multiple queue tap when hub is used to prevent the crash when command line such
as "-net tap,queues=2" is used.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce675a7579)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
20000 nested coroutines require 20 GB of virtual address space.
Only nest 1000 of them so that the test (only enabled with
"-m perf" on the command line) runs on 32-bit machines too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 027003152f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Unlike derived PVR constants mapped to CPU_POWERPC_G2LEgp3, the
"G2leGP3" model definition itself used the CPU_POWERPC_G2LEgp1 PVR.
Fixing this will allow to alias CPU_POWERPC_G2LEgp3-using types to
"G2leGP3".
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit bfe6d5b0da)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 09:52:13 -05:00
377 changed files with 15042 additions and 5845 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.