Compare commits

...

625 Commits

Author SHA1 Message Date
Greg Kurz
c648e640f4 9pfs: Fully restart unreclaim loop (CVE-2021-20181)
Git-commit: 89fbea8737
References: bsc#1182137

Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.

This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.

This is wrong in several ways:
- a potential clunk request from the client could tear the first
  fid down and cause the reference to be stale. This leads to a
  use-after-free error that can be detected with ASAN, using a
  custom 9p client
- fids are added at the head of the list : restarting from the
  previous head will always miss fids added by a some other
  potential request

All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.

Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:51:18 -06:00
Gerd Hoffmann
07a1830167 usb: fix setup_len init (CVE-2020-14364)
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364

Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.

This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.

Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:51:18 -06:00
Prasad J Pandit
c9ae86d5f9 usb: check RNDIS message length
Git-commit: 64c9bc181f
References: bsc#1175441, CVE-2020-14364

When processing remote NDIS control message packets, the USB Net
device emulator uses a fixed length(4096) data buffer. The incoming
packet length could exceed this limit. Add a check to avoid it.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-2-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:51:18 -06:00
Prasad J Pandit
166edbe8ee slirp: check pkt_len before reading protocol header
Git-commit: 2e1dcbc0c2af64fcb17009eaf2ceedd81be2b27f
References: bsc#1179467

While processing ARP/NCSI packets in 'arp_input' or 'ncsi_input'
routines, ensure that pkt_len is large enough to accommodate the
respective protocol headers, lest it should do an OOB access.
Add check to avoid it.

CVE-2020-29129 CVE-2020-29130
  QEMU: slirp: out-of-bounds access while processing ARP/NCSI packets
 -> https://www.openwall.com/lists/oss-security/2020/11/27/1

Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20201126135706.273950-1-ppandit@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[BR: no NCSI code, so only ARP patched]
2021-04-06 16:50:56 -06:00
Jason Wang
2fc63d89a9 e1000: fail early for evil descriptor
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257

During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:

            if (tp->size + bytes > msh)
                bytes = msh - tp->size;

This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Paolo Bonzini
eb30f2e105 ide: atapi: assert that the buffer pointer is in range
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443

A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF).  For now paper over it
with assertions.  The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.

Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Prasad J Pandit
8d9b761437 hw: usb: hcd-ohci: check for processed TD before retire
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625

While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Prasad J Pandit
c9149f3d0c hw: usb: hcd-ohci: check len and frame_number variables
Git-commit: 1328fe0c32
References: bsc#1176682, CVE-2020-25624

While servicing the OHCI transfer descriptors(TD), OHCI host
controller derives variables 'start_addr', 'end_addr', 'len'
etc. from values supplied by the host controller driver.
Host controller driver may supply values such that using
above variables leads to out-of-bounds access issues.
Add checks to avoid them.

AddressSanitizer: stack-buffer-overflow on address 0x7ffd53af76a0
  READ of size 2 at 0x7ffd53af76a0 thread T0
  #0 ohci_service_iso_td ../hw/usb/hcd-ohci.c:734
  #1 ohci_service_ed_list ../hw/usb/hcd-ohci.c:1180
  #2 ohci_process_lists ../hw/usb/hcd-ohci.c:1214
  #3 ohci_frame_boundary ../hw/usb/hcd-ohci.c:1257
  #4 timerlist_run_timers ../util/qemu-timer.c:572
  #5 qemu_clock_run_timers ../util/qemu-timer.c:586
  #6 qemu_clock_run_all_timers ../util/qemu-timer.c:672
  #7 main_loop_wait ../util/main-loop.c:527
  #8 qemu_main_loop ../softmmu/vl.c:1676
  #9 main ../softmmu/main.c:50

Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <j_kangel@163.com>
Reported-by: Yi Ren <yunye.ry@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200915182259.68522-2-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Li Qiang
0d4f2ccbd1 hw: ehci: check return value of 'usb_packet_map'
Git-commit: 2fdb42d840
References: bsc#1178934, CVE-2020-25723

If 'usb_packet_map' fails, we should stop to process the usb
request.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20200812161727.29412-1-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Li Qiang
c6ab88d778 hw: xhci: check return value of 'usb_packet_map'
Git-commit: 21bc31524e
References: bsc#1176673, CVE-2020-25084

Currently we don't check the return value of 'usb_packet_map',
this will cause an UAF issue. This is LP#1891341.
Following is the reproducer provided in:
-->https://bugs.launchpad.net/qemu/+bug/1891341

cat << EOF | ./i386-softmmu/qemu-system-i386 -device nec-usb-xhci \
-trace usb\* -device usb-audio -device usb-storage,drive=mydrive \
-drive id=mydrive,file=null-co://,size=2M,format=raw,if=none \
-nodefaults -nographic -qtest stdio
outl 0xcf8 0x80001016
outl 0xcfc 0x3c009f0d
outl 0xcf8 0x80001004
outl 0xcfc 0xc77695e
writel 0x9f0d000000000040 0xffff3655
writeq 0x9f0d000000002000 0xff2f9e0000000000
write 0x1d 0x1 0x27
write 0x2d 0x1 0x2e
write 0x17232 0x1 0x03
write 0x17254 0x1 0x06
write 0x17278 0x1 0x34
write 0x3d 0x1 0x27
write 0x40 0x1 0x2e
write 0x41 0x1 0x72
write 0x42 0x1 0x01
write 0x4d 0x1 0x2e
write 0x4f 0x1 0x01
writeq 0x9f0d000000002000 0x5c051a0100000000
write 0x34001d 0x1 0x13
write 0x340026 0x1 0x30
write 0x340028 0x1 0x08
write 0x34002c 0x1 0xfe
write 0x34002d 0x1 0x08
write 0x340037 0x1 0x5e
write 0x34003a 0x1 0x05
write 0x34003d 0x1 0x05
write 0x34004d 0x1 0x13
writeq 0x9f0d000000002000 0xff00010100400009
EOF

This patch fixes this.

Buglink: https://bugs.launchpad.net/qemu/+bug/1891341
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-id: 20200812153139.15146-1-liq3ea@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Prasad J Pandit
46d6d8fca2 megasas: use unsigned type for reply_queue_head and check index
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362

A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.

Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
Bruce Rogers
ea1d325869 sm501: add qemu/log.h include
References: bsc#1172385, CVE-2020-12829

This is in place of commit 4a1f253adb which added the log.h
reference, but did other changes we're not interested in.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
BALATON Zoltan
4389a1bcdc sm501: Replace hand written implementation with pixman where possible
Git-commit: b15a22bbcb
References: bsc#1172385, CVE-2020-12829

Besides being faster this should also prevent malicious guests to
abuse 2D engine to overwrite data or cause a crash.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: 58666389b6cae256e4e972a32c05cf8aa51bffc0.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6bed160dd4e4f68a85e032b08d40d6bd5449274f)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
BALATON Zoltan
29697e5a02 sm501: Clean up local variables in sm501_2d_operation
Git-commit: 3d0b096298
References: bsc#1172385, CVE-2020-12829

Make variables local to the block they are used in to make it clearer
which operation they are needed for.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: ae59f8138afe7f6a5a4a82539d0f61496a906b06.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit e8a152c862db6fdd316345d9cd932e4602e9b29e)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
BALATON Zoltan
b75f462fc3 sm501: Use BIT(x) macro to shorten constant
Git-commit: 2824809b7f
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 124bf5de8d7cf503b32b377d0445029a76bfbd49.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6f7231dc2939aca80adc5080464696e7d5f0ca45)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:27:09 -06:00
BALATON Zoltan
ad47df9169 sm501: Shorten long variable names in sm501_2d_operation
Git-commit: 6f8183b5dc
References: bsc#1172385, CVE-2020-12829

This increases readability and cleans up some confusing naming.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: b9b67b94c46e945252a73c77dfd117132c63c4fb.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
BALATON Zoltan
4c3d3ab515 sm501: Convert printf + abort to qemu_log_mask
Git-commit: e29da77e5f
References: bsc#1172385, CVE-2020-12829

Some places already use qemu_log_mask() to log unimplemented features
or errors but some others have printf() then abort(). Convert these to
qemu_log_mask() and avoid aborting to prevent guests to easily cause
denial of service.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 305af87f59d81e92f2aaff09eb8a3603b8baa322.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
Marcus Comstedt
57d5a72a94 sm501: Adjust endianness of pixel value in rectangle fill
Git-commit: f3a60058c9
References: bsc#1172385, CVE-2020-12829

The value from twoD_foreground (which is in host endian format) must
be converted to the endianness of the framebuffer (currently always
little endian) before it can be used to perform the fill operation.

Signed-off-by: Marcus Comstedt <marcus@mc.pp.se>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
BALATON Zoltan
63e5c6df37 sm501: Set updated region dirty after 2D operation
Git-commit: eb76243c9d
References: bsc#1172385, CVE-2020-12829

Set the changed memory region dirty after performed a 2D operation to
ensure that the screen is updated properly.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
BALATON Zoltan
dba0fc2f7c sm501: Fix support for non-zero frame buffer start address
Git-commit: 33159dd7ce
References: bsc#1172385, CVE-2020-12829

Display updates and drawing hardware cursor did not work when frame
buffer address was non-zero. Fix this by taking the frame buffer
address into account in these cases. This fixes screen dragging on
AmigaOS. Based on patch by Sebastian Bauer.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
Sebastian Bauer
abe3c0057a sm501: Log unimplemented raster operation modes
Git-commit: 06cb926aaa
References: bsc#1172385, CVE-2020-12829

The sm501 currently implements only a very limited set of raster operation
modes. After this change, unknown raster operation modes are logged so
these can be easily spotted.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
Sebastian Bauer
619cb98abb sm501: Implement negated destination raster operation mode
Git-commit: debc7e7dad
References: bsc#1172385, CVE-2020-12829

Add support for the negated destination operation mode. This is used e.g.
by AmigaOS for the INVERSEVID drawing mode. With this change, the cursor
in the shell and non-immediate window adjustment are working now.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
Sebastian Bauer
a2beb809ea sm501: Use values from the pitch register for 2D operations
Git-commit: 54b2a4339c
References: bsc#1172385, CVE-2020-12829

Before, crt_h_total was used for src_width and dst_width. This is a
property of the current display setting and not relevant for the 2D
operation that also can be done off-screen. The pitch register's purpose
is to describe line pitch relevant of the 2D operation.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
BALATON Zoltan
0b87f89c10 sm501: Add some more unimplemented registers
Git-commit: 5690d9ecef
References: bsc#1172385, CVE-2020-12829

These are not really implemented (just return zero or default values)
but add these so guests accessing them can run.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:26:22 -06:00
BALATON Zoltan
9016d3b697 sm501: Add support for panel layer
Git-commit: 1ae5e6eb42
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 2029a276362c0c3a14c78acb56baa9466848dd51.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:25:04 -06:00
BALATON Zoltan
57c52ffa89 sm501: Fix hardware cursor
Git-commit: 6a2a5aae02
References: bsc#1172478, CVE-2020-13765

Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:23:45 -06:00
Thomas Huth
4e018f4995 hw/core/loader: Fix possible crash in rom_copy()
Git-commit: e423455c4f
References: bsc#1172478, CVE-2020-13765

Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:

    d = dest + (rom->addr - addr);

and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.

Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
6c02b007ca es1370: check total frame count against current frame
Git-commit: 369ff955a8
References: bsc#1172384, CVE-2020-13361

A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.

    int cnt = d->frame_cnt >> 16;
    int size = d->frame_cnt & 0xffff;

if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
ab7cb4d57d scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de594e4765)
[BR: BSC#1146873 CVE-2019-12068 - minor trace related tweak applied]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Sven Schnelle
883d8ce467 lsi: use enum type for s->waiting
This makes the code easier to read - no functional change.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-3-svens@stackframe.org>
(cherry picked from commit f08ec2b82a)
[BR: BSC#1146873 CVE-2019-12068 - small tweak made related to trace code]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Marc-André Lureau
6a278e26be Fix use-afte-free in ip_reass() (CVE-2020-1983)
Git-commit: 2faae0f778f818fadc873308f983289df697eb93
References: bsc#1170940, CVE-2020-1983

The q pointer is updated when the mbuf data is moved from m_dat to
m_ext.

m_ext buffer may also be realloc()'ed and moved during m_cat():
q should also be updated in this case.

Reported-by: Aviv Sasson <asasson@paloaltonetworks.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 9bd6c5913271eabcb7768a58197ed3301fe19f2d)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Marc-André Lureau
8a4f553333 tcp_emu: fix unsafe snprintf() usages
Git-commit: 68ccb8021a838066f0951d4b2817eb6b6f10a843
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() assume that snprintf() returns "only" the
number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Before patch ce131029, if there isn't enough room in "m_data" for the
"DCC ..." message, we overflow "m_data".

After the patch, if there isn't enough room for the same, we don't
overflow "m_data", but we set "m_len" out-of-bounds. The next time an
access is bounded by "m_len", we'll have a buffer overflow then.

Use slirp_fmt*() to fix potential OOB memory access.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-7-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
1298490017 slirp: use correct size while emulating commands
Git-commit: 82ebe9c370a0e2970fb5695aa19aa5214a6a1c80
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating services in tcp_emu(), it uses 'mbuf' size
'm->m_size' to write commands via snprintf(3). Use M_FREEROOM(m)
size to avoid possible OOB access.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-3-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
d98c149d9a slirp: use correct size while emulating IRC commands
Git-commit: ce131029d6d4a405cb7d3ac6716d03e58fb4a5d9
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating IRC DCC commands, tcp_emu() uses 'mbuf' size
'm->m_size' to write DCC commands via snprintf(3). This may
lead to OOB write access, because 'bptr' points somewhere in
the middle of 'mbuf' buffer, not at the start. Use M_FREEROOM(m)
size to avoid OOB access.

Reported-by: Vishnu Dev TJ <vishnudevtj@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-2-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Samuel Thibault
7070672a7f tcp_emu: Fix oob access
Git-commit: 2655fffed7a9e765bcb4701dd876e9dab975f289
References: bsc#1161066, CVE2020-7039, bsc#1161066, CVS-2020-7039

The main loop only checks for one available byte, while we sometimes
need two bytes.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Marc-André Lureau
0298cd5e7a util: add slirp_fmt() helpers
Git-commit: 30648c03b27fb8d9611b723184216cd3174b6775
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() in libslirp assume that snprintf() returns
"only" the number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Introduce slirp_fmt() that handles several pathological cases the
way libslirp usually expect:

- treat error as fatal (instead of silently returning -1)

- fmt0() will always \0 end

- return the number of bytes actually written (instead of what would
have been written, which would usually result in OOB later), including
the ending \0 for fmt0()

- warn if truncation happened (instead of ignoring)

Other less common cases can still be handled with strcpy/snprintf() etc.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-2-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Samuel Thibault
fcd085d819 Fix heap overflow in ip_reass on big packet input
When the first fragment does not fit in the preallocated buffer, q will
already be pointing to the ext buffer, so we mustn't try to update it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 126c04acbabd7ad32c2b018fe10dfac2a3bc1210)
[LY: CVE-2019-14378 BSC#1143794]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-18 17:15:18 -06:00
Marc-André Lureau
5a45fde238 slirp: don't manipulate so_rcv in tcp_emu()
For some reason, EMU_IDENT is not like other "emulated" protocols and
tries to reconstitute the original buffer, if it came in multiple
packets. Unfortunately, it does so wrongly, as it doesn't respect the
sbuf circular buffer appending rules, nor does it maintain some of the
invariants (rptr is incremented without bounds, etc): this leads to
further memory corruption revealed by ASAN or various malloc
errors. Furthermore, the so_rcv buffer is regularly flushed, so there
is no guarantee that buffer reconstruction will do what is expected.

Instead, do what the function comment says: "XXX Assumes the whole
command came in one packet", and don't touch so_rcv.

Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2021-03-18 17:15:18 -06:00
Marc-André Lureau
dde2698649 slirp: ensure there is enough space in mbuf to null-terminate
Prevents from buffer overflows.
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Samuel Thibault
4e2a617d9c slirp: fix big/little endian conversion in ident protocol
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

---
Based-on: <1551476756-25749-1-git-send-email-will@wbowling.info>

(cherry picked from commit 1fd71067da)
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Michael Roth
9ecdbbdc79 slrip: ip_reass: Fix use after free
Using ip_deq after m_free might read pointers from an allocation reuse.

This would be difficult to exploit, but that is still related with
CVE-2019-14378 which generates fragmented IP packets that would trigger this
issue and at least produce a DoS.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit c59279437eda91841b9d26079c70b8a540d41204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[BR: BSC#1149811 CVE-2019-15890]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Liang Yan
02dbadd58d qemu-bridge-helper: restrict interface name
The interface names in qemu-bridge-helper are defined to be
of size IFNAMSIZ(=16), including the terminating null('\0') byte.
The same is applied to interface names read from 'bridge.conf'
file to form ACLs rules. If user supplied '--br=bridge' name
is not restricted to the same length, it could lead to ACL bypass
issue. Restrict bridge name to IFNAMSIZ, including null byte.

Reported-by: Riccardo Schirone <rschiron@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1140402 CVE-2019-13164]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
40727784a3 qxl: check release info object
When releasing spice resources in release_resource() routine,
if release info object 'ext.info' is null, it leads to null
pointer dereference. Add check to avoid it.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20190425063534.32747-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d52680fc93)
[LY: BSC#1135902 CVE-2019-12155]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
d3f21be02a target/i386: define md-clear bit
md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction.  Add the new feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1111331 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130
CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Maydell
83ae211f18 device_tree.c: Don't use load_image()
The load_image() function is deprecated, as it does not let the
caller specify how large the buffer to read the file into is.
Instead use load_image_size().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-9-peter.maydell@linaro.org
(cherry picked from commit da885fe1ee)
[LM: BSC#1130675 CVE-2018-20815]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:18 -06:00
Benjamin Herrenschmidt
76374bbc62 loader: Add load_image_size() to replace load_image()
A subsequent patch to ppc/spapr needs to load the RTAS blob into
qemu memory rather than target memory (so it can later be copied
into the right spot at machine reset time).

I would use load_image() but it is marked deprecated because it
doesn't take a buffer size as argument, so let's add load_image_size()
that does.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[aik: fixed errors from checkpatch.pl]
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ea87616d6c)
[LM: BSC#1130675 CVE-2018-20815]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:18 -06:00
William Bowling
ff394a03dc slirp: check sscanf result when emulating ident
When emulating ident in tcp_emu, if the strchr checks passed but the
sscanf check failed, two uninitialized variables would be copied and
sent in the reply, so move this code inside the if(sscanf()) clause.

Signed-off-by: William Bowling <will@wbowling.info>
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Message-Id: <1551476756-25749-1-git-send-email-will@wbowling.info>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit d3222975c7)
[LM: BSC#1129622 CVE-2019-9824 To pass our checkpatch check, I changed
the patch to use spaces, not tabs, as in the initially proposed]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:18 -06:00
Greg Kurz
574e860fa2 9p: take write lock on fid path updates (CVE-2018-19364)
Recent commit 5b76ef50f6 fixed a race where v9fs_co_open2() could
possibly overwrite a fid path with v9fs_path_copy() while it is being
accessed by some other thread, ie, use-after-free that can be detected
by ASAN with a custom 9p client.

It turns out that the same can happen at several locations where
v9fs_path_copy() is used to set the fid path. The fix is again to
take the write lock.

Fixes CVE-2018-19364.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b3c77aa58)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Greg Kurz
53ff9e57bd 9p: write lock path in v9fs_co_open2()
The assumption that the fid cannot be used by any other operation is
wrong. At least, nothing prevents a misbehaving client to create a
file with a given fid, and to pass this fid to some other operation
at the same time (ie, without waiting for the response to the creation
request). The call to v9fs_path_copy() performed by the worker thread
after the file was created can race with any access to the fid path
performed by some other thread. This causes use-after-free issues that
can be detected by ASAN with a custom 9p client.

Unlike other operations that only read the fid path, v9fs_co_open2()
does modify it. It should hence take the write lock.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b76ef50f6)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Greg Kurz
7d88df8b5f 9p: fix QEMU crash when renaming files
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
 flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
 path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
 path=0x0) at hw/9pfs/9p-local.c:92
 fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
 path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
 at hw/9pfs/9p.c:1083
 at util/coroutine-ucontext.c:116
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1d20398694)
[BR: BSC#1117275]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
d8bc1b7f8f slirp: check data length while emulating ident function
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.

Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7d)
[BR: BSC#1123156 CVE-2019-6778,  modify patch to use spaces instead of tabs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Marcelo Tosatti
46ceab3424 kvm: i386: fix LAPIC TSC deadline timer save/restore
The configuration of the timer represented by MSR_IA32_TSCDEADLINE depends on:

- APIC LVT Timer register.
- TSC value.

Change the order to respect the dependency.

Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7477cd3897)
[LM: BSC#1109544]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
94303aed73 lsi53c895a: check message length value is valid
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.

Discovered by Deja vu Security. Reported by Oracle.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e58ccf0396)
[BR: BSC#1114422 CVE-2018-18849]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Larry Dewey
7637b1ae35 chardev: Converting public IO impls to size_t
Due to an integer overflow issue found in ccid_card_vscard_read
(CVE-2018-18438), memory was previously susceptible to corruption. To
mitigate this issue, a number of components, including the
ccid-card-passthrough device are now utilizing size_t in place of int to
prevent an the overflow from occuring when values larger than
available space are passed to corresponding functions.

While CCID smart card readers were the primary target of the
vulnerability, a number of other subsystems have also been modified in the
same manner to prevent them from being exploited in the same way; specifically
functions which are modeled after the IOCanReadHandler and IOReadHandler
typedefs.

Subsystems affected include, but are not limited to:
CCID Smart Cards
Spice consoles
Monitors
Network Sockets
Network Tap Devices
Serial IO
Emulated Shell Serial IO
USB Redirection and Passthrough
Consoles
SLiRP IO
Mouse Emulation
Xen Consoles
SLCP IO
Network Block Devices

(cherry picked from commit 4792016a6119a4d5768bd8f989ae90b52dfef789)
[LD: BSC#1112185 CVE-2018-18438]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Jason Wang
703f8d205c net: ignore packet size greater than INT_MAX
There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1592a99470)
[LD: BSC#1111013 CVE-2018-17963]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Jason Wang
798e7abec5 pcnet: fix possible buffer overflow
In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit b1d80d12c5)
[LD: BSC#1111010 CVE-2018-17962]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Jason Wang
7851133be4 rtl8139: fix possible out of bound access
In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1a326646fe)
[LD: BSC#1111006 CVE-2018-17958]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Jason Wang
0b1b67cce8 ne2000: fix possible out of bound access in ne2000_receive
In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fdc89e90fa)
[LD: BSC#1110910 CVE-2018-10839]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Larry Dewey
747650ce86 seccomp: secure all threads with seccomp manually.
Just before starting a child thread, a call to seccomp_start() has been
added to ensure that each thread is sandboxed properly, and that
seccomp is successfully applied to all threads when the user specifies
that sanboxing should be used either via /etc/libvirt/qemu.conf or via
the command line by passing `--seccomp on` to qemu.

[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-18 17:15:18 -06:00
Konrad Rzeszutek Wilk
3c6c6bc6ae i386: define the AMD 'virt-ssbd' CPUID feature bit (CVE-2018-3639)
AMD Zen expose the Intel equivalant to Speculative Store Bypass Disable
via the 0x80000008_EBX[25] CPUID feature bit.

This needs to be exposed to guest OS to allow them to protect
against CVE-2018-3639.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 403503b162)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Konrad Rzeszutek Wilk
ed89784bcb i386: Define the Virt SSBD MSR and handling of it (CVE-2018-3639)
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD).  To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f.  With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.

Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit cfeea0c021)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
3abda5c691 qga: check bytes count read by guest-file-read
While reading file content via 'guest-file-read' command,
'qmp_guest_file_read' routine allocates buffer of count+1
bytes. It could overflow for large values of 'count'.
Add check to avoid it.

Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 141b197408)
[FL: BSC#1098735 CVE-2018-12617 modify as only qga/commands-posix.c
has related code, but not qga/commands-win32.c]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
912ee8e297 slirp: correct size computation while concatenating mbuf
While reassembling incoming fragmented datagrams, 'm_cat' routine
extends the 'mbuf' buffer, if it has insufficient room. It computes
a wrong buffer size, which leads to overwriting adjacent heap buffer
area. Correct this size computation in m_cat.

Reported-by: ZDI Disclosures <zdi-disclosures@trendmicro.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 864036e251)
[FL: BSC#1096223 CVE-2018-11806]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Michael Tokarev
1c2f33e221 slirp: remove mbuf(m_hdr,m_dat) indirection
Git-commit: 0e44486cdc

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Daniel P. Berrangé
321aff8e90 i386: define the 'ssbd' CPUID feature bit (CVE-2018-3639)
New microcode introduces the "Speculative Store Bypass Disable"
CPUID feature bit. This needs to be exposed to guest OS to allow
them to protect against CVE-2018-3639.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180521215424.13520-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit d19d1f9659)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Bruce Rogers
993e993e25 migration: warn about inconsistent spec_ctrl state
As an attempt to help the user do the right thing, warn if we
detect spec_ctrl data in the migration stream, but where the
cpu defined doesn't have the feature. This would indicate the
migration is from the quick and dirty qemu produced in January
2018 to handle Spectre v2. That qemu version exposed the IBRS
cpu feature to all vcpu types, which helped in the short term
but wasn't a well designed approach.
Warn the user that the now migrated guest needs to be restarted
as soon as possible, using the spec_ctrl cpu feature flag or a
*-IBRS vcpu model specified as appropriate.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Eduardo Habkost
9f8e1dc03d i386: Add new -IBRS versions of Intel CPU models
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715.  Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.

The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.

Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ac96c41354)
[BR: BSC#1068032 CVE-2017-5715 CPU models are reduced as appropriate
to match this QEMU version, and features specified according to current
code conventions]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Eduardo Habkost
56bb250e19 i386: Add FEAT_8000_0008_EBX CPUID feature word
Add the new feature word and the "ibpb" feature flag.

Based on a patch by Paolo Bonzini.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1b3420e1c4)
[BR: BSC#1068032 CVE-2017-5715 change to match current code's feat_name
type. Also adapted to old style feature management.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Eduardo Habkost
65970c2360 i386: Add spec-ctrl CPUID bit
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a2381f0934)
[BR: BSC#1068032 CVE-2017-5715 modify to match current code's feat_name
type]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Luwei Kang
25b35a80b2 x86: add infrastructure for 7_0_EDX features
The spec can be found in Intel Software Developer Manual or in
Instruction Set Extensions Programming Reference.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1477902446-5932-1-git-send-email-he.chen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 95ea69fb46)
[BR: BSC#1068032 CVE-2017-5715 - orig patch modified to not provide new
feature, just infrastructure, and change to match current code's feat_name
type. Adjust to old style feature tracking as well.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Bruce Rogers
f7aa8ece96 x86: cpu: increase model_id array size from 48 to 49
This is perhaps the easiest way to handle the longer model names
introduced with the -IBRS models. This is in lew of backporting
commit 807e9869b8 and other commits
which that would require.
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
21b668cafc multiboot: check mh_load_end_addr address field
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. In that, end of the data segment address
'mh_load_end_addr' should be less than the bss segment address
'mh_bss_end_addr'. Add check to validate that.

Reported-by: CERT CC <cert.cc@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1083291 CVE-2018-7550]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-18 17:15:18 -06:00
linzhecheng
8ec6aa8515 vga: check the validation of memory addr when draw text
Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
    iopl(3);
    srand(time(NULL));
    int a,b;
    while(1){
	a = rand()%0x100;
	b = 0x3c0 + (rand()%0x20);
        outb(a,b);
    }
    return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc17)
[LY: BSC#1076114 CVE-2018-5683]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
644bced347 i386: Add support for SPEC_CTRL MSR
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a33a2cfe2f)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
9f419e4ccc vnc: fix overflow in vnc_update_stats
Commit "bea60dd ui/vnc: fix potential memory corruption issues" is
incomplete.  vnc_update_stats must calculate width and height the same
way vnc_refresh_server_surface does it, to make sure we don't use width
and height values larger than the qemu vnc server can handle.

Commit "e22492d ui/vnc: disable adaptive update calculations if not
needed" masks the issue in the default configuration.  It triggers only
in case the "lossy" option is set to "on" (default is "off").

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1485248428-575-1-git-send-email-kraxel@redhat.com
(cherry picked from commit eebe0b7905)
[BR: BSC#1026612 CVE-2017-2633 (this fix fixes first fix)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
a7a5aa7a17 vnc: fix memory corruption (CVE-2015-5225)
The _cmp_bytes variable added by commit "bea60dd ui/vnc: fix potential
memory corruption issues" can become negative.  Result is (possibly
exploitable) memory corruption.  Reason for that is it uses the stride
instead of bytes per scanline to apply limits.

For the server surface is is actually fine.  vnc creates that itself,
there is never any padding and thus scanline length always equals stride.

For the guest surface scanline length and stride are typically identical
too, but it doesn't has to be that way.  So add and use a new variable
(guest_ll) for the guest scanline length.  Also rename min_stride to
line_bytes to make more clear what it actually is.  Finally sprinkle
in an assert() to make sure we never use a negative _cmp_bytes again.

Reported-by: 范祚至(库特) <zuozhi.fzz@alibaba-inc.com>
Reviewed-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit eb8934b041)
[BR: BSC#1026612 CVE-2017-2633 (this fix fixes first fix)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
dff189aaac ui/vnc: fix potential memory corruption issues
this patch makes the VNC server work correctly if the
server surface and the guest surface have different sizes.

Basically the server surface is adjusted to not exceed VNC_MAX_WIDTH
x VNC_MAX_HEIGHT and additionally the width is rounded up to multiple of
VNC_DIRTY_PIXELS_PER_BIT.

If we have a resolution whose width is not dividable by VNC_DIRTY_PIXELS_PER_BIT
we now get a small black bar on the right of the screen.

If the surface is too big to fit the limits only the upper left area is shown.

On top of that this fixes 2 memory corruption issues:

The first was actually discovered during playing
around with a Windows 7 vServer. During resolution
change in Windows 7 it happens sometimes that Windows
changes to an intermediate resolution where
server_stride % cmp_bytes != 0 (in vnc_refresh_server_surface).
This happens only if width % VNC_DIRTY_PIXELS_PER_BIT != 0.

The second is a theoretical issue, but is maybe exploitable
by the guest. If for some reason the guest surface size is bigger
than VNC_MAX_WIDTH x VNC_MAX_HEIGHT we end up in severe corruption since
this limit is nowhere enforced.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bea60dd767)
[BR: BSC#1026612 CVE-2017-2633]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
464c0013e9 ui/vnc: fix vmware VGA incompatiblities
this fixes invalid rectangle updates observed after commit 12b316d
with the vmware VGA driver. The issues occured because the server
and client surface update seems to be out of sync at some points
and the max width of the surface is not dividable by
VNC_DIRTY_BITS_PER_PIXEL (16).

Reported-by: Serge Hallyn <serge.hallyn@ubuntu.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2f487a3d40)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
9da0965b8a ui/vnc: disable adaptive update calculations if not needed
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e22492d332)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
f2c90ca3e2 ui/vnc: optimize setting in vnc_dpy_update()
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 919372251c)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
3cd587ccd3 ui/vnc: optimize dirty bitmap tracking
vnc_update_client currently scans the dirty bitmap of each client
bitwise which is a very costly operation if only few bits are dirty.
vnc_refresh_server_surface does almost the same.
this patch optimizes both by utilizing the heavily optimized
function find_next_bit to find the offset of the next dirty
bit in the dirty bitmaps.

The following artifical test (just the bitmap operation part) running
vnc_update_client 65536 times on a 2560x2048 surface illustrates the
performance difference:

All bits clean - vnc_update_client_new: 0.07 secs
 vnc_update_client_old: 10.98 secs

All bits dirty - vnc_update_client_new: 11.26 secs
 vnc_update_client_old: 20.19 secs

Few bits dirty - vnc_update_client_new: 0.08 secs
 vnc_update_client_old: 10.98 secs

The case for all bits dirty is still rather slow, this
is due to the implementation of find_and_clear_dirty_height.
This will be addresses in a separate patch.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 12b316d4c1)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
7c1dc413e2 ui/vnc: derive cmp_bytes from VNC_DIRTY_PIXELS_PER_BIT
this allows for setting VNC_DIRTY_PIXELS_PER_BIT to different
values than 16 if desired.

Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6cd859aa8a)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Peter Lieven
ecd581fa02 ui/vnc: introduce VNC_DIRTY_PIXELS_PER_BIT macro
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Wenchao Xia <xiawenc@linux.vnet.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b4c85ddcec)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
7597e2b200 console: zap displaystate from dcl callbacks
Now that nobody depends on DisplayState in DisplayChangeListener
callbacks any more we can remove the parameter from all callbacks.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bc2ed9704f)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
6eca8fd580 cocoa: stop using DisplayState
Rework DisplayStateListener callbacks to not use the DisplayState
any more.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit )5e00d3ac475fb4c9afa17612a908e933fe142f00
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
bbc77dc48b sdl: stop using DisplayState
Rework DisplayStateListener callbacks to not use the DisplayState
any more.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8db9bae94e)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
6144b773e4 vnc: stop using DisplayState
Rework DisplayStateListener callbacks to not use the DisplayState
any more.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d39fa6d86d)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
eddcf049a9 console: add surface_*() getters
Add convinence wrappers to query DisplaySurface properties.
Simliar to ds_get_*, but operating in the DisplaySurface
not the DisplayState.

With this patch in place ui frontents can stop using DisplayState
in the rendering code paths, they can simply operate using the
DisplaySurface passed in via dpy_gfx_switch callback.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 626e3b34e3)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
f8fa2a87a0 console: rework DisplaySurface handling [dcl/ui side]
Replace the dpy_gfx_resize and dpy_gfx_setdata DisplayChangeListener
callbacks with a dpy_gfx_switch callback which notifies the ui code
when the framebuffer backing storage changes.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c12aeb860c)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
8122dc8f36 console: rework DisplaySurface handling [vga emu side]
Decouple DisplaySurface allocation & deallocation from DisplayState.
Replace dpy_gfx_resize + dpy_gfx_setdata with a dpy_gfx_replace_surface
function.

This handles the graphic hardware emulation.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit da229ef3b3)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
8ef722bb31 sdl: drop dead code
DisplayAllocator removal (commit
187cd1d9f3) made this a nop.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 468dfd6de2)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
2dc158f27c console: kill DisplayState->opaque
It's broken by design.  There can be multiple DisplayChangeListener
instances, so they simply can't store state in the (single) DisplayState
struct.  Try 'qemu -display gtk -vnc :0', watch it crash & burn.

With DisplayChangeListenerOps having a more sane interface now we can
simply use the DisplayChangeListener pointer to get access to our
private data instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 21ef45d712)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
9dcfc66aca console: fix displaychangelisteners interface
Split callbacks into separate Ops struct.  Pass DisplayChangeListener
pointer as first argument to all callbacks.  Uninline a bunch of
display functions and move them from console.h to console.c

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 7c20b4a374)
[BR: BSC#1026612 CVE-2017-2633 (infrastructure patch)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
a6979f3c18 9pfs: use g_malloc0 to allocate space for xattr
9p back-end first queries the size of an extended attribute,
allocates space for it via g_malloc() and then retrieves its
value into allocated buffer. Race between querying attribute
size and retrieving its could lead to memory bytes disclosure.
Use g_malloc0() to avoid it.

Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7bd9275630)
[BR: BSC#1062069 CVE-2017-15038]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
300115cc21 cirrus: fix oob access in mode4and5 write functions
Move dst calculation into the loop, so we apply the mask on each
interation and will not overflow vga memory.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Niu Guoxiang <niuguoxiang@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171011084314.21752-1-kraxel@redhat.com
[BR: BSC#1063122 CVE-2017-15289]
Signed-off-by: Bruce Rogers <brogers@suse.com
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
46ca74f304 vga: stop passing pointers to vga_draw_line* functions
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).

Impact:  DoS for privileged guest users.  qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.

Fixes: CVE-2017-13672
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828122906.18993-1-kraxel@redhat.com
(cherry picked from commit 3d90c62548)
[FL: BSC#1056334 CVE-2017-13672, add macro to fix multiple #include]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
45abd51274 exec: use qemu_ram_ptr_length to access guest ram
When accessing guest's ram block during DMA operation, use
'qemu_ram_ptr_length' to get ram block pointer. It ensures
that DMA operation of given length is possible; And avoids
any OOB memory access situations.

Reported-by: Alex <broscutamaker@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170712123840.29328-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 04bf2526ce)
[FL: BSC#1048902 CVE-2017-11334]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
e92d5f4846 slirp: check len against dhcp options array end
While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.

This is CVE-2017-11434.

Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 413d463f43)
[FL: BSC#1049381 CVE-2017-11434]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Prasad J Pandit
044969b5e8 multiboot: validate multiboot header address values
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.

This is CVE-2017-14167.

Reported-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ed4f86e8b6)
[FL: BSC#1057585 CVE-2017-14167]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-18 17:15:18 -06:00
Gerd Hoffmann
4645de669c usb-redir: fix stack overflow in usbredir_log_data
Don't reinvent a broken wheel, just use the hexdump function we have.

Impact: low, broken code doesn't run unless you have debug logging
enabled.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509110128.27261-1-kraxel@redhat.com
(cherry picked from commit bd4a683505)
[BR: BSC#1047674 CVE-2017-10806]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Max Reitz
5888061431 qemu-nbd: Ignore SIGPIPE
qemu proper has done so for 13 years
(8a7ddc38a6), qemu-img and qemu-io have
done so for four years (526eda14a6).
Ignoring this signal is especially important in qemu-nbd because
otherwise a client can easily take down the qemu-nbd server by dropping
the connection when the server wants to send something, for example:

$ qemu-nbd -x foo -f raw -t null-co:// &
[1] 12726
$ qemu-io -c quit nbd://localhost/bar
can't open device nbd://localhost/bar: No export with name 'bar' available
[1]  + 12726 broken pipe  qemu-nbd -x foo -f raw -t null-co://

In this case, the client sends an NBD_OPT_ABORT and closes the
connection (because it is not required to wait for a reply), but the
server replies with an NBD_REP_ACK (because it is required to reply).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170611123714.31292-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 041e32b8d9)
[BR: BSC#1046636 CVE-2017-10664]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
50c105187c megasas: always store SCSIRequest* into MegasasCmd
This ensures that the request is unref'ed properly, and avoids a
segmentation fault in the new qtest testcase that is added.

Reported-by: Zhangyanyu <zyy4013@stu.ouc.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503, dropped testcase from patch]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:18 -06:00
Paolo Bonzini
0182b5013f megasas: do not read DCMD opcode more than once from frame
Avoid TOC-TOU bugs by storing the DCMD opcode in the MegasasCmd

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
8bb8f0a95f serial: fix memory leak in serial exit
The serial_exit_core function doesn't free some resources.
This can lead memory leak when hotplug and unplug. This
patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <586cb5ab.f31d9d0a.38ac3.acf2@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8409dc884a)
[BR: BSC#1021741 CVE-2017-5579]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a4895f9d7a 9pfs: local: fix unlink of alien files in mapped-file mode
When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.

This is a regression introduced by commit "df4938a6651b 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from

    ret = remove("$dir/.virtfs_metadata/$name");
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

to

    fd = openat("$dir/.virtfs_metadata")
    unlinkat(fd, "$name")
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.

We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]
(cherry picked from commit 6a87e7929f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
0d92ebe2e0 9pfs: local: forbid client access to metadata (CVE-2017-7493)
When using the mapped-file security mode, we shouldn't let the client mess
with the metadata. The current code already tries to hide the metadata dir
from the client by skipping it in local_readdir(). But the client can still
access or modify it through several other operations. This can be used to
escalate privileges in the guest.

Affected backend operations are:
- local_mknod()
- local_mkdir()
- local_open2()
- local_symlink()
- local_link()
- local_unlinkat()
- local_renameat()
- local_rename()
- local_name_to_path()

Other operations are safe because they are only passed a fid path, which
is computed internally in local_name_to_path().

This patch converts all the functions listed above to fail and return
EINVAL when being passed the name of the metadata dir. This may look
like a poor choice for errno, but there's no such thing as an illegal
path name on Linux and I could not think of anything better.

This fixes CVE-2017-7493.

Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 7a95434e0c)
[BR: BSC#1039495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
05182638e6 audio: release capture buffers
AUD_add_capture() allocates two buffers which are never released.
Add the missing calls to AUD_del_capture().

Impact: Allows vnc clients to exhaust host memory by repeatedly
starting and stopping audio capture.

Fixes: CVE-2017-8309
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: "Jiangxin (hunter, SCC)" <jiangxin1@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170428075612.9997-1-kraxel@redhat.com
(cherry picked from commit 3268a845f4)
[BR: BSC#1037242]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
ff0c0f12f4 usb: ohci: fix error return code in servicing iso td
It should return 1 if an error occurs when reading iso td.
This will avoid an infinite loop issue in ohci_service_ed_list.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5899ac3e.1033240a.944d5.9a2d@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 26f670a244)
[BR: BSC#1042159 CVE-2017-9330, also includes fix from upstream
commit cf66ee8e where the test for failure of ohci_read_iso_td()
was wrong]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
84abd63e2d ide: ahci: call cleanup function in ahci unit
This can avoid memory leak when hotunplug the ahci device.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-4-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d68f0f778e)
[BSC#1042801 CVE-2017-9373]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
cc0f8de694 ide: core: add cleanup function
As the pci ahci can be hotplug and unplug, in the ahci unrealize
function it should free all the resource once allocated in the
realized function. This patch add ide_exit to free the resource.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c9f086418a)
[BSC#1042801 CVE-2017-9373]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
03482fda68 xhci: guard xhci_kick_epctx against recursive calls
Track xhci_kick_epctx processing being active in a variable.  Check the
variable before calling xhci_kick_epctx from xhci_kick_ep.  Add an
assert to make sure we don't call recursively into xhci_kick_epctx.

Cc: 1653384@bugs.launchpad.net
Fixes: 94b037f2a4
Reported-by: Fabian Lesniak <fabian@lesniak-it.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1486035372-3621-1-git-send-email-kraxel@redhat.com
Message-id: 1485790607-31399-5-git-send-email-kraxel@redhat.com
(cherry picked from commit 96d87bdda3)
[BR: BSC#1042800 CVE-2017-9375]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
054f460e4d cirrus: fix off-by-one in cirrus_bitblt_rop_bkwd_transp_*_16
The switch from pointers to addresses (commit
026aeffcb4 and
ffaf857778) added
a off-by-one bug to 16bit backward blits.  Fix.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1489735296-19047-1-git-send-email-kraxel@redhat.com
(cherry picked from commit f019722cbb)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
0125f9ca1b cirrus: stop passing around src pointers in the blitter
Does basically the same as "cirrus: stop passing around dst pointers in
the blitter", just for the src pointer instead of the dst pointer.

For the src we have to care about cputovideo blits though and fetch the
data from s->cirrus_bltbuf instead of vga memory.  The cirrus_src*()
helper functions handle that.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489584487-3489-1-git-send-email-kraxel@redhat.com
(cherry picked from commit ffaf857778)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
1d2ade350a cirrus: stop passing around dst pointers in the blitter
Instead pass around the address (aka offset into vga memory).  Calculate
the pointer in the rop_* functions, after applying the mask to the
address, to make sure the address stays within the valid range.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489574872-8679-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 026aeffcb4)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
hangaohuai
ae7ad17ab0 fix :cirrus_vga fix OOB read case qemu Segmentation fault
check the validity of parameters in cirrus_bitblt_rop_fwd_transp_xxx
and cirrus_bitblt_rop_fwd_xxx to avoid the OOB read which causes qemu Segmentation fault.

After the fix, we will touch the assert in
cirrus_invalidate_region:
assert(off_cur_end >= off_cur);

Signed-off-by: fangying <fangying1@huawei.com>
Signed-off-by: hangaohuai <hangaohuai@huawei.com>
Message-id: 20170314063919.16200-1-hangaohuai@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 215902d7b6)
[BR: BSC#1034908 CVE-2017-7718]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Pranith Kumar
7ea1521c6b tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122

Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170323175851.14342-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 30663fd26c)
[BR: BSC#1030624]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
02bcee6460 cirrus/vnc: zap bitblit support from console code.
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests.  It is supported by cirrus and vnc server.  The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.

This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more.  Any linux guest using the cirrus drm
driver doesn't.  Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.

So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".

Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full).  This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display.  So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.

Lets kill it once for all.

Oh, and one more reason: Turns out (after writing the patch) we have a
security bug in that code path ...

Fixes: CVE-2016-9603
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 50628d3479)
[BR: BSC#1028656]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
8de40e0696 usb: ohci: limit the number of link eds
The guest may builds an infinite loop with link eds. This patch
limit the number of linked ed to avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5899a02e.45ca240a.6c373.93c1@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 95ed56939e)
[BR: BSC#1028184 CVE-2017-6505]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gerd Hoffmann
377d636e3b xhci: apply limits to loops
Limits should be big enough that normal guest should not hit it.
Add a tracepoint to log them, just in case.  Also, while being
at it, log the existing link trb limit too.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1486383669-6421-1-git-send-email-kraxel@redhat.com
(cherry picked from commit f89b60f6e5)
[BR: BSC#1025109 CVE-2017-5973]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/usb/hcd-xhci.c
	hw/usb/trace-events
2021-03-18 17:15:17 -06:00
Laszlo Ersek
650e166c42 dump: rebase from host-private RAMBlock offsets to guest-physical addresses
RAMBlock.offset                   --> GuestPhysBlock.target_start
RAMBlock.offset + RAMBlock.length --> GuestPhysBlock.target_end
RAMBlock.length                   --> GuestPhysBlock.target_end -
                                      GuestPhysBlock.target_start

"GuestPhysBlock.host_addr" is only used when writing the dump contents.

This patch enables "crash" to work with the vmcore by rebasing the vmcore
from the left side of the following diagram to the right side:

host-private
offset
relative
to ram_addr   RAMBlock                  guest-visible paddrs
            0 +-------------------+.....+-------------------+ 0
              |         ^         |     |        ^          |
              |       640 KB      |     |      640 KB       |
              |         v         |     |        v          |
  0x0000a0000 +-------------------+.....+-------------------+ 0x0000a0000
              |         ^         |     |XXXXXXXXXXXXXXXXXXX|
              |       384 KB      |     |XXXXXXXXXXXXXXXXXXX|
              |         v         |     |XXXXXXXXXXXXXXXXXXX|
  0x000100000 +-------------------+.....+-------------------+ 0x000100000
              |         ^         |     |        ^          |
              |       3583 MB     |     |      3583 MB      |
              |         v         |     |        v          |
  0x0e0000000 +-------------------+.....+-------------------+ 0x0e0000000
              |         ^         |.    |XXXXXXXXXXXXXXXXXXX|
              | above_4g_mem_size | .   |XXXX PCI hole XXXXX|
              |         v         |  .  |XXXX          XXXXX|
     ram_size +-------------------+   . |XXXX  512 MB  XXXXX|
                                   .   .|XXXXXXXXXXXXXXXXXXX|
                                    .   +-------------------+ 0x100000000
                                     .  |         ^         |
                                      . | above_4g_mem_size |
                                       .|         v         |
                                        +-------------------+ ram_size
                                                              + 512 MB

Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 56c4bfb3f0)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Laszlo Ersek
e881861139 dump: populate guest_phys_blocks
While the machine is paused, in guest_phys_blocks_append() we register a
one-shot MemoryListener, solely for the initial collection of the valid
guest-physical memory ranges that happens at listener registration time.

For each range that is reported to guest_phys_blocks_region_add(), we
attempt to merge the range with the preceding one.

Ranges can only be joined if they are contiguous in both guest-physical
address space, and contiguous in host virtual address space.

The "maximal" ranges that remain in the end constitute the guest-physical
memory map that the dump will be based on.

Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit c5d7f60f06)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Laszlo Ersek
8d6c453a47 dump: introduce GuestPhysBlockList
The vmcore must use physical addresses that are visible to the guest, not
addresses that point into linear RAMBlocks. As first step, introduce the
list type into which we'll collect the physical mappings in effect at the
time of the dump.

Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 5ee163e8ea)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Laszlo Ersek
44145d968a dump: clamp guest-provided mapping lengths to ramblock sizes
Even a trusted & clean-state guest can map more memory than what it was
given. Since the vmcore contains RAMBlocks, mapping sizes should be
clamped to RAMBlock sizes. Otherwise such oversized mappings can exceed
the entire file size, and ELF parsers might refuse even the valid portion
of the PT_LOAD entry.

Related RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=981582

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 2cac260768)
[BR: BSC#1049785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
40769097de Adjust some code for fixing bsc#1038396 and Add missing part of 616a6552
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a3e4c45773 cpu: introduce CPUClass::virtio_is_big_endian()
If we want to support targets that can change endianness (modern PPC and
ARM for the moment), we need to add a per-CPU class method to be called
from the virtio code. The virtio_ prefix in the name is a hint for people
to avoid misusage (aka. anywhere but from the virtio code).

The default behaviour is to return the compile-time default target
endianness.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bf7663c4bd)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
e934bfd326 virtio: add subsections to the migration stream
There is a need to add some more fields to VirtIODevice that should be
migrated (broken status, endianness). The problem is that we do not
want to break compatibility while adding a new feature... This issue has
been addressed in the generic VMState code with the use of optional
subsections. As a *temporary* alternative to port the whole virtio
migration code to VMState, this patch mimics a similar subsectionning
ability for virtio, using the VMState code.

Since each virtio device is streamed in its own section, the idea is to
stream subsections between the end of the device section and the start
of the next sections. This allows an older QEMU to complain and exit
when fed with subsections:

Unknown savevm section type 5
load of migration failed

Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6b321a3df5)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
8c08db6d3e cpu: Replace cpu_single_env with CPUState current_cpu
Move it to qom/cpu.h.

Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 4917cf4432)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
952453d8b8 cpu: Turn cpu_get_memory_mapping() into a CPUState hook
Change error reporting from return value to Error argument.

Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
[AF: Fixed cpu_get_memory_mapping() documentation]
Signed-off-by: Andreas Färber <afaerber@suse.de>

(cherry picked from commit a23bbfda75)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
e79f0d0e4d memory_mapping: Move MemoryMappingList typedef to qemu/typedefs.h
This will avoid issues with hwaddr and ram_addr_t when including
sysemu/memory_mapping.h for CONFIG_USER_ONLY, e.g., from qom/cpu.h.

Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 6d4d3ae77d)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
0b9396137a cpu: Turn cpu_paging_enabled() into a CPUState hook
Relocate assignment of x86 get_arch_id to have all hooks in one place.

Reviewed-by: Jens Freimann <jfrei@linux.vnet.ibm.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 444d559078)
[LM: BSC#1038396]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:17 -06:00
Bruce Rogers
c21dbaa49e 9pfs: local: remove: use correct path component
Commit a0e640a8 introduced a path processing error.
Pass fstatat the dirpath based path component instead
of the entire path.

[BR: BSC#1045035]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
90063d8c7d 9pfs: local: set the path of the export root to "."
The local backend was recently converted to using "at*()" syscalls in order
to ensure all accesses happen below the shared directory. This requires that
we only pass relative paths, otherwise the dirfd argument to the "at*()"
syscalls is ignored and the path is treated as an absolute path in the host.
This is actually the case for paths in all fids, with the notable exception
of the root fid, whose path is "/". This causes the following backend ops to
act on the "/" directory of the host instead of the virtfs shared directory
when the export root is involved:
- lstat
- chmod
- chown
- utimensat

ie, chmod /9p_mount_point in the guest will be converted to chmod / in the
host for example. This could cause security issues with a privileged QEMU.

All "*at()" syscalls are being passed an open file descriptor. In the case
of the export root, this file descriptor points to the path in the host that
was passed to -fsdev.

The fix is thus as simple as changing the path of the export root fid to be
"." instead of "/".

This is CVE-2017-7471.

Cc: qemu-stable@nongnu.org
Reported-by: Léo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9c6b899f7a)
[BR: BSC#1034866 CVE-2017-7471]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
cf73a2e5d0 9pfs: xattr: fix memory leak in v9fs_list_xattr
Free 'orig_value' in error path.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4ffcdef427)
[BR: BSC#1035950 CVE-2017-8086]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
1ffef00f49 9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.

This patch ensures that the fid isn't already associated to anything
before using it.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
(reworded the changelog, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit d63fb193e7)
[BR: BSC#1032075 CVE-2017-7377]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
d8e7eacec1 9pfs: don't try to flush self and avoid QEMU hang on reset
According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.

QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.

If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.

This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).

The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.

[*] http://man.cat-v.org/plan_9/5/flush

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit d5f2af7b95)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
41606dac49 9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b003fc0d8a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a47f831ad5 9pfs: fix O_PATH build break with older glibc versions
When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit 918112c02a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
c272662580 9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.

It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b314f6a077)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
c0c0eba573 9pfs: fail local_statfs() earlier
If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.

(Coverity issue CID1371729)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit 23da0145cc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
6df3ba047c 9pfs: fix fd leak in local_opendir()
Coverity issue CID1371731

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit faab207f11)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
8c1013182c 9pfs: fix bogus fd check in local_remove()
This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.

(Coverity issue CID1371732)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b7361d46e7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
6efdf0ce12 9pfs: local: drop unused code
Now that the all callbacks have been converted to use "at" syscalls, we
can drop this code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c23d5f1d5b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
1065eea81f 9pfs: local: open2: don't follow symlinks
The local_open2() callback is vulnerable to symlink attacks because it
calls:

(1) open() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_open2() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively. Since local_open2() already opens
a descriptor to the target file, local_set_cred_passthrough() is
modified to reuse it instead of opening a new one.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to openat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a565fea565)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
5efdf5c89e 9pfs: local: mkdir: don't follow symlinks
The local_mkdir() callback is vulnerable to symlink attacks because it
calls:

(1) mkdir() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mkdir() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mkdirat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3f3a16990b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
3551d84f50 9pfs: local: mknod: don't follow symlinks
The local_mknod() callback is vulnerable to symlink attacks because it
calls:

(1) mknod() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mknod() to rely on opendir_nofollow() and
mknodat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

A new local_set_cred_passthrough() helper based on fchownat() and
fchmodat_nofollow() is introduced as a replacement to
local_post_create_passthrough() to fix (4).

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mknodat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d815e72190)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
0ed94ea19c 9pfs: local: symlink: don't follow symlinks
The local_symlink() callback is vulnerable to symlink attacks because it
calls:

(1) symlink() which follows symbolic links for all path elements but the
    rightmost one
(2) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(3) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(4) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_symlink() to rely on opendir_nofollow() and
symlinkat() to fix (1), openat(O_NOFOLLOW) to fix (2), as well as
local_set_xattrat() and local_set_mapped_file_attrat() to fix (3) and
(4) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 38771613ea)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
f5c7500547 9pfs: local: chown: don't follow symlinks
The local_chown() callback is vulnerable to symlink attacks because it
calls:

(1) lchown() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_chown() to rely on open_nofollow() and
fchownat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d369f20763)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
42fff2efbf 9pfs: local: chmod: don't follow symlinks
The local_chmod() callback is vulnerable to symlink attacks because it
calls:

(1) chmod() which follows symbolic links for all path elements
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

We would need fchmodat() to implement AT_SYMLINK_NOFOLLOW to fix (1). This
isn't the case on linux unfortunately: the kernel doesn't even have a flags
argument to the syscall :-\ It is impossible to fix it in userspace in
a race-free manner. This patch hence converts local_chmod() to rely on
open_nofollow() and fchmod(). This fixes the vulnerability but introduces
a limitation: the target file must readable and/or writable for the call
to openat() to succeed.

It introduces a local_set_xattrat() replacement to local_set_xattr()
based on fsetxattrat() to fix (2), and a local_set_mapped_file_attrat()
replacement to local_set_mapped_file_attr() based on local_fopenat()
and mkdirat() to fix (3). No effort is made to factor out code because
both local_set_xattr() and local_set_mapped_file_attr() will be dropped
when all users have been converted to use the "at" versions.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3187a45dd)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
66c1bca070 9pfs: local: link: don't follow symlinks
The local_link() callback is vulnerable to symlink attacks because it calls:

(1) link() which follows symbolic links for all path elements but the
    rightmost one
(2) local_create_mapped_attr_dir()->mkdir() which follows symbolic links
    for all path elements but the rightmost one

This patch converts local_link() to rely on opendir_nofollow() and linkat()
to fix (1), mkdirat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ad0b46e6ac)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
d5c103c2bb 9pfs: local: improve error handling in link op
When using the mapped-file security model, we also have to create a link
for the metadata file if it exists. In case of failure, we should rollback.

That's what this patch does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6dd4b1f1d0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
63b83ed531 9pfs: local: rename: use renameat
The local_rename() callback is vulnerable to symlink attacks because it
uses rename() which follows symbolic links in all path elements but the
rightmost one.

This patch simply transforms local_rename() into a wrapper around
local_renameat() which is symlink-attack safe.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d2767edec5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
5d26d9f2e7 9pfs: local: renameat: don't follow symlinks
The local_renameat() callback is currently a wrapper around local_rename()
which is vulnerable to symlink attacks.

This patch rewrites local_renameat() to have its own implementation, based
on local_opendir_nofollow() and renameat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 99f2cf4b2d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a38dc42428 9pfs: local: lstat: don't follow symlinks
The local_lstat() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) getxattr() which follows symbolic links in all path elements
(3) local_mapped_file_attr()->local_fopen()->openat(O_NOFOLLOW) which
    follows symbolic links in all path elements but the rightmost
    one

This patch converts local_lstat() to rely on opendir_nofollow() and
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1), fgetxattrat_nofollow() to
fix (2).

A new local_fopenat() helper is introduced as a replacement to
local_fopen() to fix (3). No effort is made to factor out code
because local_fopen() will be dropped when all users have been
converted to call local_fopenat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f9aef99b3e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
modified to also fix an existing error path to do proper cleanup.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a601a931a9 9pfs: local: readlink: don't follow symlinks
The local_readlink() callback is vulnerable to symlink attacks because it
calls:

(1) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(2) readlink() which follows symbolic links for all path elements but the
    rightmost one

This patch converts local_readlink() to rely on open_nofollow() to fix (1)
and opendir_nofollow(), readlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bec1e9546e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
fb55b5349f 9pfs: local: truncate: don't follow symlinks
The local_truncate() callback is vulnerable to symlink attacks because
it calls truncate() which follows symbolic links in all path elements.

This patch converts local_truncate() to rely on open_nofollow() and
ftruncate() instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ac125d993b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
d166a4bdd5 9pfs: local: statfs: don't follow symlinks
The local_statfs() callback is vulnerable to symlink attacks because it
calls statfs() which follows symbolic links in all path elements.

This patch converts local_statfs() to rely on open_nofollow() and fstatfs()
instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 31e51d1c15)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
44ceec5b20 9pfs: local: utimensat: don't follow symlinks
The local_utimensat() callback is vulnerable to symlink attacks because it
calls qemu_utimens()->utimensat(AT_SYMLINK_NOFOLLOW) which follows symbolic
links in all path elements but the rightmost one or qemu_utimens()->utimes()
which follows symbolic links for all path elements.

This patch converts local_utimensat() to rely on opendir_nofollow() and
utimensat(AT_SYMLINK_NOFOLLOW) directly instead of using qemu_utimens().
It is hence assumed that the OS supports utimensat(), i.e. has glibc 2.6
or higher and linux 2.6.22 or higher, which seems reasonable nowadays.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a33eda0dd9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
c04a6aae0c 9pfs: local: remove: don't follow symlinks
The local_remove() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) remove() which follows symbolic links in all path elements but the
    rightmost one

This patch converts local_remove() to rely on opendir_nofollow(),
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1) and unlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a0e640a872)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
b3e4d49cc4 9pfs: local: unlinkat: don't follow symlinks
The local_unlinkat() callback is vulnerable to symlink attacks because it
calls remove() which follows symbolic links in all path elements but the
rightmost one.

This patch converts local_unlinkat() to rely on opendir_nofollow() and
unlinkat() instead.

Most of the code is moved to a separate local_unlinkat_common() helper
which will be reused in a subsequent patch to fix the same issue in
local_remove().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit df4938a665)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a47953ad34 9pfs: local: lremovexattr: don't follow symlinks
The local_lremovexattr() callback is vulnerable to symlink attacks because
it calls lremovexattr() which follows symbolic links in all path elements
but the rightmost one.

This patch introduces a helper to emulate the non-existing fremovexattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lremovexattr().

local_lremovexattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 72f0d0bf51)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
43397f9f2c 9pfs: local: lsetxattr: don't follow symlinks
The local_lsetxattr() callback is vulnerable to symlink attacks because
it calls lsetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fsetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lsetxattr().

local_lsetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3e36aba757)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
15ba579486 9pfs: local: llistxattr: don't follow symlinks
The local_llistxattr() callback is vulnerable to symlink attacks because
it calls llistxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing flistxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to llistxattr().

local_llistxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5507904e36)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
d00af4715d 9pfs: local: lgetxattr: don't follow symlinks
The local_lgetxattr() callback is vulnerable to symlink attacks because
it calls lgetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fgetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lgetxattr().

local_lgetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56ad3e54da)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
6173ab9591 9pfs: local: open/opendir: don't follow symlinks
The local_open() and local_opendir() callbacks are vulnerable to symlink
attacks because they call:

(1) open(O_NOFOLLOW) which follows symbolic links in all path elements but
    the rightmost one
(2) opendir() which follows symbolic links in all path elements

This patch converts both callbacks to use new helpers based on
openat_nofollow() to only open files and directories if they are
below the virtfs shared folder

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 996a0d76d7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a27ec4ad5b 9pfs: local: keep a file descriptor on the shared folder
This patch opens the shared folder and caches the file descriptor, so that
it can be used to do symlink-safe path walk.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0e35a37829)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
1a9beb2f55 9pfs: introduce relative_openat_nofollow() helper
When using the passthrough security mode, symbolic links created by the
guest are actual symbolic links on the host file system.

Since the resolution of symbolic links during path walk is supposed to
occur on the client side. The server should hence never receive any path
pointing to an actual symbolic link. This isn't guaranteed by the protocol
though, and malicious code in the guest can trick the server to issue
various syscalls on paths whose one or more elements are symbolic links.
In the case of the "local" backend using the "passthrough" or "none"
security modes, the guest can directly create symbolic links to arbitrary
locations on the host (as per spec). The "mapped-xattr" and "mapped-file"
security modes are also affected to a lesser extent as they require some
help from an external entity to create actual symbolic links on the host,
i.e. another guest using "passthrough" mode for example.

The current code hence relies on O_NOFOLLOW and "l*()" variants of system
calls. Unfortunately, this only applies to the rightmost path component.
A guest could maliciously replace any component in a trusted path with a
symbolic link. This could allow any guest to escape a virtfs shared folder.

This patch introduces a variant of the openat() syscall that successively
opens each path element with O_NOFOLLOW. When passing a file descriptor
pointing to a trusted directory, one is guaranteed to be returned a
file descriptor pointing to a path which is beneath the trusted directory.
This will be used by subsequent patches to implement symlink-safe path walk
for any access to the backend.

Symbolic links aren't the only threats actually: a malicious guest could
change a path element to point to other types of file with undesirable
effects:
- a named pipe or any other thing that would cause openat() to block
- a terminal device which would become QEMU's controlling terminal

These issues can be addressed with O_NONBLOCK and O_NOCTTY.

Two helpers are introduced: one to open intermediate path elements and one
to open the rightmost path element.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(renamed openat_nofollow() to relative_openat_nofollow(),
 assert path is relative and doesn't contain '//',
 fixed side-effect in assert, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit 6482a96163)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
7c8fdacdd4 9pfs: remove side-effects in local_open() and local_opendir()
If these functions fail, they should not change *fs. Let's use local
variables to fix this.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 21328e1e57)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
fe9601182f 9pfs: remove side-effects in local_init()
If this function fails, it should not modify *ctx.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00c90bd1c2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
d8e24d9f27 9pfs: local: move xattr security ops to 9p-xattr.c
These functions are always called indirectly. It really doesn't make sense
for them to sit in a header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56fc494bdc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
60718e51d9 9pfs: fix off-by-one error in PDU free list
The server can handle MAX_REQ - 1 PDUs at a time and the virtio-9p
device has a MAX_REQ sized virtqueue. If the client manages to fill
up the virtqueue, pdu_alloc() will fail and the request won't be
processed without any notice to the client (it actually causes the
linux 9p client to hang).

This has been there since the beginning (commit 9f10751365 "virtio-9p:
Add a virtio 9p device to qemu"), but it needs an agressive workload to
run in the guest to show up.

We actually allocate MAX_REQ PDUs and I see no reason not to link them
all into the free list, so let's fix the init loop.

Reported-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0d78289c3d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Halil Pasic
e19a47bf48 virtio: fix vq->inuse recalc after migr
Correct recalculation of vq->inuse after migration for the corner case
where the avail_idx has already wrapped but used_idx not yet.

Also change the type of the VirtQueue.inuse to unsigned int. This is
done to be consistent with other members representing sizes (VRing.num),
and because C99 guarantees max ring size < UINT_MAX but does not
guarantee max ring size < INT_MAX.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: bccdef6b ("virtio: recalculate vq->inuse after migration")
CC: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e66bcc4081)
[BR: BSC#1020928]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
548950c1ae 9pfs: fix crash when fsdev is missing
If the user passes -device virtio-9p without the corresponding -fsdev, QEMU
dereferences a NULL pointer and crashes.

This is a 2.8 regression introduced by commit 702dbcc274.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
(cherry picked from commit f2b58c4375)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefano Stabellini
3cc5d172ba 9pfs: move pdus to V9fsState
pdus are initialized and used in 9pfs common code. Move the array from
V9fsVirtioState to V9fsState.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 583f21f8b9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
dc62ce5e2f 9pfs: add cleanup operation in FileOperations
Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 702dbcc274)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
f7b5ea7211 9pfs: drop excessive error message from virtfs_reset()
The virtfs_reset() function is called either when the virtio-9p device
gets reset, or when the client starts a new 9P session. In both cases,
if it finds fids from a previous session, the following is printed in
the monitor:

9pfs:virtfs_reset: One or more uncluncked fids found during reset

For example, if a linux guest with a mounted 9P share is reset from the
monitor with system_reset, the message will be printed. This is excessive
since these fids are now clunked and the state is clean.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 79decce35b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
eeb82dbacd 9pfs: fix integer overflow issue in xattr read/write
The v9fs_xattr_read() and v9fs_xattr_write() are passed a guest
originated offset: they must ensure this offset does not go beyond
the size of the extended attribute that was set in v9fs_xattrcreate().
Unfortunately, the current code implement these checks with unsafe
calculations on 32 and 64 bit values, which may allow a malicious
guest to cause OOB access anyway.

Fix this by comparing the offset and the xattr size, which are
both uint64_t, before trying to compute the effective number of bytes
to read or write.

Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7e55d65c56)
[BR: CVE-2016-9104 BSC#1007493]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
6d2cb4ac00 9pfs: fix memory leak in v9fs_write
If an error occurs when marshalling the transfer length to the guest, the
v9fs_write() function doesn't free an IO vector, thus leading to a memory
leak. This patch fixes the issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit fdfcc9aeea)
[BR: CVE-2016-9106 BSC#1007495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
13335933ae 9pfs: fix memory leak in v9fs_link
The v9fs_link() function keeps a reference on the source fid object. This
causes a memory leak since the reference never goes down to 0. This patch
fixes the issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4c1586787f)
[BR: CVE-2016-9105 BSC#1007494]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
1e7e415ed2 9pfs: fix memory leak in v9fs_xattrcreate
The 'fs.xattr.value' field in V9fsFidState object doesn't consider the
situation that this field has been allocated previously. Every time, it
will be allocated directly. This leads to a host memory leak issue if
the client sends another Txattrcreate message with the same fid number
before the fid from the previous time got clunked.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, updated the changelog to indicate how the leak can occur]
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit ff55e94d23)
[BR: CVE-2016-9102 BSC#1007450]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
702e7c4530 9pfs: fix information leak in xattr read
9pfs uses g_malloc() to allocate the xattr memory space, if the guest
reads this memory before writing to it, this will leak host heap memory
to the guest. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit eb68760285)
[BR: CVE-2016-9103 BSC#1007454]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
b6c386dfc0 virtio-9p: add reset handler
Virtio devices should implement the VirtIODevice->reset() function to
perform necessary cleanup actions and to bring the device to a quiescent
state.

In the case of the virtio-9p device, this means:
- emptying the list of active PDUs (i.e. draining all in-flight I/O)
- freeing all fids (i.e. close open file descriptors and free memory)

That's what this patch does.

The reset handler first waits for all active PDUs to complete. Since
completion happens in the QEMU global aio context, we just have to
loop around aio_poll() until the active list is empty.

The freeing part involves some actions to be performed on the backend,
like closing file descriptors or flushing extended attributes to the
underlying filesystem. The virtfs_reset() function already does the
job: it calls free_fid() for all open fids not involved in an ongoing
I/O operation. We are sure this is the case since we have drained
the PDU active list.

The current code implements all backend accesses with coroutines, but we
want to stay synchronous on the reset path. We can either change the
current code to be able to run when not in coroutine context, or create
a coroutine context and wait for virtfs_reset() to complete. This patch
goes for the latter because it results in simpler code.

Note that we also need to create a dummy PDU because it is also an API
to pass the FsContext pointer to all backend callbacks.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0e44a0fd3f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
36bfafa37e 9pfs: only free completed request if not flushed
If a PDU has a flush request pending, the current code calls pdu_free()
twice:

1) pdu_complete()->pdu_free() with pdu->cancelled set, which does nothing

2) v9fs_flush()->pdu_free() with pdu->cancelled cleared, which moves the
   PDU back to the free list.

This works but it complexifies the logic of pdu_free().

With this patch, pdu_complete() only calls pdu_free() if no flush request
is pending, i.e. qemu_co_queue_next() returns false.

Since pdu_free() is now supposed to be called with pdu->cancelled cleared,
the check in pdu_free() is dropped and replaced by an assertion.

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit f74e27bf0f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a9f2ace76d 9pfs: drop useless check in pdu_free()
Out of the three users of pdu_free(), none ever passes a NULL pointer to
this function.

Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 6868a420c5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
dee5157d53 9pfs: fix potential host memory leak in v9fs_read
In 9pfs read dispatch function, it doesn't free two QEMUIOVector
object thus causing potential memory leak. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit e95c9a493a)
[BR: CVE-2016-8577 BSC#1003893]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Li Qiang
0256b94251 9pfs: allocate space for guest originated empty strings
If a guest sends an empty string paramater to any 9P operation, the current
code unmarshals it into a V9fsString equal to { .size = 0, .data = NULL }.

This is unfortunate because it can cause NULL pointer dereference to happen
at various locations in the 9pfs code. And we don't want to check str->data
everywhere we pass it to strcmp() or any other function which expects a
dereferenceable pointer.

This patch enforces the allocation of genuine C empty strings instead, so
callers don't have to bother.

Out of all v9fs_iov_vunmarshal() users, only v9fs_xattrwalk() checks if
the returned string is empty. It now uses v9fs_string_size() since
name.data cannot be NULL anymore.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[groug, rewritten title and changelog,
 fix empty string check in v9fs_xattrwalk()]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ba42ebb863)
[BR: CVE-2016-8578 BSC#1003894]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
2474863600 9pfs: fix potential segfault during walk
If the call to fid_to_qid() returns an error, we will call v9fs_path_free()
on uninitialized paths.

It is a regression introduced by the following commit:

56f101ecce 9pfs: handle walk of ".." in the root directory

Let's fix this by initializing dpath and path before calling fid_to_qid().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[groug: updated the changelog to indicate this is regression and to provide
        the offending commit SHA1]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 13fd08e631)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
357adfcd9e 9pfs: introduce v9fs_path_sprintf() helper
This helper is similar to v9fs_string_sprintf(), but it includes the
terminating NUL character in the size field.

This is to avoid doing v9fs_string_sprintf((V9fsString *) &path) and
then bumping the size.

Affected users are changed to use this new helper.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit e3e83f2e21)
[BR: support patch for BSC#1034866]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
5f7e165eda 9pfs: handle walk of ".." in the root directory
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:

All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot).  The parent of the root directory of a server's
tree is itself.

This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".

This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
  QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 56f101ecce)
[BR: CVE-2016-7116 BSC#996441]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefan Hajnoczi
fcfd6ebc66 virtio: zero vq->inuse in virtio_reset()
vq->inuse must be zeroed upon device reset like most other virtqueue
fields.

In theory, virtio_reset() just needs assert(vq->inuse == 0) since
devices must clean up in-flight requests during reset (requests cannot
not be leaked!).

In practice, it is difficult to achieve vq->inuse == 0 across reset
because balloon, blk, 9p, etc implement various different strategies for
cleaning up requests.  Most devices call g_free(elem) directly without
telling virtio.c that the VirtQueueElement is cleaned up.  Therefore
vq->inuse is not decremented during reset.

This patch zeroes vq->inuse and trusts that devices are not leaking
VirtQueueElements across reset.

I will send a follow-up series that refactors request life-cycle across
all devices and converts vq->inuse = 0 into assert(vq->inuse == 0) but
this more invasive approach is not appropriate for stable trees.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
(cherry picked from commit 4b7f91ed02)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefan Hajnoczi
d3b42bcf40 virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again.  It's necessary to decrement vq->inuse to avoid "leaking"
the element count.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 58a83c6149)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefan Hajnoczi
6881b77ddf virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated.  Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.

At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements.  For these devices we need to recalculate
vq->inuse upon load so the value is correct.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bccdef6b1a)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
a67b8ddfb3 9p: introduce the V9fsDir type
If we are to switch back to readdir(), we need a more complex type than
DIR * to be able to serialize concurrent accesses to the directory stream.

This patch introduces a placeholder type and fixes all users.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
(cherry picked from commit f314ea4e30)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Vincenzo Maffione
a93cf8810d virtio: cache used_idx in a VirtQueue field
Accessing used_idx in the VQ requires an expensive access to
guest physical memory. Before this patch, 3 accesses are normally
done for each pop/push/notify call. However, since the used_idx is
only written by us, we can track it in our internal data structure.

Signed-off-by: Vincenzo Maffione <v.maffione@gmail.com>
Message-Id: <3d062ec54e9a7bf9fb325c1fd693564951f2b319.1450218353.git.v.maffione@gmail.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit b796fcd1bf)
[BR: BSC#1015048]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
3791dc6525 9pfs: introduce V9fsVirtioState
V9fsState now only contains generic fields. Introduce V9fsVirtioState
for virtio transport.  Change virtio-pci and virtio-ccw to use
V9fsVirtioState.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 00588a0aa2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
02a91b01a3 9pfs: factor out v9fs_device_{,un}realize_common
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 2a0c56aa4c)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Note:
The previous reference to virtio_9p_get_features is left in place.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
95bbacf64b 9pfs: rename virtio-9p.c to 9p.c
Now that file only contains generic code.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 60ce86c714)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
a7df3233ee 9pfs: rename virtio_9p_set_fd_limit to use v9fs_ prefix
It's not virtio specific.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
(cherry picked from commit 72a189770a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
193790b694 9pfs: move handle_9p_output and make it static function
It's only used in virtio device.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 0192cc5d79)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
1796e84666 9pfs: export pdu_{submit,alloc,free}
They will be used in later patches.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 4b311c5f0b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
5eeb0ec8ac 9pfs: factor out virtio_9p_push_and_notify
The new function resides in virtio specific file.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
cherry picked from commit 0d3716b4e6)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
06de0af1de 9pfs: break out 9p.h from virtio-9p.h
Move out generic definitions from virtio-9p.h to 9p.h. Fix header
inclusions.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit ebe74f8ba2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
08e06f889f 9pfs: break out virtio_init_iov_from_pdu
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 592707af7f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
3198e15ff2 9pfs: factor out pdu_push_and_notify
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit f657b17a63)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
21fbcd8a76 9pfs: factor out virtio_pdu_{,un}marshal
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit fe9fa96d7c)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
dad4b21660 9pfs: make pdu_{,un}marshal proper functions
Factor out v9fs_iov_v{,un}marshal. Implement pdu_{,un}marshal with those
functions.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 0e2082d9e5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
39c96419f7 9pfs: PDU processing functions should start pdu_ prefix
This matches naming convention of pdu_marshal and pdu_unmarshal.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit dc295f8353)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
2ee0a32795 9pfs: PDU processing functions don't need to take V9fsState as argument
V9fsState can be referenced by pdu->s. Initialise that in device
realization function.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit ad38ce9ed1)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
0951f62a59 fsdev: rename virtio-9p-marshal.{c,h} to 9p-iov-marshal.{c,h}
And rename v9fs_marshal to v9fs_iov_marshal, v9fs_unmarshal to
v9fs_iov_unmarshal.

The rationale behind this change is that, this marshalling interface is
used both by virtio and proxy helper. Renaming files and functions to
reflect the true nature of this interface.

Xen transport is going to have its own marshalling interface.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 2209bd050a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
7184ee97ff fsdev: break out 9p-marshal.{c,h} from virtio-9p-marshal.{c,h}
Break out some generic functions for marshaling 9p state. Pure code
motion plus minor fixes for build system.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 829dd2861a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
f50d9dbe8a 9pfs: remove dead code
Some structures in virtio-9p.h have been unused since 2011 when relevant
functions switched to use coroutines.

The declaration of pdu_packunpack and function do_pdu_unpack are
useless.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 71042cffc0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
e9bb27cf74 9pfs: merge hw/virtio/virtio-9p.h into hw/9pfs/virtio-9p.h
The deleted file only contained V9fsConf which wasn't virtio specific.
Merge that to the general header of 9pfs.

Fixed header inclusions as I went along.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 756cb74a59)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
dc2ee725ff 9pfs: rename virtio-9p-xattr{,-user}.{c,h} to 9p-xattr{,-user}.{c,h}
These three files are not virtio specific. Rename them to generic
names.

Fix comments and header inclusion in various files.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 267ae092e2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
5e4a9d648b 9pfs: rename virtio-9p-proxy.{c,h} to 9p-proxy.{c,h}
Those two files are not virtio specific. Rename them to use generic
names.

Fix includes in various C files. Change define guards and comments
in header files.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 494a8ebe71)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
b0dc4234f3 9pfs: rename virtio-9p-posix-acl.c to 9p-posix-acl.c
This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit d57b78002c)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
7d0e489d5d 9pfs: rename virtio-9p-local.c to 9p-local.c
This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit f00d4f596b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Wei Liu
c5a2335cfe 9pfs: rename virtio-9p-handle.c to 9p-handle.c
This file is not virtio specific. Rename it to use generic name.

Fix comment and remove unneeded inclusion of virtio.h.

Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
(cherry picked from commit 3b9ca04653)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
465556e58e virtio-9p-device: add minimal unrealize handler
Since commit 4652f1640e "virtio-9p: add savevm
handlers", if the user hot-unplugs a quiescent 9p device and live
migrates, the source QEMU crashes before migration completetion...
This happens because virtio-9p devices have a realize handler which
calls virtio_init() and register_savevm().  Both calls store pointers
to the device internals, that get dereferenced during migration even
if the device got unplugged.

This patch simply adds an unrealize handler to perform minimal
cleanup and avoid the crash.  Hot unplug of non-quiescent 9p devices
is still not supported in QEMU, and not supported by linux guests
either.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 20151208155457.27775.69441.stgit@bahia.huguette.org
[PMM: rewrapped long lines in commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6cecf09373)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Bruce Rogers
c3e0978a73 coroutine: move into libqemuutil.a library
The coroutine files are currently referenced by the block-obj-y
variable. The coroutine functionality though is already used by
more than just the block code. eg migration code uses coroutine
yield. In the future the I/O channel code will also use the
coroutine yield functionality. Since the coroutine code is nicely
self-contained it can be easily built as part of the libqemuutil.a
library, making it widely available.

The headers are also moved into include/qemu, instead of the
include/block directory, since they are now part of the util
codebase, and the impl was never in the block/ directory
either.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602
In place of upstream commit 10817bf09d,
we just simulate the generalization of the qemu headers by creating
a include/qemu/coroutine.h, which simply includes include/block/
coroutine.h, nothing more. That satisfies our very limited needs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Shannon Zhao
b929e52924 virtio-9p-device: move qdev properties into virtio-9p-device.c
As only one place in virtio-9p-device.c uses
DEFINE_VIRTIO_9P_PROPERTIES, there is no need to expose it. Inline it
into virtio-9p-device.c to avoid wrongly use.

Signed-off-by: Shannon Zhao <zhaoshenglong@huawei.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 83a84878da)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Michael S. Tsirkin
06565c79b3 virtio-9p: use standard headers
Drop code duplicated from standard headers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
(cherry picked from commit 8744a6a8d5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Michael S. Tsirkin
e6cd749943 virtio: use standard-headers
Drop a bunch of code duplicated from virtio_config.h and virtio_ring.h.
This makes us rename event index accessors which conflict,
as reusing the ones from virtio_ring.h isn't trivial.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
(cherry picked from commit e9600c6ca9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Only
the part of the commit which seemed essential for the backport was
cherry-picked]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Michael S. Tsirkin
da2a529b13 virtio: use standard virtio_ring.h
Switch to virtio_ring.h from standard headers.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 4fbe0f322d)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Michael S. Tsirkin
44dd6205ae include: import virtio headers from linux 4.0
Add files imported from linux-next (what will become linux 4.0) using
scripts/update-linux-headers.sh

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9fbe302b2a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Gonglei
8d73b914bf virtio-9p: use aliases instead of duplicate qdev properties
virtio-9p-pci all duplicate the qdev properties of their
V9fsState child. This approach does not work well with
string or pointer properties since we must be careful
about leaking or double-freeing them.

Use the QOM alias property to forward property accesses to the
V9fsState child.  This way no duplication is necessary.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 48833071d9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefan Hajnoczi
f7ed5892ce qdev: add qdev_alias_all_properties()
The qdev_alias_all_properties() function creates QOM alias properties
for each qdev property on a DeviceState.  This is useful for parent
objects that wish to forward property accesses to their children.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
(cherry picked from commit 67cc7e0aac)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Stefan Hajnoczi
38aa4e05f4 qom: add object_property_add_alias()
Sometimes an object needs to present a property which is actually on
another object, or it needs to provide an alias name for an existing
property.

Examples:
  a.foo -> b.foo
  a.old_name -> a.new_name

The new object_property_add_alias() API allows objects to alias a
property on the same object or another object.  The source and target
names can be different.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
(cherry picked from commit ef7c7ff6d4)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Rusty Russell
7eaaa4117b virtio: allow byte swapping for vring
Quoting original text from Rusty: "This is based on a simpler patch by Anthony
Liguouri".

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ add VirtIODevice * argument to most helpers,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cee3ca0028)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
9947983130 virtio: memory accessors for endian-ambivalent targets
This is the virtio-access.h header file taken from Rusty's "endian-ambivalent
targets using legacy virtio" patch. It introduces helpers that should be used
when accessing vring data or by drivers for data that contains headers.
The virtio config space is also target endian, but the current code already
handles that with the virtio_is_big_endian() helper. There is no obvious
benefit at using the virtio accessors in this case.

Now we have two distinct paths: a fast inline one for fixed endian targets,
and a slow out-of-line one for targets that define the new TARGET_IS_BIENDIAN
macro.

Signed-off-by: Rusty Russell <rusty@rustcorp.com.au>
[ relicensed virtio-access.h to GPLv2+ on Rusty's request,
  pass &address_space_memory to physical memory accessors,
  per-device endianness,
  virtio tswap16 and tswap64 helpers,
  faspath for fixed endian targets,
  Greg Kurz <gkurz@linux.vnet.ibm.com> ]
Cc: Cédric Le Goater <clg@fr.ibm.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0f5d1d2a49)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
430828e8b3 virtio: add endian-ambivalent support to VirtIODevice
Some CPU families can dynamically change their endianness. This means we
can have little endian ppc or big endian arm guests for example. This has
an impact on legacy virtio data structures since they are target endian.
We hence introduce a new property to track the endianness of each virtio
device. It is reasonnably assumed that endianness won't change while the
device is in use : we hence capture the device endianness when it gets
reset.

We migrate this property in a subsection, after the device descriptor. This
means the load code must not rely on it until it is restored. As a consequence,
the vring sanity checks had to be moved after the call to vmstate_load_state().
We enforce paranoia by poisoning the property at the begining of virtio_load().

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 616a655219)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Greg Kurz
3e6da8d653 exec: introduce target_words_bigendian() helper
We currently have a virtio_is_big_endian() helper that provides the target
endianness to the virtio code. As of today, the helper returns a fixed
compile-time value. Of course, this will have to change if we want to
support target endianness changes at run-time.

Let's move the TARGET_WORDS_BIGENDIAN bits out to a new helper and have
virtio_is_big_endian() implemented on top of it.

This patch doesn't change any functionality.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 98ed8ecfc9)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Peter Crosthwaite
70c46da6ed error: Add error_abort
Add a special Error * that can be passed to error handling APIs to
signal that any errors are fatal and should abort QEMU. There are two
advantages to this:

- allows for brevity when wishing to assert success of Error **
  accepting APIs. No need for this pattern:
        Error * local_err = NULL;
        api_call(foo, bar, &local_err);
        assert_no_error(local_err);
  This also removes the need for _nofail variants of APIs with
  asserting call sites now reduced to 1LOC.
- SIGABRT happens from within the offending API. When a fatal error
  occurs in an API call (when the caller is asserting sucess) failure
  often means the API itself is broken. With the abort happening in the
  API call now, the stack frames into the call are available at debug
  time. In the assert_no_error scheme the abort happens after the fact.

The exact semantic is that when an error is raised, if the argument
Error ** matches &error_abort, then the abort occurs immediately. The
error messaged is reported.

For error_propagate, if the destination error is &error_abort, then
the abort happens at propagation time.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit 5d24ee70bc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
6f6f454785 virtio: Convert exit to unrealize
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 306ec6c3ce)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Only
part of this patch is used, and it also retains some of the old
interface to allow old and new to co-exist]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
bb37892302 virtio: Complete converting VirtioDevice to QOM realize
Drop VirtioDeviceClass::init.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0ba94b6f94)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
altered to allow init to continue to be in use.]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
32886d6ede virtio-9p: Convert to QOM realize
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 59be75227d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
4455b77144 virtio: Start converting VirtioDevice to QOM realize
Temporarily allow either VirtioDeviceClass::init or
VirtioDeviceClass::realize.

Introduce VirtioDeviceClass::unrealize for symmetry.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1d244b42d2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Patch
modified to maintain both old and new ways to init/realize]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
d546524e1f virtio-9p: QOM realize preparations
Avoid unnecessary VIRTIO_DEVICE().

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0f3657ec36)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Paolo Bonzini
90ed6cfdab virtio-bus: cleanup plug/unplug interface
Right now we have these pairs:

- virtio_bus_plug_device/virtio_bus_destroy_device.  The first
  takes a VirtIODevice, the second takes a VirtioBusState

- device_plugged/device_unplug callbacks in the VirtioBusClass
  (here it's just the naming that is inconsistent)

- virtio_bus_destroy_device is not called by anyone (and since
  it calls qdev_free, it would be called by the proxies---but
  then the callback is useless since the proxies can do whatever
  they want before calling virtio_bus_destroy_device)

And there is a k->init but no k->exit, hence virtio_device_exit is
overwritten by subclasses (except virtio-9p).  This cleans it up by:

- renaming the device_unplug callback to device_unplugged

- renaming virtio_bus_plug_device to virtio_bus_device_plugged,
  matching the callback name

- renaming virtio_bus_destroy_device to virtio_bus_device_unplugged,
  removing the qdev_free, making it take a VirtIODevice and calling it
  from virtio_device_exit

- adding a k->exit callback

virtio_device_exit is still overwritten, the next patches will fix that.

Cc: qemu-stable@nongnu.org
Acked-by: Andreas Faerber <afaerber@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5e96f5d2f8)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
yinyin
bb02c6a957 virtio: virtqueue_get_avail_bytes: fix desc_pa when loop over the indirect descriptor table
virtqueue_get_avail_bytes: when found a indirect desc, we need loop over it.
           /* loop over the indirect descriptor table */
           indirect = 1;
           max = vring_desc_len(desc_pa, i) / sizeof(VRingDesc);
           num_bufs = i = 0;
           desc_pa = vring_desc_addr(desc_pa, i);
But, It init i to 0, then use i to update desc_pa. so we will always get:
desc_pa = vring_desc_addr(desc_pa, 0);
the last two line should swap.

Cc: qemu-stable@nongnu.org
Signed-off-by: Yin Yin <yin.yin@cs2c.com.cn>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 1ae2757c6c)
[BR: Fix and/or infrastructure for BSC#1038396]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:17 -06:00
Andreas Färber
868047039e virtio-9p-device: Avoid freeing uninitialized memory
In virtio_9p_device_init() there are 6x goto out that will lead to
v9fs_path_free() attempting to free unitialized path.data field.
Easiest way to trigger is: qemu-system-x86_64 -device virtio-9p-pci

Fix this by moving v9fs_path_init() before any goto out.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1375315187-16534-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 27915efb97)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
M. Mohan Kumar
ac6fb856f2 hw/9pfs: Fix memory leak in error path
Fix few more memory leaks in virtio-9p-device.c detected using valgrind.

Signed-off-by: M. Mohan Kumar <mohan@in.ibm.com>
Message-id: 1372929678-14341-1-git-send-email-mohan@in.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 92304bf399)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
60babba4d3 virtio: add virtio_device_set_child_bus_name.
Add virtio_device_set_child_bus_name function.

It will be used with virtio-serial-x and virtio-scsi-x to set the
child bus name before calling virtio-x-device's init.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1367330931-12994-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 1034e9cf4d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
a1a9e82370 virtio: drop unused function prototypes.
This removes the unused prototypes in virtio.h.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366791683-5350-8-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit fca0a70cdb)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
8c90c1e6b8 virtio-9p: cleanup: QOM casts.
As the virtio-9p-pci is switched to the new API, we can use QOM casts.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366708123-19626-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 13daf6cad0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
121b959a2b virtio-9p: cleanup: init function.
This remove old init function as it is no longer needed.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366708123-19626-4-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit e8111e5055)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
8663156735 virtio-9p-pci: switch to the new API.
Here the virtio-9p-pci is modified for the new API. The device
virtio-9p-pci extends virtio-pci. It creates and connects a
virtio-9p-device during the init. The properties are not changed.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366708123-19626-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 234a336f9e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
c3b9f7e9b8 virtio-9p: add the virtio-9p device.
Create virtio-9p-device which extends virtio-device, so it can be connected on
virtio-bus.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1366708123-19626-2-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit e7303c4303)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Hans de Goede
df41258a5e virtio-9p: Fix virtio-9p no longer building after hw-dirs branch merge
Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Message-id: 1365495755-10902-1-git-send-email-hdegoede@redhat.com
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 93b48c201e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
7d3bd60d64 hw: move headers to include/
Many of these should be cleaned up with proper qdev-/QOM-ification.
Right now there are many catch-all headers in include/hw/ARCH depending
on cpu.h, and this makes it necessary to compile these files per-target.
However, fixing this does not belong in these patches.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0d09e41a51)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. This patch
is severely edited to only include the changes needed for this backport]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
d092dee64e virtio-pci: fix hot unplug.
Hot unplug failed because it tried to free the virtio device two times.

This fix the issue by removing the call to virtio_bus_destroy_device.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1363624648-16906-4-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 10479a8089)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
b63bdb8817 virtio-x-bus: fix allow_hotplug assertion.
This set allow_hotplug for each existing virtio-x-bus, allowing the
refactored devices to be hot pluggable.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1363624648-16906-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit cbd19063e7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
KONRAD Frederic
dc635dfd63 virtio: make virtio device's structures public.
These structures must be made public to avoid two memory allocations for
refactored virtio devices.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Message-id: 1363624648-16906-2-git-send-email-fred.konrad@greensocs.com

Changes V4 <- V3:
   * Rebased on current git.

Changes V3 <- V2:
    * Style correction spotted by Andreas (virtio-scsi.h).
    * Style correction for virtio-net.h.

Changes V2 <- V1:
    * Move the dataplane include into the header (virtio-blk).
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit f1b24e840f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Stefan Hajnoczi
8afcf31a2a main-loop: add qemu_get_aio_context()
It is very useful to get the main loop AioContext, which is a static
variable in main-loop.c.

I'm not sure whether qemu_get_aio_context() will be necessary in the
future once devices focus on using their own AioContext instead of the
main loop AioContext, but for now it allows us to refactor code to
support multiple AioContext while actually passing the main loop
AioContext.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5f3aa1ff47)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
7d4b291c57 hw: include hw header files with full paths
Done with this script:

cd hw
for i in `find . -name '*.h' | sed 's/^..//'`; do
  echo '\,^#.*include.*["<]'$i'[">], s,'$i',hw/&,'
done | sed -i -f - `find . -type f`

This is so that paths remain valid as files are moved.

Instead, files in hw/dataplane are referenced with the relative path.
We know they are not going to move to include/, and they are the only
include files that are in subdirectories _and_ move.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 83c9f4ca79)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602. Very
severely trimmed to only include what is needful for this backporting
effort]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
8d465a0d5c virtio-9p: remove PCI dependencies from hw/9pfs/
Also move the 9p.h file to 9pfs/virtio-9p-device.h, for consistency
with the corresponding .c file.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 60653b28f5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
55742cff0b virtio-9p: use CONFIG_VIRTFS, not CONFIG_LINUX
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7e6b14dfb5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
332d63dad6 scsi: make default command timeout user-settable
Per default any SCSI commands are sent with an infinite timeout,
which essentially disables any command abort mechanism on the
host and causes the guest to stall.
This patch adds a new option 'timeout' for scsi-generic and
scsi-disk which allows the user to set the timeout value to
something sensible.

Instead of disabling command aborts by setting the command timeout
to infinity we should be setting it to '0' per default, allowing
the host to fall back to its default values.

Signed-off-by: Hannes Reinecke <hare@suse.com>
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Ulrich Obergfell
2c4624a124 scsi-disk: fix bug in scsi_block_new_request() introduced by commit 137745c
This patch fixes a bug in scsi_block_new_request() that was introduced
by commit 137745c5c6. If the host cache
is used - i.e. if BDRV_O_NOCACHE is _not_ set - the 'break' statement
needs to be executed to 'fall back' to SG_IO.

Cc: qemu-stable@nongnu.org
Signed-off-by: Ulrich Obergfell <uobergfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2fe5a9f73b)
[LM: BSC#1031051]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
8e6e2e8155 megasas: fix guest-triggered memory leak
If the guest sets the sglist size to a value >=2GB, megasas_handle_dcmd
will return MFI_STAT_MEMORY_NOT_AVAILABLE without freeing the memory.
Avoid this by returning only the status from map_dcmd, and loading
cmd->iov_size in the caller.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 765a707000)
[BR: CVE-2017-5856 BSC#1023053]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Li Qiang
bb43e90ce0 watchdog: 6300esb: add exit function
When the Intel 6300ESB watchdog is hot unplug. The timer allocated
in realize isn't freed thus leaking memory leak. This patch avoid
this through adding the exit function.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <583cde9c.3223ed0a.7f0c2.886e@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit eb7a20a361)
[BR: CVE-2016-10155 BSC#1021129]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
21aa22ccb3 usb: ccid: check ccid apdu length
CCID device emulator uses Application Protocol Data Units(APDU)
to exchange command and responses to and from the host.
The length in these units couldn't be greater than 65536. Add
check to ensure the same. It'd also avoid potential integer
overflow in emulated_apdu_from_guest.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170202192228.10847-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c7dfbf3225)
[BR: CVE-2017-5898 BSC#1023907]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
a84a506957 cirrus: add blit_is_unsafe call to cirrus_bitblt_cputovideo
CIRRUS_BLTMODE_MEMSYSSRC blits do NOT check blit destination
and blit width, at all.  Oops.  Fix it.

Security impact: high.

The missing blit destination check allows to write to host memory.
Basically same as CVE-2014-8106 for the other blit variants.

The missing blit width check allows to overflow cirrus_bltbuf,
with the attractive target cirrus_srcptr (current cirrus_bltbuf write
position) being located right after cirrus_bltbuf in CirrusVGAState.

Due to cirrus emulation writing cirrus_bltbuf bytewise the attacker
hasn't full control over cirrus_srcptr though, only one byte can be
changed.  Once the first byte has been modified further writes land
elsewhere.

[ This is CVE-2017-2620 / XSA-209  - Ian Jackson ]

Fixed compilation by removing extra parameter to blit_is_unsafe. -iwj

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
[BR: BSC#1024972]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
5deab92cdb cirrus: fix patterncopy checks
The blit_region_is_unsafe checks don't work correctly for the
patterncopy source.  It's a fixed-sized region, which doesn't
depend on cirrus_blt_{width,height}.  So go do the check in
cirrus_bitblt_common_patterncopy instead, then tell blit_is_unsafe that
it doesn't need to verify the source.  Also handle the case where we
blit from cirrus_bitbuf correctly.

This patch replaces 5858dd1801.

Security impact:  I think for the most part error on the safe side this
time, refusing blits which should have been allowed.

Only exception is placing the blit source at the end of the video ram,
so cirrus_blt_srcaddr + 256 goes beyond the end of video memory.  But
even in that case I'm not fully sure this actually allows read access to
host memory.  To trick the commit 5858dd18 security checks one has to
pick very small cirrus_blt_{width,height} values, which in turn implies
only a fraction of the blit source will actually be used.

Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 1486645341-5010-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 95280c31cd)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Li Qiang
05909e4bfb cirrus: fix oob access issue (CVE-2017-2615)
When doing bitblt copy in backward mode, we should minus the
blt width first just like the adding in the forward mode. This
can avoid the oob access of the front of vga's vram.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>

{ kraxel: with backward blits (negative pitch) addr is the topmost
          address, so check it as-is against vram size ]

Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Fixes: d3532a0db0 (CVE-2014-8106)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485938101-26602-1-git-send-email-kraxel@redhat.com
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 62d4c6bd52)
[BR: CVE-2017-2615 BSC#1023004]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
b0f2dfb575 cirrus_vga: fix off-by-one in blit_region_is_unsafe
The "max" value is being compared with >=, but addr + width points to
the first byte that will _not_ be copied.  Laszlo suggested using a
"greater than" comparison, instead of subtracting one like it is
already done above for the height, so that max remains always positive.

The mistake is "safe"---it will reject some blits, but will never cause
out-of-bounds writes.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1455121059-18280-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d2ba7ecb34)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
57bf00d878 cirrus: fix blit address mask handling
Apply the cirrus_addr_mask to cirrus_blt_dstaddr and cirrus_blt_srcaddr
right after assigning them, in cirrus_bitblt_start(), instead of having
this all over the place in the cirrus code, and missing a few places.

Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485338996-17095-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 60cd23e851)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Wolfgang Bumiller
a9fd1684ae cirrus: handle negative pitch in cirrus_invalidate_region()
cirrus_invalidate_region() calls memory_region_set_dirty()
on a per-line basis, always ranging from off_begin to
off_begin+bytesperline. With a negative pitch off_begin
marks the top most used address and thus we need to do an
initial shift backwards by a line for negative pitches of
backward blits, otherwise the first iteration covers the
line going from the start offset forwards instead of
backwards.
Additionally since the start address is inclusive, if we
shift by a full `bytesperline` we move to the first address
*not* included in the blit, so we only shift by one less
than bytesperline.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-id: 1485352137-29367-1-git-send-email-w.bumiller@proxmox.com

[ kraxel: codestyle fixes ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f153b563f8)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Bruce Rogers
4e95d1c44e display: cirrus: ignore source pitch value as needed in blit_is_unsafe
Commit 4299b90 added a check which is too broad, given that the source
pitch value is not required to be initialized for solid fill operations.
This patch refines the blit_is_unsafe() check to ignore source pitch in
that case. After applying the above commit as a security patch, we
noticed the SLES 11 SP4 guest gui failed to initialize properly.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 20170109203520.5619-1-brogers@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 913a87885f)
[BR: BSC#1016779]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
cba145c494 display: cirrus: check vga bits per pixel(bpp) value
In Cirrus CLGD 54xx VGA Emulator, if cirrus graphics mode is VGA,
'cirrus_get_bpp' returns zero(0), which could lead to a divide
by zero error in while copying pixel data. The same could occur
via blit pitch values. Add check to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1476776717-24807-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4299b90e9b)
[BR: CVE-2016-9921 CVE-2016-9922 BSC#1014702 BSC#1015169]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Li Qiang
75a2db63a9 usbredir: free vm_change_state_handler in usbredir destroy dispatch
In usbredir destroy dispatch function, it doesn't free the vm change
state handler once registered in usbredir_realize function. This will
lead a memory leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 58216976.d0236b0a.77b99.bcd6@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 07b026fd82)
[BR: CVE-2016-9907 BSC#1014109]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Li Qiang
17a95a18d9 usb: ehci: fix memory leak in ehci_init_transfer
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5821c0f4.091c6b0a.e0c92.e811@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 791f97758e)
[BR: CVE-2016-9911 BSC#1014111]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
25343fb99a net: mcf: check receive buffer size register value
ColdFire Fast Ethernet Controller uses a receive buffer size
register(EMRBR) to hold maximum size of all receive buffers.
It is set by a user before any operation. If it was set to be
zero, ColdFire emulator would go into an infinite loop while
receiving data in mcf_fec_receive. Add check to avoid it.

Reported-by: Wjjzhang <wjjzhang@tencent.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 77d54985b8)
[BR: CVE-2016-9776 BSC#1013285]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
P J P
1fcbe9577b dma: rc4030: limit interval timer reload value
The JAZZ RC4030 chipset emulator has a periodic timer and
associated interval reload register. The reload value is used
as divider when computing timer's next tick value. If reload
value is large, it could lead to divide by zero error. Limit
the interval reload value to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-8667 BSC#1004702]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
30f303353c audio: intel-hda: check stream entry count during transfer
Intel HDA emulator uses stream of buffers during DMA data
transfers. Each entry has buffer length and buffer pointer
position, which are used to derive bytes to 'copy'. If this
length and buffer pointer were to be same, 'copy' could be
set to zero(0), leading to an infinite loop. Add check to
avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1476949224-6865-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0c0fc2b5fd)
[BR: CVE-2016-8909 BSC#1006536]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
e7e9758890 net: rtl8139: limit processing of ring descriptors
RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.

Reported-by: Andrew Henderson <hendersa@icculus.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c7c3591669)
[BR: CVE-2016-8910 BSC#1006538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Li Qiang
faea87b9ea net: eepro100: fix memory leak in device uninit
The exit dispatch of eepro100 network card device doesn't free
the 's->vmstate' field which was allocated in device realize thus
leading a host memory leak. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2634ab7fe2)
[BR: CVE-2016-9101 BSC#1007391]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
77cee1b17f net: pcnet: check rx/tx descriptor ring length
The AMD PC-Net II emulator has set of control and status(CSR)
registers. Of these, CSR76 and CSR78 hold receive and transmit
descriptor ring length respectively. This ring length could range
from 1 to 65535. Setting ring length to zero leads to an infinite
loop in pcnet_rdra_addr() or pcnet_transmit(). Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 34e29ce754)
[BR: CVE-2016-7909 BSC#1002557]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
b89561e8b9 char: serial: check divider value against baud base
16550A UART device uses an oscillator to generate frequencies
(baud base), which decide communication speed. This speed could
be changed by dividing it by a divider. If the divider is
greater than the baud base, speed is set to zero, leading to a
divide by zero error. Add check to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1476251888-20238-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3592fe0c91)
[BR: CVE-2016-8669 BSC#1004707]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
da70237f21 xhci: limit the number of link trbs we are willing to process
Needed to avoid we run in circles forever in case the guest builds
an endless loop with link trbs.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Tested-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1476096382-7981-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 05f43d44e4)
[BR: CVE-2016-8576 BSC#1003878]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
e5d8696ae8 net: mcf: limit buffer descriptor count
ColdFire Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set upper limit to number of buffer descriptors.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 070c4b92b8)
[BR: CVE-2016-7908 BSC#1002550]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
c3f64a4897 vmsvga: correct bitmap and pixmap size checks
When processing svga command DEFINE_CURSOR in vmsvga_fifo_run,
the computed BITMAP and PIXMAP size are checked against the
'cursor.mask[]' and 'cursor.image[]' array sizes in bytes.
Correct these checks to avoid OOB memory access.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1473338754-15430-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 167d97a3de)
[BR: CVE-2016-7170 BSC#998516]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
af282c8fb6 vmsvga: more cursor checks
Check the cursor size more carefully.  Also switch to unsigned while
being at it, so they can't be negative.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5829b09720)
[BR: support patch for CVE-2016-7170 BSC#998516]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
chaojianhu
b12ad4db7d hw/net: Fix a heap overflow in xlnx.xps-ethernetlite
The .receive callback of xlnx.xps-ethernetlite doesn't check the length
of data before calling memcpy. As a result, the NetClientState object in
heap will be overflowed. All versions of qemu with xlnx.xps-ethernetlite
will be affected.

Reported-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit a0d1cbdacf)
[BR: CVE-2016-7161 BSC#1001151]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Petr Matousek
40e9bde33f vnc: sanitize bits_per_pixel from the client
bits_per_pixel that are less than 8 could result in accessing
non-initialized buffers later in the code due to the expectation
that bytes_per_pixel value that is used to initialize these buffers is
never zero.

To fix this check that bits_per_pixel from the client is one of the
values that the rfb protocol specification allows.

This is CVE-2014-7815.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>

[ kraxel: apply codestyle fix ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e6908bfe8e)
[BR: BSC#902737]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	ui/vnc.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
4ee6bf080e virtio: check vring descriptor buffer length
virtio back end uses set of buffers to facilitate I/O operations.
An infinite loop unfolds in virtqueue_pop() if a buffer was
of zero size. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 1e7aed7014)
[BR: CVE-2016-6490 BSC#991466]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/virtio.c
2021-03-18 17:15:16 -06:00
Stefan Hajnoczi
11c3b6e4ab virtio: error out if guest exceeds virtqueue size
A broken or malicious guest can submit more requests than the virtqueue
size permits, causing unbounded memory allocation in QEMU.

The guest can submit requests without bothering to wait for completion
and is therefore not bound by virtqueue size.  This requires reusing
vring descriptors in more than one request, which is not allowed by the
VIRTIO 1.0 specification.

In "3.2.1 Supplying Buffers to The Device", the VIRTIO 1.0 specification
says:

  1. The driver places the buffer into free descriptor(s) in the
     descriptor table, chaining as necessary

and

  Note that the above code does not take precautions against the
  available ring buffer wrapping around: this is not possible since the
  ring buffer is the same size as the descriptor table, so step (1) will
  prevent such a condition.

This implies that placing more buffers into the virtqueue than the
descriptor table size is not allowed.

QEMU is missing the check to prevent this case.  Processing a request
allocates a VirtQueueElement leading to unbounded memory allocation
controlled by the guest.

Exit with an error if the guest provides more requests than the
virtqueue size permits.  This bounds memory allocation and makes the
buggy guest visible to the user.

This patch fixes CVE-2016-5403 and was reported by Zhenhao Hong from 360
Marvel Team, China.

Reported-by: Zhenhao Hong <hongzhenhao@360.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afd9096eb1)
[BR: CVE-2016-5403 BSC#991080]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Paolo Bonzini
35319a3197 scsi: esp: respect FIFO invariant after message phase
The FIFO contains two bytes; hence the write ptr should be two bytes ahead
of the read pointer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d020aa504c)
[BR: CVE-2016-5238 BSC#982959]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
ac5d72b393 scsi: megasas: null terminate bios version buffer
While reading information via 'megasas_ctrl_get_info' routine,
a local bios version buffer isn't null terminated. Add the
terminating null byte to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 844864fbae)
[BR: CVE-2016-5337 BSC#983961]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/megasas.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
3ba1a7bc0e scsi: esp: check TI buffer index before read/write
The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
FIFO buffers. One is used to handle commands and other is for
information transfer. Three control variables 'ti_rptr',
'ti_wptr' and 'ti_size' are used to control r/w access to the
information transfer buffer ti_buf[TI_BUFSZ=16]. In that,

'ti_rptr' is used as read index, where read occurs.
'ti_wptr' is a write index, where write would occur.
'ti_size' indicates total bytes to be read from the buffer.

While reading/writing to this buffer, index could exceed its
size. Add check to avoid OOB r/w access.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1465230883-22303-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ff589551c8)
[BR: CVE-2016-5338 BSC#983982]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
6aefc57b17 vmsvga: don't process more than 1024 fifo commands at once
vmsvga_fifo_run is called in regular intervals (on each display update)
and will resume where it left off.  So we can simply exit the loop,
without having to worry about how processing will continue.

Fixes: CVE-2016-4453
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-5-git-send-email-kraxel@redhat.com
(cherry picked from commit 4e68a0ee17)
[BR: CVE-2016-4453 BSC#982223]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
1263662655 vmsvga: move fifo sanity checks to vmsvga_fifo_length
Sanity checks are applied when the fifo is enabled by the guest
(SVGA_REG_CONFIG_DONE write).  Which doesn't help much if the guest
changes the fifo registers afterwards.  Move the checks to
vmsvga_fifo_length so they are done each time qemu is about to read
from the fifo.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-2-git-send-email-kraxel@redhat.com
(cherry picked from commit 5213602678)
[BR: CVE-2016-4454 BSC#982222]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Peter Lieven
670d4a7c7b block/iscsi: avoid potential overflow of acb->task->cdb
at least in the path via virtio-blk the maximum size is not
restricted.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a6b3167fa0)
[BR: CVE-2016-5126 BSC#982285]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	block/iscsi.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
b66921534f scsi: megasas: check 'read_queue_head' index value
While doing MegaRAID SAS controller command frame lookup, routine
'megasas_lookup_frame' uses 'read_queue_head' value as an index
into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value
within array bounds to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b60bdd1f1e)
[BR: CVE-2016-5107 BSC#982019]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/megasas.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
fdd886a365 scsi: megasas: initialise local configuration data buffer
When reading MegaRAID SAS controller configuration via MegaRAID
Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read
uses an uninitialised local data buffer. Initialise this buffer
to avoid stack information leakage.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d37af74073)
[BR: CVE-2016-5105 BSC#982017]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
e5c5403858 scsi: megasas: use appropriate property buffer size
When setting MegaRAID SAS controller properties via MegaRAID
Firmware Interface(MFI) commands, a user supplied size parameter
is used to set property value. Use appropriate size value to avoid
OOB access issues.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1b85898025)
[BR: CVE-2016-5106 BSC#982018]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
f2a69101c7 ohci: allocate timer only once.
Allocate timer once, at init time, instead of allocating/freeing
it all the time when starting/stopping the bus.  Simplifies the
code, also fixes bugs (memory leak) due to missing checks whenever
the time is already allocated or not.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fa1298c2d6)
[BR: CVE-2016-2391 BSC#967013]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/usb/hcd-ohci.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
b2c865b536 usb: check USB configuration descriptor object
When processing remote NDIS control message packets, the USB Net
device emulator checks to see if the USB configuration descriptor
object is of RNDIS type(2). But it does not check if it is null,
which leads to a null dereference error. Add check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455188480-14688-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 80eecda8e5)
[BR: CVE-2016-2392 BSC#967012]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
P J P
0e9c0b22a3 e1000: Avoid infinite loop in processing transmit descriptor (CVE-2015-6815)
While processing transmit descriptors, it could lead to an infinite
loop if 'bytes' was to become zero; Add a check to avoid it.

[The guest can force 'bytes' to 0 by setting the hdr_len and mss
descriptor fields to 0.
--Stefan]

Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1441383666-6590-1-git-send-email-stefanha@redhat.com
(cherry picked from commit b947ac2bf2)
[BR: CVE-2015-6815 BSC#944697]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Kevin Wolf
6a675ebf45 ahci: Fix FLUSH command
AHCI couldn't cope with asynchronous commands that aren't doing DMA, it
simply wouldn't complete them. Due to the bug fixed in commit f68ec837,
FLUSH commands would seem to have completed immediately even if they
were still running on the host. After the commit, they would simply hang
and never unset the BSY bit, rendering AHCI unusable on any OS sending
flushes.

This patch adds another callback for the completion of asynchronous
commands. This is what AHCI really wants to use for its command
completion logic rather than an DMA completion callback.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a62eaa26c1)
[LM: BNC#982356]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
87df91c34d esp: check dma length before reading scsi command(CVE-2016-4441)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.

Fixes CVE-2016-4441.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-3-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6c1fef6b59)
[BR: CVE-2016-4441 BSC#980723]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
c5f968a13b esp: check command buffer length before write(CVE-2016-4439)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.

Fixes CVE-2016-4439.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c98c6c105f)
[BR: CVE-2016-4439 BSC#980711]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
59d47733a5 vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression.  The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.

This patch introduces a new sr_vbe register set.  The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[].  Normal vga register reads and
writes go to sr[].  Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.

This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.

Cc: qemu-stable@nongnu.org
Reported-by: Thomas Lamprecht <thomas@lamprecht.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1463475294-14119-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 94ef4f337f)
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/display/vga.c
	hw/display/vga_int.h
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
4c292728ce vga: make sure vga register setup for vbe stays intact (CVE-2016-3712).
Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode.  This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.

Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.

Which is good for one or another buffer overflow.  Not that
critical as they typically read overflows happening somewhere
in the display code.  So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.

Fixes: CVE-2016-3712
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
46e2949b4b vga: update vga register setup on vbe changes
Call the new vbe_update_vgaregs() function on vbe configuration
changes, to make sure vga registers are up-to-date.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
948754c8d1 vga: factor out vga register setup
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer.  Move that code to a separate function so we can call it
from other places too.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
69f29d65de vga: add vbe_enabled() helper
Makes code a bit easier to read.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
7676b7308b vga: fix banked access bounds checking (CVE-2016-3710)
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.

The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register.  The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.

Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.

Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.

Fixes: CVE-2016-3710
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978158 CVE-2016-3710]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
2f96715adf i386: kvmvapic: initialise imm32 variable
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.

Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975700 CVE-2016-4020]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
9f61701313 net: mipsnet: check packet length against buffer
When receiving packets over MIPSnet network device, it uses
 receive buffer of size 1514 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.

Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975136 CVE-2016-4002]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
d1a8028b43 net: stellaris_enet: check packet length against receive buffer
When receiving packets over Stellaris ethernet controller, it
uses receive buffer of size 2048 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.

Reported-by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1460095428-22698-1-git-send-email-ppandit@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 3a15cc0e1e)
[BR: BSC#975128 CVE-2016-4001 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
cd1897fd36 net: check packet payload length
While computing IP checksum, 'net_checksum_calculate' reads
payload length from the packet. It could exceed the given 'data'
buffer size. Add a check to avoid it.

Reported-by: Liu Ling <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 362786f14a)
[BR: BSC#970037 CVE-2016-2857]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Ladi Prosek
9b4f1ab7f0 rng: add request queue support to rng-random
Requests are now created in the RngBackend parent class and the
code path is shared by both rng-egd and rng-random.

This commit fixes the rng-random implementation which processed
only one request at a time and simply discarded all but the most
recent one. In the guest this manifested as delayed completion
of reads from virtio-rng, i.e. a read was completed only after
another read was issued.

By switching rng-random to use the same request queue as rng-egd,
the unsafe stack-based allocation of the entropy buffer is
eliminated and replaced with g_malloc.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-5-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 60253ed1e6)
[BR: BSC#970036 CVE-2016-2858 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Ladi Prosek
a6390bf22c rng: move request queue cleanup from RngEgd to RngBackend
RngBackend is now in charge of cleaning up the linked list on
instance finalization. It also exposes a function to finalize
individual RngRequest instances, called by its child classes.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-4-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 9f14b0add1)
[BR: support patch for BSC#970036 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	backends/rng.c
2021-03-18 17:15:16 -06:00
Ladi Prosek
e9d1ef475f rng: move request queue from RngEgd to RngBackend
The 'requests' field now lives in the RngBackend parent class.
There are no functional changes in this commit.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-3-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 74074e8a7c)
[BR: support patch for BSC#970036 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	include/sysemu/rng.h
2021-03-18 17:15:16 -06:00
Ladi Prosek
3933ac6171 rng: remove the unused request cancellation code
rng_backend_cancel_requests had no callers and none of the code
deleted in this commit ever ran.

Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-2-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 3c52ddcdc5)
[BR: support patch for BSC#970036 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
39ffe45d06 net: ne2000: check ring buffer control registers
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. Registers PSTART & PSTOP
define ring buffer size & location. Setting these registers
to invalid values could lead to infinite loop or OOB r/w
access issues. Add check to avoid it.

Reported-by: Yang Hongke <yanghongke@huawei.com>
Tested-by: Yang Hongke <yanghongke@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 415ab35a44)
[BR: BSC#969350 CVE-2016-2841 (source path modified)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
a9d3db9e35 usb: check RNDIS buffer offsets & length
When processing remote NDIS control message packets,
the USB Net device emulator uses a fixed length(4096) data buffer.
The incoming informationBufferOffset & Length combination could
overflow and cross that range. Check control message buffer
offsets and length to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-3-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fe3c546c5f)
[BR: BSC#967969 CVE-2016-2538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Petr Matousek
f892bc471e i8254: fix out-of-bounds memory access in pit_ioport_read()
Due converting PIO to the new memory read/write api we no longer provide
separate I/O region lenghts for read and write operations. As a result,
reading from PIT Mode/Command register will end with accessing
pit->channels with invalid index.

Fix this by ignoring read from the Mode/Command register.

This is CVE-2015-3214.

Reported-by: Matt Tait <matttait@google.com>
Fixes: 0505bcdec8
Cc: qemu-stable@nongnu.org
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d4862a87e3)
[AF: bsc#934069]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
John Snow
178b3b92c8 ide: Correct handling of malformed/short PRDTs
This impacts both BMDMA and AHCI HBA interfaces for IDE.
Currently, we confuse the difference between a PRDT having
"0 bytes" and a PRDT having "0 complete sectors."

When we receive an incomplete sector, inconsistent error checking
leads to an infinite loop wherein the call succeeds, but it
didn't give us enough bytes -- leading us to re-call the
DMA chain over and over again. This leads to, in the BMDMA case,
leaked memory for short PRDTs, and infinite loops and resource
usage in the AHCI case.

The .prepare_buf() callback is reworked to return the number of
bytes that it successfully prepared. 0 is a valid, non-error
answer that means the table was empty and described no bytes.
-1 indicates an error.

Our current implementation uses the io_buffer in IDEState to
ultimately describe the size of a prepared scatter-gather list.
Even though the AHCI PRDT/SGList can be as large as 256GiB, the
AHCI command header limits transactions to just 4GiB. ATA8-ACS3,
however, defines the largest transaction to be an LBA48 command
that transfers 65,536 sectors. With a 512 byte sector size, this
is just 32MiB.

Since our current state structures use the int type to describe
the size of the buffer, and this state is migrated as int32, we
are limited to describing 2GiB buffer sizes unless we change the
migration protocol.

For this reason, this patch begins to unify the assertions in the
IDE pathways that the scatter-gather list provided by either the
AHCI PRDT or the PCI BMDMA PRDs can only describe, at a maximum,
2GiB. This should be resilient enough unless we need a sector
size that exceeds 32KiB.

Further, the likelihood of any guest operating system actually
attempting to transfer this much data in a single operation is
very slim.

To this end, the IDEState variables have been updated to more
explicitly clarify our maximum supported size. Callers to the
prepare_buf callback have been reworked to understand the new
return code, and all versions of the prepare_buf callback have
been adjusted accordingly.

Lastly, the ahci_populate_sglist helper, relied upon by the
AHCI implementation of .prepare_buf() as well as the PCI
implementation of the callback have had overflow assertions
added to help make clear the reasonings behind the various
type changes.

[Added %d -> %"PRId64" fix John sent because off_pos changed from int to
int64_t.
--Stefan]

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1414785819-26209-4-git-send-email-jsnow@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3251bdcf1c)
[BR: BSC#928393 CVE-2014-9718]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/ide/ahci.c
	hw/ide/core.c
	hw/ide/internal.h
	hw/ide/macio.c
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
9a54f3039d vmware-vga: use vmsvga_verify_rect in vmsvga_fill_rect
Add verification to vmsvga_fill_rect, re-enable HW_FILL_ACCEL.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit bd9ccd8517)
[AF: bsc#901508, change vmsvga_verify_rect() argument]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
79d5f1f7ac vmware-vga: use vmsvga_verify_rect in vmsvga_copy_rect
Add verification to vmsvga_copy_rect, re-enable HW_RECT_ACCEL.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 61b41b4c20)
[AF: bsc#901508, change vmsvga_verify_rect() argument]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
9e91a2c7d5 vmware-vga: use vmsvga_verify_rect in vmsvga_update_rect
Switch vmsvga_update_rect over to use vmsvga_verify_rect.  Slight change
in behavior:  We don't try to automatically fixup rectangles any more.
In case we find invalid update requests we'll do a full-screen update
instead.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 1735fe1edb)
[AF: bsc#901508, surface_{width,height}() -> ds_get_{width,height}()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
75bad7d22c vmware-vga: add vmsvga_verify_rect
Add verification function for rectangles, returning
true if verification passes and false otherwise.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 07258900fd)
[AF: bsc#901508, ds_get_{width,height}() -> surface_{width,height}()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
89f795cd96 vmware-vga: CVE-2014-3689: turn off hw accel
Quick & easy stopgap for CVE-2014-3689:  We just compile out the
hardware acceleration functions which lack sanity checks.  Thankfully
we have capability bits for them (SVGA_CAP_RECT_COPY and
SVGA_CAP_RECT_FILL), so guests should deal just fine, in theory.

Subsequent patches will add the missing checks and re-enable the
hardware acceleration emulation.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 83afa38eb2)
[AF: bsc#901508]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-18 17:15:16 -06:00
Laszlo Ersek
071d6d468b e1000: eliminate infinite loops on out-of-bounds transfer start
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:

- the TDLEN and RDLEN registers store the total size of the descriptor
  area,

- while the TDH and RDH registers store the offset (in whole tx / rx
  descriptors) into the area where the transfer is supposed to start.

Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).

QEMU already contains logic to deal with bogus transfers submitted by the
guest:

- Normally, the transmit case wants to increase TDH from its initial value
  to TDT. (TDT is allowed to be numerically smaller than the initial TDH
  value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
  that QEMU currently has here is a check against reaching the original
  TDH value again -- a complete wraparound, which should never happen.

- In the receive case RDH is increased from its initial value until
  "total_size" bytes have been received; preferably in a single step, or
  in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
  RX descriptors are skipped without receiving data, while RDH is
  incremented just the same. QEMU tries to prevent an infinite loop
  (processing only null RX descriptors) by detecting whether RDH assumes
  its original value during the loop. (Again, wrapping from RDLEN to 0 is
  normal.)

What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.

The condition that expresses this is:

  xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)

i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.

This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.

This is CVE-2016-1981.

Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dd793a7488)
[BR: in above description e1000_receive_iov is e1000_receive in backport]
[BR: BSC#963782 CVE-2016-1981]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/net/e1000.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
b10e717d42 usb: ehci: add capability mmio write function
USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).

Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#964413 CVE-2016-2198]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Wolfgang Bumiller
d2f89ad738 hmp: fix sendkey out of bounds write (CVE-2015-8619)
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.

Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.

Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

(cherry picked from commit 64ffbe04ea)
[BR: BSC#960334 CVE-2015-8619]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hmp.c
	include/ui/console.h
	ui/input-legacy.c
2021-03-18 17:15:16 -06:00
Peter Lieven
8959003f68 ui/vnc: limit client_cut_text msg payload size
currently a malicious client could define a payload
size of 2^32 - 1 bytes and send up to that size of
data to the vnc server. The server would allocated
that amount of memory which could easily create an
out of memory condition.

This patch limits the payload size to 1MB max.

Please note that client_cut_text messages are currently
silently ignored.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f9a70e7939)
[CYL: BSC#944463 CVE-2015-5239]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-18 17:15:16 -06:00
Chunyan Liu
f8cbdbf983 vbe: rework sanity checks
Plug a bunch of holes in the bochs dispi interface parameter checking.
Add a function doing verification on all registers.  Call that
unconditionally on every register write.  That way we should catch
everything, even changing one register affecting the valid range of
another register.

Some of the holes have been added by commit
e9c6149f6a.  Before that commit the
maximum possible framebuffer (VBE_DISPI_MAX_XRES * VBE_DISPI_MAX_YRES *
32 bpp) has been smaller than the qemu vga memory (8MB) and the checking
for VBE_DISPI_MAX_XRES + VBE_DISPI_MAX_YRES + VBE_DISPI_MAX_BPP was ok.

Some of the holes have been there forever, such as
VBE_DISPI_INDEX_X_OFFSET and VBE_DISPI_INDEX_Y_OFFSET register writes
lacking any verification.

Security impact:

(1) Guest can make the ui (gtk/vnc/...) use memory rages outside the vga
frame buffer as source  ->  host memory leak.  Memory isn't leaked to
the guest but to the vnc client though.

(2) Qemu will segfault in case the memory range happens to include
unmapped areas  ->  Guest can DoS itself.

The guest can not modify host memory, so I don't think this can be used
by the guest to escape.

CVE-2014-3615

Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
[CYL: BSC#895528 CVE-2014-3615]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-18 17:15:16 -06:00
Chunyan Liu
65cae03d31 vbe: make bochs dispi interface return the correct memory size with qxl
VgaState->vram_size is the size of the pci bar.  In case of qxl not the
whole pci bar can be used as vga framebuffer.  Add a new variable
vbe_size to handle that case.  By default (if unset) it equals
vram_size, but qxl can set vbe_size to something else.

This makes sure VBE_DISPI_INDEX_VIDEO_MEMORY_64K returns correct results
and sanity checks are done with the correct size too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit c1b886c45d)
[CYL: BSC#895528 CVE-2014-3615]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
e8d412a913 ide: ahci: reset ncq object to unused on error
When processing NCQ commands, AHCI device emulation prepares a
NCQ transfer object; To which an aio control block(aiocb) object
is assigned in 'execute_ncq_command'. In case, when the NCQ
command is invalid, the 'aiocb' object is not assigned, and NCQ
transfer object is left as 'used'. This leads to a use after
free kind of error in 'bdrv_aio_cancel_async' via 'ahci_reset_port'.
Reset NCQ transfer object to 'unused' to avoid it.

[Maintainer edit: s/ACHI/AHCI/ in the commit message. --js]

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1452282511-4116-1-git-send-email-ppandit@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 4ab0359a8a)
[LM: BSC#961333 CVE-2016-1568]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
John Snow
31fff2a787 ahci: add ncq_err helper
Set some appropriate error bits for NCQ for us.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1435016308-6150-4-git-send-email-jsnow@redhat.com
(cherry picked from commit a55c8231d0)
[LM: BSC#961333 CVE-2016-1568]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
536dec7447 net: ne2000: fix bounds check in ioport operations
While doing ioport r/w operations, ne2000 device emulation suffers
from OOB r/w errors. Update respective array bounds check to avoid
OOB access.

Reported-by: Ling Liu <liuling-it@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit aa7f9966df)
[LM: BSC#960725 CVE-2015-8743]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Marc-André Lureau
7f0879dc5a msix: implement pba write (but read-only)
qpci_msix_pending() writes on pba region, causing qemu to SEGV:

  Program received signal SIGSEGV, Segmentation fault.
  [Switching to Thread 0x7ffff7fba8c0 (LWP 25882)]
  0x0000000000000000 in ?? ()
  (gdb) bt
  #0  0x0000000000000000 in  ()
  #1  0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434
  #2  0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506
  #3  0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176
  #4  0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439
  #5  0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534
  #6  0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80
  #7  0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378
  #8  0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569
  #9  0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581
  #10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306
  #11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876
  #12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122

(without this patch, this can be reproduced with the ivshmem qtest)

Implement an empty mmio write to avoid the crash.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 43b11a91dd)
[LM: BSC#958917 CVE-2015-7549]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Gerd Hoffmann
b0948314b2 ehci: apply limit to iTD/sidt descriptors
Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever).  Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.

So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit.  That should really
catch all cases now.

Reported-by: 杜少博 <dushaobo@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 1ae3f2f178)
[BR: BSC#976109 CVE-2016-4037]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Michael S. Tsirkin
339105813a virtio-serial: fix ANY_LAYOUT
Don't assume a specific layout for control messages.
Required by virtio 1.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7882080388)
[LM: BSC#940929 CVE-2015-5745]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
Prasad J Pandit
cdb4131a16 ui: vnc: avoid floating point exception
While sending 'SetPixelFormat' messages to a VNC server,
the client could set the 'red-max', 'green-max' and 'blue-max'
values to be zero. This leads to a floating point exception in
write_png_palette while doing frame buffer updates.

Reported-by: Lian Yihan <lianyihan@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4c65fed8bd)
[LM: BSC#958491 CVE-2015-8504]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-18 17:15:16 -06:00
P J P
483c94a8c7 scsi: initialise info object with appropriate size
While processing controller 'CTRL_GET_INFO' command, the routine
'megasas_ctrl_get_info' overflows the '&info' object size. Use its
appropriate size to null initialise it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512211501420.22471@wniryva>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 36fef36b91)
[BR: BSC#961556 CVE-2015-8613]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/scsi/megasas.c
2021-03-18 17:15:16 -06:00
P J P
a9f5400b1b i386: avoid null pointer dereference
Hello,

A null pointer dereference issue was reported by Mr Ling Liu, CC'd here. It
occurs while doing I/O port write operations via hmp interface. In that,
'current_cpu' remains null as it is not called from cpu_exec loop, which
results in the said issue.

Below is a proposed (tested)patch to fix this issue; Does it look okay?

===
From ae88a4947fab9a148cd794f8ad2d812e7f5a1d0f Mon Sep 17 00:00:00 2001
From: Prasad J Pandit <pjp@fedoraproject.org>
Date: Fri, 18 Dec 2015 11:16:07 +0530
Subject: [PATCH] i386: avoid null pointer dereference

When I/O port write operation is called from hmp interface,
'current_cpu' remains null, as it is not called from cpu_exec()
loop. This leads to a null pointer dereference in vapic_write
routine. Add check to avoid it.

Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512181129320.9805@wniryva>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 4c1396cb57)
[BR: BSC#962320 CVE-2016-1922]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/i386/kvmvapic.c
2021-03-18 17:15:16 -06:00
Prasad J Pandit
3de75379ef fw_cfg: add check to validate current entry value
When processing firmware configurations, an OOB r/w access occurs
if 's->cur_entry' is set to be invalid(FW_CFG_INVALID=0xffff).
Add a check to validate 's->cur_entry' to avoid such access.

Reported-by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#961691 CVE-2016-1714]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Andreas Färber
26f8c6e533 ide: Set BSY bit during FLUSH
The implementation of the ATA FLUSH command invokes a flush at the block
layer, which may on raw files on POSIX entail a synchronous fdatasync().
This may in some cases take so long that the SLES 11 SP1 guest driver
reports I/O errors and filesystems get corrupted or remounted read-only.

Avoid this by setting BUSY_STAT, so that the guest is made aware we are
in the middle of an operation and no ATA commands are attempted to be
processed concurrently.

Addresses BNC#637297.

Suggested-by: Gonglei (Arei) <arei.gonglei@huawei.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f68ec8379e)
[BSC#936132]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
John Snow
fc266e683d ide: fix ATAPI command permissions
We're a little too lenient with what we'll let an ATAPI drive handle.
Clamp down on the IDE command execution table to remove CD_OK permissions
from commands that are not and have never been ATAPI commands.

For ATAPI command validity, please see:
- ATA4 Section 6.5 ("PACKET Command feature set")
- ATA8/ACS Section 4.3 ("The PACKET feature set")
- ACS3 Section 4.3 ("The PACKET feature set")

ACS3 has a historical command validity table in Table B.4
("Historical Command Assignments") that can be referenced to find when
a command was introduced, deprecated, obsoleted, etc.

The only reference for ATAPI command validity is by checking that
version's PACKET feature set section.

ATAPI was introduced by T13 into ATA4, all commands retired prior to ATA4
therefore are assumed to have never been ATAPI commands.

Mandatory commands, as listed in ATA8-ACS3, are:

- DEVICE RESET
- EXECUTE DEVICE DIAGNOSTIC
- IDENTIFY DEVICE
- IDENTIFY PACKET DEVICE
- NOP
- PACKET
- READ SECTOR(S)
- SET FEATURES

Optional commands as listed in ATA8-ACS3, are:

- FLUSH CACHE
- READ LOG DMA EXT
- READ LOG EXT
- WRITE LOG DMA EXT
- WRITE LOG EXT

All other commands are illegal to send to an ATAPI device and should
be rejected by the device.

CD_OK removal justifications:

0x06 WIN_DSM              Defined in ACS2. Not valid for ATAPI.
0x21 WIN_READ_ONCE        Retired in ATA5. Not ATAPI in ATA4.
0x94 WIN_STANDBYNOW2      Retired in ATA4. Did not coexist with ATAPI.
0x95 WIN_IDLEIMMEDIATE2   Retired in ATA4. Did not coexist with ATAPI.
0x96 WIN_STANDBY2         Retired in ATA4. Did not coexist with ATAPI.
0x97 WIN_SETIDLE2         Retired in ATA4. Did not coexist with ATAPI.
0x98 WIN_CHECKPOWERMODE2  Retired in ATA4. Did not coexist with ATAPI.
0x99 WIN_SLEEPNOW2        Retired in ATA4. Did not coexist with ATAPI.
0xE0 WIN_STANDBYNOW1      Not part of ATAPI in ATA4, ACS or ACS3.
0xE1 WIN_IDLEIMMDIATE     Not part of ATAPI in ATA4, ACS or ACS3.
0xE2 WIN_STANDBY          Not part of ATAPI in ATA4, ACS or ACS3.
0xE3 WIN_SETIDLE1         Not part of ATAPI in ATA4, ACS or ACS3.
0xE4 WIN_CHECKPOWERMODE1  Not part of ATAPI in ATA4, ACS or ACS3.
0xE5 WIN_SLEEPNOW1        Not part of ATAPI in ATA4, ACS or ACS3.
0xF8 WIN_READ_NATIVE_MAX  Obsoleted in ACS3. Not ATAPI in ATA4 or ACS.

This patch fixes a divide by zero fault that can be caused by sending
the WIN_READ_NATIVE_MAX command to an ATAPI drive, which causes it to
attempt to use zeroed CHS values to perform sector arithmetic.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1441816082-21031-1-git-send-email-jsnow@redhat.com
CC: qemu-stable@nongnu.org
(cherry picked from commit d9033e1d3a)
[BR: CVE-2015-6855 BSC#945404]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Jason Wang
c7bdf03a5a virtio-net: correctly drop truncated packets
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:

- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated

In order to be consistent with vhost, fix by discarding the descriptor
in this case.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 0cf33fb6b4)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Jason Wang
d846293609 virtio: introduce virtqueue_discard()
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.

Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 29b9f5efd7)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
Jason Wang
2953bff061 virtio: introduce virtqueue_unmap_sg()
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit ce31746157)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:15:16 -06:00
P J P
028f5d1153 net: avoid infinite loop when receiving packets(CVE-2015-5278)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, leading to an infinite
loop situation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 737d2b3c41)
[BR: BSC#945989]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/net/ne2000.c
2021-03-18 17:15:16 -06:00
P J P
ab49866dd1 net: add checks to validate ring buffer pointers(CVE-2015-5279)
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, which could lead to a
memory buffer overflow. Added other checks at initialisation.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9bbdbc66e5)
[BR: BSC#945987]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/net/ne2000.c
2021-03-18 17:15:16 -06:00
Bruce Rogers
5e40fd2e0e savevm: Produce message in case of failed migration to assist user
Include-If: %if 0%{?sp_version} == 4

Unfortunately we had to break migration for previous releases of
kvm on SLE11 SP4, in order to be compatible with SLE11 SP3 as
well as SLE12. In order to help the users understand and get
past this issue, we will output a message when the migration
fails, pointing them to TID # 7017048, which will provide details
about the issue.

NOTE: As soon as the vast majority of users are already past the
likelihood of still hitting this issue, we should remove this
patch.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-18 17:14:24 -06:00
Jason Wang
dc618750bd pcnet: fix rx buffer overflow(CVE-2015-7512)
Backends could provide a packet whose length is greater than buffer
size. Check for this and truncate the packet to avoid rx buffer
overflow in this case.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
[BR: BSC#957162]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Chunyan Liu
651e2a9b57 eepro100: Prevent two endless loops
http://lists.nongnu.org/archive/html/qemu-devel/2015-11/msg04592.html
shows an example how an endless loop in function action_command can
be achieved.

During my code review, I noticed a 2nd case which can result in an
endless loop.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 00837731d2)
(CYL: BSC#956829 CVE-2015-8345)
Signed-off-by: Chunyan Liu <cyliu@suse.com>

Conflicts:
    hw/net/eepro100.c
2021-03-17 21:09:20 -06:00
Alexander Graf
c546895fdc kvmclock: Ensure time in migration never goes backward
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.

To make sure we never go backwards, calculate what the guest would have seen as time at the point of migration and use that value instead of the kernel returned one when it's more recent.
This bases the view of the kvmclock after migration on the
same foundation in host as well as guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9a48bcd1b8)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Marcelo Tosatti
41f6bcba97 kvmclock: Ensure proper env->tsc value for kvmclock_current_nsec calculation
Ensure proper env->tsc value for kvmclock_current_nsec calculation.

Reported-by: Marcin Gibuła <m.gibula@beyond.pl>
Analyzed-by: Marcin Gibuła <m.gibula@beyond.pl>
Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 317b0a6d8b)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Marcelo Tosatti
f46a6f4c71 Introduce cpu_clean_all_dirty
Introduce cpu_clean_all_dirty, to force subsequent cpu_synchronize_all_states
to read in-kernel register state.

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de9d61e83d)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	cpus.c
	include/sysemu/kvm.h
	kvm-all.c
2021-03-17 21:09:20 -06:00
Marcelo Tosatti
44a8bb827f kvmclock: clock should count only if vm is running
kvmclock should not count while vm is paused, because:

1) if the vm is paused for long periods, timekeeping
math can overflow while converting the (large) clocksource
delta to nanoseconds.

2) Users rely on CLOCK_MONOTONIC to count run time, that is,
time which OS has been in a runnable state (see CLOCK_BOOTTIME).

Change kvmclock driver so as to save clock value when vm transitions
from runnable to stopped state, and to restore clock value from stopped
to runnable transition.

Cc: qemu-stable@nongnu.org
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Gleb Natapov <gleb@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 00f4d64ee7)
[BSC#947164, BSC#953187]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Kevin Wolf
a2afcb9f7e ide: Clear DRQ after handling all expected accesses
This is additional hardening against an end_transfer_func that fails to
clear the DRQ status bit. The bit must be unset as soon as the PIO
transfer has completed, so it's better to do this in a central place
instead of duplicating the code in all commands (and forgetting it in
some).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Kevin Wolf
02fd0e8c47 ide/atapi: Fix START STOP UNIT command completion
The command must be completed on all code paths. START STOP UNIT with
pwrcnd set should succeed without doing anything.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Kevin Wolf
2b33e451b0 ide: Check array bounds before writing to io_buffer (CVE-2015-5154)
If the end_transfer_func of a command is called because enough data has
been read or written for the current PIO transfer, and it fails to
correctly call the command completion functions, the DRQ bit in the
status register and s->end_transfer_func may remain set. This allows the
guest to access further bytes in s->io_buffer beyond s->data_end, and
eventually overflowing the io_buffer.

One case where this currently happens is emulation of the ATAPI command
START STOP UNIT.

This patch fixes the problem by adding explicit array bounds checks
before accessing the buffer instead of relying on end_transfer_func to
function correctly.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Michael Tokarev
d4268af7a5 slirp: use less predictable directory name in /tmp for smb config (CVE-2015-4037)
In this version I used mkdtemp(3) which is:

        _BSD_SOURCE
        || /* Since glibc 2.10: */
            (_POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700)

(POSIX.1-2008), so should be available on systems we care about.

While at it, reset the resulting directory name within smb structure
on error so cleanup function wont try to remove directory which we
failed to create.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 8b8f1c7e9d)
[BR: BSC#932267]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Petr Matousek
ee9fd13005 pcnet: force the buffer access to be in bounds during tx
4096 is the maximum length per TMD and it is also currently the size of
the relay buffer pcnet driver uses for sending the packet data to QEMU
for further processing. With packet spanning multiple TMDs it can
happen that the overall packet size will be bigger than sizeof(buffer),
which results in memory corruption.

Fix this by only allowing to queue maximum sizeof(buffer) bytes.

This is CVE-2015-3209.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Matt Tait <matttait@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Gonglei
cb3e3659d0 pcnet: fix Negative array index read
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b50d00911)
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/net/pcnet.c
2021-03-17 21:09:20 -06:00
Petr Matousek
1a16c42ecc fdc: force the fifo access to be in bounds of the allocated buffer
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.

Fix this by making sure that the index is always bounded by the
allocated memory.

This is CVE-2015-3456.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
[AF: BSC#929339]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Daniel P. Berrange
9b91ab56ba CVE-2015-1779: limit size of HTTP headers from websockets clients
The VNC server websockets decoder will read and buffer data from
websockets clients until it sees the end of the HTTP headers,
as indicated by \r\n\r\n. In theory this allows a malicious to
trick QEMU into consuming an arbitrary amount of RAM. In practice,
because QEMU runs g_strstr_len() across the buffered header data,
it will spend increasingly long burning CPU time searching for
the substring match and less & less time reading data. So while
this does cause arbitrary memory growth, the bigger problem is
that QEMU will be burning 100% of available CPU time.

A novnc websockets client typically sends headers of around
512 bytes in length. As such it is reasonable to place a 4096
byte limit on the amount of data buffered while searching for
the end of HTTP headers.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2cdb5e142f)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Daniel P. Berrange
0d4de284e8 CVE-2015-1779: incrementally decode websocket frames
The logic for decoding websocket frames wants to fully
decode the frame header and payload, before allowing the
VNC server to see any of the payload data. There is no
size limit on websocket payloads, so this allows a
malicious network client to consume 2^64 bytes in memory
in QEMU. It can trigger this denial of service before
the VNC server even performs any authentication.

The fix is to decode the header, and then incrementally
decode the payload data as it is needed. With this fix
the websocket decoder will allow at most 4k of data to
be buffered before decoding and processing payload.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[ kraxel: fix frequent spurious disconnects, suggested by Peter Maydell ]
[ kraxel: fix 32bit build ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit a2bebfd6e0)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Paolo Bonzini
157e595970 virtio-blk: do not relay a previous driver's WCE configuration to the current
The following sequence happens:
- the SeaBIOS virtio-blk driver does not support the WCE feature, which
causes QEMU to disable writeback caching

- the Linux virtio-blk driver resets the device, finds WCE is available
but writeback caching is disabled; tells block layer to not send cache
flush commands

- the Linux virtio-blk driver sets the DRIVER_OK bit, which causes
writeback caching to be re-enabled, but the Linux virtio-blk driver does
not know of this side effect and cache flushes remain disabled

The bug is at the third step.  If the guest does know about CONFIG_WCE,
QEMU should ignore the WCE feature's state.  The guest will control the
cache mode solely using configuration space.  This change makes Linux
do flushes correctly, but Linux will keep SeaBIOS's writethrough mode.

Hence, whenever the guest is reset, the cache mode of the disk should
be reset to whatever was specified in the "-drive" option.  With this
change, the Linux virtio-blk driver finds that writeback caching is
enabled, and tells the block layer to send cache flush commands
appropriately.

Reported-by: Rusty Russell <rusty@au1.ibm.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: Fixes bsc#920571]
(cherry picked from commit ef5bc96268)
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/virtio-blk.c
	hw/virtio-blk.h
2021-03-17 21:09:20 -06:00
8652076a8a migration: Fix incorrect state information for migrate_cancel
In qemu 1.4.x, when performing migrate_cancel during migration,
if the migrate_fd_cancel in main thread is scheduled to run before
thread buffered_file_thread calls migrate_fd_put_buffer, the s->state
will be modified to MIG_STATE_CANCELLED by main thread, then the
migrate_fd_put_buffer in thread buffered_file_thread will return
-EIO if s->state != MIG_STATE_ACTIVE. This return value will trigger
migrate_fd_error to set s->state = MIG_STATE_ERROR.

The patch fixes this issue.

Signed-off-by: Lin Ma <lma@suse.com>
[AF: BNC#843074]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Michael S. Tsirkin
4f6297ffde cpu: verify that block->host is set
If it isn't, access at an offset will cause memory corruption.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit b78accf614)

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:20 -06:00
Michael S. Tsirkin
1eb10086b3 cpu: assert host pointer offset within block
Make accesses safer in case we missed some
check somewhere.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit fd5f3b6367)

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:20 -06:00
Michael S. Tsirkin
d19868d2ae exec: add wrapper for host pointer access
host pointer accesses force pointer math, let's
add a wrapper to make them safer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amos Kong <akong@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 1240be2435)
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:20 -06:00
Michael S. Tsirkin
774865abda migration: fix parameter validation on ram load
During migration, the values read from migration stream during ram load
are not validated. Especially offset in host_from_stream_offset() and
also the length of the writes in the callers of said function.

To fix this, we need to make sure that the [offset, offset + length]
range fits into one of the allocated memory regions.

Validating addr < len should be sufficient since data seems to always be
managed in TARGET_PAGE_SIZE chunks.

Fixes: CVE-2014-7840

Note: follow-up patches add extra checks on each block->host access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 0be839a270)
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:20 -06:00
Gerd Hoffmann
4c4e7b58f9 cirrus: don't overflow CirrusVGAState->cirrus_bltbuf
This is CVE-2014-8106.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit bf25983345)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Gerd Hoffmann
42f0b2c45e cirrus: fix blit region check
Issues:
 * Doesn't check pitches correctly in case it is negative.
 * Doesn't check width at all.

Turn macro into functions while being at it, also factor out the check
for one region which we then can simply call twice for src + dst.

This is CVE-2014-8106.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d3532a0db0)
[AF: BNC#907805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:20 -06:00
Peter Lieven
7e1b8bf7ee migration: do not overwrite zero pages
on incoming migration do not memset pages to zero if they already read as zero.
this will allocate a new zero page and consume memory unnecessarily. even
if we madvise a MADV_DONTNEED later this will only deallocate the memory
asynchronously.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 211ea74022)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
68fcd8b903 Revert "migration: do not sent zero pages in bulk stage"
Not sending zero pages breaks migration if a page is zero
at the source but not at the destination. This can e.g. happen
if different BIOS versions are used at source and destination.
It has also been reported that migration on pseries is completely
broken with this patch.

This effectively reverts commit f1c72795af.

Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:

	arch_init.c

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9ef051e553)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
62d134cd3a migration: use XBZRLE only after bulk stage
at the beginning of migration all pages are marked dirty and
in the first round a bulk migration of all pages is performed.

currently all these pages are copied to the page cache regardless
of whether they are frequently updated or not. this doesn't make sense
since most of these pages are never transferred again.

this patch changes the XBZRLE transfer to only be used after
the bulk stage has been completed. that means a page is added
to the page cache the second time it is transferred and XBZRLE
can benefit from the third time of transfer.

since the page cache is likely smaller than the number of pages
it's also likely that in the second round the page is missing in the
cache due to collisions in the bulk phase.

on the other hand a lot of unnecessary mallocs, memdups and frees
are saved.

the following results have been taken earlier while executing
the test program from docs/xbzrle.txt. (+) with the patch and (-)
without. (thanks to Eric Blake for reformatting and comments)

+ total time: 22185 milliseconds
- total time: 22410 milliseconds

Shaved 0.3 seconds, better than 1%!

+ downtime: 29 milliseconds
- downtime: 21 milliseconds

Not sure why downtime seemed worse, but probably not the end of the world.

+ transferred ram: 706034 kbytes
- transferred ram: 721318 kbytes

Fewer bytes sent - good.

+ remaining ram: 0 kbytes
- remaining ram: 0 kbytes
+ total ram: 1057216 kbytes
- total ram: 1057216 kbytes
+ duplicate: 108556 pages
- duplicate: 105553 pages
+ normal: 175146 pages
- normal: 179589 pages
+ normal bytes: 700584 kbytes
- normal bytes: 718356 kbytes

Fewer normal bytes...

+ cache size: 67108864 bytes
- cache size: 67108864 bytes
+ xbzrle transferred: 3127 kbytes
- xbzrle transferred: 630 kbytes

...and more compressed pages sent - good.

+ xbzrle pages: 117811 pages
- xbzrle pages: 21527 pages
+ xbzrle cache miss: 18750
- xbzrle cache miss: 179589

And very good improvement on the cache miss rate.

+ xbzrle overflow : 0
- xbzrle overflow : 0

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5cc11c46cf)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
d8e61c490d migration: do not search dirty pages in bulk stage
avoid searching for dirty pages just increment the
page offset. all pages are dirty anyway.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 70c8652bf3)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
7e43f15817 migration: do not sent zero pages in bulk stage
during bulk stage of ram migration if a page is a
zero page do not send it at all.
the memory at the destination reads as zero anyway.

even if there is an madvise with QEMU_MADV_DONTNEED
at the target upon receipt of a zero page I have observed
that the target starts swapping if the memory is overcommitted.
it seems that the pages are dropped asynchronously.

this patch also updates QMP to return the number of
skipped pages in MigrationStats.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit f1c72795af)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
3d163225bf migration: add an indicator for bulk state of ram migration
the first round of ram transfer is special since all pages
are dirty and thus all memory pages are transferred to
the target. this patch adds a boolean variable to track
this stage.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 78d07ae7ac)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
583c04d8af migration: search for zero instead of dup pages
virtually all dup pages are zero pages. remove
the special is_dup_page() function and use the
optimized buffer_find_nonzero_offset() function
instead.

here buffer_find_nonzero_offset() is used directly
to avoid the unnecssary additional checks in
buffer_is_zero().

raw performace gain checking 1 GByte zeroed memory
over is_dup_page() is approx. 10-12% with SSE2
and 8-10% with unsigned long arithmedtic.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Orit Wasserman <owasserm@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3edcd7e6eb)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
5f829735d9 cutils: add a function to find non-zero content in a buffer
this adds buffer_find_nonzero_offset() which is a SSE2/Altivec
optimized function that searches for non-zero content in a
buffer.

the function starts full unrolling only after the first few chunks have
been checked one by one. analyzing real memory page data has revealed
that non-zero pages are non-zero within the first 256-512 bits in
most cases. as this function is also heavily used to check for zero memory
pages this tweak has been made to avoid the high setup costs of the fully
unrolled check for non-zero pages.

due to the optimizations used in the function there are restrictions
on buffer address and search length. the function
can_use_buffer_find_nonzero_content() can be used to check if
the function can be used safely.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 41a259bd2b)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Lieven
51381a8f5e move vector definitions to qemu-common.h
vector optimizations will now be used at various places
not just in is_dup_page() in arch_init.c

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit c61ca00ada)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Juan Quintela
79b7d17769 migration: Improve QMP documentation
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 817c60457f)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:20 -06:00
Peter Crosthwaite
9205eaf678 iov: Factor out hexdumper
Factor out the hexdumper functionality from iov for all to use. Useful for
creating verbose debug printfery that dumps packet data.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: faaac219c55ea586d3f748befaf5a2788fd271b8.1361853677.git.peter.crosthwaite@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6ff66f50f0)
[LM: BNC#878350]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:19 -06:00
Tony Breeds
0522639474 block/raw-posix: use seek_hole ahead of fiemap
try_fiemap() uses FIEMAP_FLAG_SYNC which has a significant performance
impact.

Prefer seek_hole() over fiemap() to avoid this impact where possible.
seek_hole is more widely used and, arguably, has potential to be
optimised in the kernel.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 7c15903789)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	block/raw-posix.c
2021-03-17 21:09:19 -06:00
Tony Breeds
1d4a975d59 block/raw-posix: Fix disk corruption in try_fiemap
Using fiemap without FIEMAP_FLAG_SYNC is a known corrupter.

Add the FIEMAP_FLAG_SYNC flag to the FS_IOC_FIEMAP ioctl.  This has
the downside of significantly reducing performance.

Reported-By: Michael Steffens <michael_steffens@posteo.de>
Signed-off-by: Tony Breeds <tony@bakeyournoodle.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: Pádraig Brady <pbrady@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 38c4d0aea3)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:19 -06:00
Max Reitz
2026c9d65d block/raw-posix: Try both FIEMAP and SEEK_HOLE
The current version of raw-posix always uses ioctl(FS_IOC_FIEMAP) if
FIEMAP is available; lseek with SEEK_HOLE/SEEK_DATA are not even
compiled in in this case. However, there may be implementations which
support the latter but not the former (e.g., NFSv4.2) as well as vice
versa.

To cover both cases, try FIEMAP first (as this will return -ENOTSUP if
not supported instead of returning a failsafe value (everything
allocated as a single extent)) and if that does not work, fall back to
SEEK_HOLE/SEEK_DATA.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4f11aa8a40)
[BR: BNC#908381]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	block/raw-posix.c
2021-03-17 21:09:19 -06:00
85b5143a9d qdev: Validate hex properties
strtoul(l) might overflow, in which case it'll return '-1' and set
the appropriate error code. So update the calls to strtoul(l) when
parsing hex properties to avoid silent overflows.
And we should be using an intermediate variable to avoid clobbering
of the passed-in point on error.

Signed-off-by: Hannes Reinecke <hare@suse.de>

Backported the patch from:
http://lists.nongnu.org/archive/html/qemu-devel/2013-11/msg03950.html
[LM: BNC#852397]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:19 -06:00
Amos Kong
c8992c2856 add a boot option to do strict boot
Seabios already added a new device type to halt booting.
Qemu can add "HALT" at the end of bootindex string, then
seabios will halt booting after trying to boot from all
selected devices.

This patch added a new boot option to configure if boot
from un-selected devices.

This option only effects when boot priority is changed by
bootindex options, the old style(-boot order=..) will still
try to boot from un-selected devices.

v2: add HALT entry in get_boot_devices_list()
v3: rebase to latest qemu upstream

Signed-off-by: Amos Kong <akong@redhat.com>
Message-id: 1363674207-31496-1-git-send-email-akong@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c8a6ae8bb9)
Signed-off-by: Lin Ma <lma@suse.com>
[BR: BNC#900084]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:19 -06:00
Amos Kong
386ce302c5 monitor: introduce query-command-line-options
Libvirt has no way to probe if an option or property is supported,
This patch introduces a new qmp command to query command line
option information. hmp command isn't added because it's not needed.

Signed-off-by: Amos Kong <akong@redhat.com>
CC: Luiz Capitulino <lcapitulino@redhat.com>
CC: Osier Yang <jyang@redhat.com>
CC: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 1f8f987d34)
[BR: BNC#899144]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	qapi-schema.json
2021-03-17 21:09:19 -06:00
Petr Matousek
4a8248c367 slirp: udp: fix NULL pointer dereference because of uninitialized socket
When guest sends udp packet with source port and source addr 0,
uninitialized socket is picked up when looking for matching and already
created udp sockets, and later passed to sosendto() where NULL pointer
dereference is hit during so->slirp->vnetwork_mask.s_addr access.

Fix this by checking that the socket is not just a socket stub.

This is CVE-2014-3640.

Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Xavier Mehrenberger <xavier.mehrenberger@airbus.com>
Reported-by: Stephane Duverger <stephane.duverger@eads.net>
Reviewed-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-id: 20140918063537.GX9321@dhcp-25-225.brq.redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 01f7cecf00)
[AF: BNC#897654]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
f0fa303aaf usb: fix up post load checks
Correct post load checks:
1. dev->setup_len == sizeof(dev->data_buf)
    seems fine, no need to fail migration
2. When state is DATA, passing index > len
   will cause memcpy with negative length,
   resulting in heap overflow

First of the issues was reported by dgilbert.

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 719ffe1f5f)
[AF: BNC#878541]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
33d5bbb8e7 qcow1: Validate image size (CVE-2014-0223)
A huge image size could cause s->l1_size to overflow. Make sure that
images never require a L1 table larger than what fits in s->l1_size.

This cannot only cause unbounded allocations, but also the allocation of
a too small L1 table, resulting in out-of-bounds array accesses (both
reads and writes).

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 46485de0cb)
[AF: BNC#877645; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
687ff44c54 qcow1: Validate L2 table size (CVE-2014-0222)
Too large L2 table sizes cause unbounded allocations. Images actually
created by qemu-img only have 512 byte or 4k L2 tables.

To keep things consistent with cluster sizes, allow ranges between 512
bytes and 64k (in fact, down to 1 entry = 8 bytes is technically
working, but L2 table sizes smaller than a cluster don't make a lot of
sense).

This also means that the number of bytes on the virtual disk that are
described by the same L2 table is limited to at most 8k * 64k or 2^29,
preventively avoiding any integer overflows.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 42eb58179b)
[AF: BNC#877642; error_setg() -> qerror_report()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
dce4a6fc84 qcow1: Check maximum cluster size
Huge values for header.cluster_bits cause unbounded allocations (e.g.
for s->cluster_cache) and crash qemu this way. Less huge values may
survive those allocations, but can cause integer overflows later on.

The only cluster sizes that qemu can create are 4k (for standalone
images) and 512 (for images with backing files), so we can limit it
to 64k.

Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Benoit Canet <benoit@irqsave.net>
(cherry picked from commit 7159a45b2b)
[AF: error_setg() -> qerror_report(), disabled iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
4fb4d6b29b KVM: Extend dynamic MSI route flush
We have a limited number of IRQ routing entries. To ensure that we don't
exceed that number, we flush dynamically created MSI route entries when
we realize that we're running out of space.

However, we count the GSI count incorrectly. This patch adds a safetly net
to make sure we're flushing all dynamic MSI routes when we realize that we
would be exceeding the number space.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
afb255de4a x86 XSAVE: Reconstruct xsave cpuid leafs
Different virtual CPUs implement different capabilities of features that
get reflected in XSAVE depth. So instead of passing in the maximum XSAVE
capabilities, we should only tell the guest as much as it has to see for
the respective chosen vcpu type.

This patch is heavily based on commit 2560f19f but adjusted to apply on
our ancient code base.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
d2b641e419 x86 PMU: Disable vPMU cpuid exposure
On x86 we expose all KVM PMU capabilities back into KVM. Unfortunately our
vPMU emulation breaks Windows Server 2012 R2.

Because we're lacking support for all the vPMU MSRs anyway, disable all PMU
CPUID flags as well, so that Windows is happy and we don't get a half-way
implemented PMU exposed into our guest.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
921b44a765 KVM: Fix GSI number space limit
KVM tells us the number of GSIs it can handle inside the kernel. That value is
basically KVM_MAX_IRQ_ROUTES. However when we try to set the GSI mapping table,
it checks for

    r = -EINVAL;
    if (routing.nr >= KVM_MAX_IRQ_ROUTES)
        goto out;

erroring out even when we're only using all of the GSIs. To make sure we never
hit that limit, let's reduce the number of GSIs we get from KVM by one.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
cae32837da virtio: validate config_len on load
Malformed input can have config_len in migration stream
exceed the array size allocated on destination, the
result will be heap overflow.

To fix, that config_len matches on both sides.

CVE-2014-0182

Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>

--

v2: use %ix and %zx to print config_len values
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a890a2f913)
[AF: BNC#874788]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
81eb2c4322 virtio-net: out-of-bounds buffer write on load
CVE-2013-4149 QEMU 1.3.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

>         } else if (n->mac_table.in_use) {
>             uint8_t *buf = g_malloc0(n->mac_table.in_use);

We are allocating buffer of size n->mac_table.in_use

>             qemu_get_buffer(f, buf, n->mac_table.in_use * ETH_ALEN);

and read to the n->mac_table.in_use size buffer n->mac_table.in_use *
ETH_ALEN bytes, corrupting memory.

If adversary controls state then memory written there is controlled
by adversary.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 98f93ddd84)
[AF: BNC#864649]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael Roth
9a18e51d76 openpic: avoid buffer overrun on incoming migration
CVE-2013-4534

opp->nb_cpus is read from the wire and used to determine how many
IRQDest elements to read into opp->dst[]. If the value exceeds the
length of opp->dst[], MAX_CPU, opp->dst[] can be overrun with arbitrary
data from the wire.

Fix this by failing migration if the value read from the wire exceeds
MAX_CPU.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 73d963c0a7)
[AF: BNC#864811; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
0698489509 ssi-sd: fix buffer overrun on invalid state load
CVE-2013-4537

s->arglen is taken from wire and used as idx
in ssi_sd_transfer().

Validate it before access.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit a9c380db3b)
[AF: BNC#864391]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Peter Maydell
fde2b7d50e savevm: Ignore minimum_version_id_old if there is no load_state_old
At the moment we require vmstate definitions to set minimum_version_id_old
to the same value as minimum_version_id if they do not provide a
load_state_old handler. Since the load_state_old functionality is
required only for a handful of devices that need to retain migration
compatibility with a pre-vmstate implementation, this means the bulk
of devices have pointless boilerplate. Relax the definition so that
minimum_version_id_old is ignored if there is no load_state_old handler.

Note that under the old scheme we would segfault if the vmstate
specified a minimum_version_id_old that was less than minimum_version_id
but did not provide a load_state_old function, and the incoming state
specified a version number between minimum_version_id_old and
minimum_version_id. Under the new scheme this will just result in
our failing the migration.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 767adce2d9)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
e6cb432c45 usb: sanity check setup_index+setup_len in post_load
CVE-2013-4541

s->setup_len and s->setup_index are fed into usb_packet_copy as
size/offset into s->data_buf, it's possible for invalid state to exploit
this to load arbitrary data.

setup_len and setup_index should be checked to make sure
they are not negative.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 9f8e9895c5)
[AF: BNC#864802]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
fe7ba41ebc vmstate: s/VMSTATE_INT32_LE/VMSTATE_INT32_POSITIVE_LE/
As the macro verifies the value is positive, rename it
to make the function clearer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3476436a44)
[AF: backported (target-arm doesn't use VMState yet)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
61274a8253 virtio-scsi: fix buffer overrun on invalid state load
CVE-2013-4542

hw/scsi/scsi-bus.c invokes load_request.

 virtio_scsi_load_request does:
    qemu_get_buffer(f, (unsigned char *)&req->elem, sizeof(req->elem));

this probably can make elem invalid, for example,
make in_num or out_num huge, then:

    virtio_scsi_parse_req(s, vs->cmd_vqs[n], req);

will do:

    if (req->elem.out_num > 1) {
        qemu_sgl_init_external(req, &req->elem.out_sg[1],
                               &req->elem.out_addr[1],
                               req->elem.out_num - 1);
    } else {
        qemu_sgl_init_external(req, &req->elem.in_sg[1],
                               &req->elem.in_addr[1],
                               req->elem.in_num - 1);
    }

and this will access out of array bounds.

Note: this adds security checks within assert calls since
SCSIBusInfo's load_request cannot fail.
For now simply disable builds with NDEBUG - there seems
to be little value in supporting these.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3c3ce98142)
[AF: BNC#864804, backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
b84de338f9 zaurus: fix buffer overrun on invalid state load
CVE-2013-4540

Within scoop_gpio_handler_update, if prev_level has a high bit set, then
we get bit > 16 and that causes a buffer overrun.

Since prev_level comes from wire indirectly, this can
happen on invalid state load.

Similarly for gpio_level and gpio_dir.

To fix, limit to 16 bit.

Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 52f91c3723)
[AF: BNC#864801]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
ea0170cca4 tsc210x: fix buffer overrun on invalid state load
CVE-2013-4539

s->precision, nextprecision, function and nextfunction
come from wire and are used
as idx into resolution[] in TSC_CUT_RESOLUTION.

Validate after load to avoid buffer overrun.

Cc: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5193be3be3)
[AF: BNC#864805]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
99604c72ed ssd0323: fix buffer overun on invalid state load
CVE-2013-4538

s->cmd_len used as index in ssd0323_transfer() to store 32-bit field.
Possible this field might then be supplied by guest to overwrite a
return addr somewhere. Same for row/col fields, which are indicies into
framebuffer array.

To fix validate after load.

Additionally, validate that the row/col_start/end are within bounds;
otherwise the guest can provoke an overrun by either setting the _end
field so large that the row++ increments just walk off the end of the
array, or by setting the _start value to something bogus and then
letting the "we hit end of row" logic reset row to row_start.

For completeness, validate mode as well.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ead7a57df3)
[AF: BNC#864769]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
36cfdceeb6 pxa2xx: avoid buffer overrun on incoming migration
CVE-2013-4533

s->rx_level is read from the wire and used to determine how many bytes
to subsequently read into s->rx_fifo[]. If s->rx_level exceeds the
length of s->rx_fifo[] the buffer can be overrun with arbitrary data
from the wire.

Fix this by validating rx_level against the size of s->rx_fifo.

Cc: Don Koch <dkoch@verizon.com>
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Don Koch <dkoch@verizon.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit caa881abe0)
[AF: BNC#864655]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
a60c88f157 virtio: validate num_sg when mapping
CVE-2013-4535
CVE-2013-4536

Both virtio-block and virtio-serial read,
VirtQueueElements are read in as buffers, and passed to
virtqueue_map_sg(), where num_sg is taken from the wire and can force
writes to indicies beyond VIRTQUEUE_MAX_SIZE.

To fix, validate num_sg.

Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Cc: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 36cf2a3713)
[AF: BNC#864665]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael Roth
f0fa74faab virtio: avoid buffer overrun on incoming migration
CVE-2013-6399

vdev->queue_sel is read from the wire, and later used in the
emulation code as an index into vdev->vq[]. If the value of
vdev->queue_sel exceeds the length of vdev->vq[], currently
allocated to be VIRTIO_PCI_QUEUE_MAX elements, subsequent PIO
operations such as VIRTIO_PCI_QUEUE_PFN can be used to overrun
the buffer with arbitrary data originating from the source.

Fix this by failing migration if the value from the wire exceeds
VIRTIO_PCI_QUEUE_MAX.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4b53c2c72c)
[AF: BNC#864814]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
0f15235aa3 vmstate: fix buffer overflow in target-arm/machine.c
CVE-2013-4531

cpreg_vmstate_indexes is a VARRAY_INT32. A negative value for
cpreg_vmstate_array_len will cause a buffer overflow.

VMSTATE_INT32_LE was supposed to protect against this
but doesn't because it doesn't validate that input is
non-negative.

Fix this macro to valide the value appropriately.

The only other user of VMSTATE_INT32_LE doesn't
ever use negative numbers so it doesn't care.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d2ef4b61fe)
[AF: BNC#864796, backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dr. David Alan Gilbert
39acd639aa Fix vmstate_info_int32_le comparison/assign
Fix comparison of vmstate_info_int32_le so that it succeeds if loaded
value is (l)ess than or (e)qual

When the comparison succeeds, assign the value loaded
  This is a change in behaviour but I think the original intent, since
  the idea is to check if the version/size of the thing you're loading is
  less than some limit, but you might well want to do something based on
  the actual version/size in the file

Fix up comment and name text

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 24a370ef23)
[AF: Backported from vmstate.c]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
f9e6598dbf pl022: fix buffer overun on invalid state load
CVE-2013-4530

pl022.c did not bounds check tx_fifo_head and
rx_fifo_head after loading them from file and
before they are used to dereference array.

Reported-by: Michael S. Tsirkin <mst@redhat.com
Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit d8d0a0bc7e)
[AF: BNC#864682]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
5b2547bea6 hw/pci/pcie_aer.c: fix buffer overruns on invalid state load
4) CVE-2013-4529
hw/pci/pcie_aer.c    pcie aer log can overrun the buffer if log_num is
                     too large

There are two issues in this file:
1. log_max from remote can be larger than on local
then buffer will overrun with data coming from state file.
2. log_num can be larger then we get data corruption
again with an overflow but not adversary controlled.

Fix both issues.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Reported-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5f691ff91d)
[AF: BNC#864678]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
07054ebc47 hpet: fix buffer overrun on invalid state load
CVE-2013-4527 hw/timer/hpet.c buffer overrun

hpet is a VARRAY with a uint8 size but static array of 32

To fix, make sure num_timers is valid using VMSTATE_VALID hook.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 3f1c49e213)
[AF: BNC#864673]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
8253e0d82e vmstate: add VMSTATE_VALIDATE
Validate state using VMS_ARRAY with num = 0 and VMS_MUST_EXIST

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 4082f0889b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
23adb56bf0 vmstate: add VMS_MUST_EXIST
Can be used to verify a required field exists or validate
state in some other way.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 5bf81c8d63)
[AF: Backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
c4cd1f6e7a ahci: fix buffer overrun on invalid state load
CVE-2013-4526

Within hw/ide/ahci.c, VARRAY refers to ports which is also loaded.  So
we use the old version of ports to read the array but then allow any
value for ports.  This can cause the code to overflow.

There's no reason to migrate ports - it never changes.
So just make sure it matches.

Reported-by: Anthony Liguori <anthony@codemonkey.ws>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit ae2158ad6c)
[AF: BNC#864671]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
a2b29c3489 virtio: out-of-bounds buffer write on invalid state load
CVE-2013-4151 QEMU 1.0 out-of-bounds buffer write in
virtio_load@hw/virtio/virtio.c

So we have this code since way back when:

    num = qemu_get_be32(f);

    for (i = 0; i < num; i++) {
        vdev->vq[i].vring.num = qemu_get_be32(f);

array of vqs has size VIRTIO_PCI_QUEUE_MAX, so
on invalid input this will write beyond end of buffer.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit cc45995294)
[AF: BNC#864653]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
d058533c65 virtio-net: out-of-bounds buffer write on invalid state load
CVE-2013-4150 QEMU 1.5.0 out-of-bounds buffer write in
virtio_net_load()@hw/net/virtio-net.c

This code is in hw/net/virtio-net.c:

    if (n->max_queues > 1) {
        if (n->max_queues != qemu_get_be16(f)) {
            error_report("virtio-net: different max_queues ");
            return -1;
        }

        n->curr_queues = qemu_get_be16(f);
        for (i = 1; i < n->curr_queues; i++) {
            n->vqs[i].tx_waiting = qemu_get_be32(f);
        }
    }

Number of vqs is max_queues, so if we get invalid input here,
for example if max_queues = 2, curr_queues = 3, we get
write beyond end of the buffer, with data that comes from
wire.

This might be used to corrupt qemu memory in hard to predict ways.
Since we have lots of function pointers around, RCE might be possible.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit eea750a562)
[AF: BNC#864650]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
4c6198d076 virtio-net: fix buffer overflow on invalid state load
CVE-2013-4148 QEMU 1.0 integer conversion in
virtio_net_load()@hw/net/virtio-net.c

Deals with loading a corrupted savevm image.

>         n->mac_table.in_use = qemu_get_be32(f);

in_use is int so it can get negative when assigned 32bit unsigned value.

>         /* MAC_TABLE_ENTRIES may be different from the saved image */
>         if (n->mac_table.in_use <= MAC_TABLE_ENTRIES) {

passing this check ^^^

>             qemu_get_buffer(f, n->mac_table.macs,
>                             n->mac_table.in_use * ETH_ALEN);

with good in_use value, "n->mac_table.in_use * ETH_ALEN" can get
positive and bigger than mac_table.macs. For example 0x81000000
satisfies this condition when ETH_ALEN is 6.

Fix it by making the value unsigned.
For consistency, change first_multi as well.

Note: all call sites were audited to confirm that
making them unsigned didn't cause any issues:
it turns out we actually never do math on them,
so it's easy to validate because both values are
always <= MAC_TABLE_ENTRIES.

Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 71f7fe48e1)
[AF: BNC#864812; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Michael S. Tsirkin
ae2d6aa056 virtio-net: fix guest-triggerable buffer overrun
When VM guest programs multicast addresses for
a virtio net card, it supplies a 32 bit
entries counter for the number of addresses.
These addresses are read into tail portion of
a fixed macs array which has size MAC_TABLE_ENTRIES,
at offset equal to in_use.

To avoid overflow of this array by guest, qemu attempts
to test the size as follows:
-    if (in_use + mac_data.entries <= MAC_TABLE_ENTRIES) {

however, as mac_data.entries is uint32_t, this sum
can overflow, e.g. if in_use is 1 and mac_data.entries
is 0xffffffff then in_use + mac_data.entries will be 0.

Qemu will then read guest supplied buffer into this
memory, overflowing buffer on heap.

CVE-2014-0150

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1397218574-25058-1-git-send-email-mst@redhat.com
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit edc2438512)
[AF: Resolves BNC#873235; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Benoît Canet
0c8626fc06 ide: Correct improper smart self test counter reset in ide core.
The SMART self test counter was incorrectly being reset to zero,
not 1. This had the effect that on every 21st SMART EXECUTE OFFLINE:
 * We would write off the beginning of a dynamically allocated buffer
 * We forgot the SMART history
Fix this.

Signed-off-by: Benoit Canet <benoit@irqsave.net>
Message-id: 1397336390-24664-1-git-send-email-benoit.canet@irqsave.net
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Acked-by: Kevin Wolf <kwolf@redhat.com>
[PMM: tweaked commit message as per suggestions from Markus]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

(cherry picked from commit 940973ae0b)
[AF: Addresses CVE-2014-2894; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Jason Wang
8643a13d4a virtio: properly validate address before accessing config
There are several several issues in the current checking:

- The check was based on the minus of unsigned values which can overflow
- It was done after .{set|get}_config() which can lead crash when config_len
  is zero since vdev->config is NULL

Fix this by:

- Validate the address in virtio_pci_config_{read|write}() before
  .{set|get}_config
- Use addition instead minus to do the validation

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Petr Matousek <pmatouse@redhat.com>
Message-id: 1367905369-10765-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 5f5a131865)
[AF: BNC#817593, CVE-2013-2016; backported]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
a125af5b32 parallels: Sanity check for s->tracks (CVE-2014-0142)
This avoids a possible division by zero.

Convert s->tracks to unsigned as well because it feels better than
surviving just because the results of calculations with s->tracks are
converted to unsigned anyway.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9302e863aa)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
b6d96fa269 parallels: Fix catalog size integer overflow (CVE-2014-0143)
The first test case would cause a huge memory allocation, leading to a
qemu abort; the second one to a too small malloc() for the catalog
(smaller than s->catalog_size), which causes a read-only out-of-bounds
array access and on big endian hosts an endianess conversion for an
undefined memory area.

The sample image used here is not an original Parallels image. It was
created using an hexeditor on the basis of the struct that qemu uses.
Good enough for trying to crash the driver, but not for ensuring
compatibility.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afbcc40bee)
[AF: BNC#870439; error_setg() -> qerror_report(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
7ce6986d8e qcow2: Limit snapshot table size
Even with a limit of 64k snapshots, each snapshot could have a filename
and an ID with up to 64k, which would still lead to pretty large
allocations, which could potentially lead to qemu aborting. Limit the
total size of the snapshot table to an average of 1k per entry when
the limit of 64k snapshots is fully used. This should be plenty for any
reasonable user.

This also fixes potential integer overflows of s->snapshot_size.

Suggested-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dae6e30c5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
8932352174 qcow2: Check maximum L1 size in qcow2_snapshot_load_tmp() (CVE-2014-0143)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6a83f8b5be)
[AF: BNC#870439; rebased on report_unsupported(),
     error_setg() -> error_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
b6f0cb127e qcow2: Fix L1 allocation size in qcow2_snapshot_load_tmp() (CVE-2014-0145)
For the L1 table to loaded for an internal snapshot, the code allocated
only enough memory to hold the currently active L1 table. If the
snapshot's L1 table is actually larger than the current one, this leads
to a buffer overflow.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c05e4667be)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
6868abd2ad qcow2: Fix NULL dereference in qcow2_open() error path (CVE-2014-0146)
The qcow2 code assumes that s->snapshots is non-NULL if s->nb_snapshots
!= 0. By having the initialisation of both fields separated in
qcow2_open(), any error occuring in between would cause the error path
to dereference NULL in qcow2_free_snapshots() if the image had any
snapshots.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 11b128f406)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
144778eb92 qcow2: Fix copy_sectors() with VM state
bs->total_sectors is not the highest possible sector number that could
be involved in a copy on write operation: VM state is after the end of
the virtual disk. This resulted in wrong values for the number of
sectors to be copied (n).

The code that checks for the end of the image isn't required any more
because the code hasn't been calling the block layer's bdrv_read() for a
long time; instead, it directly calls qcow2_readv(), which doesn't error
out on VM state sector numbers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6b7d4c5558)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
8d3c5b239a block: Limit request size (CVE-2014-0143)
Limiting the size of a single request to INT_MAX not only fixes a
direct integer overflow in bdrv_check_request() (which would only
trigger bad behaviour with ridiculously huge images, as in close to
2^64 bytes), but can also prevent overflows in all block drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8f4754ede5)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
d60d49abd9 dmg: prevent chunk buffer overflow (CVE-2014-0145)
Both compressed and uncompressed I/O is buffered.  dmg_open() calculates
the maximum buffer size needed from the metadata in the image file.

There is currently a buffer overflow since ->lengths[] is accounted
against the maximum compressed buffer size but actually uses the
uncompressed buffer:

  switch (s->types[chunk]) {
  case 1: /* copy */
      ret = bdrv_pread(bs->file, s->offsets[chunk],
                       s->uncompressed_chunk, s->lengths[chunk]);

We must account against the maximum uncompressed buffer size for type=1
chunks.

This patch fixes the maximum buffer size calculation to take into
account the chunk type.  It is critical that we update the correct
maximum since there are two buffers ->compressed_chunk and
->uncompressed_chunk.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f0dce23475)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
01c88e1e46 dmg: use uint64_t consistently for sectors and lengths
The DMG metadata is stored as uint64_t, so use the same type for
sector_num.  int was a particularly poor choice since it is only 32-bit
and would truncate large values.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 686d7148ec)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
02ceca2065 dmg: sanitize chunk length and sectorcount (CVE-2014-0145)
Chunk length and sectorcount are used for decompression buffers as well
as the bdrv_pread() count argument.  Ensure that they have reasonable
values so neither memory allocation nor conversion from uint64_t to int
will cause problems.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c165f77580)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
d3086a5380 dmg: use appropriate types when reading chunks
Use the right types instead of signed int:

  size_t new_size;

  This is a byte count for g_realloc() that is calculated from uint32_t
  and size_t values.

  uint32_t chunk_count;

  Use the same type as s->n_chunks, which is used together with
  chunk_count.

This patch is a cleanup and does not fix bugs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eb71803b04)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
c6bf1d09dc dmg: drop broken bdrv_pread() loop
It is not necessary to check errno for EINTR and the block layer does
not produce short reads.  Therefore we can drop the loop that attempts
to read a compressed chunk.

The loop is buggy because it incorrectly adds the transferred bytes
twice:

  do {
      ret = bdrv_pread(...);
      i += ret;
  } while (ret >= 0 && ret + i < s->lengths[chunk]);

Luckily we can drop the loop completely and perform a single
bdrv_pread().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b404bf8542)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
c1d52eaa3c dmg: prevent out-of-bounds array access on terminator
When a terminator is reached the base for offsets and sectors is stored.
The following records that are processed will use this base value.

If the first record we encounter is a terminator, then calculating the
base values would result in out-of-bounds array accesses.  Don't do
that.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 73ed27ec28)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
9adbfc9661 dmg: coding style and indentation cleanup
Clean up the mix of tabs and spaces, as well as the coding style
violations in block/dmg.c.  There are no semantic changes since this
patch simply reformats the code.

This patch is necessary before we can make meaningful changes to this
file, due to the inconsistent formatting and confusing indentation.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2c1885adcf)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
2fcca8dc9f qcow2: Fix new L1 table size check (CVE-2014-0143)
The size in bytes is assigned to an int later, so check that instead of
the number of entries.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit cab60de930)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
ac79fe7e15 qcow2: Catch some L1 table index overflows
This catches the situation that is described in the bug report at
https://bugs.launchpad.net/qemu/+bug/865518 and goes like this:

    $ qemu-img create -f qcow2 huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    $ qemu-io /tmp/huge.qcow2 -c "write $((1024*1024*1024*1024*1024*1024 - 1024)) 512"
    Segmentation fault

With this patch applied the segfault will be avoided, however the case
will still fail, though gracefully:

    $ qemu-img create -f qcow2 /tmp/huge.qcow2 $((1024*1024))T
    Formatting 'huge.qcow2', fmt=qcow2 size=1152921504606846976 encryption=off cluster_size=65536 lazy_refcounts=off
    qemu-img: The image size is too large for file format 'qcow2'

Note that even long before these overflow checks kick in, you get
insanely high memory usage (up to INT_MAX * sizeof(uint64_t) = 16 GB for
the L1 table), so with somewhat smaller image sizes you'll probably see
qemu aborting for a failed g_malloc().

If you need huge image sizes, you should increase the cluster size to
the maximum of 2 MB in order to get higher limits.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2cf7cfa1cd)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
9d98758f5c qcow2: Protect against some integer overflows in bdrv_check
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0abe740f1d)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
ed835bfeeb qcow2: Fix types in qcow2_alloc_clusters and alloc_clusters_noref
In order to avoid integer overflows.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bb572aefbd)
[AF: BNC#870439; rebased]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
61108e6f74 qcow2: Check new refcount table size on growth
If the size becomes larger than what qcow2_open() would accept, fail the
growing operation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2b5d5953ee)
[AF: BNC#870439; rebased on report_unsupported()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
db3621a1c5 qcow2: Avoid integer overflow in get_refcount (CVE-2014-0143)
This ensures that the checks catch all invalid cluster indexes
instead of returning the refcount of a wrong cluster.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit db8a31d11d)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
e5f4152b4d qcow2: Don't rely on free_cluster_index in alloc_refcount_block() (CVE-2014-0147)
free_cluster_index is only correct if update_refcount() was called from
an allocation function, and even there it's brittle because it's used to
protect unfinished allocations which still have a refcount of 0 - if it
moves in the wrong place, the unfinished allocation can be corrupted.

So not using it any more seems to be a good idea. Instead, use the
first requested cluster to do the calculations. Return -EAGAIN if
unfinished allocations could become invalid and let the caller restart
its search for some free clusters.

The context of creating a snapsnot is one situation where
update_refcount() is called outside of a cluster allocation. For this
case, the change fixes a buffer overflow if a cluster is referenced in
an L2 table that cannot be represented by an existing refcount block.
(new_table[refcount_table_index] was out of bounds)

[Bump the qemu-iotests 026 refblock_alloc.write leak count from 10 to
11.
--Stefan]

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit b106ad9185)
[AF: BNC#870439; dropped QCOW2_DISCARD_NEVER argument to
     qcow2_free_clusters() and update_refcount(), dropped iotests]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
452e3119e7 qcow2: Fix backing file name length check
len could become negative and would pass the check then. Nothing bad
happened because bdrv_pread() happens to return an error for negative
length values, but make variables for sizes unsigned anyway.

This patch also changes the behaviour to error out on invalid lengths
instead of silently truncating it to 1023.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d33e8e7dc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
c53fabbda9 qcow2: Validate active L1 table offset and size (CVE-2014-0144)
This avoids an unbounded allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 2d51c32c4b)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
390abc9bbe qcow2: Validate snapshot table offset/size (CVE-2014-0144)
This avoid unbounded memory allocation and fixes a potential buffer
overflow on 32 bit hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce48f2f441)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
6c9ede177a qcow2: Validate refcount table offset
The end of the refcount table must not exceed INT64_MAX so that integer
overflows are avoided.

Also check for misaligned refcount table. Such images are invalid and
probably the result of data corruption. Error out to avoid further
corruption.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8c7de28305)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
5ad5d31414 qcow2: Check refcount table size (CVE-2014-0144)
Limit the in-memory reference count table size to 8 MB, it's enough in
practice. This fixes an unbounded allocation as well as a buffer
overflow in qcow2_refcount_init().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5dab2faddc)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
d0c7a2de8f qcow2: Check backing_file_offset (CVE-2014-0144)
Header, header extension and the backing file name must all be stored in
the first cluster. Setting the backing file to a much higher value
allowed header extensions to become much bigger than we want them to be
(unbounded allocation).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a1b3955c94)
[AF: BNC#870439; error_setg() -> report_unsupported(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
7c628f176e qcow2: Check header_length (CVE-2014-0144)
This fixes an unbounded allocation for s->unknown_header_fields.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 24342f2cae)
[AF: BNC#870439; error_setg() -> unsupported_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Fam Zheng
152474e9cd curl: check data size before memcpy to local buffer. (CVE-2014-0144)
curl_read_cb is callback function for libcurl when data arrives. The
data size passed in here is not guaranteed to be within the range of
request we submitted, so we may overflow the guest IO buffer. Check the
real size we have before memcpy to buffer to avoid overflow.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6d4b9e55fc)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Jeff Cody
054771e1b4 vdi: add bounds checks for blocks_in_image and disk_size header fields (CVE-2014-0144)
The maximum blocks_in_image is 0xffffffff / 4, which also limits the
maximum disk_size for a VDI image to 1024TB.  Note that this is the maximum
size that QEMU will currently support with this driver, not necessarily the
maximum size allowed by the image format.

This also fixes an incorrect error message, a bug introduced by commit
5b7aa9b56d (Reported by Stefan Weil)

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 63fa06dc97)
[AF: BNC#870439; changed error_setg() -> logout()]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
85c8e398c4 vpc: Validate block size (CVE-2014-0142)
This fixes some cases of division by zero crashes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5e71dfad76)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Jeff Cody
38026b9b14 vpc/vhd: add bounds check for max_table_entries and block_size (CVE-2014-0144)
This adds checks to make sure that max_table_entries and block_size
are in sane ranges.  Memory is allocated based on max_table_entries,
and block_size is used to calculate indices into that allocated
memory, so if these values are incorrect that can lead to potential
unbounded memory allocation, or invalid memory accesses.

Also, the allocation of the pagetable is changed from g_malloc0()
to qemu_blockalign().

Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 97f1c45c6f)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
d67e105f86 bochs: Fix bitmap offset calculation
32 bit truncation could let us access the wrong offset in the image.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a9ba36a45d)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
ac2a60feb2 bochs: Check extent_size header field (CVE-2014-0142)
This fixes two possible division by zero crashes: In bochs_open() and in
seek_to_sector().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 8e53abbc20)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
c612c16a66 bochs: Check catalog_size header field (CVE-2014-0143)
It should neither become negative nor allow unbounded memory
allocations. This fixes aborts in g_malloc() and an s->catalog_bitmap
buffer overflow on big endian hosts.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3737b820b)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
3b37554011 bochs: Use unsigned variables for offsets and sizes (CVE-2014-0147)
Gets us rid of integer overflows resulting in negative sizes which
aren't correctly checked.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 246f65838d)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Kevin Wolf
12aaa72057 bochs: Unify header structs and make them QEMU_PACKED
This is an on-disk structure, so offsets must be accurate.

Before this patch, sizeof(bochs) != sizeof(header_v1), which makes the
memcpy() between both invalid. We're lucky enough that the destination
buffer happened to be the larger one, and the memcpy size to be taken
from the smaller one, so we didn't get a buffer overflow in practice.

This patch unifies the both structures, eliminating the need to do a
memcpy in the first place. The common fields are extracted to the top
level of the struct and the actually differing part gets a union of the
two versions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3dd8a6763b)
[AF: BNC#870439]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
b95824b29d block/cloop: fix offsets[] size off-by-one
cloop stores the number of compressed blocks in the n_blocks header
field.  The file actually contains n_blocks + 1 offsets, where the extra
offset is the end-of-file offset.

The following line in cloop_read_block() results in an out-of-bounds
offsets[] access:

    uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];

This patch allocates and loads the extra offset so that
cloop_read_block() works correctly when the last block is accessed.

Notice that we must free s->offsets[] unconditionally now since there is
always an end-of-file offset.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 42d43d35d9)
[AF: BNC#870439; dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
48d335bdd0 block/cloop: refuse images with bogus offsets (CVE-2014-0144)
The offsets[] array allows efficient seeking and tells us the maximum
compressed data size.  If the offsets are bogus the maximum compressed
data size will be unrealistic.

This could cause g_malloc() to abort and bogus offsets mean the image is
broken anyway.  Therefore we should refuse such images.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f56b9bc3ae)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
8816866725 block/cloop: refuse images with huge offsets arrays (CVE-2014-0144)
Limit offsets_size to 512 MB so that:

1. g_malloc() does not abort due to an unreasonable size argument.

2. offsets_size does not overflow the bdrv_pread() int size argument.

This limit imposes a maximum image size of 16 TB at 256 KB block size.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b103b36d6)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
d910bde971 block/cloop: prevent offsets_size integer overflow (CVE-2014-0143)
The following integer overflow in offsets_size can lead to out-of-bounds
memory stores when n_blocks has a huge value:

    uint32_t n_blocks, offsets_size;
    [...]
    ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
    [...]
    s->n_blocks = be32_to_cpu(s->n_blocks);

    /* read offsets */
    offsets_size = s->n_blocks * sizeof(uint64_t);
    s->offsets = g_malloc(offsets_size);

    [...]

    for(i=0;i<s->n_blocks;i++) {
        s->offsets[i] = be64_to_cpu(s->offsets[i]);

offsets_size can be smaller than n_blocks due to integer overflow.
Therefore s->offsets[] is too small when the for loop byteswaps offsets.

This patch refuses to open files if offsets_size would overflow.

Note that changing the type of offsets_size is not a fix since 32-bit
hosts still only have 32-bit size_t.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 509a41bab5)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
ed924dda52 block/cloop: validate block_size header field (CVE-2014-0144)
Avoid unbounded s->uncompressed_block memory allocation by checking that
the block_size header field has a reasonable value.  Also enforce the
assumption that the value is a non-zero multiple of 512.

These constraints conform to cloop 2.639's code so we accept existing
image files.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d65f97a82c)
[AF: BNC#870439; changed error_setg() -> qerror_report(), dropped iotest]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Asias He
f48539108e scsi: Allocate SCSITargetReq r->buf dynamically [CVE-2013-4344]
r->buf is hardcoded to 2056 which is (256 + 1) * 8, allowing 256 luns at
most. If more than 256 luns are specified by user, we have buffer
overflow in scsi_target_emulate_report_luns.

To fix, we allocate the buffer dynamically.

Signed-off-by: Asias He <asias@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 846424350b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Gerd Hoffmann
8c68c03a81 usb: sanity check setup_index+setup_len in post_load
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c60174e847)
[AF: BNC#864802 / CVE-2013-4541]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
7691134cff s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css
We have to set the cssid to 0, otherwise the stsch code will
return an operand exception without the m bit. In the same way
we should set m=0.

This case was triggered in some cases during reboot, if for some
reason the location of blk_schid.cssid contains 1 and m was 0.
Turns out that the qemu elf loader does not zero out the bss section
on reboot.

The symptom was an dump of the old kernel with several areas
overwritten. The bootloader does not register a program check
handler, so bios exception jumped back into the old kernel.

Lets just use a local struct with a designed initializer. That
will guarantee that all other subelements are initialized to 0.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 5d739a4787)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
39f3f0461f s390-ccw.img: Fix sporadic reboot hangs: Initialize next_idx
The current code does not initialize next_idx in the virtio ring.
As the ccw bios will always use guest memory at a fixed location,
this queue might != 0 after a reboot.
Lets make the initialization explicit.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1028f1b5b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
abe2609e03 s390/ipl: Fix waiting for virtio processing
The guest side must not manipulate the index for the used buffers. Instead,
remember the state of the used buffer locally and wait until it has moved.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Acked-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 441ea695f9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
8eb51e7d14 s390/ipl: Fix spurious errors in virtio
With the ccw ipl code sometimes an error message like
"virtio: trying to map MMIO memory" or
"Guest moved used index from %u to %u" appeared. Turns out
that the ccw bios did not zero out the vring, which might
cause stale values in avail->idx and friends, especially
on reboot.

Lets zero out the relevant fields. To activate the patch we
need to rebuild s390-ccw.img as well.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 1369309901-418-1-git-send-email-borntraeger@de.ibm.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 39c93c67c5)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dominik Dingel
de942d0024 S390: BIOS boot from given device
Use the passed device, if there is no device, use the first applicable device.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ff151f4ec9)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
d08d0016d4 s390-ccw.img: Get queue config from host.
Ask the host about the configuration instead of guessing it.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit abbbe3de4a)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
3ac361387f s390-ccw.img: Rudimentary error checking.
Try to handle at least some of the errors that may happen.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0f3f1f302f)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
0fe3cae764 s390-ccw.img: Enhance drain_irqs().
- Use tpi + tsch to get interrupts.
- Return an error if the irb indicates problems.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 776e7f0f21)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
73e87d269c s390-ccw.img: Detect devices with stsch.
stsch is the canonical way to detect devices. As a bonus, we can
abort the loop if we get cc 3, and we need to check only the valid
devices (dnv set).

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 22d67ab55a)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
75c54b01b4 s390-ccw.img: Fix compile warning in s390 ccw virtio code
Lets fix this gcc warning:

virtio.c: In function ‘vring_send_buf’:
virtio.c:125:35: error: operation on ‘vr->next_idx’ may be undefined
[-Werror=sequence-point]

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit dc03640b58)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
700961a5ce s390-ccw.img: Take care of the elf->img transition
We have to call strip with s390-ccw.elf as input and
s390-ccw.img as output

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 6328801f19)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
1927c5a842 s390-ccw.img: replace while loop with a disabled wait on s390 bios
dont waste cpu power on an error condition. Lets stop the guest
with a disabled wait.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7f61cbc108)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
6dcbbf3e70 S390: ccw firmware: Add Makefile
This patch adds a makefile, so we can build our ccw firmware. Also
add the resulting binaries to .gitignore, so that nobody is annoyed
they might be in the tree.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit b462fcd57c)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
e84b764b58 S390: ccw firmware: Add bootmap interpreter
On s390, there is an architected boot map format that we can read to
boot a certain entry off the disk. Implement a simple reader for this
that always boots the first (default) entry.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 685d49a63e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
c00bd5404d S390: ccw firmware: Add glue header
Like all great programs, we have to call between different functions in
different object files. And all of them need a common ground of defines.

Provide a file that provides these defines.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit c9c39d3b5e)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
1a197d51b2 S390: ccw firmware: Add virtio device drivers
In order to boot, we need to be able to access a virtio-blk device through
the CCW bus. Implement support for this.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1e17c2c15b)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
b1768718de S390: ccw firmware: Add sclp output
In order to communicate with the user, we need an I/O mechanism that he
can read. Implement SCLP ASCII support, which happens to be the default
in the s390 ccw machine.

This file is missing read support for now. It can only print messages.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0369b2eb07)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
b2a9daf83c S390: ccw firmware: Add main program
This C file is the main driving piece of the s390 ccw firmware. It
provides a search for a workable block device, sets it as the default
to boot off of and boots from it.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 92f2ca38b0)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
213624f969 S390: ccw firmware: Add start assembly
We want to write most of our code in C, so add a small assembly
stub that jumps straight into C code for us to continue booting.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 80fea6e893)
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Bruce Rogers
6104825124 vnc: provide fake color map
Our current VNC code does not handle color maps (aka non-true-color) at all
and aborts if a client requests them. There are 2 major issues with this:

 1) A VNC viewer on an 8-bit X11 system may request color maps
 2) RealVNC _always_ starts requesting color maps, then moves on to full color

In order to support these 2 use cases, let's just create a fake color map
that covers exactly our normal true color 8 bit color space. That way we don't
lose anything over a client that wants true color.

Reported-by: Sascha Wehnert <swehnert@suse.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Andreas Färber
7cc4beb7e8 acpi_piix4: Fix migration from SLE11 SP2
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Andreas Färber
c45f394fa1 i8254: Fix migration from SP2
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Andreas Färber
28cd4b8a28 vga: Raise VRAM to 16 MiB for pc-0.15 and below
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.

Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Jan Kiszka
7f0e8c8d44 pcnet: Flush queued packets on end of STOP state
Analogously to other NICs, we have to inform the network layer when
the can_receive handler will no longer report 0. Without this, we may
get stuck waiting on queued incoming packets.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ee76c1f821)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Stefan Hajnoczi
be93f8e544 rtl8139: flush queued packets when RxBufPtr is written
Net queues support efficient "receive disable".  For example, tap's file
descriptor will not be polled while its peer has receive disabled.  This
saves CPU cycles for needlessly copying and then dropping packets which
the peer cannot receive.

rtl8139 is missing the qemu_flush_queued_packets() call that wakes the
queue up when receive becomes possible again.

As a result, the Windows 7 guest driver reaches a state where the
rtl8139 cannot receive packets.  The driver has actually refilled the
receive buffer but we never resume reception.

The bug can be reproduced by running a large FTP 'get' inside a Windows
7 guest:

  $ qemu -netdev tap,id=tap0,...
         -device rtl8139,netdev=tap0

The Linux guest driver does not trigger the bug, probably due to a
different buffer management strategy.

Reported-by: Oliver Francke <oliver.francke@filoo.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00b7ade807)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
04e0bfdd48 s390/ipl: Fix boot order
The latest ipl code adaptions collided with some of the virtio
refactoring rework. This resulted in always booting the first
disk. Let's fix booting from a given ID.
The new code also checks for command lines without bootindex to
avoid random behaviour when accessing dev_st (==0).

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5c8ded6ef5)

[AF: virtio refactoring not applicable, revert dev_st cast change]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
a9fb15b8ea s390x/css: Fix concurrent sense.
Fix an off-by-one error when indicating availablity of concurrent
sense data.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 8312976e73)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Cornelia Huck
a3831a43e3 virtio-ccw: Fix unsetting of indicators.
Interpretation of the ccws to register (configuration) indicators contained
a thinko: We want to disallow reading from 0, but setting the indicator
pointer to 0 is fine.

Let's fix the handling for CCW_CMD_SET{,_CONF}_IND.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit d1db1fa8df)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Christian Borntraeger
bcb164497b s390/virtio-ccw: Fix virtio reset
On virtio reset we must reset the indicator to avoid stale interrupts,
e.g. after a reset.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dominik Dingel
1bd8effe86 S390: Add virtio-blk boot
If no kernel IPL entry is specified, boot the bios and pass if available
device information for the first boot device (as given by the boot index).

The provided information will be used in the next commit from the BIOS.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba1509c0a9)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dominik Dingel
4f6d9028cd Common: Add quick access to first boot device
Instead of manually parsing the boot_list as character stream,
we can access the nth boot device, specified by the position in the
boot order.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 7dc5af5545)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dominik Dingel
25e275bdcc S390: Merging s390_ipl_cpu and s390_ipl_reset
There is no use in have this splitted in two functions.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 2c4c71ee3a)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Dominik Dingel
7ba7628d75 S390: BIOS check for file
Add a check if the BIOS blob exists before trying to load.

Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 1f7de85330)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
a7183f4e57 S390: CCW: Use new, working firmware by default
Since we now have working firmware for s390-ccw in the tree, we can
default to it on our s390-ccw machine, rendering it more useful.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit ba747cc8f3)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
8d97a9e990 S390: IPL: Use different firmware for different machines
We have a virtio-s390 and a virtio-ccw machine in QEMU. Both use vastly
different ways to do I/O. Having the same firmware blob for both doesn't
really make any sense.

Instead, let's parametrize the firmware file name, so that we can have
different blobs for different machines.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d0249ce5a8)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
c272ed2b2c S390: IPL: Support ELF firmware
Our firmware blob is always a raw file that we load at a fixed address today.
Support loading an ELF blob instead that we can map high up in memory.

This way we don't have to be so conscious about size constraints.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 3325995640)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
4c7b66406c S390: Make IPL reset address dynamic
We can have different load addresses for different blobs we boot with.
Make the reset IP dynamic, so that we can handle things more flexibly.

Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 74ad2d22c1)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
70b243d20e Dictzip: Compile in block bucket, so qemu-img gets support as well
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
95a7b57853 Dictzip: Fix potential endless loop
Before, we reserved streams on request submission, going into a potential
endless loop if we run more than 4 parallel requests. There's no need to.
The decoding callback is synchronized already.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Alexander Graf
3ba7da378a Dictzip: Fix endianness issues
Yikes - when running on big endian systems, we had some serious bugs
exposed. This patch gets us rolling on those again.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:19 -06:00
Tim Hardeck
45e0967553 TLS support for VNC Websockets
Added TLS support to the VNC QEMU Websockets implementation.
VNC-TLS needs to be enabled for this feature to be used.

The required certificates are specified as in case of VNC-TLS
with the VNC parameter "x509=<path>".

If the server certificate isn't signed by a rooth authority it needs to
be manually imported in the browser because at least in case of Firefox
and Chrome there is no user dialog, the connection just gets canceled.

As a side note VEncrypt over Websocket doesn't work atm because TLS can't
be stacked in the current implementation. (It also didn't work before)
Nevertheless to my knowledge there is no HTML 5 VNC client which supports
it and the Websocket connection can be encrypted with regular TLS now so
it should be fine for most use cases.

Signed-off-by: Tim Hardeck <thardeck@suse.de>
Reviewed-by: Anthony Liguori <aliguori@us.ibm.com>
Message-id: 1366727581-5772-1-git-send-email-thardeck@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 0057a0d590)

[BNC#821819 / FATE#315032]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:18 -06:00
Bruce Rogers
563b2c3b48 increase x86_64 physical bits to 42
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 21:09:18 -06:00
Jan Kiszka
fac90bfc6d target-i386: Improve -cpu ? features output
We were missing a bunch of feature lists. Fix this by simply dumping
the meta list feature_word_info.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 3af60be28c)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:18 -06:00
Jan Kiszka
6c662cc172 target-i386: Fix including "host" in -cpu ? output
kvm_enabled() cannot be true at this point because accelerators are
initialized much later during init. Also, hiding this makes it very hard
to discover for users. Simply dump unconditionally if CONFIG_KVM is set.

Add explanation for "host" CPU type.

Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 21ad77892d)

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:18 -06:00
Cornelia Huck
a6e68a82b3 virtio-ccw: Wire up virtio-rng.
Make virtio-rng devices available for s390-ccw-virtio machines.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:18 -06:00
Stefan Berger
b8f274bd95 rng-random: Use qemu_open / qemu_close
In the rng backend use qemu_open and qemu_close rather than POSIX
open/close.

Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 21:09:18 -06:00
Alexander Graf
9e32a0a2bc s390: Remove legacy s390-virtio machine type
We don't want to confuse users by offering the legacy, broken
machine type -M s390-virtio. Let's just not include the machine
description in the first place..

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:18 -06:00
Alexander Graf
fe583ee43a s390: Default virtio-blk to ccw
When spawning a drive with -drive if=virtio on s390x, we want to
create a ccw device by default, as that is our default machine.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:09:18 -06:00
Alexander Graf
6452d71938 s390: Update s390-zipl.rom with a ccw capable version
Include-If: %if 0%{?patch-possibly-applied-elsewhere}

Source compiled from commit 3c3828f74e11 of:

  git://repo.or.cz/s390-tools.git virtio-ccw-zipl

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 21:08:26 -06:00
Bruce Rogers
a4d378b1ce Add syscalls to white list which allow sdl output
Signed-off-by: Bruce Rogers <brogers@suse.com>
2013-06-12 16:23:07 +02:00
Alexander Graf
de8526eefe s390: restrict early printk to legacy s390 machine
Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:07 +02:00
Alexander Graf
3ab66401c3 Legacy Patch kvm-qemu-preXX-report-default-mac-used.patch 2013-06-12 16:23:06 +02:00
Bruce Rogers
f6af7357df s390: set s390-ccw as the default machine type for s390
Signed-off-by: Bruce Rogers <brogers@suse.com>
2013-06-12 16:23:06 +02:00
Alexander Graf
c8af02494d s390: default virtio aliases to ccw bus
When running with the s390-ccw machine, we need to make sure we
spawn virtio-ccw devices for -net and -drive. Change the aliases
accordingly.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:06 +02:00
Alexander Graf
1bb261f222 Legacy Patch kvm-studio-vnc.patch 2013-06-12 16:23:06 +02:00
Alexander Graf
f0922ef574 Legacy Patch kvm-studio-slirp-nooutgoing.patch 2013-06-12 16:23:06 +02:00
Alexander Graf
a6aa2f9c54 Make char muxer more robust wrt small FIFOs
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.
2013-06-12 16:23:06 +02:00
Alexander Graf
bf5b9c24c5 Implement early printk in virtio-console
On our S390x Virtio machine we don't have anywhere to display early printks
on, because we don't know about VGA or serial ports.

So instead we just forward everything to the virtio console that we created
anyways.

Signed-off-by: Alexander Graf <agraf@suse.de>

Conflicts:

	hw/s390-virtio.c
2013-06-12 16:23:06 +02:00
Alexander Graf
63d9acbe99 Legacy Patch kvm-qemu-tweak-sandboxing-syscall-whitelist.patch 2013-06-12 16:23:06 +02:00
Andreas Färber
632f4958a0 Raise soft address space limit to hard limit
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2013-06-12 16:23:06 +02:00
Alexander Graf
66db770b81 console: add question-mark escape operator
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.

This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.

Signed-off-by: Alexander Graf <agraf@suse.de>
2013-06-12 16:23:05 +02:00
Alexander Graf
363b2fc4f0 Legacy Patch kvm-qemu-preXX-dictzip3.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
287b7249b6 Add tar container format
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.

So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.

This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.

The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>

Conflicts:

	block/Makefile.objs
2013-06-12 16:23:05 +02:00
Alexander Graf
ea55d53b1b Add support for DictZip enabled gzip files
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.

Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.

This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.

Tar patch follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>

Conflicts:

	block/Makefile.objs
2013-06-12 16:23:05 +02:00
Alexander Graf
8e4eb52196 Legacy Patch kvm-qemu-make-rtl8139-default-nic.patch
We need this, but perhaps we can drop in SLES 12.
2013-06-12 16:23:05 +02:00
Alexander Graf
5efb9ade7e Legacy Patch kvm-qemu-enable-kvm-acceleration.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
73aab0ebf2 Legacy Patch kvm-qemu-avoid-redunant-declaration-error.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
bad6f6bc3f Legacy Patch kvm-qemu-provide-__u64-for-broken-sys-capability-h.patch 2013-06-12 16:23:05 +02:00
Alexander Graf
e536419507 Legacy Patch kvm-qemu-avoid-deprecated-gnutls-types.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
1bb6f0527b Legacy Patch kvm-qemu-madvise-DONTFORK-for-tight-memory-migration.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
a45b1c7069 Legacy Patch kvm-qemu-default-memsize.patch 2013-06-12 16:23:04 +02:00
Alexander Graf
827326be7b Legacy Patch qemu-datadir.diff 2013-06-12 16:23:04 +02:00
Michael Roth
89400a80f5 update VERSION for 1.4.2
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-23 17:12:44 -05:00
Hervé Poussineau
e85b521519 ppc: do not register IABR SPR twice for 603e
IABR SPR is already registered in gen_spr_603(), called from init_proc_603E().

Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 16:30:36 -05:00
Aneesh Kumar K.V
f890185392 hw/9pfs: use O_NOFOLLOW for mapped readlink operation
With mapped security models like mapped-xattr and mapped-file, we save the
symlink target as file contents. Now if we ever expose a normal directory
with mapped security model and find real symlinks in export path, never
follow them and return proper error.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 16:23:43 -05:00
Aneesh Kumar K.V
745f6c0ef7 hw/9pfs: Fix segfault with 9p2000.u
When guest tries to chmod a block or char device file over 9pfs,
the qemu process segfaults. With 9p2000.u protocol we use wstat to
change mode bits and client don't send extension information for
chmod. We need to check for size field to check whether extension
info is present or not.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-20 11:25:00 -05:00
Josh Durgin
0182df5ae5 rbd: add an asynchronous flush
The existing bdrv_co_flush_to_disk implementation uses rbd_flush(),
which is sychronous and causes the main qemu thread to block until it
is complete. This results in unresponsiveness and extra latency for
the guest.

Fix this by using an asynchronous version of flush.  This was added to
librbd with a special #define to indicate its presence, since it will
be backported to stable versions. Thus, there is no need to check the
version of librbd.

Implement this as bdrv_aio_flush, since it matches other aio functions
in the rbd block driver, and leave out bdrv_co_flush_to_disk when the
asynchronous version is available.

Reported-by: Oliver Francke <oliver@filoo.de>
Signed-off-by: Josh Durgin <josh.durgin@inktank.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dc7588c1eb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 15:52:55 -05:00
Paolo Bonzini
7f28f0f1f6 qemu-iotests: add tests for rebasing zero clusters
If zero clusters are erroneously treated as unallocated, "qemu-img rebase"
will copy the backing file's contents onto the cluster.

The bug existed also in image streaming, but since the root cause was in
qcow2's is_allocated implementation it is enough to test it with qemu-img.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit acbf30ec60)

Conflicts:

	tests/qemu-iotests/group

* fixed up to account for tests 48/49 being missing from 1.4

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 13:10:52 -05:00
Luiz Capitulino
45bbe1fa89 virtio-balloon: fix integer overflow in BALLOON_CHANGE QMP event
Because dev->actual is uint32_t, the expression 'dev->actual <<
VIRTIO_BALLOON_PFN_SHIFT' is truncated to 32 bits. This overflows when
dev->actual >= 1048576.

To reproduce:

 1. Start a VM with a QMP socket and 5G of RAM
 2. Connect to the QMP socket, negotiate capabilities and issue:

   { "execute":"balloon", "arguments": { "value": 1073741824 } }

 3. Watch for BALLOON_CHANGE QMP events, the last one will incorretly be:

   { "timestamp": { "seconds": 1366228965, "microseconds": 245466 },
     "event": "BALLOON_CHANGE", "data": { "actual": 5368709120 } }

To fix it this commit casts it to ram_addr_t, which is ram_size's type.

Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit dcc6ceffc0)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-17 12:02:18 -05:00
Paolo Bonzini
06efdc4f4d qemu-timer: move timeBeginPeriod/timeEndPeriod to os-win32
These are needed for any of the Win32 alarm timer implementations.
They are not tied to mmtimer exclusively.

Jacob tested this patch with both mmtimer and Win32 timers.

Cc: qemu-stable@nongnu.org
Tested-by: Jacob Kroon <jacob.kroon@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
(cherry picked from commit 0727b86754)

Conflicts:

	os-win32.c

* updated to retain cpu affinity settings for 1.4

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 17:22:34 -05:00
Brad Smith
0c70b5ad59 configure: Don't fall back to gthread coroutine backend
This is a back port of 7c2acc7062 to the
1.4 stable branch without needing the new error_exit() function.

configure: Don't fall back to gthread coroutine backend

The gthread coroutine backend is broken and does not produce a working
QEMU; it is only useful for some very limited debugging situations.
Clean up the backend selection logic in configure so that it now runs
"if on windows use windows; else prefer ucontext; else sigaltstack".

To do this we refactor the configure code to separate out "test
whether we have a working ucontext", "pick a default if user didn't
specify" and "validate that user didn't specify something invalid",
rather than having all three of these run together. We also simplify
the Makefile logic so it just links in the backend the configure
script selects.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1365419487-19867-3-git-send-email-peter.maydell@linaro.org
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
Signed-off-by: Brad Smith <brad@comstyle.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 14:35:48 -05:00
Hans de Goede
b90fd157f7 usb-redir: Fix crash on migration with no client connected
If no client is connected on the src side, then we won't receive a
parser during migrate, in this case usbredir_post_load() should be a nop,
rather then to try to derefefence the NULL dev->parser pointer.

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3713e1485e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 12:06:36 -05:00
Cole Robinson
7322cb17fa docs: Fix generating qemu-doc.html with texinfo 5
LC_ALL=C makeinfo --no-headers --no-split --number-sections --html qemu-doc.texi -o qemu-doc.html
./qemu-options.texi:1521: unknown command `list'
./qemu-options.texi:1521: table requires an argument: the formatter for @item
./qemu-options.texi:1521: warning: @table has text but no @item

This is for 1.4 stable only; master isn't affected, as it was fixed by
another commit (which isn't appropriate for stable):

commit 5d6768e3b8
Author: MORITA Kazutaka <morita.kazutaka@lab.ntt.co.jp>
Date:   Fri Feb 22 12:39:51 2013 +0900

    sheepdog: accept URIs

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-16 12:04:13 -05:00
Laszlo Ersek
1d7723ffc7 qga: unlink just created guest-file if fchmod() or fdopen() fails on it
We shouldn't allow guest filesystem pollution on error paths.

Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 2b72001806)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 16:18:25 -05:00
Laszlo Ersek
67b460a404 qga: distinguish binary modes in "guest_file_open_modes" map
In Windows guests this may make a difference.

Since the original patch (commit c689b4f1) sought to be pedantic and to
consider theoretical corner cases of portability, we should fix it up
where it failed to come through in that pursuit.

Suggested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 8fe6bbca71)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 16:18:15 -05:00
Peter Maydell
84247bbe28 translate-all.c: Remove cpu_unlink_tb()
The (unsafe) function cpu_unlink_tb() is now unused, so we can simply
remove it and any code that was only used by it.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 3a808cc407)

Conflicts:
	translate-all.c

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:38 -05:00
Peter Maydell
2ebcc590c9 Handle CPU interrupts by inline checking of a flag
Fix some of the nasty TCG race conditions and crashes by implementing
cpu_exit() as setting a flag which is checked at the start of each TB.
This avoids crashes if a thread or signal handler calls cpu_exit()
while the execution thread is itself modifying the TB graph (which
may happen in system emulation mode as well as in linux-user mode
with a multithreaded guest binary).

This fixes the crashes seen in LP:668799; however there are another
class of crashes described in LP:1098729 which stem from the fact
that in linux-user with a multithreaded guest all threads will
use and modify the same global TCG date structures (including the
generated code buffer) without any kind of locking. This means that
multithreaded guest binaries are still in the "unsupported"
category.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 378df4b237)

Conflicts:
	exec.c
	include/qom/cpu.h
	translate-all.c
	include/exec/gen-icount.h

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>

Conflicts:
	cpu-exec.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:21 -05:00
Peter Maydell
69001b3145 cpu-exec: wrap tcg_qemu_tb_exec() in a fn to restore the PC
If tcg_qemu_tb_exec() returns a value whose low bits don't indicate a
link to an indexed next TB, this means that the TB execution never
started (eg because the instruction counter hit zero).  In this case the
guest PC has to be reset to the address of the start of the TB.
Refactor the cpu-exec code to make all tcg_qemu_tb_exec() calls pass
through a wrapper function which does this restoration if necessary.

Note that the apparent change in cpu_exec_nocache() from calling
cpu_pc_from_tb() with the old TB to calling it with the TB returned by
do_tcg_qemu_tb_exec() is safe, because in the nocache case we can
guarantee that the TB we try to execute is not linked to any others,
so the only possible returned TB is the one we started at. That is,
we should arguably previously have included in cpu_exec_nocache() an
assert(next_tb & ~TB_EXIT_MASK) == tb), since the API requires restore
from next_tb but we were using tb.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 77211379d7)

Conflicts:
	cpu-exec.c

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:48:14 -05:00
Peter Maydell
3accab7365 tcg: Document tcg_qemu_tb_exec() and provide constants for low bit uses
Document tcg_qemu_tb_exec(). In particular, its return value is a
combination of a pointer to the next translation block and some
extra information in the low two bits. Provide some #defines for
the values passed in these bits to improve code clarity.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Blue Swirl <blauwirbel@gmail.com>

(cherry picked from commit 0980011b4f)

Conflicts:
	tcg/tcg.h

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 15:47:53 -05:00
Laszlo Ersek
60259539ee qga: set umask 0077 when daemonizing (CVE-2013-2007)
The qemu guest agent creates a bunch of files with insecure permissions
when started in daemon mode. For example:

  -rw-rw-rw- 1 root root /var/log/qemu-ga.log
  -rw-rw-rw- 1 root root /var/run/qga.state
  -rw-rw-rw- 1 root root /var/log/qga-fsfreeze-hook.log

In addition, at least all files created with the "guest-file-open" QMP
command, and all files created with shell output redirection (or
otherwise) by utilities invoked by the fsfreeze hook script are affected.

For now mask all file mode bits for "group" and "others" in
become_daemon().

Temporarily, for compatibility reasons, stick with the 0666 file-mode in
case of files newly created by the "guest-file-open" QMP call. Do so
without changing the umask temporarily.

Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c689b4f1ba)

Conflicts:

	qga/commands-posix.c

*update includes to match stable

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:30:33 -05:00
Aurelien Jarno
93399d0827 tcg/optimize: fix setcond2 optimization
When setcond2 is rewritten into setcond, the state of the destination
temp should be reset, so that a copy of the previous value is not
used instead of the result.

Reported-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 66e61b55f1)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:03:45 -05:00
Richard Sandiford
074dd56a01 target-mips: Fix accumulator arguments to gen_helper_dmult(u)
gen_muldiv was passing int accumulator arguments directly
to gen_helper_dmult(u).  This patch fixes it to use TCGs,
via the gen_helper_0e2i wrapper.

Fixes an --enable-debug-tcg build failure reported by Juergen Lock.

Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 13:01:34 -05:00
Andreas Färber
d10d2510b9 configure: Pick up libseccomp include path
openSUSE 12.3 has seccomp.h in /usr/include/libseccomp-1.0.1,
so add `pkg-config --cflags libseccomp` output to QEMU_CFLAGS.

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 372e47e9b5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-14 05:30:53 -05:00
Cornelia Huck
5613bda4ac virtio-ccw: Check indicators location.
If a guest neglected to register (secondary) indicators but still runs
with notifications enabled, we might end up writing to guest zero;
avoid this by checking for valid indicators and only writing to the
guest and generating an interrupt if indicators have been setup.

Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 7c4869761d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:53:19 -05:00
Jason Wang
c5675a98bb tap: properly initialize vhostfds
Only tap->vhostfd were checked net_init_tap_one(), but tap->vhostfds were
forgot, this will lead qemu to ignore all fds passed by management through
vhostfds, and tries to create vhost_net device itself. Fix by adding this check
also.

Reportyed-by: Michal Privoznik <mprivozn@redhat.com>
Cc: Michal Privoznik <mprivozn@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michal Privoznik <mprivozn@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7873df408d)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:52:06 -05:00
Amit Shah
e355efd962 rng random backend: check for -EAGAIN errors on read
Not handling EAGAIN triggers the assert

qemu/backends/rng-random.c:44:entropy_available: assertion failed: (len != -1)
Aborted (core dumped)

This happens when starting a guest with '-device virtio-rng-pci',
issuing a 'cat /dev/hwrng' in the guest, while also doing 'cat
/dev/random' on the host.

Reported-by: yunpingzheng <yunzheng@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
Message-id: eacda84dfaf2d99cf6d250b678be4e4d6c2088fb.1366108096.git.amit.shah@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit acbbc03661)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:50:35 -05:00
Andreas Färber
4d7f4556fc qdev: Fix QOM unrealize behavior
Since commit 249d41720b (qdev: Prepare
"realized" property) setting realized = true would register the device's
VMStateDescription, but realized = false would not unregister it. Fix that.

Moving the code from unparenting also revealed that we were calling
DeviceClass::init through DeviceClass::realize as interim solution but
DeviceClass::exit still at unparenting time with a realized check.
Make this symmetrical by implementing DeviceClass::unrealize to call it,
while we're setting realized = false in the unparenting path.
The only other unrealize user is mac_nvram, which can safely override it.

Thus, mark DeviceClass::exit as obsolete, new devices should implement
DeviceClass::unrealize instead.

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
Message-id: 1366043650-9719-1-git-send-email-afaerber@suse.de
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit fe6c211781)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:48:56 -05:00
Stefan Hajnoczi
0486c27a36 nbd: unlock mutex in nbd_co_send_request() error path
Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6760c47aa4)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-05-13 11:47:07 -05:00
Michael Roth
57105f7480 update VERSION for 1.4.1
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-15 14:18:25 -05:00
Daniel P. Berrange
6e8865313f Add -f FMT / --format FMT arg to qemu-nbd
Currently the qemu-nbd program will auto-detect the format of
any disk it is given. This behaviour is known to be insecure.
For example, if qemu-nbd initially exposes a 'raw' file to an
unprivileged app, and that app runs

   'qemu-img create -f qcow2 -o backing_file=/etc/shadow /dev/nbd0'

then the next time the app is started, the qemu-nbd will now
detect it as a 'qcow2' file and expose /etc/shadow to the
unprivileged app.

The only way to avoid this is to explicitly tell qemu-nbd what
disk format to use on the command line, completely disabling
auto-detection. This patch adds a '-f' / '--format' arg for
this purpose, mirroring what is already available via qemu-img
and qemu commands.

  qemu-nbd --format raw -p 9000 evil.img

will now always use raw, regardless of what format 'evil.img'
looks like it contains

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[Use errx, not err. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>

*fixed conflict due to bdrv_open() not supporting "options" param
in v1.4.1

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-09 10:00:20 -05:00
Richard Sandiford
6d0b135a98 target-mips: Fix accumulator selection for MIPS16 and microMIPS
Add accumulator arguments to gen_HILO and gen_muldiv, rather than
extracting the accumulator directly from ctx->opcode.  The extraction
was only right for the standard encoding: MIPS16 doesn't have access
to the DSP registers, while microMIPS encodes the accumulator register
in a different field (bits 14 and 15).

Passing the accumulator register is probably an over-generalisation
for division and 64-bit multiplication, which never access anything
other than HI and LO, and which always pass 0 as the new argument.
Separating them felt a bit fussy though.

Signed-off-by: Richard Sandiford <rdsandiford@googlemail.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 26135ead80)

Conflicts:
	target-mips/translate.c

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-09 09:59:17 -05:00
Brad Smith
d89f9ba43b Allow clock_gettime() monotonic clock to be utilized on more OS's
Allow the clock_gettime() code using monotonic clock to be utilized on
more POSIX compliannt OS's. This started as a fix for OpenBSD which was
listed in one function as part of the previous hard coded list of OS's
for the functions to support but not in the other.

Signed-off-by: Brad Smith <brad@comstyle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20130405003748.GH884@rox.home.comstyle.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit d05ef16045)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-06 16:38:15 -05:00
Eduardo Habkost
46f9071a23 target-i386: Check for host features before filter_features_for_kvm()
commit 5ec01c2e96 broke "-cpu ..,enforce",
as it has moved kvm_check_features_against_host() after the
filter_features_for_kvm() call. filter_features_for_kvm() removes all
features not supported by the host, so this effectively made
kvm_check_features_against_host() impossible to fail.

This patch changes the call so we check for host feature support before
filtering the feature bits.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1364935692-24004-1-git-send-email-ehabkost@redhat.com
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit a509d632c8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-05 14:01:33 -05:00
Jason Wang
f85e082a36 help: add docs for missing 'queues' option of tap
Cc: Markus Armbruster <armbru@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1361545072-30426-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit ec3960148f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-05 13:57:17 -05:00
Paolo Bonzini
da78a1bc7a compiler: fix warning with GCC 4.8.0
GCC 4.8.0 introduces a new warning:

    block/qcow2-snapshot.c: In function 'qcow2_write_snapshots’:
    block/qcow2-snapshot.c:252:18: error: typedef 'qemu_build_bug_on__253'
              locally defined but not used [-Werror=unused-local-typedefs]
         QEMU_BUILD_BUG_ON(offsetof(QCowHeader, snapshots_offset) !=
                  ^
    cc1: all warnings being treated as errors

(Caret diagnostics aren't perfect yet with macros... :)) Work around it
with __attribute__((unused)).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1364391272-1128-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 99835e0084)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 19:53:21 -05:00
Peter Lieven
2b92aa36d1 block: complete all IOs before resizing a device
this patch ensures that all pending IOs are completed
before a device is resized. this is especially important
if a device is shrinked as it the bdrv_check_request()
result is invalidated.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 92b7a08d64)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:36:43 -05:00
Peter Lieven
e4cce2d3e9 Revert "block: complete all IOs before .bdrv_truncate"
brdv_truncate() is also called from readv/writev commands on self-
growing file based storage. this will result in requests waiting
for theirselves to complete.

This reverts commit 9a665b2b86.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5c916681ae)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:35:43 -05:00
Gerd Hoffmann
d15b1aa30c qxl: better vga init in enter_vga_mode
Ask the vga core to update the display.  Will trigger dpy_gfx_resize
if needed.  More complete than just calling dpy_gfx_resize.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c099e7aa02)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:33:55 -05:00
Markus Armbruster
65fe29ec00 doc: Fix texinfo @table markup in qemu-options.hx
End tables before headings, start new ones afterwards.  Fixes
incorrect indentation of headings "File system options" and "Virtual
File system pass-through options" in manual page and qemu-doc.

Normalize markup some to increase chances it survives future edits.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1360781383-28635-5-git-send-email-armbru@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit c70a01e449)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:29:28 -05:00
Bruce Rogers
888e036eb4 acpi: initialize s4_val used in s4 shutdown
While investigating why a 32 bit Windows 2003 guest wasn't able to
successfully perform a shutdown /h, it was discovered that commit
afafe4bbe0 inadvertently dropped the
initialization of the s4_val used to handle s4 shutdown.
Initialize the value as before.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 1364928100-487-1-git-send-email-brogers@suse.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 560e639652)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 17:24:55 -05:00
Petar Jovanovic
d019dd928c target-mips: fix rndrashift_short_acc and code for EXTR_ instructions
Fix for rndrashift_short_acc to set correct value to higher 64 bits.
This change also corrects conditions when bit 23 of the DSPControl register
is set.

The existing test files have been extended with several examples that
trigger the issues. One bug/example in the test file for EXTR_RS_W has been
found and reported by Klaus Peichl.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 8b758d0568)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:58:41 -05:00
Petar Jovanovic
dac077f0e6 target-mips: fix DSP overflow macro and affected routines
The previous implementation incorrectly used same macro to detect overflow
for addition and subtraction. This patch makes distinction between these
two, and creates separate macros. The affected routines are changed
accordingly.

This change also includes additions to the existing tests for SUBQ_S_PH and
SUBQ_S_W that would trigger the fixed issue, and it removes dead code from
the test file. The last test case in subq_s_w.c is a bug found/reported/
isolated by Klaus Peichl from Dolby.

Signed-off-by: Petar Jovanovic <petar.jovanovic@imgtec.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 20c334a797)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:32:39 -05:00
Petar Jovanovic
b09a673164 target-mips: fix for sign-issue in MULQ_W helper
Correct sign-propagation before multiplication in MULQ_W helper.
The change also fixes previously incorrect expected values in the
tests for MULQ_RS.W and MULQ_S.W.

Signed-off-by: Petar Jovanovic <petarj@mips.com>
Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit a345481baa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:31:57 -05:00
Petar Jovanovic
79a4dd4085 target-mips: fix for incorrect multiplication with MULQ_S.PH
The change corrects sign-related issue with MULQ_S.PH. It also includes
extension to the already existing test which will trigger the issue.

Signed-off-by: Petar Jovanovic <petarj@mips.com>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 9c19eb1e20)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 16:31:23 -05:00
Hans de Goede
57e929c19c usb-tablet: Don't claim wakeup capability for USB-2 version
Our ehci code does not implement wakeup support, so claiming support for
it with usb-tablet in USB-2 mode causes all tablet events to get lost.

http://bugzilla.redhat.com/show_bug.cgi?id=929068

Signed-off-by: Hans de Goede <hdegoede@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit aa1c9e971e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:33:52 -05:00
Stefan Hajnoczi
27c71355fb chardev: clear O_NONBLOCK on SCM_RIGHTS file descriptors
When we receive a file descriptor over a UNIX domain socket the
O_NONBLOCK flag is preserved.  Clear the O_NONBLOCK flag and rely on
QEMU file descriptor users like migration, SPICE, VNC, block layer, and
others to set non-blocking only when necessary.

This change ensures we don't accidentally expose O_NONBLOCK in the QMP
API.  QMP clients should not need to get the non-blocking state
"correct".

A recent real-world example was when libvirt passed a non-blocking TCP
socket for migration where we expected a blocking socket.  The source
QEMU produced a corrupted migration stream since its code did not cope
with non-blocking sockets.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e374f7f816171f9783c1d9d00a041f26379f1ac6)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
283b7de6a5 qemu-socket: set passed fd non-blocking in socket_connect()
socket_connect() sets non-blocking on TCP or UNIX domain sockets if a
callback function is passed.  Do the same for file descriptor passing,
otherwise we could unexpectedly be using a blocking file descriptor.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 35fb94fa292173a3e1df0768433e06912a2a88e4)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
a1cb89f3fe net: ensure "socket" backend uses non-blocking fds
There are several code paths in net_init_socket() depending on how the
socket is created: file descriptor passing, UDP multicast, TCP, or UDP.
Some of these support both listen and connect.

Not all code paths set the socket to non-blocking.  This patch addresses
the file descriptor passing and UDP cases which were missing
socket_set_nonblock(fd) calls.

I considered moving socket_set_nonblock(fd) to a central location but it
turns out the code paths are different enough to require non-blocking at
different places.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f05b707279dc7c29ab10d9d13dbf413df6ec22f1)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Stefan Hajnoczi
68f9df5990 oslib-posix: rename socket_set_nonblock() to qemu_set_nonblock()
The fcntl(fd, F_SETFL, O_NONBLOCK) flag is not specific to sockets.
Rename to qemu_set_nonblock() just like qemu_set_cloexec().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 399f1c8f8af1f6f8b18ef4e37169c6301264e467)

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>

Conflicts:
	block/sheepdog.c

socket_set_block()/socket_set_nonblock() calls in different locations

	include/qemu/sockets.h

socket_set_nodelay() does not exist in v1.4.0, messes up diff context

	qemu-char.c

glib G_IO_IN events are not used in v1.4.0, messes up diff context

	savevm.c

qemu_fopen_socket() only has read mode in v1.4.0, qemu_set_block() not
necessary.

	slirp/misc.c

unportable setsockopt() calls in v1.4.0 mess up diff context

	slirp/tcp_subr.c

file was reformatted, diff context is messed up

	ui/vnc.c

old dcl->idle instead of vd->dcl.idle messes up diff context

Added:
	migration-tcp.c, migration-unix.c

qemu_fopen_socket() write mode does not exist yet, qemu_set_block() call
is needed here.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-04 15:17:32 -05:00
Gerd Hoffmann
0135796271 update seabios to 1.7.2.1
Alex Williamson (3):
      seabios q35: Enable all PIRQn IRQs at startup
      seabios q35: Add new PCI slot to irq routing function
      seabios: Add a dummy PCI slot to irq mapping function

Avik Sil (1):
      USB-EHCI: Fix null pointer assignment

Kevin O'Connor (4):
      Update tools/acpi_extract.py to handle iasl 20130117 release.
      Fix Makefile - don't reference "out/" directly, instead use "$(OUT)".
      build: Don't require $(OUT) to be a sub-directory of the main
directory.
      Verify CC is valid during build tests.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5c75fb1002)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:34:06 -05:00
Peter Maydell
799a34a48b linux-user/syscall.c: Don't warn about unimplemented get_robust_list
The nature of the kernel ABI for the get_robust_list and set_robust_list
syscalls means we cannot implement them in QEMU. Make get_robust_list
silently return ENOSYS rather than using the default "print message and
then fail ENOSYS" code path, in the same way we already do for
set_robust_list, and add a comment documenting why we do this.

This silences warnings which were being produced for emulating
even trivial programs like 'ls' in x86-64-on-x86-64.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit e9a970a831)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:28:53 -05:00
Peter Maydell
8378910554 linux-user: make bogus negative iovec lengths fail EINVAL
If the guest passes us a bogus negative length for an iovec, fail
EINVAL rather than proceeding blindly forward. This fixes some of
the error cases tests for readv and writev in the LTP.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit dfae8e00f8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 16:23:52 -05:00
John Rigby
7a238b9fbd linux-user: fix futex strace of FUTEX_CLOCK_REALTIME
Handle same as existing FUTEX_PRIVATE_FLAG.

Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit bfb669f39f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:49:19 -05:00
John Rigby
02493ee490 linux-user/syscall.c: handle FUTEX_WAIT_BITSET in do_futex
Upstream libc has recently changed to start using
FUTEX_WAIT_BITSET instead of FUTEX_WAIT and this
is causing do_futex to return -TARGET_ENOSYS.

Pass bitset in val3 to sys_futex which will be
ignored by kernel for the FUTEX_WAIT case.

Signed-off-by: John Rigby <john.rigby@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
(cherry picked from commit cce246e0a2)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:48:35 -05:00
Stefan Hajnoczi
7d47b243d6 qcow2: flush refcount cache correctly in qcow2_write_snapshots()
Since qcow2 metadata is cached we need to flush the caches, not just the
underlying file.  Use bdrv_flush(bs) instead of bdrv_flush(bs->file).

Also add the error return path when bdrv_flush() fails and move the
flush after checking for qcow2_alloc_clusters() failure so that the
qcow2_alloc_clusters() error return value takes precedence.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f6977f1556)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:47:09 -05:00
Stefan Hajnoczi
02ea844746 qcow2: flush refcount cache correctly in alloc_refcount_block()
update_refcount() affects the refcount cache, it does not write to disk.
Therefore bdrv_flush(bs->file) does nothing.  We need to flush the
refcount cache in order to write out the refcount updates!

While we're here also add error returns when qcow2_cache_flush() fails.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9991923b26)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:45:40 -05:00
Peter Lieven
0fcf00b55c page_cache: fix memory leak
XBZRLE encoded migration introduced a MRU page cache
meachnism. Unfortunately, cached items where never freed in
case of a collision in the page cache on cache_insert().

This lead to out of memory conditions during XBZRLE migration
if the page cache was small and there where a lot of collisions
in the cache.

Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Orit Wasserman <owasserm@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 32a1c08b60)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:44:43 -05:00
Orit Wasserman
5610ef5863 Fix page_cache leak in cache_resize
Signed-off-by: Orit Wasserman <owasserm@redhat.com>

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
(cherry picked from commit 0db65d624e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:44:02 -05:00
Christian Borntraeger
7a687aed28 virtio-blk: fix unplug + virsh reboot
virtio-blk registers a vmstate change handler. Unfortunately this
handler is not unregistered on unplug, leading to some random
crashes if the system is restarted, e.g. via virsh reboot.
Lets unregister the vmstate change handler if the device is removed.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 69b302b204)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:41:50 -05:00
Mark Cave-Ayland
b91aee5810 ide/macio: Fix macio DMA initialisation.
Commit 07a7484e5d accidentally introduced a bug
in the initialisation of the second macio DMA device which could cause some
DMA operations to segfault QEMU.

CC: Andreas Färber <afaerber@suse.de>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 02d583c723)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 15:39:59 -05:00
Andreas Färber
e09b99b54f target-ppc: Fix CPU_POWERPC_MPC8547E
It was defined to ..._MPC8545E_v21 rather than ..._MPC8547E_v21.
Due to both resolving to CPU_POWERPC_e500v2_v21 this did not show.

Fixing this nontheless helps with QOM'ifying CPU aliases.

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 0136d715ad)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:53:18 -05:00
David Gibson
611c7f2c3a pseries: Add cleanup hook for PAPR virtual LAN device
Currently the spapr-vlan device does not supply a cleanup call for its
NetClientInfo structure.  With current qemu versions, that leads to a SEGV
on exit, when net_cleanup() attempts to call the cleanup handlers on all
net clients.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 156dfaded8)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:51:39 -05:00
Michal Privoznik
4e4566ce78 configure: Require at least spice-protocol-0.12.3
As of 5a49d3e9 we assume SPICE_PORT_EVENT_BREAK to be defined.
However, it is defined not in 0.12.2 what we require now, but in
0.12.3.  Therefore in order to prevent build failure we must
adjust our minimal requirements.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 358689fe29)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:48:51 -05:00
Paolo Bonzini
43e00611bc qemu-bridge-helper: force usage of a very high MAC address for the bridge
Linux uses the lowest enslaved MAC address as the MAC address of
the bridge.  Set MAC address to a high value so that it does not
affect the MAC address of the bridge.

Changing the MAC address of the bridge could cause a few seconds
of network downtime.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1363971468-21154-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 226ecabfbd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:31:58 -05:00
Cornelia Huck
3c3de7c6b4 virtio-ccw: Queue sanity check for notify hypercall.
Verify that the virtio-ccw notify hypercall passed a reasonable
value for queue.

Cc: qemu-stable@nongnu.org
Reported-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit b57ed9bf07)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:30:51 -05:00
Yeongkyoon Lee
b0da310a69 tcg: Fix occasional TCG broken problem when ldst optimization enabled
is_tcg_gen_code() checks the upper limit of TCG generated code range wrong, so
that TCG could get broken occasionally only when CONFIG_QEMU_LDST_OPTIMIZATION
enabled. The reason is code_gen_buffer_max_size does not cover the upper range
up to (TCG_MAX_OP_SIZE * OPC_BUF_SIZE), thus code_gen_buffer_max_size should be
modified to code_gen_buffer_size.

CC: qemu-stable@nongnu.org
Signed-off-by: Yeongkyoon Lee <yeongkyoon.lee@samsung.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 52ae646d4a)

Conflicts:

	translate-all.c

*modified to use non-tcg-ctx version of code_gen_* variables

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:28:39 -05:00
Peter Crosthwaite
d26efd2d39 qga/main.c: Don't use g_key_file_get/set_int64
These functions don't exist until glib version 2.26. QEMU is currently only
mandating glib 2.12.

This patch replaces the functions with g_key_file_get/set_integer.

Unbreaks the build on Ubuntu 10.04 and RHEL 5.6.

Regression was introduced by 39097daf15

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-id: 1363323879-682-1-git-send-email-peter.crosthwaite@xilinx.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 4f30649618)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:18:47 -05:00
Michael Roth
f305d504ab qemu-ga: use key-value store to avoid recycling fd handles after restart
Hosts hold on to handles provided by guest-file-open for periods that can
span beyond the life of the qemu-ga process that issued them. Since these
are issued starting from 0 on every restart, we run the risk of issuing
duplicate handles after restarts/reboots.

As a result, users with a stale copy of these handles may end up
reading/writing corrupted data due to their existing handles effectively
being re-assigned to an unexpected file or offset.

We unfortunately do not issue handles as strings, but as integers, so a
solution such as using UUIDs can't be implemented without introducing a
new interface.

As a workaround, we fix this by implementing a persistent key-value store
that will be used to track the value of the last handle that was issued
across restarts/reboots to avoid issuing duplicates.

The store is automatically written to the same directory we currently
set via --statedir to track fsfreeze state, and so should be applicable
for stable releases where this flag is supported.

A follow-up can use this same store for handling fsfreeze state, but
that change is cosmetic and left out for now.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org

* fixed guest_file_handle_add() return value from uint64_t to int64_t
(cherry picked from commit 39097daf15)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:16:31 -05:00
Paolo Bonzini
d3652a1b28 qcow2: make is_allocated return true for zero clusters
Otherwise, live migration of the top layer will miss zero clusters and
let the backing file show through.  This also matches what is done in qed.

QCOW2_CLUSTER_ZERO clusters are invalid in v2 image files.  Check this
directly in qcow2_get_cluster_offset instead of replicating the test
everywhere.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 381b487d54)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 11:00:19 -05:00
David Gibson
51943504d5 pseries: Add compatible property to root of device tree
Currently, for the pseries machine the device tree supplied by qemu to SLOF
and from there to the guest does not include a 'compatible property' at the
root level.  Usually that works fine, since in this case the compatible
property doesn't really give any information not already found in the
'device_type' or 'model' properties.

However, the lack of 'compatible' confuses the bootloader install in the
SLES11 SP2 and SLES11 SP3 installers.  This patch therefore adds a token
'compatible' property to work around that.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit d63919c93e)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:59:03 -05:00
Christian Borntraeger
4d1cdb9efd Allow virtio-net features for legacy s390 virtio bus
Enable all virtio-net features for the legacy s390 virtio bus. This also fixes
kernel BUG at /usr/src/packages/BUILD/kernel-default-3.0.58/linux-3.0/drivers/s390/kvm/kvm_virtio.c:121!

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 35569cea79)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:57:53 -05:00
Cole Robinson
c3b81e01b8 rtc-test: Fix test failures with recent glib
As of glib 2.35.4, glib changed its logic for ordering test cases:

https://bugzilla.gnome.org/show_bug.cgi?id=694487

This was causing failures in rtc-test. Group the reordered test
cases into their own suite, which maintains the original ordering.

CC: qemu-stable@nongnu.org
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit eeb29fb9aa)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:56:19 -05:00
Paolo Bonzini
99b1f39bd2 scsi-disk: do not complete canceled UNMAP requests
Canceled requests should never be completed, and doing that could cause
accesses to a NULL hba_private field.

Cc: qemu-stable@nongnu.org
Reported-by: Stefan Priebe <s.priebe@profihost.ag>
Tested-by: Stefan Priebe <s.priebe@profihost.ag>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d0242eadc5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:54:35 -05:00
Paolo Bonzini
f23ab037c7 scsi: do not call scsi_read_data/scsi_write_data for a canceled request
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6f6710aa99)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:53:33 -05:00
Paolo Bonzini
0c918dd600 iscsi: look for pkg-config file too
Due to library conflicts, Fedora will have to put libiscsi in
/usr/lib/iscsi.  Simplify configuration by using a pkg-config
file.  The Fedora package will distribute one, and the patch
to add it has been sent to upstream libiscsi as well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3c33ea9640)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:52:33 -05:00
Paolo Bonzini
a8b090ef08 scsi-disk: handle io_canceled uniformly and correctly
Always check it immediately after calling bdrv_acct_done, and
always do a "goto done" in case the "done" label has to free
some memory---as is the case for scsi_unmap_complete in the
previous patch.

This patch could fix problems that happen when a request is
split into multiple parts, and one of them is canceled.  Then
the next part is fired, but the HBA's cancellation callbacks have
fired already.  Whether this happens or not, depends on how the
block/ driver implements AIO cancellation.  It it does a simple
bdrv_drain_all() or similar, then it will not have a problem.
If it only cancels the given AIOCB, this scenario could happen.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c92e0e6b6)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:50:28 -05:00
Michael Roth
4a38944326 qemu-ga: make guest-sync-delimited available during fsfreeze
We currently maintain a whitelist of commands that are safe during
fsfreeze. During fsfreeze, we disable all commands that aren't part of
that whitelist.

guest-sync-delimited meets the criteria for being whitelisted, and is
also required for qemu-ga clients that rely on guest-sync-delimited for
re-syncing the channel after a timeout.

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit c5dcb6ae23)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:49:47 -05:00
Markus Armbruster
b7ff1a7a00 qmp: netdev_add is like -netdev, not -net, fix documentation
Cc: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit af347aa5a5)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:43:46 -05:00
Gerd Hoffmann
d49fed4c55 vga: fix byteswapping.
In case host and guest endianness differ the vga code first creates
a shared surface (using qemu_create_displaysurface_from), then goes
patch the surface format to indicate that the bytes must be swapped.

The switch to pixman broke that hack as the format patching isn't
propagated into the pixman image, so ui code using the pixman image
directly (such as vnc) uses the wrong format.

Fix that by adding a byteswap parameter to
qemu_create_displaysurface_from, so we'll use the correct format
when creating the surface (and the pixman image) and don't have
to patch the format afterwards.

[ v2: unbreak xen build ]

Cc: qemu-stable@nongnu.org
Cc: mark.cave-ayland@ilande.co.uk
Cc: agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1361349432-23884-1-git-send-email-kraxel@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit b1424e0381)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:34:41 -05:00
Jason Wang
cebb8ebe41 help: add docs for multiqueue tap options
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Message-id: 1361354641-51969-1-git-send-email-jasowang@redhat.com
Signed-off-by: Anthony Liguori <aliguori@us.ibm.com>
(cherry picked from commit 2ca81baa0b)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:32:37 -05:00
Jason Wang
3b39a11cde net: reduce the unnecessary memory allocation of multiqueue
Edivaldo reports a problem that the array of NetClientState in NICState is too
large - MAX_QUEUE_NUM(1024) which will wastes memory even if multiqueue is not
used.

Instead of static arrays, solving this issue by allocating the queues on demand
for both the NetClientState array in NICState and VirtIONetQueue array in
VirtIONet.

Tested by myself, with single virtio-net-pci device. The memory allocation is
almost the same as when multiqueue is not merged.

Cc: Edivaldo de Araujo Pereira <edivaldoapereira@yahoo.com.br>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f6b26cf257)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:28:29 -05:00
Igor Mitsyanko
ec9f828341 qemu-char.c: fix waiting for telnet connection message
Current colon position in "waiting for telnet connection" message template
produces messages like:
QEMU waiting for connection on: telnet::127.0.0.16666,server

After moving a colon to the right, we will get a correct messages like:
QEMU waiting for connection on: telnet:127.0.0.1:6666,server

Signed-off-by: Igor Mitsyanko <i.mitsyanko@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e5545854dd)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:25:05 -05:00
Jason Wang
332e93417a tap: forbid creating multiqueue tap when hub is used
Obviously, hub does not support multiqueue tap. So this patch forbids creating
multiple queue tap when hub is used to prevent the crash when command line such
as "-net tap,queues=2" is used.

Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ce675a7579)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:07:24 -05:00
Peter Lieven
e6b795f34e block: complete all IOs before .bdrv_truncate
bdrv_truncate() invalidates the bdrv_check_request() result for
in-flight requests, so there should better be none.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 9a665b2b86)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:05:31 -05:00
Paolo Bonzini
51968b8503 coroutine: trim down nesting level in perf_nesting test
20000 nested coroutines require 20 GB of virtual address space.
Only nest 1000 of them so that the test (only enabled with
"-m perf" on the command line) runs on 32-bit machines too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 027003152f)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 10:04:32 -05:00
Andreas Färber
80d8b5da48 target-ppc: Fix "G2leGP3" PVR
Unlike derived PVR constants mapped to CPU_POWERPC_G2LEgp3, the
"G2leGP3" model definition itself used the CPU_POWERPC_G2LEgp1 PVR.

Fixing this will allow to alias CPU_POWERPC_G2LEgp3-using types to
"G2leGP3".

Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit bfe6d5b0da)

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2013-04-02 09:52:13 -05:00
377 changed files with 15042 additions and 5845 deletions

2
.gitignore vendored
View File

@@ -92,6 +92,8 @@ pc-bios/optionrom/multiboot.img
pc-bios/optionrom/kvmvapic.bin
pc-bios/optionrom/kvmvapic.raw
pc-bios/optionrom/kvmvapic.img
pc-bios/s390-ccw/s390-ccw.elf
pc-bios/s390-ccw/s390-ccw.img
.stgit-*
cscope.*
tags

View File

@@ -170,7 +170,7 @@ qemu-io$(EXESUF): qemu-io.o cmd.o $(block-obj-y) libqemuutil.a libqemustub.a
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/virtio-9p-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx

View File

@@ -16,16 +16,7 @@ block-obj-y += qapi-types.o qapi-visit.o
block-obj-y += qemu-coroutine.o qemu-coroutine-lock.o qemu-coroutine-io.o
block-obj-y += qemu-coroutine-sleep.o
ifeq ($(CONFIG_UCONTEXT_COROUTINE),y)
block-obj-$(CONFIG_POSIX) += coroutine-ucontext.o
else
ifeq ($(CONFIG_SIGALTSTACK_COROUTINE),y)
block-obj-$(CONFIG_POSIX) += coroutine-sigaltstack.o
else
block-obj-$(CONFIG_POSIX) += coroutine-gthread.o
endif
endif
block-obj-$(CONFIG_WIN32) += coroutine-win32.o
block-obj-y += coroutine-$(CONFIG_COROUTINE_BACKEND).o
ifeq ($(CONFIG_VIRTIO)$(CONFIG_VIRTFS)$(CONFIG_PCI),yyy)
# Lots of the fsdev/9pcode is pulled in by vl.c via qemu_fsdev_add.

View File

@@ -1 +1 @@
1.4.0
1.4.2

View File

@@ -114,26 +114,6 @@ const uint32_t arch_type = QEMU_ARCH;
#define RAM_SAVE_FLAG_CONTINUE 0x20
#define RAM_SAVE_FLAG_XBZRLE 0x40
#ifdef __ALTIVEC__
#include <altivec.h>
#define VECTYPE vector unsigned char
#define SPLAT(p) vec_splat(vec_ld(0, p), 0)
#define ALL_EQ(v1, v2) vec_all_eq(v1, v2)
/* altivec.h may redefine the bool macro as vector type.
* Reset it to POSIX semantics. */
#undef bool
#define bool _Bool
#elif defined __SSE2__
#include <emmintrin.h>
#define VECTYPE __m128i
#define SPLAT(p) _mm_set1_epi8(*(p))
#define ALL_EQ(v1, v2) (_mm_movemask_epi8(_mm_cmpeq_epi8(v1, v2)) == 0xFFFF)
#else
#define VECTYPE unsigned long
#define SPLAT(p) (*(p) * (~0UL / 255))
#define ALL_EQ(v1, v2) ((v1) == (v2))
#endif
static struct defconfig_file {
const char *filename;
@@ -164,19 +144,10 @@ int qemu_read_default_config_files(bool userconfig)
return 0;
}
static int is_dup_page(uint8_t *page)
static inline bool is_zero_page(uint8_t *p)
{
VECTYPE *p = (VECTYPE *)page;
VECTYPE val = SPLAT(page);
int i;
for (i = 0; i < TARGET_PAGE_SIZE / sizeof(VECTYPE); i++) {
if (!ALL_EQ(val, p[i])) {
return 0;
}
}
return 1;
return buffer_find_nonzero_offset(p, TARGET_PAGE_SIZE) ==
TARGET_PAGE_SIZE;
}
/* struct contains XBZRLE cache and a static page
@@ -210,6 +181,7 @@ int64_t xbzrle_cache_resize(int64_t new_size)
/* accounting for migration statistics */
typedef struct AccountingInfo {
uint64_t dup_pages;
uint64_t skipped_pages;
uint64_t norm_pages;
uint64_t iterations;
uint64_t xbzrle_bytes;
@@ -235,6 +207,16 @@ uint64_t dup_mig_pages_transferred(void)
return acct_info.dup_pages;
}
uint64_t skipped_mig_bytes_transferred(void)
{
return acct_info.skipped_pages * TARGET_PAGE_SIZE;
}
uint64_t skipped_mig_pages_transferred(void)
{
return acct_info.skipped_pages;
}
uint64_t norm_mig_bytes_transferred(void)
{
return acct_info.norm_pages * TARGET_PAGE_SIZE;
@@ -347,6 +329,7 @@ static ram_addr_t last_offset;
static unsigned long *migration_bitmap;
static uint64_t migration_dirty_pages;
static uint32_t last_version;
static bool ram_bulk_stage;
static inline
ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
@@ -356,7 +339,13 @@ ram_addr_t migration_bitmap_find_and_reset_dirty(MemoryRegion *mr,
unsigned long nr = base + (start >> TARGET_PAGE_BITS);
unsigned long size = base + (int128_get64(mr->size) >> TARGET_PAGE_BITS);
unsigned long next = find_next_bit(migration_bitmap, size, nr);
unsigned long next;
if (ram_bulk_stage && nr > base) {
next = nr + 1;
} else {
next = find_next_bit(migration_bitmap, size, nr);
}
if (next < size) {
clear_bit(next, migration_bitmap);
@@ -451,6 +440,7 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
if (!block) {
block = QTAILQ_FIRST(&ram_list.blocks);
complete_round = true;
ram_bulk_stage = false;
}
} else {
uint8_t *p;
@@ -461,13 +451,13 @@ static int ram_save_block(QEMUFile *f, bool last_stage)
/* In doubt sent page as normal */
bytes_sent = -1;
if (is_dup_page(p)) {
if (is_zero_page(p)) {
acct_info.dup_pages++;
bytes_sent = save_block_hdr(f, block, offset, cont,
RAM_SAVE_FLAG_COMPRESS);
qemu_put_byte(f, *p);
bytes_sent += 1;
} else if (migrate_use_xbzrle()) {
qemu_put_byte(f, 0);
bytes_sent++;
} else if (!ram_bulk_stage && migrate_use_xbzrle()) {
current_addr = block->offset + offset;
bytes_sent = save_xbzrle_page(f, p, current_addr, block,
offset, cont, last_stage);
@@ -554,6 +544,7 @@ static void reset_ram_globals(void)
last_sent_block = NULL;
last_offset = 0;
last_version = ram_list.version;
ram_bulk_stage = true;
}
#define MAX_WAIT 50 /* ms, half buffered_file limit */
@@ -745,7 +736,7 @@ static inline void *host_from_stream_offset(QEMUFile *f,
uint8_t len;
if (flags & RAM_SAVE_FLAG_CONTINUE) {
if (!block) {
if (!block || block->length <= offset) {
fprintf(stderr, "Ack, bad migration stream!\n");
return NULL;
}
@@ -758,8 +749,9 @@ static inline void *host_from_stream_offset(QEMUFile *f,
id[len] = 0;
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (!strncmp(id, block->idstr, sizeof(id)))
if (!strncmp(id, block->idstr, sizeof(id)) && block->length > offset) {
return memory_region_get_ram_ptr(block->mr) + offset;
}
}
fprintf(stderr, "Can't find block %s!\n", id);
@@ -833,14 +825,16 @@ static int ram_load(QEMUFile *f, void *opaque, int version_id)
}
ch = qemu_get_byte(f);
memset(host, ch, TARGET_PAGE_SIZE);
if (ch != 0 || !is_zero_page(host)) {
memset(host, ch, TARGET_PAGE_SIZE);
#ifndef _WIN32
if (ch == 0 &&
(!kvm_enabled() || kvm_has_sync_mmu()) &&
getpagesize() <= TARGET_PAGE_SIZE) {
qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
}
if (ch == 0 &&
(!kvm_enabled() || kvm_has_sync_mmu()) &&
getpagesize() <= TARGET_PAGE_SIZE) {
qemu_madvise(host, TARGET_PAGE_SIZE, QEMU_MADV_DONTNEED);
}
#endif
}
} else if (flags & RAM_SAVE_FLAG_PAGE) {
void *host;

View File

@@ -2054,6 +2054,8 @@ void AUD_del_capture (CaptureVoiceOut *cap, void *cb_opaque)
sw = sw1;
}
QLIST_REMOVE (cap, entries);
g_free (cap->hw.mix_buf);
g_free (cap->buf);
g_free (cap);
}
return;

View File

@@ -24,33 +24,12 @@ typedef struct RngEgd
CharDriverState *chr;
char *chr_name;
GSList *requests;
} RngEgd;
typedef struct RngRequest
{
EntropyReceiveFunc *receive_entropy;
uint8_t *data;
void *opaque;
size_t offset;
size_t size;
} RngRequest;
static void rng_egd_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
static void rng_egd_request_entropy(RngBackend *b, RngRequest *req)
{
RngEgd *s = RNG_EGD(b);
RngRequest *req;
req = g_malloc(sizeof(*req));
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
size_t size = req->size;
while (size > 0) {
uint8_t header[2];
@@ -64,23 +43,15 @@ static void rng_egd_request_entropy(RngBackend *b, size_t size,
size -= len;
}
s->requests = g_slist_append(s->requests, req);
}
static void rng_egd_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static int rng_egd_chr_can_read(void *opaque)
static size_t rng_egd_chr_can_read(void *opaque)
{
RngEgd *s = RNG_EGD(opaque);
GSList *i;
int size = 0;
size_t size = 0;
for (i = s->requests; i; i = i->next) {
for (i = s->parent.requests; i; i = i->next) {
RngRequest *req = i->data;
size += req->size - req->offset;
}
@@ -88,12 +59,12 @@ static int rng_egd_chr_can_read(void *opaque)
return size;
}
static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
static void rng_egd_chr_read(void *opaque, const uint8_t *buf, size_t size)
{
RngEgd *s = RNG_EGD(opaque);
while (size > 0 && s->requests) {
RngRequest *req = s->requests->data;
while (size > 0 && s->parent.requests) {
RngRequest *req = s->parent.requests->data;
int len = MIN(size, req->size - req->offset);
memcpy(req->data + req->offset, buf, len);
@@ -101,38 +72,13 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
size -= len;
if (req->offset == req->size) {
s->requests = g_slist_remove_link(s->requests, s->requests);
req->receive_entropy(req->opaque, req->data, req->size);
rng_egd_free_request(req);
rng_backend_finalize_request(&s->parent, req);
}
}
}
static void rng_egd_free_requests(RngEgd *s)
{
GSList *i;
for (i = s->requests; i; i = i->next) {
rng_egd_free_request(i->data);
}
g_slist_free(s->requests);
s->requests = NULL;
}
static void rng_egd_cancel_requests(RngBackend *b)
{
RngEgd *s = RNG_EGD(b);
/* We simply delete the list of pending requests. If there is data in the
* queue waiting to be read, this is okay, because there will always be
* more data than we requested originally
*/
rng_egd_free_requests(s);
}
static void rng_egd_opened(RngBackend *b, Error **errp)
{
RngEgd *s = RNG_EGD(b);
@@ -194,8 +140,6 @@ static void rng_egd_finalize(Object *obj)
}
g_free(s->chr_name);
rng_egd_free_requests(s);
}
static void rng_egd_class_init(ObjectClass *klass, void *data)
@@ -203,7 +147,6 @@ static void rng_egd_class_init(ObjectClass *klass, void *data)
RngBackendClass *rbc = RNG_BACKEND_CLASS(klass);
rbc->request_entropy = rng_egd_request_entropy;
rbc->cancel_requests = rng_egd_cancel_requests;
rbc->opened = rng_egd_opened;
}

View File

@@ -21,10 +21,6 @@ struct RndRandom
int fd;
char *filename;
EntropyReceiveFunc *receive_func;
void *opaque;
size_t size;
};
/**
@@ -37,33 +33,35 @@ struct RndRandom
static void entropy_available(void *opaque)
{
RndRandom *s = RNG_RANDOM(opaque);
uint8_t buffer[s->size];
ssize_t len;
len = read(s->fd, buffer, s->size);
g_assert(len != -1);
while (s->parent.requests != NULL) {
RngRequest *req = s->parent.requests->data;
ssize_t len;
s->receive_func(s->opaque, buffer, len);
s->receive_func = NULL;
len = read(s->fd, req->data, req->size);
if (len < 0 && errno == EAGAIN) {
return;
}
g_assert(len != -1);
req->receive_entropy(req->opaque, req->data, len);
rng_backend_finalize_request(&s->parent, req);
}
/* We've drained all requests, the fd handler can be reset. */
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
}
static void rng_random_request_entropy(RngBackend *b, size_t size,
EntropyReceiveFunc *receive_entropy,
void *opaque)
static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
{
RndRandom *s = RNG_RANDOM(b);
if (s->receive_func) {
s->receive_func(s->opaque, NULL, 0);
if (s->parent.requests == NULL) {
/* If there are no pending requests yet, we need to
* install our fd handler. */
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
}
s->receive_func = receive_entropy;
s->opaque = opaque;
s->size = size;
qemu_set_fd_handler(s->fd, entropy_available, NULL, s);
}
static void rng_random_opened(RngBackend *b, Error **errp)
@@ -74,7 +72,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
error_set(errp, QERR_INVALID_PARAMETER_VALUE,
"filename", "a valid filename");
} else {
s->fd = open(s->filename, O_RDONLY | O_NONBLOCK);
s->fd = qemu_open(s->filename, O_RDONLY | O_NONBLOCK);
if (s->fd == -1) {
error_set(errp, QERR_OPEN_FILE_FAILED, s->filename);
@@ -130,7 +128,7 @@ static void rng_random_finalize(Object *obj)
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
if (s->fd != -1) {
close(s->fd);
qemu_close(s->fd);
}
g_free(s->filename);

View File

@@ -18,18 +18,20 @@ void rng_backend_request_entropy(RngBackend *s, size_t size,
void *opaque)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
RngRequest *req;
if (k->request_entropy) {
k->request_entropy(s, size, receive_entropy, opaque);
}
}
req = g_malloc(sizeof(*req));
void rng_backend_cancel_requests(RngBackend *s)
{
RngBackendClass *k = RNG_BACKEND_GET_CLASS(s);
req->offset = 0;
req->size = size;
req->receive_entropy = receive_entropy;
req->opaque = opaque;
req->data = g_malloc(req->size);
if (k->cancel_requests) {
k->cancel_requests(s);
k->request_entropy(s, req);
s->requests = g_slist_append(s->requests, req);
}
}
@@ -68,6 +70,30 @@ static void rng_backend_prop_set_opened(Object *obj, bool value, Error **errp)
}
}
static void rng_backend_free_request(RngRequest *req)
{
g_free(req->data);
g_free(req);
}
static void rng_backend_free_requests(RngBackend *s)
{
GSList *i;
for (i = s->requests; i; i = i->next) {
rng_backend_free_request(i->data);
}
g_slist_free(s->requests);
s->requests = NULL;
}
void rng_backend_finalize_request(RngBackend *s, RngRequest *req)
{
s->requests = g_slist_remove(s->requests, req);
rng_backend_free_request(req);
}
static void rng_backend_init(Object *obj)
{
object_property_add_bool(obj, "opened",
@@ -76,11 +102,19 @@ static void rng_backend_init(Object *obj)
NULL);
}
static void rng_backend_finalize(Object *obj)
{
RngBackend *s = RNG_BACKEND(obj);
rng_backend_free_requests(s);
}
static const TypeInfo rng_backend_info = {
.name = TYPE_RNG_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(RngBackend),
.instance_init = rng_backend_init,
.instance_finalize = rng_backend_finalize,
.class_size = sizeof(RngBackendClass),
.abstract = true,
};

View File

@@ -1940,6 +1940,10 @@ static int bdrv_check_byte_request(BlockDriverState *bs, int64_t offset,
static int bdrv_check_request(BlockDriverState *bs, int64_t sector_num,
int nb_sectors)
{
if (nb_sectors > INT_MAX / BDRV_SECTOR_SIZE) {
return -EIO;
}
return bdrv_check_byte_request(bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE);
}

View File

@@ -18,5 +18,7 @@ endif
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += mirror.o
block-obj-y += dictzip.o
block-obj-y += tar.o
$(obj)/curl.o: QEMU_CFLAGS+=$(CURL_CFLAGS)

View File

@@ -38,57 +38,42 @@
// not allocated: 0xffffffff
// always little-endian
struct bochs_header_v1 {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
uint32_t version;
uint32_t header; // size of header
union {
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 20];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
} extra;
};
// always little-endian
struct bochs_header {
char magic[32]; // "Bochs Virtual HD Image"
char type[16]; // "Redolog"
char subtype[16]; // "Undoable" / "Volatile" / "Growing"
char magic[32]; /* "Bochs Virtual HD Image" */
char type[16]; /* "Redolog" */
char subtype[16]; /* "Undoable" / "Volatile" / "Growing" */
uint32_t version;
uint32_t header; // size of header
uint32_t header; /* size of header */
uint32_t catalog; /* num of entries */
uint32_t bitmap; /* bitmap size */
uint32_t extent; /* extent size */
union {
struct {
uint32_t catalog; // num of entries
uint32_t bitmap; // bitmap size
uint32_t extent; // extent size
uint32_t reserved; // for ???
uint64_t disk; // disk size
char padding[HEADER_SIZE - 64 - 8 - 24];
} redolog;
char padding[HEADER_SIZE - 64 - 8];
struct {
uint32_t reserved; /* for ??? */
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 12];
} QEMU_PACKED redolog;
struct {
uint64_t disk; /* disk size */
char padding[HEADER_SIZE - 64 - 20 - 8];
} QEMU_PACKED redolog_v1;
char padding[HEADER_SIZE - 64 - 20];
} extra;
};
} QEMU_PACKED;
typedef struct BDRVBochsState {
CoMutex lock;
uint32_t *catalog_bitmap;
int catalog_size;
uint32_t catalog_size;
int data_offset;
uint32_t data_offset;
int bitmap_blocks;
int extent_blocks;
int extent_size;
uint32_t bitmap_blocks;
uint32_t extent_blocks;
uint32_t extent_size;
} BDRVBochsState;
static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -111,9 +96,8 @@ static int bochs_probe(const uint8_t *buf, int buf_size, const char *filename)
static int bochs_open(BlockDriverState *bs, int flags)
{
BDRVBochsState *s = bs->opaque;
int i;
uint32_t i;
struct bochs_header bochs;
struct bochs_header_v1 header_v1;
int ret;
bs->read_only = 1; // no write support yet
@@ -132,13 +116,20 @@ static int bochs_open(BlockDriverState *bs, int flags)
}
if (le32_to_cpu(bochs.version) == HEADER_V1) {
memcpy(&header_v1, &bochs, sizeof(bochs));
bs->total_sectors = le64_to_cpu(header_v1.extra.redolog.disk) / 512;
bs->total_sectors = le64_to_cpu(bochs.extra.redolog_v1.disk) / 512;
} else {
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
bs->total_sectors = le64_to_cpu(bochs.extra.redolog.disk) / 512;
}
/* Limit to 1M entries to avoid unbounded allocation. This is what is
* needed for the largest image that bximage can create (~8 TB). */
s->catalog_size = le32_to_cpu(bochs.catalog);
if (s->catalog_size > 0x100000) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Catalog size is too large");
return -EFBIG;
}
s->catalog_size = le32_to_cpu(bochs.extra.redolog.catalog);
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
@@ -152,10 +143,27 @@ static int bochs_open(BlockDriverState *bs, int flags)
s->data_offset = le32_to_cpu(bochs.header) + (s->catalog_size * 4);
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extra.redolog.extent) - 1) / 512;
s->bitmap_blocks = 1 + (le32_to_cpu(bochs.bitmap) - 1) / 512;
s->extent_blocks = 1 + (le32_to_cpu(bochs.extent) - 1) / 512;
s->extent_size = le32_to_cpu(bochs.extra.redolog.extent);
s->extent_size = le32_to_cpu(bochs.extent);
if (s->extent_size == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Extent size may not be zero");
return -EINVAL;
} else if (s->extent_size > 0x800000) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Extent size %" PRIu32 " is too large",
s->extent_size);
return -EINVAL;
}
if (s->catalog_size < bs->total_sectors / s->extent_size) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Catalog size is too small for this disk size");
ret = -EINVAL;
goto fail;
}
qemu_co_mutex_init(&s->lock);
return 0;
@@ -168,8 +176,8 @@ fail:
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{
BDRVBochsState *s = bs->opaque;
int64_t offset = sector_num * 512;
int64_t extent_index, extent_offset, bitmap_offset;
uint64_t offset = sector_num * 512;
uint64_t extent_index, extent_offset, bitmap_offset;
char bitmap_entry;
// seek to sector
@@ -180,8 +188,9 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
return -1; /* not allocated */
}
bitmap_offset = s->data_offset + (512 * s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
bitmap_offset = s->data_offset +
(512 * (uint64_t) s->catalog_bitmap[extent_index] *
(s->extent_blocks + s->bitmap_blocks));
/* read in bitmap for current extent */
if (bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),

View File

@@ -26,6 +26,9 @@
#include "qemu/module.h"
#include <zlib.h>
/* Maximum compressed block size */
#define MAX_BLOCK_SIZE (64 * 1024 * 1024)
typedef struct BDRVCloopState {
CoMutex lock;
uint32_t block_size;
@@ -67,6 +70,29 @@ static int cloop_open(BlockDriverState *bs, int flags)
return ret;
}
s->block_size = be32_to_cpu(s->block_size);
if (s->block_size % 512) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size %u must be a multiple of 512",
s->block_size);
return -EINVAL;
}
if (s->block_size == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size cannot be zero");
return -EINVAL;
}
/* cloop's create_compressed_fs.c warns about block sizes beyond 256 KB but
* we can accept more. Prevent ridiculous values like 4 GB - 1 since we
* need a buffer this big.
*/
if (s->block_size > MAX_BLOCK_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"block_size %u must be %u MB or less",
s->block_size,
MAX_BLOCK_SIZE / (1024 * 1024));
return -EINVAL;
}
ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
if (ret < 0) {
@@ -75,7 +101,25 @@ static int cloop_open(BlockDriverState *bs, int flags)
s->n_blocks = be32_to_cpu(s->n_blocks);
/* read offsets */
offsets_size = s->n_blocks * sizeof(uint64_t);
if (s->n_blocks > (UINT32_MAX - 1) / sizeof(uint64_t)) {
/* Prevent integer overflow */
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"n_blocks %u must be %zu or less",
s->n_blocks,
(UINT32_MAX - 1) / sizeof(uint64_t));
return -EINVAL;
}
offsets_size = (s->n_blocks + 1) * sizeof(uint64_t);
if (offsets_size > 512 * 1024 * 1024) {
/* Prevent ridiculous offsets_size which causes memory allocation to
* fail or overflows bdrv_pread() size. In practice the 512 MB
* offsets[] limit supports 16 TB images at 256 KB block size.
*/
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"image requires too many offsets, "
"try increasing block size");
return -EINVAL;
}
s->offsets = g_malloc(offsets_size);
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
@@ -83,13 +127,39 @@ static int cloop_open(BlockDriverState *bs, int flags)
goto fail;
}
for(i=0;i<s->n_blocks;i++) {
for (i = 0; i < s->n_blocks + 1; i++) {
uint64_t size;
s->offsets[i] = be64_to_cpu(s->offsets[i]);
if (i > 0) {
uint32_t size = s->offsets[i] - s->offsets[i - 1];
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
}
if (i == 0) {
continue;
}
if (s->offsets[i] < s->offsets[i - 1]) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"offsets not monotonically increasing at "
"index %u, image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
size = s->offsets[i] - s->offsets[i - 1];
/* Compressed blocks should be smaller than the uncompressed block size
* but maybe compression performed poorly so the compressed block is
* actually bigger. Clamp down on unrealistic values to prevent
* ridiculous s->compressed_block allocation.
*/
if (size > 2 * MAX_BLOCK_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"invalid compressed block size at index %u, "
"image file is corrupt", i);
ret = -EINVAL;
goto fail;
}
if (size > max_compressed_block_size) {
max_compressed_block_size = size;
}
}
@@ -179,9 +249,7 @@ static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
static void cloop_close(BlockDriverState *bs)
{
BDRVCloopState *s = bs->opaque;
if (s->n_blocks > 0) {
g_free(s->offsets);
}
g_free(s->offsets);
g_free(s->compressed_block);
g_free(s->uncompressed_block);
inflateEnd(&s->zstream);

View File

@@ -134,6 +134,11 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
if (!s || !s->orig_buf)
goto read_end;
if (s->buf_off >= s->buf_len) {
/* buffer full, read nothing */
return 0;
}
realsize = MIN(realsize, s->buf_len - s->buf_off);
memcpy(s->orig_buf + s->buf_off, ptr, realsize);
s->buf_off += realsize;

572
block/dictzip.c Normal file
View File

@@ -0,0 +1,572 @@
/*
* DictZip Block driver for dictzip enabled gzip files
*
* Use the "dictzip" tool from the "dictd" package to create gzip files that
* contain the extra DictZip headers.
*
* dictzip(1) is a compression program which creates compressed files in the
* gzip format (see RFC 1952). However, unlike gzip(1), dictzip(1) compresses
* the file in pieces and stores an index to the pieces in the gzip header.
* This allows random access to the file at the granularity of the compressed
* pieces (currently about 64kB) while maintaining good compression ratios
* (within 5% of the expected ratio for dictionary data).
* dictd(8) uses files stored in this format.
*
* For details on DictZip see http://dict.org/.
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("dzip: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define Z_STREAM_COUNT 4
#define CACHE_COUNT 20
/* magic values */
#define GZ_MAGIC1 0x1f
#define GZ_MAGIC2 0x8b
#define DZ_MAGIC1 'R'
#define DZ_MAGIC2 'A'
#define GZ_FEXTRA 0x04 /* Optional field (random access index) */
#define GZ_FNAME 0x08 /* Original name */
#define GZ_COMMENT 0x10 /* Zero-terminated, human-readable comment */
#define GZ_FHCRC 0x02 /* Header CRC16 */
/* offsets */
#define GZ_ID 0 /* GZ_MAGIC (16bit) */
#define GZ_FLG 3 /* FLaGs (see above) */
#define GZ_XLEN 10 /* eXtra LENgth (16bit) */
#define GZ_SI 12 /* Subfield ID (16bit) */
#define GZ_VERSION 16 /* Version for subfield format */
#define GZ_CHUNKSIZE 18 /* Chunk size (16bit) */
#define GZ_CHUNKCNT 20 /* Number of chunks (16bit) */
#define GZ_RNDDATA 22 /* Random access data (16bit) */
#define GZ_99_CHUNKSIZE 18 /* Chunk size (32bit) */
#define GZ_99_CHUNKCNT 22 /* Number of chunks (32bit) */
#define GZ_99_FILESIZE 26 /* Size of unpacked file (64bit) */
#define GZ_99_RNDDATA 34 /* Random access data (32bit) */
struct BDRVDictZipState;
typedef struct DictZipAIOCB {
BlockDriverAIOCB common;
struct BDRVDictZipState *s;
QEMUIOVector *qiov; /* QIOV of the original request */
QEMUIOVector *qiov_gz; /* QIOV of the gz subrequest */
QEMUBH *bh; /* BH for cache */
z_stream *zStream; /* stream to use for decoding */
int zStream_id; /* stream id of the above pointer */
size_t start; /* offset into the uncompressed file */
size_t len; /* uncompressed bytes to read */
uint8_t *gzipped; /* the gzipped data */
uint8_t *buf; /* cached result */
size_t gz_len; /* amount of gzip data */
size_t gz_start; /* uncompressed starting point of gzip data */
uint64_t offset; /* offset for "start" into the uncompressed chunk */
int chunks_len; /* amount of uncompressed data in all gzip data */
} DictZipAIOCB;
typedef struct dict_cache {
size_t start;
size_t len;
uint8_t *buf;
} DictCache;
typedef struct BDRVDictZipState {
BlockDriverState *hd;
z_stream zStream[Z_STREAM_COUNT];
DictCache cache[CACHE_COUNT];
int cache_index;
uint8_t stream_in_use;
uint64_t chunk_len;
uint32_t chunk_cnt;
uint16_t *chunks;
uint32_t *chunks32;
uint64_t *offsets;
int64_t file_len;
} BDRVDictZipState;
static int dictzip_probe(const uint8_t *buf, int buf_size, const char *filename)
{
if (buf_size < 2)
return 0;
/* We match on every gzip file */
if ((buf[0] == GZ_MAGIC1) && (buf[1] == GZ_MAGIC2))
return 100;
return 0;
}
static int start_zStream(z_stream *zStream)
{
zStream->zalloc = NULL;
zStream->zfree = NULL;
zStream->opaque = NULL;
zStream->next_in = 0;
zStream->avail_in = 0;
zStream->next_out = NULL;
zStream->avail_out = 0;
return inflateInit2( zStream, -15 );
}
static int dictzip_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVDictZipState *s = bs->opaque;
const char *err = "Unknown (read error?)";
uint8_t magic[2];
char buf[100];
uint8_t header_flags;
uint16_t chunk_len16;
uint16_t chunk_cnt16;
uint32_t chunk_len32;
uint16_t header_ver;
uint16_t tmp_short;
uint64_t offset;
int chunks_len;
int headerLength = GZ_XLEN - 1;
int rnd_offs;
int ret;
int i;
const char *fname = filename;
if (!strncmp(filename, "dzip://", 7))
fname += 7;
else if (!strncmp(filename, "dzip:", 5))
fname += 5;
ret = bdrv_file_open(&s->hd, fname, flags);
if (ret < 0)
return ret;
/* initialize zlib streams */
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (start_zStream( &s->zStream[i] ) != Z_OK) {
err = s->zStream[i].msg;
goto fail;
}
}
/* gzip header */
if (bdrv_pread(s->hd, GZ_ID, &magic, sizeof(magic)) != sizeof(magic))
goto fail;
if (!((magic[0] == GZ_MAGIC1) && (magic[1] == GZ_MAGIC2))) {
err = "No gzip file";
goto fail;
}
/* dzip header */
if (bdrv_pread(s->hd, GZ_FLG, &header_flags, 1) != 1)
goto fail;
if (!(header_flags & GZ_FEXTRA)) {
err = "Not a dictzip file (wrong flags)";
goto fail;
}
/* extra length */
if (bdrv_pread(s->hd, GZ_XLEN, &tmp_short, 2) != 2)
goto fail;
headerLength += le16_to_cpu(tmp_short) + 2;
/* DictZip magic */
if (bdrv_pread(s->hd, GZ_SI, &magic, 2) != 2)
goto fail;
if (magic[0] != DZ_MAGIC1 || magic[1] != DZ_MAGIC2) {
err = "Not a dictzip file (missing extra magic)";
goto fail;
}
/* DictZip version */
if (bdrv_pread(s->hd, GZ_VERSION, &header_ver, 2) != 2)
goto fail;
header_ver = le16_to_cpu(header_ver);
switch (header_ver) {
case 1: /* Normal DictZip */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_CHUNKSIZE, &chunk_len16, 2) != 2)
goto fail;
s->chunk_len = le16_to_cpu(chunk_len16);
/* chunk count */
if (bdrv_pread(s->hd, GZ_CHUNKCNT, &chunk_cnt16, 2) != 2)
goto fail;
s->chunk_cnt = le16_to_cpu(chunk_cnt16);
chunks_len = sizeof(short) * s->chunk_cnt;
rnd_offs = GZ_RNDDATA;
break;
case 99: /* Special Alex pigz version */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_99_CHUNKSIZE, &chunk_len32, 4) != 4)
goto fail;
dprintf("chunk len [%#x] = %d\n", GZ_99_CHUNKSIZE, chunk_len32);
s->chunk_len = le32_to_cpu(chunk_len32);
/* chunk count */
if (bdrv_pread(s->hd, GZ_99_CHUNKCNT, &s->chunk_cnt, 4) != 4)
goto fail;
s->chunk_cnt = le32_to_cpu(s->chunk_cnt);
dprintf("chunk len | count = %d | %d\n", s->chunk_len, s->chunk_cnt);
/* file size */
if (bdrv_pread(s->hd, GZ_99_FILESIZE, &s->file_len, 8) != 8)
goto fail;
s->file_len = le64_to_cpu(s->file_len);
chunks_len = sizeof(int) * s->chunk_cnt;
rnd_offs = GZ_99_RNDDATA;
break;
default:
err = "Invalid DictZip version";
goto fail;
}
/* random access data */
s->chunks = g_malloc(chunks_len);
if (header_ver == 99)
s->chunks32 = (uint32_t *)s->chunks;
if (bdrv_pread(s->hd, rnd_offs, s->chunks, chunks_len) != chunks_len)
goto fail;
/* orig filename */
if (header_flags & GZ_FNAME) {
if (bdrv_pread(s->hd, headerLength + 1, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("filename: %s\n", buf);
}
/* comment field */
if (header_flags & GZ_COMMENT) {
if (bdrv_pread(s->hd, headerLength, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("comment: %s\n", buf);
}
if (header_flags & GZ_FHCRC)
headerLength += 2;
/* uncompressed file length*/
if (!s->file_len) {
uint32_t file_len;
if (bdrv_pread(s->hd, bdrv_getlength(s->hd) - 4, &file_len, 4) != 4)
goto fail;
s->file_len = le32_to_cpu(file_len);
}
/* compute offsets */
s->offsets = g_malloc(sizeof( *s->offsets ) * s->chunk_cnt);
for (offset = headerLength + 1, i = 0; i < s->chunk_cnt; i++) {
s->offsets[i] = offset;
switch (header_ver) {
case 1:
offset += le16_to_cpu(s->chunks[i]);
break;
case 99:
offset += le32_to_cpu(s->chunks32[i]);
break;
}
dprintf("chunk %#x - %#x = offset %#x -> %#x\n", i * s->chunk_len, (i+1) * s->chunk_len, s->offsets[i], offset);
}
return 0;
fail:
fprintf(stderr, "DictZip: Error opening file: %s\n", err);
bdrv_delete(s->hd);
if (s->chunks)
g_free(s->chunks);
return -EINVAL;
}
/* This callback gets invoked when we have the result in cache already */
static void dictzip_cache_cb(void *opaque)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
qemu_iovec_from_buf(acb->qiov, 0, acb->buf, acb->len);
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_release(acb);
}
/* This callback gets invoked by the underlying block reader when we have
* all compressed data. We uncompress in here. */
static void dictzip_read_cb(void *opaque, int ret)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
struct BDRVDictZipState *s = acb->s;
uint8_t *buf;
DictCache *cache;
int r, i;
buf = g_malloc(acb->chunks_len);
/* try to find zlib stream for decoding */
do {
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (!(s->stream_in_use & (1 << i))) {
s->stream_in_use |= (1 << i);
acb->zStream_id = i;
acb->zStream = &s->zStream[i];
break;
}
}
} while(!acb->zStream);
/* sure, we could handle more streams, but this callback should be single
threaded and when it's not, we really want to know! */
assert(i == 0);
/* uncompress the chunk */
acb->zStream->next_in = acb->gzipped;
acb->zStream->avail_in = acb->gz_len;
acb->zStream->next_out = buf;
acb->zStream->avail_out = acb->chunks_len;
r = inflate( acb->zStream, Z_PARTIAL_FLUSH );
if ( (r != Z_OK) && (r != Z_STREAM_END) )
fprintf(stderr, "Error inflating: [%d] %s\n", r, acb->zStream->msg);
if ( r == Z_STREAM_END )
inflateReset(acb->zStream);
dprintf("inflating [%d] left: %d | %d bytes\n", r, acb->zStream->avail_in, acb->zStream->avail_out);
s->stream_in_use &= ~(1 << acb->zStream_id);
/* nofity the caller */
qemu_iovec_from_buf(acb->qiov, 0, buf + acb->offset, acb->len);
acb->common.cb(acb->common.opaque, 0);
/* fill the cache */
cache = &s->cache[s->cache_index];
s->cache_index++;
if (s->cache_index == CACHE_COUNT)
s->cache_index = 0;
cache->len = 0;
if (cache->buf)
g_free(cache->buf);
cache->start = acb->gz_start;
cache->buf = buf;
cache->len = acb->chunks_len;
/* free occupied ressources */
g_free(acb->qiov_gz);
qemu_aio_release(acb);
}
static void dictzip_aio_cancel(BlockDriverAIOCB *blockacb)
{
}
static const AIOCBInfo dictzip_aiocb_info = {
.aiocb_size = sizeof(DictZipAIOCB),
.cancel = dictzip_aio_cancel,
};
/* This is where we get a request from a caller to read something */
static BlockDriverAIOCB *dictzip_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVDictZipState *s = bs->opaque;
DictZipAIOCB *acb;
QEMUIOVector *qiov_gz;
struct iovec *iov;
uint8_t *buf;
size_t start = sector_num * SECTOR_SIZE;
size_t len = nb_sectors * SECTOR_SIZE;
size_t end = start + len;
size_t gz_start;
size_t gz_len;
int64_t gz_sector_num;
int gz_nb_sectors;
int first_chunk, last_chunk;
int first_offset;
int i;
acb = qemu_aio_get(&dictzip_aiocb_info, bs, cb, opaque);
if (!acb)
return NULL;
/* Search Cache */
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
if ((start >= s->cache[i].start) &&
(end <= (s->cache[i].start + s->cache[i].len))) {
acb->buf = s->cache[i].buf + (start - s->cache[i].start);
acb->len = len;
acb->qiov = qiov;
acb->bh = qemu_bh_new(dictzip_cache_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
/* No cache, so let's decode */
/* We need to read these chunks */
first_chunk = start / s->chunk_len;
first_offset = start - first_chunk * s->chunk_len;
last_chunk = end / s->chunk_len;
gz_start = s->offsets[first_chunk];
gz_len = 0;
for (i = first_chunk; i <= last_chunk; i++) {
if (s->chunks32)
gz_len += le32_to_cpu(s->chunks32[i]);
else
gz_len += le16_to_cpu(s->chunks[i]);
}
gz_sector_num = gz_start / SECTOR_SIZE;
gz_nb_sectors = (gz_len / SECTOR_SIZE);
/* account for tail and heads */
while ((gz_start + gz_len) > ((gz_sector_num + gz_nb_sectors) * SECTOR_SIZE))
gz_nb_sectors++;
/* Allocate qiov, iov and buf in one chunk so we only need to free qiov */
qiov_gz = g_malloc0(sizeof(QEMUIOVector) + sizeof(struct iovec) +
(gz_nb_sectors * SECTOR_SIZE));
iov = (struct iovec *)(((char *)qiov_gz) + sizeof(QEMUIOVector));
buf = ((uint8_t *)iov) + sizeof(struct iovec *);
/* Kick off the read by the backing file, so we can start decompressing */
iov->iov_base = (void *)buf;
iov->iov_len = gz_nb_sectors * 512;
qemu_iovec_init_external(qiov_gz, iov, 1);
dprintf("read %d - %d => %d - %d\n", start, end, gz_start, gz_start + gz_len);
acb->s = s;
acb->qiov = qiov;
acb->qiov_gz = qiov_gz;
acb->start = start;
acb->len = len;
acb->gzipped = buf + (gz_start % SECTOR_SIZE);
acb->gz_len = gz_len;
acb->gz_start = first_chunk * s->chunk_len;
acb->offset = first_offset;
acb->chunks_len = (last_chunk - first_chunk + 1) * s->chunk_len;
return bdrv_aio_readv(s->hd, gz_sector_num, qiov_gz, gz_nb_sectors,
dictzip_read_cb, acb);
}
static void dictzip_close(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
int i;
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
g_free(s->cache[i].buf);
}
for (i = 0; i < Z_STREAM_COUNT; i++) {
inflateEnd(&s->zStream[i]);
}
if (s->chunks)
g_free(s->chunks);
if (s->offsets)
g_free(s->offsets);
dprintf("Close\n");
}
static int64_t dictzip_getlength(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_dictzip = {
.format_name = "dzip",
.protocol_name = "dzip",
.instance_size = sizeof(BDRVDictZipState),
.bdrv_file_open = dictzip_open,
.bdrv_close = dictzip_close,
.bdrv_getlength = dictzip_getlength,
.bdrv_probe = dictzip_probe,
.bdrv_aio_readv = dictzip_aio_readv,
};
static void dictzip_block_init(void)
{
bdrv_register(&bdrv_dictzip);
}
block_init(dictzip_block_init);

View File

@@ -27,6 +27,14 @@
#include "qemu/module.h"
#include <zlib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
* or truncating when converting to 32-bit types
*/
DMG_LENGTHS_MAX = 64 * 1024 * 1024, /* 64 MB */
DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
};
typedef struct BDRVDMGState {
CoMutex lock;
/* each chunk contains a certain number of sectors,
@@ -85,12 +93,43 @@ static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
return 0;
}
/* Increase max chunk sizes, if necessary. This function is used to calculate
* the buffer sizes needed for compressed/uncompressed chunk I/O.
*/
static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uint32_t *max_compressed_size,
uint32_t *max_sectors_per_chunk)
{
uint32_t compressed_size = 0;
uint32_t uncompressed_sectors = 0;
switch (s->types[chunk]) {
case 0x80000005: /* zlib compressed */
compressed_size = s->lengths[chunk];
uncompressed_sectors = s->sectorcounts[chunk];
break;
case 1: /* copy */
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
uncompressed_sectors = s->sectorcounts[chunk];
break;
}
if (compressed_size > *max_compressed_size) {
*max_compressed_size = compressed_size;
}
if (uncompressed_sectors > *max_sectors_per_chunk) {
*max_sectors_per_chunk = uncompressed_sectors;
}
}
static int dmg_open(BlockDriverState *bs, int flags)
{
BDRVDMGState *s = bs->opaque;
uint64_t info_begin,info_end,last_in_offset,last_out_offset;
uint64_t info_begin, info_end, last_in_offset, last_out_offset;
uint32_t count, tmp;
uint32_t max_compressed_size=1,max_sectors_per_chunk=1,i;
uint32_t max_compressed_size = 1, max_sectors_per_chunk = 1, i;
int64_t offset;
int ret;
@@ -152,37 +191,40 @@ static int dmg_open(BlockDriverState *bs, int flags)
goto fail;
}
if (type == 0x6d697368 && count >= 244) {
int new_size, chunk_count;
if (type == 0x6d697368 && count >= 244) {
size_t new_size;
uint32_t chunk_count;
offset += 4;
offset += 200;
chunk_count = (count-204)/40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size/2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
chunk_count = (count - 204) / 40;
new_size = sizeof(uint64_t) * (s->n_chunks + chunk_count);
s->types = g_realloc(s->types, new_size / 2);
s->offsets = g_realloc(s->offsets, new_size);
s->lengths = g_realloc(s->lengths, new_size);
s->sectors = g_realloc(s->sectors, new_size);
s->sectorcounts = g_realloc(s->sectorcounts, new_size);
for (i = s->n_chunks; i < s->n_chunks + chunk_count; i++) {
ret = read_uint32(bs, offset, &s->types[i]);
if (ret < 0) {
goto fail;
}
offset += 4;
if(s->types[i]!=0x80000005 && s->types[i]!=1 && s->types[i]!=2) {
if(s->types[i]==0xffffffff) {
last_in_offset = s->offsets[i-1]+s->lengths[i-1];
last_out_offset = s->sectors[i-1]+s->sectorcounts[i-1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
offset += 4;
if (s->types[i] != 0x80000005 && s->types[i] != 1 &&
s->types[i] != 2) {
if (s->types[i] == 0xffffffff && i > 0) {
last_in_offset = s->offsets[i - 1] + s->lengths[i - 1];
last_out_offset = s->sectors[i - 1] +
s->sectorcounts[i - 1];
}
chunk_count--;
i--;
offset += 36;
continue;
}
offset += 4;
ret = read_uint64(bs, offset, &s->sectors[i]);
if (ret < 0) {
@@ -197,6 +239,14 @@ static int dmg_open(BlockDriverState *bs, int flags)
}
offset += 8;
if (s->sectorcounts[i] > DMG_SECTORCOUNTS_MAX) {
error_report("sector count %" PRIu64 " for chunk %u is "
"larger than max (%u)",
s->sectorcounts[i], i, DMG_SECTORCOUNTS_MAX);
ret = -EINVAL;
goto fail;
}
ret = read_uint64(bs, offset, &s->offsets[i]);
if (ret < 0) {
goto fail;
@@ -210,19 +260,25 @@ static int dmg_open(BlockDriverState *bs, int flags)
}
offset += 8;
if(s->lengths[i]>max_compressed_size)
max_compressed_size = s->lengths[i];
if(s->sectorcounts[i]>max_sectors_per_chunk)
max_sectors_per_chunk = s->sectorcounts[i];
}
s->n_chunks+=chunk_count;
}
if (s->lengths[i] > DMG_LENGTHS_MAX) {
error_report("length %" PRIu64 " for chunk %u is larger "
"than max (%u)",
s->lengths[i], i, DMG_LENGTHS_MAX);
ret = -EINVAL;
goto fail;
}
update_max_chunk_size(s, i, &max_compressed_size,
&max_sectors_per_chunk);
}
s->n_chunks += chunk_count;
}
}
/* initialize zlib engine */
s->compressed_chunk = g_malloc(max_compressed_size+1);
s->uncompressed_chunk = g_malloc(512*max_sectors_per_chunk);
if(inflateInit(&s->zstream) != Z_OK) {
s->compressed_chunk = g_malloc(max_compressed_size + 1);
s->uncompressed_chunk = g_malloc(512 * max_sectors_per_chunk);
if (inflateInit(&s->zstream) != Z_OK) {
ret = -EINVAL;
goto fail;
}
@@ -244,83 +300,82 @@ fail:
}
static inline int is_sector_in_chunk(BDRVDMGState* s,
uint32_t chunk_num,int sector_num)
uint32_t chunk_num, uint64_t sector_num)
{
if(chunk_num>=s->n_chunks || s->sectors[chunk_num]>sector_num ||
s->sectors[chunk_num]+s->sectorcounts[chunk_num]<=sector_num)
return 0;
else
return -1;
if (chunk_num >= s->n_chunks || s->sectors[chunk_num] > sector_num ||
s->sectors[chunk_num] + s->sectorcounts[chunk_num] <= sector_num) {
return 0;
} else {
return -1;
}
}
static inline uint32_t search_chunk(BDRVDMGState* s,int sector_num)
static inline uint32_t search_chunk(BDRVDMGState *s, uint64_t sector_num)
{
/* binary search */
uint32_t chunk1=0,chunk2=s->n_chunks,chunk3;
while(chunk1!=chunk2) {
chunk3 = (chunk1+chunk2)/2;
if(s->sectors[chunk3]>sector_num)
chunk2 = chunk3;
else if(s->sectors[chunk3]+s->sectorcounts[chunk3]>sector_num)
return chunk3;
else
chunk1 = chunk3;
uint32_t chunk1 = 0, chunk2 = s->n_chunks, chunk3;
while (chunk1 != chunk2) {
chunk3 = (chunk1 + chunk2) / 2;
if (s->sectors[chunk3] > sector_num) {
chunk2 = chunk3;
} else if (s->sectors[chunk3] + s->sectorcounts[chunk3] > sector_num) {
return chunk3;
} else {
chunk1 = chunk3;
}
}
return s->n_chunks; /* error */
}
static inline int dmg_read_chunk(BlockDriverState *bs, int sector_num)
static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
{
BDRVDMGState *s = bs->opaque;
if(!is_sector_in_chunk(s,s->current_chunk,sector_num)) {
int ret;
uint32_t chunk = search_chunk(s,sector_num);
if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
int ret;
uint32_t chunk = search_chunk(s, sector_num);
if(chunk>=s->n_chunks)
return -1;
if (chunk >= s->n_chunks) {
return -1;
}
s->current_chunk = s->n_chunks;
switch(s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
int i;
s->current_chunk = s->n_chunks;
switch (s->types[chunk]) {
case 0x80000005: { /* zlib compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
/* we need to buffer, because only the chunk as whole can be
* inflated. */
i=0;
do {
ret = bdrv_pread(bs->file, s->offsets[chunk] + i,
s->compressed_chunk+i, s->lengths[chunk]-i);
if(ret<0 && errno==EINTR)
ret=0;
i+=ret;
} while(ret>=0 && ret+i<s->lengths[chunk]);
if (ret != s->lengths[chunk])
return -1;
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512*s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if(ret != Z_OK)
return -1;
ret = inflate(&s->zstream, Z_FINISH);
if(ret != Z_STREAM_END || s->zstream.total_out != 512*s->sectorcounts[chunk])
return -1;
break; }
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->zstream.next_in = s->compressed_chunk;
s->zstream.avail_in = s->lengths[chunk];
s->zstream.next_out = s->uncompressed_chunk;
s->zstream.avail_out = 512 * s->sectorcounts[chunk];
ret = inflateReset(&s->zstream);
if (ret != Z_OK) {
return -1;
}
ret = inflate(&s->zstream, Z_FINISH);
if (ret != Z_STREAM_END ||
s->zstream.total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break; }
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk])
return -1;
break;
case 2: /* zero */
memset(s->uncompressed_chunk, 0, 512*s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
if (ret != s->lengths[chunk]) {
return -1;
}
break;
case 2: /* zero */
memset(s->uncompressed_chunk, 0, 512 * s->sectorcounts[chunk]);
break;
}
s->current_chunk = chunk;
}
return 0;
}
@@ -331,12 +386,14 @@ static int dmg_read(BlockDriverState *bs, int64_t sector_num,
BDRVDMGState *s = bs->opaque;
int i;
for(i=0;i<nb_sectors;i++) {
uint32_t sector_offset_in_chunk;
if(dmg_read_chunk(bs, sector_num+i) != 0)
return -1;
sector_offset_in_chunk = sector_num+i-s->sectors[s->current_chunk];
memcpy(buf+i*512,s->uncompressed_chunk+sector_offset_in_chunk*512,512);
for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk;
if (dmg_read_chunk(bs, sector_num + i) != 0) {
return -1;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
}
return 0;
}
@@ -368,12 +425,12 @@ static void dmg_close(BlockDriverState *bs)
}
static BlockDriver bdrv_dmg = {
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
.format_name = "dmg",
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
};
static void bdrv_dmg_init(void)

View File

@@ -602,6 +602,13 @@ static BlockDriverAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
acb->buf = NULL;
acb->ioh = buf;
if (acb->ioh->cmd_len > SCSI_CDB_MAX_SIZE) {
error_report("iSCSI: ioctl error CDB exceeds max size (%d > %d)",
acb->ioh->cmd_len, SCSI_CDB_MAX_SIZE);
qemu_aio_unref(acb);
return NULL;
}
acb->task = malloc(sizeof(struct scsi_task));
if (acb->task == NULL) {
error_report("iSCSI: Failed to allocate task for scsi command. %s",

View File

@@ -274,7 +274,7 @@ static int nbd_co_send_request(BDRVNBDState *s, struct nbd_request *request,
ret = qemu_co_sendv(s->sock, qiov->iov, qiov->niov,
offset, request->len);
if (ret != request->len) {
return -EIO;
rc = -EIO;
}
}
qemu_aio_set_fd_handler(s->sock, nbd_reply_ready, NULL,
@@ -350,7 +350,7 @@ static int nbd_establish_connection(BlockDriverState *bs)
/* Now that we're connected, set the socket to be non-blocking and
* kick the reply mechanism. */
socket_set_nonblock(sock);
qemu_set_nonblock(sock);
qemu_aio_set_fd_handler(sock, nbd_reply_ready, NULL,
nbd_have_request, s);

View File

@@ -49,9 +49,9 @@ typedef struct BDRVParallelsState {
CoMutex lock;
uint32_t *catalog_bitmap;
int catalog_size;
unsigned int catalog_size;
int tracks;
unsigned int tracks;
} BDRVParallelsState;
static int parallels_probe(const uint8_t *buf, int buf_size, const char *filename)
@@ -91,8 +91,19 @@ static int parallels_open(BlockDriverState *bs, int flags)
bs->total_sectors = le32_to_cpu(ph.nb_sectors);
s->tracks = le32_to_cpu(ph.tracks);
if (s->tracks == 0) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Invalid image: Zero sectors per track");
ret = -EINVAL;
goto fail;
}
s->catalog_size = le32_to_cpu(ph.catalog_entries);
if (s->catalog_size > INT_MAX / 4) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Catalog too large");
ret = -EFBIG;
goto fail;
}
s->catalog_bitmap = g_malloc(s->catalog_size * 4);
ret = bdrv_pread(bs->file, 64, s->catalog_bitmap, s->catalog_size * 4);

View File

@@ -60,7 +60,7 @@ typedef struct BDRVQcowState {
int cluster_sectors;
int l2_bits;
int l2_size;
int l1_size;
unsigned int l1_size;
uint64_t cluster_offset_mask;
uint64_t l1_table_offset;
uint64_t *l1_table;
@@ -124,10 +124,28 @@ static int qcow_open(BlockDriverState *bs, int flags)
goto fail;
}
if (header.size <= 1 || header.cluster_bits < 9) {
if (header.size <= 1) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Image size is too small (must be at least 2 bytes)");
ret = -EINVAL;
goto fail;
}
if (header.cluster_bits < 9 || header.cluster_bits > 16) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Cluster size must be between 512 and 64k");
ret = -EINVAL;
goto fail;
}
/* l2_bits specifies number of entries; storing a uint64_t in each entry,
* so bytes = num_entries << 3. */
if (header.l2_bits < 9 - 3 || header.l2_bits > 16 - 3) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"L2 table size must be between 512 and 64k");
ret = -EINVAL;
goto fail;
}
if (header.crypt_method > QCOW_CRYPT_AES) {
ret = -EINVAL;
goto fail;
@@ -146,7 +164,19 @@ static int qcow_open(BlockDriverState *bs, int flags)
/* read the level 1 table */
shift = s->cluster_bits + s->l2_bits;
s->l1_size = (header.size + (1LL << shift) - 1) >> shift;
if (header.size > UINT64_MAX - (1LL << shift)) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Image too large");
ret = -EINVAL;
goto fail;
} else {
uint64_t l1_size = (header.size + (1LL << shift) - 1) >> shift;
if (l1_size > INT_MAX / sizeof(uint64_t)) {
qerror_report(ERROR_CLASS_GENERIC_ERROR, "Image too large");
ret = -EINVAL;
goto fail;
}
s->l1_size = l1_size;
}
s->l1_table_offset = header.l1_table_offset;
s->l1_table = g_malloc(s->l1_size * sizeof(uint64_t));

View File

@@ -29,12 +29,13 @@
#include "block/qcow2.h"
#include "trace.h"
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size)
{
BDRVQcowState *s = bs->opaque;
int new_l1_size, new_l1_size2, ret, i;
int new_l1_size2, ret, i;
uint64_t *new_l1_table;
int64_t new_l1_table_offset;
int64_t new_l1_table_offset, new_l1_size;
uint8_t data[12];
if (min_size <= s->l1_size)
@@ -53,8 +54,13 @@ int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size)
}
}
if (new_l1_size > INT_MAX / sizeof(uint64_t)) {
return -EFBIG;
}
#ifdef DEBUG_ALLOC2
fprintf(stderr, "grow l1_table from %d to %d\n", s->l1_size, new_l1_size);
fprintf(stderr, "grow l1_table from %d to %" PRId64 "\n",
s->l1_size, new_l1_size);
#endif
new_l1_size2 = sizeof(uint64_t) * new_l1_size;
@@ -324,15 +330,6 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
struct iovec iov;
int n, ret;
/*
* If this is the last cluster and it is only partially used, we must only
* copy until the end of the image, or bdrv_check_request will fail for the
* bdrv_read/write calls below.
*/
if (start_sect + n_end > bs->total_sectors) {
n_end = bs->total_sectors - start_sect;
}
n = n_end - n_start;
if (n <= 0) {
return 0;
@@ -391,8 +388,8 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset)
{
BDRVQcowState *s = bs->opaque;
unsigned int l1_index, l2_index;
uint64_t l2_offset, *l2_table;
unsigned int l2_index;
uint64_t l1_index, l2_offset, *l2_table;
int l1_bits, c;
unsigned int index_in_cluster, nb_clusters;
uint64_t nb_available, nb_needed;
@@ -454,6 +451,9 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
*cluster_offset &= L2E_COMPRESSED_OFFSET_SIZE_MASK;
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
return -EIO;
}
c = count_contiguous_clusters(nb_clusters, s->cluster_size,
&l2_table[l2_index], 0,
QCOW_OFLAG_COMPRESSED | QCOW_OFLAG_ZERO);
@@ -504,8 +504,8 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
int *new_l2_index)
{
BDRVQcowState *s = bs->opaque;
unsigned int l1_index, l2_index;
uint64_t l2_offset;
unsigned int l2_index;
uint64_t l1_index, l2_offset;
uint64_t *l2_table = NULL;
int ret;
@@ -519,6 +519,7 @@ static int get_cluster_table(BlockDriverState *bs, uint64_t offset,
}
}
assert(l1_index < s->l1_size);
l2_offset = s->l1_table[l1_index] & L1E_OFFSET_MASK;
/* seek the l2 table of the given l2 offset */

View File

@@ -26,7 +26,7 @@
#include "block/block_int.h"
#include "block/qcow2.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size);
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
int64_t offset, int64_t length,
int addend);
@@ -38,8 +38,10 @@ static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
int qcow2_refcount_init(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
int ret, refcount_table_size2, i;
unsigned int refcount_table_size2, i;
int ret;
assert(s->refcount_table_size <= INT_MAX / sizeof(uint64_t));
refcount_table_size2 = s->refcount_table_size * sizeof(uint64_t);
s->refcount_table = g_malloc(refcount_table_size2);
if (s->refcount_table_size > 0) {
@@ -85,7 +87,7 @@ static int load_refcount_block(BlockDriverState *bs,
static int get_refcount(BlockDriverState *bs, int64_t cluster_index)
{
BDRVQcowState *s = bs->opaque;
int refcount_table_index, block_index;
uint64_t refcount_table_index, block_index;
int64_t refcount_block_offset;
int ret;
uint16_t *refcount_block;
@@ -189,10 +191,11 @@ static int alloc_refcount_block(BlockDriverState *bs,
* they can describe them themselves.
*
* - We need to consider that at this point we are inside update_refcounts
* and doing the initial refcount increase. This means that some clusters
* have already been allocated by the caller, but their refcount isn't
* accurate yet. free_cluster_index tells us where this allocation ends
* as long as we don't overwrite it by freeing clusters.
* and potentially doing an initial refcount increase. This means that
* some clusters have already been allocated by the caller, but their
* refcount isn't accurate yet. If we allocate clusters for metadata, we
* need to return -EAGAIN to signal the caller that it needs to restart
* the search for free clusters.
*
* - alloc_clusters_noref and qcow2_free_clusters may load a different
* refcount block into the cache
@@ -201,7 +204,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
*refcount_block = NULL;
/* We write to the refcount table, so we might depend on L2 tables */
qcow2_cache_flush(bs, s->l2_table_cache);
ret = qcow2_cache_flush(bs, s->l2_table_cache);
if (ret < 0) {
return ret;
}
/* Allocate the refcount block itself and mark it as used */
int64_t new_block = alloc_clusters_noref(bs, s->cluster_size);
@@ -237,7 +243,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
goto fail_block;
}
bdrv_flush(bs->file);
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
if (ret < 0) {
goto fail_block;
}
/* Initialize the new refcount block only after updating its refcount,
* update_refcount uses the refcount cache itself */
@@ -270,7 +279,10 @@ static int alloc_refcount_block(BlockDriverState *bs,
}
s->refcount_table[refcount_table_index] = new_block;
return 0;
/* The new refcount block may be where the caller intended to put its
* data, so let it restart the search. */
return -EAGAIN;
}
ret = qcow2_cache_put(bs, s->refcount_block_cache, (void**) refcount_block);
@@ -293,8 +305,11 @@ static int alloc_refcount_block(BlockDriverState *bs,
/* Calculate the number of refcount blocks needed so far */
uint64_t refcount_block_clusters = 1 << (s->cluster_bits - REFCOUNT_SHIFT);
uint64_t blocks_used = (s->free_cluster_index +
refcount_block_clusters - 1) / refcount_block_clusters;
uint64_t blocks_used = DIV_ROUND_UP(cluster_index, refcount_block_clusters);
if (blocks_used > QCOW_MAX_REFTABLE_SIZE / sizeof(uint64_t)) {
return -EFBIG;
}
/* And now we need at least one block more for the new metadata */
uint64_t table_size = next_refcount_table_size(s, blocks_used + 1);
@@ -327,8 +342,6 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint16_t *new_blocks = g_malloc0(blocks_clusters * s->cluster_size);
uint64_t *new_table = g_malloc0(table_size * sizeof(uint64_t));
assert(meta_offset >= (s->free_cluster_index * s->cluster_size));
/* Fill the new refcount table */
memcpy(new_table, s->refcount_table,
s->refcount_table_size * sizeof(uint64_t));
@@ -391,17 +404,18 @@ static int alloc_refcount_block(BlockDriverState *bs,
s->refcount_table_size = table_size;
s->refcount_table_offset = table_offset;
/* Free old table. Remember, we must not change free_cluster_index */
uint64_t old_free_cluster_index = s->free_cluster_index;
/* Free old table. */
qcow2_free_clusters(bs, old_table_offset, old_table_size * sizeof(uint64_t));
s->free_cluster_index = old_free_cluster_index;
ret = load_refcount_block(bs, new_block, (void**) refcount_block);
if (ret < 0) {
return ret;
}
return 0;
/* If we were trying to do the initial refcount update for some cluster
* allocation, we might have used the same clusters to store newly
* allocated metadata. Make the caller search some new space. */
return -EAGAIN;
fail_table:
g_free(new_table);
@@ -539,15 +553,16 @@ static int update_cluster_refcount(BlockDriverState *bs,
/* return < 0 if error */
static int64_t alloc_clusters_noref(BlockDriverState *bs, int64_t size)
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size)
{
BDRVQcowState *s = bs->opaque;
int i, nb_clusters, refcount;
uint64_t i, nb_clusters;
int refcount;
nb_clusters = size_to_clusters(s, size);
retry:
for(i = 0; i < nb_clusters; i++) {
int64_t next_cluster_index = s->free_cluster_index++;
uint64_t next_cluster_index = s->free_cluster_index++;
refcount = get_refcount(bs, next_cluster_index);
if (refcount < 0) {
@@ -564,18 +579,21 @@ retry:
return (s->free_cluster_index - nb_clusters) << s->cluster_bits;
}
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size)
int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size)
{
int64_t offset;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_CLUSTER_ALLOC);
offset = alloc_clusters_noref(bs, size);
if (offset < 0) {
return offset;
}
do {
offset = alloc_clusters_noref(bs, size);
if (offset < 0) {
return offset;
}
ret = update_refcount(bs, offset, size, 1);
} while (ret == -EAGAIN);
ret = update_refcount(bs, offset, size, 1);
if (ret < 0) {
return ret;
}
@@ -588,32 +606,29 @@ int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
{
BDRVQcowState *s = bs->opaque;
uint64_t cluster_index;
uint64_t old_free_cluster_index;
int i, refcount, ret;
/* Check how many clusters there are free */
cluster_index = offset >> s->cluster_bits;
for(i = 0; i < nb_clusters; i++) {
refcount = get_refcount(bs, cluster_index++);
do {
/* Check how many clusters there are free */
cluster_index = offset >> s->cluster_bits;
for(i = 0; i < nb_clusters; i++) {
refcount = get_refcount(bs, cluster_index++);
if (refcount < 0) {
return refcount;
} else if (refcount != 0) {
break;
if (refcount < 0) {
return refcount;
} else if (refcount != 0) {
break;
}
}
}
/* And then allocate them */
old_free_cluster_index = s->free_cluster_index;
s->free_cluster_index = cluster_index + i;
/* And then allocate them */
ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
} while (ret == -EAGAIN);
ret = update_refcount(bs, offset, i << s->cluster_bits, 1);
if (ret < 0) {
return ret;
}
s->free_cluster_index = old_free_cluster_index;
return i;
}
@@ -884,8 +899,7 @@ static void inc_refcounts(BlockDriverState *bs,
int64_t offset, int64_t size)
{
BDRVQcowState *s = bs->opaque;
int64_t start, last, cluster_offset;
int k;
uint64_t start, last, cluster_offset, k;
if (size <= 0)
return;
@@ -895,11 +909,7 @@ static void inc_refcounts(BlockDriverState *bs,
for(cluster_offset = start; cluster_offset <= last;
cluster_offset += s->cluster_size) {
k = cluster_offset >> s->cluster_bits;
if (k < 0) {
fprintf(stderr, "ERROR: invalid cluster offset=0x%" PRIx64 "\n",
cluster_offset);
res->corruptions++;
} else if (k >= refcount_table_size) {
if (k >= refcount_table_size) {
fprintf(stderr, "Warning: cluster offset=0x%" PRIx64 " is after "
"the end of the image file, can't properly check refcounts.\n",
cluster_offset);
@@ -1112,14 +1122,19 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix)
{
BDRVQcowState *s = bs->opaque;
int64_t size, i;
int nb_clusters, refcount1, refcount2;
int64_t size, i, nb_clusters;
int refcount1, refcount2;
QCowSnapshot *sn;
uint16_t *refcount_table;
int ret;
size = bdrv_getlength(bs->file);
nb_clusters = size_to_clusters(s, size);
if (nb_clusters > INT_MAX) {
res->check_errors++;
return -EFBIG;
}
refcount_table = g_malloc0(nb_clusters * sizeof(uint16_t));
/* header */

View File

@@ -26,31 +26,6 @@
#include "block/block_int.h"
#include "block/qcow2.h"
typedef struct QEMU_PACKED QCowSnapshotHeader {
/* header is 8 byte aligned */
uint64_t l1_table_offset;
uint32_t l1_size;
uint16_t id_str_size;
uint16_t name_size;
uint32_t date_sec;
uint32_t date_nsec;
uint64_t vm_clock_nsec;
uint32_t vm_state_size;
uint32_t extra_data_size; /* for extension */
/* extra data follows */
/* id_str follows */
/* name follows */
} QCowSnapshotHeader;
typedef struct QEMU_PACKED QCowSnapshotExtraData {
uint64_t vm_state_size_large;
uint64_t disk_size;
} QCowSnapshotExtraData;
void qcow2_free_snapshots(BlockDriverState *bs)
{
BDRVQcowState *s = bs->opaque;
@@ -141,8 +116,14 @@ int qcow2_read_snapshots(BlockDriverState *bs)
}
offset += name_size;
sn->name[name_size] = '\0';
if (offset - s->snapshots_offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset - s->snapshots_offset <= INT_MAX);
s->snapshots_size = offset - s->snapshots_offset;
return 0;
@@ -163,7 +144,7 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
uint32_t nb_snapshots;
uint64_t snapshots_offset;
} QEMU_PACKED header_data;
int64_t offset, snapshots_offset;
int64_t offset, snapshots_offset = 0;
int ret;
/* compute the size of the snapshots */
@@ -175,16 +156,26 @@ static int qcow2_write_snapshots(BlockDriverState *bs)
offset += sizeof(extra);
offset += strlen(sn->id_str);
offset += strlen(sn->name);
if (offset > QCOW_MAX_SNAPSHOTS_SIZE) {
ret = -EFBIG;
goto fail;
}
}
assert(offset <= INT_MAX);
snapshots_size = offset;
/* Allocate space for the new snapshot list */
snapshots_offset = qcow2_alloc_clusters(bs, snapshots_size);
bdrv_flush(bs->file);
offset = snapshots_offset;
if (offset < 0) {
return offset;
}
ret = bdrv_flush(bs);
if (ret < 0) {
return ret;
}
/* Write all snapshots to the new list */
for(i = 0; i < s->nb_snapshots; i++) {
@@ -322,6 +313,10 @@ int qcow2_snapshot_create(BlockDriverState *bs, QEMUSnapshotInfo *sn_info)
uint64_t *l1_table = NULL;
int64_t l1_table_offset;
if (s->nb_snapshots >= QCOW_MAX_SNAPSHOTS) {
return -EFBIG;
}
memset(sn, 0, sizeof(*sn));
/* Generate an ID if it wasn't passed */
@@ -636,7 +631,11 @@ int qcow2_snapshot_load_tmp(BlockDriverState *bs, const char *snapshot_name)
sn = &s->snapshots[snapshot_index];
/* Allocate and read in the snapshot's L1 table */
new_l1_bytes = s->l1_size * sizeof(uint64_t);
if (sn->l1_size > QCOW_MAX_L1_SIZE) {
error_report("Snapshot L1 table too large");
return -EFBIG;
}
new_l1_bytes = sn->l1_size * sizeof(uint64_t);
new_l1_table = g_malloc0(align_offset(new_l1_bytes, 512));
ret = bdrv_pread(bs->file, sn->l1_table_offset, new_l1_table, new_l1_bytes);

View File

@@ -285,12 +285,40 @@ static int qcow2_check(BlockDriverState *bs, BdrvCheckResult *result,
return ret;
}
static int validate_table_offset(BlockDriverState *bs, uint64_t offset,
uint64_t entries, size_t entry_len)
{
BDRVQcowState *s = bs->opaque;
uint64_t size;
/* Use signed INT64_MAX as the maximum even for uint64_t header fields,
* because values will be passed to qemu functions taking int64_t. */
if (entries > INT64_MAX / entry_len) {
return -EINVAL;
}
size = entries * entry_len;
if (INT64_MAX - size < offset) {
return -EINVAL;
}
/* Tables must be cluster aligned */
if (offset & (s->cluster_size - 1)) {
return -EINVAL;
}
return 0;
}
static int qcow2_open(BlockDriverState *bs, int flags)
{
BDRVQcowState *s = bs->opaque;
int len, i, ret = 0;
unsigned int len, i;
int ret = 0;
QCowHeader header;
uint64_t ext_end;
uint64_t l1_vm_state_index;
ret = bdrv_pread(bs->file, 0, &header, sizeof(header));
if (ret < 0) {
@@ -322,6 +350,19 @@ static int qcow2_open(BlockDriverState *bs, int flags)
s->qcow_version = header.version;
/* Initialise cluster size */
if (header.cluster_bits < MIN_CLUSTER_BITS ||
header.cluster_bits > MAX_CLUSTER_BITS) {
report_unsupported(bs, "Unsupported cluster size: 2^%i",
header.cluster_bits);
ret = -EINVAL;
goto fail;
}
s->cluster_bits = header.cluster_bits;
s->cluster_size = 1 << s->cluster_bits;
s->cluster_sectors = 1 << (s->cluster_bits - 9);
/* Initialise version 3 header fields */
if (header.version == 2) {
header.incompatible_features = 0;
@@ -335,6 +376,18 @@ static int qcow2_open(BlockDriverState *bs, int flags)
be64_to_cpus(&header.autoclear_features);
be32_to_cpus(&header.refcount_order);
be32_to_cpus(&header.header_length);
if (header.header_length < 104) {
report_unsupported(bs, "qcow2 header too short");
ret = -EINVAL;
goto fail;
}
}
if (header.header_length > s->cluster_size) {
report_unsupported(bs, "qcow2 header exceeds cluster size");
ret = -EINVAL;
goto fail;
}
if (header.header_length > sizeof(header)) {
@@ -347,6 +400,12 @@ static int qcow2_open(BlockDriverState *bs, int flags)
}
}
if (header.backing_file_offset > s->cluster_size) {
report_unsupported(bs, "Invalid backing file offset");
ret = -EINVAL;
goto fail;
}
if (header.backing_file_offset) {
ext_end = header.backing_file_offset;
} else {
@@ -377,11 +436,6 @@ static int qcow2_open(BlockDriverState *bs, int flags)
goto fail;
}
if (header.cluster_bits < MIN_CLUSTER_BITS ||
header.cluster_bits > MAX_CLUSTER_BITS) {
ret = -EINVAL;
goto fail;
}
if (header.crypt_method > QCOW_CRYPT_AES) {
ret = -EINVAL;
goto fail;
@@ -390,32 +444,77 @@ static int qcow2_open(BlockDriverState *bs, int flags)
if (s->crypt_method_header) {
bs->encrypted = 1;
}
s->cluster_bits = header.cluster_bits;
s->cluster_size = 1 << s->cluster_bits;
s->cluster_sectors = 1 << (s->cluster_bits - 9);
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
s->l2_size = 1 << s->l2_bits;
bs->total_sectors = header.size / 512;
s->csize_shift = (62 - (s->cluster_bits - 8));
s->csize_mask = (1 << (s->cluster_bits - 8)) - 1;
s->cluster_offset_mask = (1LL << s->csize_shift) - 1;
s->refcount_table_offset = header.refcount_table_offset;
s->refcount_table_size =
header.refcount_table_clusters << (s->cluster_bits - 3);
s->snapshots_offset = header.snapshots_offset;
s->nb_snapshots = header.nb_snapshots;
if (header.refcount_table_clusters > qcow2_max_refcount_clusters(s)) {
report_unsupported(bs, "Reference count table too large");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, s->refcount_table_offset,
s->refcount_table_size, sizeof(uint64_t));
if (ret < 0) {
report_unsupported(bs, "Invalid reference count table offset");
goto fail;
}
/* Snapshot table offset/length */
if (header.nb_snapshots > QCOW_MAX_SNAPSHOTS) {
report_unsupported(bs, "Too many snapshots");
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, header.snapshots_offset,
header.nb_snapshots,
sizeof(QCowSnapshotHeader));
if (ret < 0) {
report_unsupported(bs, "Invalid snapshot table offset");
goto fail;
}
/* read the level 1 table */
if (header.l1_size > QCOW_MAX_L1_SIZE) {
report_unsupported(bs, "Active L1 table too large");
ret = -EFBIG;
goto fail;
}
s->l1_size = header.l1_size;
s->l1_vm_state_index = size_to_l1(s, header.size);
l1_vm_state_index = size_to_l1(s, header.size);
if (l1_vm_state_index > INT_MAX) {
ret = -EFBIG;
goto fail;
}
s->l1_vm_state_index = l1_vm_state_index;
/* the L1 table must contain at least enough entries to put
header.size bytes */
if (s->l1_size < s->l1_vm_state_index) {
ret = -EINVAL;
goto fail;
}
ret = validate_table_offset(bs, header.l1_table_offset,
header.l1_size, sizeof(uint64_t));
if (ret < 0) {
report_unsupported(bs, "Invalid L1 table offset");
goto fail;
}
s->l1_table_offset = header.l1_table_offset;
if (s->l1_size > 0) {
s->l1_table = g_malloc0(
align_offset(s->l1_size * sizeof(uint64_t), 512));
@@ -456,8 +555,10 @@ static int qcow2_open(BlockDriverState *bs, int flags)
/* read the backing file name */
if (header.backing_file_offset != 0) {
len = header.backing_file_size;
if (len > 1023) {
len = 1023;
if (len > MIN(1023, s->cluster_size - header.backing_file_offset)) {
report_unsupported(bs, "Backing file name too long");
ret = -EINVAL;
goto fail;
}
ret = bdrv_pread(bs->file, header.backing_file_offset,
bs->backing_file, len);
@@ -467,6 +568,10 @@ static int qcow2_open(BlockDriverState *bs, int flags)
bs->backing_file[len] = '\0';
}
/* Internal snapshots */
s->snapshots_offset = header.snapshots_offset;
s->nb_snapshots = header.nb_snapshots;
ret = qcow2_read_snapshots(bs);
if (ret < 0) {
goto fail;
@@ -584,7 +689,7 @@ static int coroutine_fn qcow2_co_is_allocated(BlockDriverState *bs,
*pnum = 0;
}
return (cluster_offset != 0);
return (cluster_offset != 0) || (ret == QCOW2_CLUSTER_ZERO);
}
/* handle reading after the end of the backing file */
@@ -665,10 +770,6 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
break;
case QCOW2_CLUSTER_ZERO:
if (s->qcow_version < 3) {
ret = -EIO;
goto fail;
}
qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
break;
@@ -1205,7 +1306,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
*/
BlockDriverState* bs;
QCowHeader header;
uint8_t* refcount_table;
uint64_t* refcount_table;
int ret;
ret = bdrv_create_file(filename, options);
@@ -1247,9 +1348,10 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
/* Write an empty refcount table */
refcount_table = g_malloc0(cluster_size);
ret = bdrv_pwrite(bs, cluster_size, refcount_table, cluster_size);
/* Write a refcount table with one refcount block */
refcount_table = g_malloc0(2 * cluster_size);
refcount_table[0] = cpu_to_be64(2 * cluster_size);
ret = bdrv_pwrite(bs, cluster_size, refcount_table, 2 * cluster_size);
g_free(refcount_table);
if (ret < 0) {
@@ -1271,7 +1373,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
goto out;
}
ret = qcow2_alloc_clusters(bs, 2 * cluster_size);
ret = qcow2_alloc_clusters(bs, 3 * cluster_size);
if (ret < 0) {
goto out;
@@ -1433,7 +1535,8 @@ static coroutine_fn int qcow2_co_discard(BlockDriverState *bs,
static int qcow2_truncate(BlockDriverState *bs, int64_t offset)
{
BDRVQcowState *s = bs->opaque;
int ret, new_l1_size;
int64_t new_l1_size;
int ret;
if (offset & 511) {
error_report("The new size must be a multiple of 512");

View File

@@ -38,6 +38,19 @@
#define QCOW_CRYPT_AES 1
#define QCOW_MAX_CRYPT_CLUSTERS 32
#define QCOW_MAX_SNAPSHOTS 65536
/* 8 MB refcount table is enough for 2 PB images at 64k cluster size
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
#define QCOW_MAX_REFTABLE_SIZE 0x800000
/* 32 MB L1 table is enough for 2 PB images at 64k cluster size
* (128 GB for 512 byte clusters, 2 EB for 2 MB clusters) */
#define QCOW_MAX_L1_SIZE 0x2000000
/* Allow for an average of 1k per snapshot table entry, should be plenty of
* space for snapshot names and IDs */
#define QCOW_MAX_SNAPSHOTS_SIZE (1024 * QCOW_MAX_SNAPSHOTS)
/* indicate that the refcount of the referenced cluster is exactly one. */
#define QCOW_OFLAG_COPIED (1LL << 63)
@@ -82,6 +95,32 @@ typedef struct QCowHeader {
uint32_t header_length;
} QCowHeader;
typedef struct QEMU_PACKED QCowSnapshotHeader {
/* header is 8 byte aligned */
uint64_t l1_table_offset;
uint32_t l1_size;
uint16_t id_str_size;
uint16_t name_size;
uint32_t date_sec;
uint32_t date_nsec;
uint64_t vm_clock_nsec;
uint32_t vm_state_size;
uint32_t extra_data_size; /* for extension */
/* extra data follows */
/* id_str follows */
/* name follows */
} QCowSnapshotHeader;
typedef struct QEMU_PACKED QCowSnapshotExtraData {
uint64_t vm_state_size_large;
uint64_t disk_size;
} QCowSnapshotExtraData;
typedef struct QCowSnapshot {
uint64_t l1_table_offset;
uint32_t l1_size;
@@ -157,8 +196,8 @@ typedef struct BDRVQcowState {
uint64_t *refcount_table;
uint64_t refcount_table_offset;
uint32_t refcount_table_size;
int64_t free_cluster_index;
int64_t free_byte_offset;
uint64_t free_cluster_index;
uint64_t free_byte_offset;
CoMutex lock;
@@ -168,7 +207,7 @@ typedef struct BDRVQcowState {
AES_KEY aes_decrypt_key;
uint64_t snapshots_offset;
int snapshots_size;
int nb_snapshots;
unsigned int nb_snapshots;
QCowSnapshot *snapshots;
int flags;
@@ -267,7 +306,7 @@ static inline int size_to_clusters(BDRVQcowState *s, int64_t size)
return (size + (s->cluster_size - 1)) >> s->cluster_bits;
}
static inline int size_to_l1(BDRVQcowState *s, int64_t size)
static inline int64_t size_to_l1(BDRVQcowState *s, int64_t size)
{
int shift = s->cluster_bits + s->l2_bits;
return (size + (1ULL << shift) - 1) >> shift;
@@ -279,6 +318,11 @@ static inline int64_t align_offset(int64_t offset, int n)
return offset;
}
static inline uint64_t qcow2_max_refcount_clusters(BDRVQcowState *s)
{
return QCOW_MAX_REFTABLE_SIZE >> s->cluster_bits;
}
static inline int qcow2_get_cluster_type(uint64_t l2_entry)
{
if (l2_entry & QCOW_OFLAG_COMPRESSED) {
@@ -311,7 +355,7 @@ int qcow2_update_header(BlockDriverState *bs);
int qcow2_refcount_init(BlockDriverState *bs);
void qcow2_refcount_close(BlockDriverState *bs);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, int64_t size);
int64_t qcow2_alloc_clusters(BlockDriverState *bs, uint64_t size);
int qcow2_alloc_clusters_at(BlockDriverState *bs, uint64_t offset,
int nb_clusters);
int64_t qcow2_alloc_bytes(BlockDriverState *bs, int size);
@@ -327,7 +371,8 @@ int qcow2_check_refcounts(BlockDriverState *bs, BdrvCheckResult *res,
BdrvCheckMode fix);
/* qcow2-cluster.c functions */
int qcow2_grow_l1_table(BlockDriverState *bs, int min_size, bool exact_size);
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
bool exact_size);
void qcow2_l2_cache_reset(BlockDriverState *bs);
int qcow2_decompress_cluster(BlockDriverState *bs, uint64_t cluster_offset);
void qcow2_encrypt_sectors(BDRVQcowState *s, int64_t sector_num,

View File

@@ -142,6 +142,9 @@ typedef struct BDRVRawState {
bool is_xfs : 1;
#endif
bool has_discard : 1;
#ifdef CONFIG_FIEMAP
bool skip_fiemap;
#endif
} BDRVRawState;
typedef struct BDRVRawReopenState {
@@ -1035,6 +1038,79 @@ static int raw_create(const char *filename, QEMUOptionParameter *options)
return result;
}
static int try_fiemap(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int nb_sectors, int *pnum)
{
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
if (s->skip_fiemap) {
return 1;
}
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = FIEMAP_FLAG_SYNC;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
/* Assume everything is allocated. */
s->skip_fiemap = true;
return 1;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
*hole = f.fm.fm_start;
*data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
*data = f.fe.fe_logical;
*hole = f.fe.fe_logical + f.fe.fe_length;
}
return 0;
#else
return 1;
#endif
}
static int64_t try_seek_hole(BlockDriverState *bs, off_t start, off_t *data,
off_t *hole, int *pnum)
{
#if defined SEEK_HOLE && defined SEEK_DATA
BDRVRawState *s = bs->opaque;
*hole = lseek(s->fd, start, SEEK_HOLE);
if (*hole == -1) {
/* -ENXIO indicates that sector_num was past the end of the file.
* There is a virtual hole there. */
assert(errno != -ENXIO);
return 1;
}
if (*hole > start) {
*data = start;
} else {
/* On a hole. We need another syscall to find its end. */
*data = lseek(s->fd, start, SEEK_DATA);
if (*data == -1) {
*data = lseek(s->fd, 0, SEEK_END);
}
}
return 0;
#else
return 1;
#endif
}
/*
* Returns true iff the specified sector is present in the disk image. Drivers
* not implementing the functionality are assumed to not support backing files,
@@ -1054,7 +1130,7 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum)
{
off_t start, data, hole;
off_t start, data = 0, hole = 0;
int ret;
ret = fd_open(bs);
@@ -1064,65 +1140,15 @@ static int coroutine_fn raw_co_is_allocated(BlockDriverState *bs,
start = sector_num * BDRV_SECTOR_SIZE;
#ifdef CONFIG_FIEMAP
BDRVRawState *s = bs->opaque;
struct {
struct fiemap fm;
struct fiemap_extent fe;
} f;
f.fm.fm_start = start;
f.fm.fm_length = (int64_t)nb_sectors * BDRV_SECTOR_SIZE;
f.fm.fm_flags = 0;
f.fm.fm_extent_count = 1;
f.fm.fm_reserved = 0;
if (ioctl(s->fd, FS_IOC_FIEMAP, &f) == -1) {
/* Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
}
if (f.fm.fm_mapped_extents == 0) {
/* No extents found, data is beyond f.fm.fm_start + f.fm.fm_length.
* f.fm.fm_start + f.fm.fm_length must be clamped to the file size!
*/
off_t length = lseek(s->fd, 0, SEEK_END);
hole = f.fm.fm_start;
data = MIN(f.fm.fm_start + f.fm.fm_length, length);
} else {
data = f.fe.fe_logical;
hole = f.fe.fe_logical + f.fe.fe_length;
}
#elif defined SEEK_HOLE && defined SEEK_DATA
BDRVRawState *s = bs->opaque;
hole = lseek(s->fd, start, SEEK_HOLE);
if (hole == -1) {
/* -ENXIO indicates that sector_num was past the end of the file.
* There is a virtual hole there. */
assert(errno != -ENXIO);
/* Most likely EINVAL. Assume everything is allocated. */
*pnum = nb_sectors;
return 1;
}
if (hole > start) {
data = start;
} else {
/* On a hole. We need another syscall to find its end. */
data = lseek(s->fd, start, SEEK_DATA);
if (data == -1) {
data = lseek(s->fd, 0, SEEK_END);
ret = try_seek_hole(bs, start, &data, &hole, pnum);
if (ret) {
ret = try_fiemap(bs, start, &data, &hole, nb_sectors, pnum);
if (ret) {
/* Assume everything is allocated. */
data = 0;
hole = start + nb_sectors * BDRV_SECTOR_SIZE;
}
}
#else
*pnum = nb_sectors;
return 1;
#endif
if (data <= start) {
/* On a data extent, compute sectors to the end of the extent. */

View File

@@ -63,7 +63,8 @@
typedef enum {
RBD_AIO_READ,
RBD_AIO_WRITE,
RBD_AIO_DISCARD
RBD_AIO_DISCARD,
RBD_AIO_FLUSH
} RBDAIOCmd;
typedef struct RBDAIOCB {
@@ -379,8 +380,7 @@ static void qemu_rbd_complete_aio(RADOSCB *rcb)
r = rcb->ret;
if (acb->cmd == RBD_AIO_WRITE ||
acb->cmd == RBD_AIO_DISCARD) {
if (acb->cmd != RBD_AIO_READ) {
if (r < 0) {
acb->ret = r;
acb->error = 1;
@@ -658,6 +658,16 @@ static int rbd_aio_discard_wrapper(rbd_image_t image,
#endif
}
static int rbd_aio_flush_wrapper(rbd_image_t image,
rbd_completion_t comp)
{
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
return rbd_aio_flush(image, comp);
#else
return -ENOTSUP;
#endif
}
static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
int64_t sector_num,
QEMUIOVector *qiov,
@@ -678,7 +688,7 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
acb = qemu_aio_get(&rbd_aiocb_info, bs, cb, opaque);
acb->cmd = cmd;
acb->qiov = qiov;
if (cmd == RBD_AIO_DISCARD) {
if (cmd == RBD_AIO_DISCARD || cmd == RBD_AIO_FLUSH) {
acb->bounce = NULL;
} else {
acb->bounce = qemu_blockalign(bs, qiov->size);
@@ -722,6 +732,9 @@ static BlockDriverAIOCB *rbd_start_aio(BlockDriverState *bs,
case RBD_AIO_DISCARD:
r = rbd_aio_discard_wrapper(s->image, off, size, c);
break;
case RBD_AIO_FLUSH:
r = rbd_aio_flush_wrapper(s->image, c);
break;
default:
r = -EINVAL;
}
@@ -761,6 +774,16 @@ static BlockDriverAIOCB *qemu_rbd_aio_writev(BlockDriverState *bs,
RBD_AIO_WRITE);
}
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
static BlockDriverAIOCB *qemu_rbd_aio_flush(BlockDriverState *bs,
BlockDriverCompletionFunc *cb,
void *opaque)
{
return rbd_start_aio(bs, 0, NULL, 0, cb, opaque, RBD_AIO_FLUSH);
}
#else
static int qemu_rbd_co_flush(BlockDriverState *bs)
{
#if LIBRBD_VERSION_CODE >= LIBRBD_VERSION(0, 1, 1)
@@ -771,6 +794,7 @@ static int qemu_rbd_co_flush(BlockDriverState *bs)
return 0;
#endif
}
#endif
static int qemu_rbd_getinfo(BlockDriverState *bs, BlockDriverInfo *bdi)
{
@@ -948,7 +972,12 @@ static BlockDriver bdrv_rbd = {
.bdrv_aio_readv = qemu_rbd_aio_readv,
.bdrv_aio_writev = qemu_rbd_aio_writev,
#ifdef LIBRBD_SUPPORTS_AIO_FLUSH
.bdrv_aio_flush = qemu_rbd_aio_flush,
#else
.bdrv_co_flush_to_disk = qemu_rbd_co_flush,
#endif
#ifdef LIBRBD_SUPPORTS_DISCARD
.bdrv_aio_discard = qemu_rbd_aio_discard,

View File

@@ -549,7 +549,7 @@ static coroutine_fn void do_co_req(void *opaque)
co = qemu_coroutine_self();
qemu_aio_set_fd_handler(sockfd, NULL, restart_co_req, NULL, co);
socket_set_block(sockfd);
qemu_set_block(sockfd);
ret = send_co_req(sockfd, hdr, data, wlen);
if (ret < 0) {
goto out;
@@ -579,7 +579,7 @@ static coroutine_fn void do_co_req(void *opaque)
ret = 0;
out:
qemu_aio_set_fd_handler(sockfd, NULL, NULL, NULL, NULL);
socket_set_nonblock(sockfd);
qemu_set_nonblock(sockfd);
srco->ret = ret;
srco->finished = true;
@@ -812,7 +812,7 @@ static int get_sheep_fd(BDRVSheepdogState *s)
return fd;
}
socket_set_nonblock(fd);
qemu_set_nonblock(fd);
ret = set_nodelay(fd);
if (ret) {

365
block/tar.c Normal file
View File

@@ -0,0 +1,365 @@
/*
* Tar block driver
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu-common.h"
#include "block/block_int.h"
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("tar: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define POSIX_TAR_MAGIC "ustar"
#define OFFS_LENGTH 0x7c
#define OFFS_TYPE 0x9c
#define OFFS_MAGIC 0x101
#define OFFS_S_SP 0x182
#define OFFS_S_EXT 0x1e2
#define OFFS_S_LENGTH 0x1e3
#define OFFS_SX_EXT 0x1f8
typedef struct SparseCache {
uint64_t start;
uint64_t end;
} SparseCache;
typedef struct BDRVTarState {
BlockDriverState *hd;
size_t file_sec;
uint64_t file_len;
SparseCache *sparse;
int sparse_num;
uint64_t last_end;
char longfile[2048];
} BDRVTarState;
static int tar_probe(const uint8_t *buf, int buf_size, const char *filename)
{
if (buf_size < OFFS_MAGIC + 5)
return 0;
/* we only support newer tar */
if (!strncmp((char*)buf + OFFS_MAGIC, POSIX_TAR_MAGIC, 5))
return 100;
return 0;
}
static int str_ends(char *str, const char *end)
{
int end_len = strlen(end);
int str_len = strlen(str);
if (str_len < end_len)
return 0;
return !strncmp(str + str_len - end_len, end, end_len);
}
static int is_target_file(BlockDriverState *bs, char *filename,
char *header)
{
int retval = 0;
if (str_ends(filename, ".raw"))
retval = 1;
if (str_ends(filename, ".qcow"))
retval = 1;
if (str_ends(filename, ".qcow2"))
retval = 1;
if (str_ends(filename, ".vmdk"))
retval = 1;
if (retval &&
(header[OFFS_TYPE] != '0') &&
(header[OFFS_TYPE] != 'S')) {
retval = 0;
}
dprintf("does filename %s match? %s\n", filename, retval ? "yes" : "no");
/* make sure we're not using this name again */
filename[0] = '\0';
return retval;
}
static uint64_t tar2u64(char *ptr)
{
uint64_t retval;
char oldend = ptr[12];
ptr[12] = '\0';
if (*ptr & 0x80) {
/* XXX we only support files up to 64 bit length */
retval = be64_to_cpu(*(uint64_t *)(ptr+4));
dprintf("Convert %lx -> %#lx\n", *(uint64_t*)(ptr+4), retval);
} else {
retval = strtol(ptr, NULL, 8);
dprintf("Convert %s -> %#lx\n", ptr, retval);
}
ptr[12] = oldend;
return retval;
}
static void tar_sparse(BDRVTarState *s, uint64_t offs, uint64_t len)
{
SparseCache *sparse;
if (!len)
return;
if (!(offs - s->last_end)) {
s->last_end += len;
return;
}
if (s->last_end > offs)
return;
dprintf("Last chunk until %lx new chunk at %lx\n", s->last_end, offs);
s->sparse = g_realloc(s->sparse, (s->sparse_num + 1) * sizeof(SparseCache));
sparse = &s->sparse[s->sparse_num];
sparse->start = s->last_end;
sparse->end = offs;
s->last_end = offs + len;
s->sparse_num++;
dprintf("Sparse at %lx end=%lx\n", sparse->start,
sparse->end);
}
static int tar_open(BlockDriverState *bs, const char *filename, int flags)
{
BDRVTarState *s = bs->opaque;
char header[SECTOR_SIZE];
char *real_file = header;
char *magic;
const char *fname = filename;
size_t header_offs = 0;
int ret;
if (!strncmp(filename, "tar://", 6))
fname += 6;
else if (!strncmp(filename, "tar:", 4))
fname += 4;
ret = bdrv_file_open(&s->hd, fname, flags);
if (ret < 0)
return ret;
/* Search the file for an image */
do {
/* tar header */
if (bdrv_pread(s->hd, header_offs, header, SECTOR_SIZE) != SECTOR_SIZE)
goto fail;
if ((header_offs > 1) && !header[0]) {
fprintf(stderr, "Tar: No image file found in archive\n");
goto fail;
}
magic = &header[OFFS_MAGIC];
if (strncmp(magic, POSIX_TAR_MAGIC, 5)) {
fprintf(stderr, "Tar: Invalid magic: %s\n", magic);
goto fail;
}
dprintf("file type: %c\n", header[OFFS_TYPE]);
/* file length*/
s->file_len = (tar2u64(&header[OFFS_LENGTH]) + (SECTOR_SIZE - 1)) &
~(SECTOR_SIZE - 1);
s->file_sec = (header_offs / SECTOR_SIZE) + 1;
header_offs += s->file_len + SECTOR_SIZE;
if (header[OFFS_TYPE] == 'L') {
bdrv_pread(s->hd, header_offs - s->file_len, s->longfile,
sizeof(s->longfile));
s->longfile[sizeof(s->longfile)-1] = '\0';
real_file = header;
} else if (s->longfile[0]) {
real_file = s->longfile;
} else {
real_file = header;
}
} while(!is_target_file(bs, real_file, header));
/* We found an image! */
if (header[OFFS_TYPE] == 'S') {
uint8_t isextended;
int i;
for (i = OFFS_S_SP; i < (OFFS_S_SP + (4 * 24)); i += 24)
tar_sparse(s, tar2u64(&header[i]), tar2u64(&header[i+12]));
s->file_len = tar2u64(&header[OFFS_S_LENGTH]);
isextended = header[OFFS_S_EXT];
while (isextended) {
if (bdrv_pread(s->hd, s->file_sec * SECTOR_SIZE, header,
SECTOR_SIZE) != SECTOR_SIZE)
goto fail;
for (i = 0; i < (21 * 24); i += 24)
tar_sparse(s, tar2u64(&header[i]), tar2u64(&header[i+12]));
isextended = header[OFFS_SX_EXT];
s->file_sec++;
}
tar_sparse(s, s->file_len, 1);
}
return 0;
fail:
fprintf(stderr, "Tar: Error opening file\n");
bdrv_delete(s->hd);
return -EINVAL;
}
typedef struct TarAIOCB {
BlockDriverAIOCB common;
QEMUBH *bh;
} TarAIOCB;
/* This callback gets invoked when we have pure sparseness */
static void tar_sparse_cb(void *opaque)
{
TarAIOCB *acb = (TarAIOCB *)opaque;
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_release(acb);
}
static void tar_aio_cancel(BlockDriverAIOCB *blockacb)
{
}
static AIOCBInfo tar_aiocb_info = {
.aiocb_size = sizeof(TarAIOCB),
.cancel = tar_aio_cancel,
};
/* This is where we get a request from a caller to read something */
static BlockDriverAIOCB *tar_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockDriverCompletionFunc *cb, void *opaque)
{
BDRVTarState *s = bs->opaque;
SparseCache *sparse;
int64_t sec_file = sector_num + s->file_sec;
int64_t start = sector_num * SECTOR_SIZE;
int64_t end = start + (nb_sectors * SECTOR_SIZE);
int i;
TarAIOCB *acb;
for (i = 0; i < s->sparse_num; i++) {
sparse = &s->sparse[i];
if (sparse->start > end) {
/* We expect the cache to be start increasing */
break;
} else if ((sparse->start < start) && (sparse->end <= start)) {
/* sparse before our offset */
sec_file -= (sparse->end - sparse->start) / SECTOR_SIZE;
} else if ((sparse->start <= start) && (sparse->end >= end)) {
/* all our sectors are sparse */
char *buf = g_malloc0(nb_sectors * SECTOR_SIZE);
acb = qemu_aio_get(&tar_aiocb_info, bs, cb, opaque);
qemu_iovec_from_buf(qiov, 0, buf, nb_sectors * SECTOR_SIZE);
g_free(buf);
acb->bh = qemu_bh_new(tar_sparse_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
} else if (((sparse->start >= start) && (sparse->start < end)) ||
((sparse->end >= start) && (sparse->end < end))) {
/* we're semi-sparse (worst case) */
/* let's go synchronous and read all sectors individually */
char *buf = g_malloc(nb_sectors * SECTOR_SIZE);
uint64_t offs;
for (offs = 0; offs < (nb_sectors * SECTOR_SIZE);
offs += SECTOR_SIZE) {
bdrv_pread(bs, (sector_num * SECTOR_SIZE) + offs,
buf + offs, SECTOR_SIZE);
}
qemu_iovec_from_buf(qiov, 0, buf, nb_sectors * SECTOR_SIZE);
acb = qemu_aio_get(&tar_aiocb_info, bs, cb, opaque);
acb->bh = qemu_bh_new(tar_sparse_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
return bdrv_aio_readv(s->hd, sec_file, qiov, nb_sectors,
cb, opaque);
}
static void tar_close(BlockDriverState *bs)
{
dprintf("Close\n");
}
static int64_t tar_getlength(BlockDriverState *bs)
{
BDRVTarState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_tar = {
.format_name = "tar",
.protocol_name = "tar",
.instance_size = sizeof(BDRVTarState),
.bdrv_file_open = tar_open,
.bdrv_close = tar_close,
.bdrv_getlength = tar_getlength,
.bdrv_probe = tar_probe,
.bdrv_aio_readv = tar_aio_readv,
};
static void tar_block_init(void)
{
bdrv_register(&bdrv_tar);
}
block_init(tar_block_init);

View File

@@ -120,6 +120,11 @@ typedef unsigned char uuid_t[16];
#define VDI_IS_ALLOCATED(X) ((X) < VDI_DISCARDED)
/* max blocks in image is (0xffffffff / 4) */
#define VDI_BLOCKS_IN_IMAGE_MAX 0x3fffffff
#define VDI_DISK_SIZE_MAX ((uint64_t)VDI_BLOCKS_IN_IMAGE_MAX * \
(uint64_t)DEFAULT_CLUSTER_SIZE)
#if !defined(CONFIG_UUID)
static inline void uuid_generate(uuid_t out)
{
@@ -383,6 +388,14 @@ static int vdi_open(BlockDriverState *bs, int flags)
vdi_header_print(&header);
#endif
if (header.disk_size > VDI_DISK_SIZE_MAX) {
logout("Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")\n",
header.disk_size, VDI_DISK_SIZE_MAX);
ret = -ENOTSUP;
goto fail;
}
if (header.disk_size % SECTOR_SIZE != 0) {
/* 'VBoxManage convertfromraw' can create images with odd disk sizes.
We accept them but round the disk size to the next multiple of
@@ -415,8 +428,9 @@ static int vdi_open(BlockDriverState *bs, int flags)
logout("unsupported sector size %u B\n", header.sector_size);
ret = -ENOTSUP;
goto fail;
} else if (header.block_size != 1 * MiB) {
logout("unsupported block size %u B\n", header.block_size);
} else if (header.block_size != DEFAULT_CLUSTER_SIZE) {
logout("unsupported VDI image (block size %u is not %u)\n",
header.block_size, DEFAULT_CLUSTER_SIZE);
ret = -ENOTSUP;
goto fail;
} else if (header.disk_size >
@@ -432,6 +446,11 @@ static int vdi_open(BlockDriverState *bs, int flags)
logout("parent uuid != 0, unsupported\n");
ret = -ENOTSUP;
goto fail;
} else if (header.blocks_in_image > VDI_BLOCKS_IN_IMAGE_MAX) {
logout("unsupported VDI image (too many blocks %u, max is %u)\n",
header.blocks_in_image, VDI_BLOCKS_IN_IMAGE_MAX);
ret = -ENOTSUP;
goto fail;
}
bs->total_sectors = header.disk_size / SECTOR_SIZE;
@@ -668,11 +687,20 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
options++;
}
if (bytes > VDI_DISK_SIZE_MAX) {
result = -ENOTSUP;
logout("Unsupported VDI image size (size is 0x%" PRIx64
", max supported is 0x%" PRIx64 ")\n",
bytes, VDI_DISK_SIZE_MAX);
goto exit;
}
fd = qemu_open(filename,
O_WRONLY | O_CREAT | O_TRUNC | O_BINARY | O_LARGEFILE,
0644);
if (fd < 0) {
return -errno;
result = -errno;
goto exit;
}
/* We need enough blocks to store the given disk size,
@@ -733,6 +761,7 @@ static int vdi_create(const char *filename, QEMUOptionParameter *options)
result = -errno;
}
exit:
return result;
}

View File

@@ -45,6 +45,8 @@ enum vhd_type {
// Seconds since Jan 1, 2000 0:00:00 (UTC)
#define VHD_TIMESTAMP_BASE 946684800
#define VHD_MAX_SECTORS (65535LL * 255 * 255)
// always big-endian
struct vhd_footer {
char creator[8]; // "conectix"
@@ -163,6 +165,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
struct vhd_dyndisk_header* dyndisk_header;
uint8_t buf[HEADER_SIZE];
uint32_t checksum;
uint64_t computed_size;
int disk_type = VHD_DYNAMIC;
int ret;
@@ -211,7 +214,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
be16_to_cpu(footer->cyls) * footer->heads * footer->secs_per_cyl;
/* Allow a maximum disk size of approximately 2 TB */
if (bs->total_sectors >= 65535LL * 255 * 255) {
if (bs->total_sectors >= VHD_MAX_SECTORS) {
ret = -EFBIG;
goto fail;
}
@@ -231,10 +234,32 @@ static int vpc_open(BlockDriverState *bs, int flags)
}
s->block_size = be32_to_cpu(dyndisk_header->block_size);
if (!is_power_of_2(s->block_size) || s->block_size < BDRV_SECTOR_SIZE) {
qerror_report(ERROR_CLASS_GENERIC_ERROR,
"Invalid block size %" PRIu32, s->block_size);
ret = -EINVAL;
goto fail;
}
s->bitmap_size = ((s->block_size / (8 * 512)) + 511) & ~511;
s->max_table_entries = be32_to_cpu(dyndisk_header->max_table_entries);
s->pagetable = g_malloc(s->max_table_entries * 4);
if ((bs->total_sectors * 512) / s->block_size > 0xffffffffU) {
ret = -EINVAL;
goto fail;
}
if (s->max_table_entries > (VHD_MAX_SECTORS * 512) / s->block_size) {
ret = -EINVAL;
goto fail;
}
computed_size = (uint64_t) s->max_table_entries * s->block_size;
if (computed_size < bs->total_sectors * 512) {
ret = -EINVAL;
goto fail;
}
s->pagetable = qemu_blockalign(bs, s->max_table_entries * 4);
s->bat_offset = be64_to_cpu(dyndisk_header->table_offset);
@@ -280,7 +305,7 @@ static int vpc_open(BlockDriverState *bs, int flags)
return 0;
fail:
g_free(s->pagetable);
qemu_vfree(s->pagetable);
#ifdef CACHE
g_free(s->pageentry_u8);
#endif
@@ -789,7 +814,7 @@ static int vpc_create(const char *filename, QEMUOptionParameter *options)
static void vpc_close(BlockDriverState *bs)
{
BDRVVPCState *s = bs->opaque;
g_free(s->pagetable);
qemu_vfree(s->pagetable);
#ifdef CACHE
g_free(s->pageentry_u8);
#endif

View File

@@ -570,7 +570,7 @@ DriveInfo *drive_init(QemuOpts *opts, BlockInterfaceType block_default_type)
/* add virtio block device */
opts = qemu_opts_create_nofail(qemu_find_opts("device"));
if (arch_type == QEMU_ARCH_S390X) {
qemu_opt_set(opts, "driver", "virtio-blk-s390");
qemu_opt_set(opts, "driver", "virtio-blk-ccw");
} else {
qemu_opt_set(opts, "driver", "virtio-blk-pci");
}
@@ -1043,6 +1043,9 @@ void qmp_block_resize(const char *device, int64_t size, Error **errp)
return;
}
/* complete all in-flight operations before resizing the device */
bdrv_drain_all();
switch (bdrv_truncate(bs, size)) {
case 0:
break;

94
configure vendored
View File

@@ -283,7 +283,7 @@ sdl_config="${SDL_CONFIG-${cross_prefix}sdl-config}"
# default flags for all hosts
QEMU_CFLAGS="-fno-strict-aliasing $QEMU_CFLAGS"
QEMU_CFLAGS="-Wall -Wundef -Wwrite-strings -Wmissing-prototypes $QEMU_CFLAGS"
QEMU_CFLAGS="-Wstrict-prototypes -Wredundant-decls $QEMU_CFLAGS"
QEMU_CFLAGS="-Wstrict-prototypes $QEMU_CFLAGS"
QEMU_CFLAGS="-D_GNU_SOURCE -D_FILE_OFFSET_BITS=64 -D_LARGEFILE_SOURCE $QEMU_CFLAGS"
QEMU_INCLUDES="-I. -I\$(SRC_PATH) -I\$(SRC_PATH)/include"
if test "$debug_info" = "yes"; then
@@ -1435,6 +1435,7 @@ fi
if test "$seccomp" != "no" ; then
if $pkg_config --atleast-version=1.0.0 libseccomp --modversion >/dev/null 2>&1; then
libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
seccomp="yes"
else
if test "$seccomp" = "yes"; then
@@ -2759,7 +2760,13 @@ if test "$libiscsi" != "no" ; then
#include <iscsi/iscsi.h>
int main(void) { iscsi_unmap_sync(NULL,0,0,0,NULL,0); return 0; }
EOF
if compile_prog "" "-liscsi" ; then
if $pkg_config --atleast-version=1.7.0 libiscsi --modversion >/dev/null 2>&1; then
libiscsi="yes"
libiscsi_cflags=$($pkg_config --cflags libiscsi 2>/dev/null)
libiscsi_libs=$($pkg_config --libs libiscsi 2>/dev/null)
CFLAGS="$CFLAGS $libiscsi_cflags"
LIBS="$LIBS $libiscsi_libs"
elif compile_prog "" "-liscsi" ; then
libiscsi="yes"
LIBS="$LIBS -liscsi"
else
@@ -2827,7 +2834,7 @@ EOF
spice_cflags=$($pkg_config --cflags spice-protocol spice-server 2>/dev/null)
spice_libs=$($pkg_config --libs spice-protocol spice-server 2>/dev/null)
if $pkg_config --atleast-version=0.12.0 spice-server >/dev/null 2>&1 && \
$pkg_config --atleast-version=0.12.2 spice-protocol > /dev/null 2>&1 && \
$pkg_config --atleast-version=0.12.3 spice-protocol > /dev/null 2>&1 && \
compile_prog "$spice_cflags" "$spice_libs" ; then
spice="yes"
libs_softmmu="$libs_softmmu $spice_libs"
@@ -3029,34 +3036,67 @@ fi
##########################################
# check and set a backend for coroutine
# default is ucontext, but always fallback to gthread
# windows autodetected by make
if test "$coroutine" = "" -o "$coroutine" = "ucontext"; then
if test "$darwin" != "yes"; then
cat > $TMPC << EOF
# We prefer ucontext, but it's not always possible. The fallback
# is sigcontext. gthread is not selectable except explicitly, because
# it is not functional enough to run QEMU proper. (It is occasionally
# useful for debugging purposes.) On Windows the only valid backend
# is the Windows-specific one.
ucontext_works=no
if test "$darwin" != "yes"; then
cat > $TMPC << EOF
#include <ucontext.h>
#ifdef __stub_makecontext
#error Ignoring glibc stub makecontext which will always fail
#endif
int main(void) { makecontext(0, 0, 0); return 0; }
EOF
if compile_prog "" "" ; then
coroutine_backend=ucontext
else
coroutine_backend=gthread
fi
if compile_prog "" "" ; then
ucontext_works=yes
fi
fi
if test "$coroutine" = ""; then
if test "$mingw32" = "yes"; then
coroutine=win32
elif test "$ucontext_works" = "yes"; then
coroutine=ucontext
else
coroutine=sigaltstack
fi
elif test "$coroutine" = "gthread" ; then
coroutine_backend=gthread
elif test "$coroutine" = "windows" ; then
coroutine_backend=windows
elif test "$coroutine" = "sigaltstack" ; then
coroutine_backend=sigaltstack
else
echo
echo "Error: unknown coroutine backend $coroutine"
echo
exit 1
case $coroutine in
windows)
if test "$mingw32" != "yes"; then
echo
echo "Error: 'windows' coroutine backend only valid for Windows"
echo
exit 1
fi
# Unfortunately the user visible backend name doesn't match the
# coroutine-*.c filename for this case, so we have to adjust it here.
coroutine=win32
;;
ucontext)
if test "$ucontext_works" != "yes"; then
feature_not_found "ucontext"
fi
;;
gthread|sigaltstack)
if test "$mingw32" = "yes"; then
echo
echo "Error: only the 'windows' coroutine backend is valid for Windows"
echo
exit 1
fi
;;
*)
echo
echo "Error: unknown coroutine backend $coroutine"
echo
exit 1
;;
esac
fi
##########################################
@@ -3339,7 +3379,7 @@ echo "OpenGL support $opengl"
echo "libiscsi support $libiscsi"
echo "build guest agent $guest_agent"
echo "seccomp support $seccomp"
echo "coroutine backend $coroutine_backend"
echo "coroutine backend $coroutine"
echo "GlusterFS support $glusterfs"
echo "virtio-blk-data-plane $virtio_blk_data_plane"
echo "gcov $gcov_tool"
@@ -3662,11 +3702,7 @@ if test "$rbd" = "yes" ; then
echo "CONFIG_RBD=y" >> $config_host_mak
fi
if test "$coroutine_backend" = "ucontext" ; then
echo "CONFIG_UCONTEXT_COROUTINE=y" >> $config_host_mak
elif test "$coroutine_backend" = "sigaltstack" ; then
echo "CONFIG_SIGALTSTACK_COROUTINE=y" >> $config_host_mak
fi
echo "CONFIG_COROUTINE_BACKEND=$coroutine" >> $config_host_mak
if test "$open_by_handle_at" = "yes" ; then
echo "CONFIG_OPEN_BY_HANDLE=y" >> $config_host_mak

View File

@@ -51,12 +51,32 @@ void cpu_resume_from_signal(CPUArchState *env, void *puc)
}
#endif
/* Execute a TB, and fix up the CPU state afterwards if necessary */
static inline tcg_target_ulong cpu_tb_exec(CPUArchState *env, uint8_t *tb_ptr)
{
tcg_target_ulong next_tb = tcg_qemu_tb_exec(env, tb_ptr);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
* of the start of the TB.
*/
TranslationBlock *tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
cpu_pc_from_tb(env, tb);
}
if ((next_tb & TB_EXIT_MASK) == TB_EXIT_REQUESTED) {
/* We were asked to stop executing TBs (probably a pending
* interrupt. We've now stopped, so clear the flag.
*/
env->tcg_exit_req = 0;
}
return next_tb;
}
/* Execute the code without caching the generated code. An interpreter
could be used if available. */
static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
TranslationBlock *orig_tb)
{
tcg_target_ulong next_tb;
TranslationBlock *tb;
/* Should never happen.
@@ -68,14 +88,8 @@ static void cpu_exec_nocache(CPUArchState *env, int max_cycles,
max_cycles);
env->current_tb = tb;
/* execute the generated code */
next_tb = tcg_qemu_tb_exec(env, tb->tc_ptr);
cpu_tb_exec(env, tb->tc_ptr);
env->current_tb = NULL;
if ((next_tb & 3) == 2) {
/* Restore PC. This may happen if async event occurs before
the TB starts executing. */
cpu_pc_from_tb(env, tb);
}
tb_phys_invalidate(tb, -1);
tb_free(tb);
}
@@ -196,6 +210,7 @@ int cpu_exec(CPUArchState *env)
}
cpu_single_env = env;
current_cpu = cpu;
if (unlikely(exit_request)) {
env->exit_request = 1;
@@ -583,7 +598,8 @@ int cpu_exec(CPUArchState *env)
spans two pages, we cannot safely do a direct
jump. */
if (next_tb != 0 && tb->page_addr[1] == -1) {
tb_add_jump((TranslationBlock *)(next_tb & ~3), next_tb & 3, tb);
tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK, tb);
}
spin_unlock(&tb_lock);
@@ -596,13 +612,24 @@ int cpu_exec(CPUArchState *env)
if (likely(!env->exit_request)) {
tc_ptr = tb->tc_ptr;
/* execute the generated code */
next_tb = tcg_qemu_tb_exec(env, tc_ptr);
if ((next_tb & 3) == 2) {
next_tb = cpu_tb_exec(env, tc_ptr);
switch (next_tb & TB_EXIT_MASK) {
case TB_EXIT_REQUESTED:
/* Something asked us to stop executing
* chained TBs; just continue round the main
* loop. Whatever requested the exit will also
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop.
*/
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
next_tb = 0;
break;
case TB_EXIT_ICOUNT_EXPIRED:
{
/* Instruction counter expired. */
int insns_left;
tb = (TranslationBlock *)(next_tb & ~3);
/* Restore PC. */
cpu_pc_from_tb(env, tb);
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
insns_left = env->icount_decr.u32;
if (env->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
@@ -623,6 +650,10 @@ int cpu_exec(CPUArchState *env)
next_tb = 0;
cpu_loop_exit(env);
}
break;
}
default:
break;
}
}
env->current_tb = NULL;
@@ -633,6 +664,7 @@ int cpu_exec(CPUArchState *env)
/* Reload env after longjmp - the compiler may have smashed all
* local variables as longjmp is marked 'noreturn'. */
env = cpu_single_env;
cpu = current_cpu;
}
} /* for(;;) */
@@ -667,5 +699,6 @@ int cpu_exec(CPUArchState *env)
/* fail safe : never use cpu_single_env outside cpu_exec() */
cpu_single_env = NULL;
current_cpu = NULL;
return ret;
}

43
cpus.c
View File

@@ -38,6 +38,10 @@
#include "qemu/main-loop.h"
#include "qemu/bitmap.h"
#ifdef CONFIG_SECCOMP
#include "sysemu/seccomp.h"
#endif
#ifndef _WIN32
#include "qemu/compatfd.h"
#endif
@@ -432,6 +436,15 @@ void cpu_synchronize_all_post_init(void)
}
}
void cpu_clean_all_dirty(void)
{
CPUArchState *cpu;
for (cpu = first_cpu; cpu; cpu = cpu->next_cpu) {
cpu_clean_state(cpu);
}
}
bool cpu_is_stopped(CPUState *cpu)
{
return !runstate_is_running() || cpu->stopped;
@@ -665,9 +678,11 @@ void run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
qemu_cpu_kick(cpu);
while (!wi.done) {
CPUArchState *self_env = cpu_single_env;
CPUState *self_cpu = current_cpu;
qemu_cond_wait(&qemu_work_cond, &qemu_global_mutex);
cpu_single_env = self_env;
current_cpu = self_cpu;
}
}
@@ -737,10 +752,15 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
CPUState *cpu = ENV_GET_CPU(env);
int r;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
qemu_mutex_lock(&qemu_global_mutex);
qemu_thread_get_self(cpu->thread);
cpu->thread_id = qemu_get_thread_id();
cpu_single_env = env;
current_cpu = cpu;
r = kvm_init_vcpu(cpu);
if (r < 0) {
@@ -778,6 +798,10 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
sigset_t waitset;
int r;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
qemu_mutex_lock_iothread();
qemu_thread_get_self(cpu->thread);
cpu->thread_id = qemu_get_thread_id();
@@ -790,8 +814,10 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
qemu_cond_signal(&qemu_cpu_cond);
cpu_single_env = env;
current_cpu = cpu;
while (1) {
cpu_single_env = NULL;
current_cpu = NULL;
qemu_mutex_unlock_iothread();
do {
int sig;
@@ -803,6 +829,7 @@ static void *qemu_dummy_cpu_thread_fn(void *arg)
}
qemu_mutex_lock_iothread();
cpu_single_env = env;
current_cpu = cpu;
qemu_wait_io_event_common(cpu);
}
@@ -817,6 +844,10 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
CPUState *cpu = arg;
CPUArchState *env;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
qemu_tcg_init_cpu_signals();
qemu_thread_get_self(cpu->thread);
@@ -1328,6 +1359,18 @@ exit:
fclose(f);
}
bool spec_ctrl_is_inconsistent(void)
{
#if defined(TARGET_I386)
X86CPU *x86_cpu = X86_CPU(current_cpu);
CPUX86State *env = x86_cpu != NULL ? &x86_cpu->env : NULL;
if (env && !(env->cpuid_7_0_edx_features & CPUID_7_0_EDX_SPEC_CTRL) &&
env->spec_ctrl)
return true;
#endif
return false;
}
void qmp_inject_nmi(Error **errp)
{
#if defined(TARGET_I386)

View File

@@ -85,7 +85,7 @@ void *load_device_tree(const char *filename_path, int *sizep)
/* First allocate space in qemu for device tree */
fdt = g_malloc0(dt_size);
dt_file_load_size = load_image(filename_path, fdt);
dt_file_load_size = load_image_size(filename_path, fdt, dt_size);
if (dt_file_load_size < 0) {
printf("Unable to open device tree file '%s'\n",
filename_path);

View File

@@ -157,7 +157,6 @@ static const VMStateDescription vmstate_kbd = {
.name = "pckbd",
.version_id = 3,
.minimum_version_id = 3,
.minimum_version_id_old = 3,
.fields = (VMStateField []) {
VMSTATE_UINT8(write_cmd, KBDState),
VMSTATE_UINT8(status, KBDState),
@@ -186,12 +185,13 @@ You can see that there are several version fields:
- minimum_version_id: the minimum version_id that VMState is able to understand
for that device.
- minimum_version_id_old: For devices that were not able to port to vmstate, we can
assign a function that knows how to read this old state.
assign a function that knows how to read this old state. This field is
ignored if there is no load_state_old handler.
So, VMState is able to read versions from minimum_version_id to
version_id. And the function load_state_old() is able to load state
from minimum_version_id_old to minimum_version_id. This function is
deprecated and will be removed when no more users are left.
version_id. And the function load_state_old() (if present) is able to
load state from minimum_version_id_old to minimum_version_id. This
function is deprecated and will be removed when no more users are left.
=== Massaging functions ===
@@ -272,7 +272,6 @@ const VMStateDescription vmstate_ide_drive_pio_state = {
.name = "ide_drive/pio_state",
.version_id = 1,
.minimum_version_id = 1,
.minimum_version_id_old = 1,
.pre_save = ide_drive_pio_pre_save,
.post_load = ide_drive_pio_post_load,
.fields = (VMStateField []) {
@@ -292,7 +291,6 @@ const VMStateDescription vmstate_ide_drive = {
.name = "ide_drive",
.version_id = 3,
.minimum_version_id = 0,
.minimum_version_id_old = 0,
.post_load = ide_drive_post_load,
.fields = (VMStateField []) {
.... several fields ....

View File

@@ -52,7 +52,8 @@ int cpu_write_elf32_qemunote(write_core_dump_function f,
return -1;
}
int cpu_get_dump_info(ArchDumpInfo *info)
int cpu_get_dump_info(ArchDumpInfo *info,
const struct GuestPhysBlockList *guest_phys_blocks)
{
return -1;
}

172
dump.c
View File

@@ -21,6 +21,7 @@
#include "sysemu/dump.h"
#include "sysemu/sysemu.h"
#include "sysemu/memory_mapping.h"
#include "sysemu/cpus.h"
#include "qapi/error.h"
#include "qmp-commands.h"
#include "exec/gdbstub.h"
@@ -59,6 +60,7 @@ static uint64_t cpu_convert_to_target64(uint64_t val, int endian)
}
typedef struct DumpState {
GuestPhysBlockList guest_phys_blocks;
ArchDumpInfo dump_info;
MemoryMappingList list;
uint16_t phdr_num;
@@ -69,7 +71,7 @@ typedef struct DumpState {
hwaddr memory_offset;
int fd;
RAMBlock *block;
GuestPhysBlock *next_block;
ram_addr_t start;
bool has_filter;
int64_t begin;
@@ -81,6 +83,7 @@ static int dump_cleanup(DumpState *s)
{
int ret = 0;
guest_phys_blocks_free(&s->guest_phys_blocks);
memory_mapping_list_free(&s->list);
if (s->fd != -1) {
close(s->fd);
@@ -187,7 +190,8 @@ static int write_elf32_header(DumpState *s)
}
static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
int phdr_index, hwaddr offset)
int phdr_index, hwaddr offset,
hwaddr filesz)
{
Elf64_Phdr phdr;
int ret;
@@ -197,15 +201,12 @@ static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
phdr.p_offset = cpu_convert_to_target64(offset, endian);
phdr.p_paddr = cpu_convert_to_target64(memory_mapping->phys_addr, endian);
if (offset == -1) {
/* When the memory is not stored into vmcore, offset will be -1 */
phdr.p_filesz = 0;
} else {
phdr.p_filesz = cpu_convert_to_target64(memory_mapping->length, endian);
}
phdr.p_filesz = cpu_convert_to_target64(filesz, endian);
phdr.p_memsz = cpu_convert_to_target64(memory_mapping->length, endian);
phdr.p_vaddr = cpu_convert_to_target64(memory_mapping->virt_addr, endian);
assert(memory_mapping->length >= filesz);
ret = fd_write_vmcore(&phdr, sizeof(Elf64_Phdr), s);
if (ret < 0) {
dump_error(s, "dump: failed to write program header table.\n");
@@ -216,7 +217,8 @@ static int write_elf64_load(DumpState *s, MemoryMapping *memory_mapping,
}
static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
int phdr_index, hwaddr offset)
int phdr_index, hwaddr offset,
hwaddr filesz)
{
Elf32_Phdr phdr;
int ret;
@@ -226,15 +228,12 @@ static int write_elf32_load(DumpState *s, MemoryMapping *memory_mapping,
phdr.p_type = cpu_convert_to_target32(PT_LOAD, endian);
phdr.p_offset = cpu_convert_to_target32(offset, endian);
phdr.p_paddr = cpu_convert_to_target32(memory_mapping->phys_addr, endian);
if (offset == -1) {
/* When the memory is not stored into vmcore, offset will be -1 */
phdr.p_filesz = 0;
} else {
phdr.p_filesz = cpu_convert_to_target32(memory_mapping->length, endian);
}
phdr.p_filesz = cpu_convert_to_target32(filesz, endian);
phdr.p_memsz = cpu_convert_to_target32(memory_mapping->length, endian);
phdr.p_vaddr = cpu_convert_to_target32(memory_mapping->virt_addr, endian);
assert(memory_mapping->length >= filesz);
ret = fd_write_vmcore(&phdr, sizeof(Elf32_Phdr), s);
if (ret < 0) {
dump_error(s, "dump: failed to write program header table.\n");
@@ -388,14 +387,14 @@ static int write_data(DumpState *s, void *buf, int length)
}
/* write the memroy to vmcore. 1 page per I/O. */
static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
static int write_memory(DumpState *s, GuestPhysBlock *block, ram_addr_t start,
int64_t size)
{
int64_t i;
int ret;
for (i = 0; i < size / TARGET_PAGE_SIZE; i++) {
ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
ret = write_data(s, block->host_addr + start + i * TARGET_PAGE_SIZE,
TARGET_PAGE_SIZE);
if (ret < 0) {
return ret;
@@ -403,7 +402,7 @@ static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
}
if ((size % TARGET_PAGE_SIZE) != 0) {
ret = write_data(s, block->host + start + i * TARGET_PAGE_SIZE,
ret = write_data(s, block->host_addr + start + i * TARGET_PAGE_SIZE,
size % TARGET_PAGE_SIZE);
if (ret < 0) {
return ret;
@@ -413,57 +412,71 @@ static int write_memory(DumpState *s, RAMBlock *block, ram_addr_t start,
return 0;
}
/* get the memory's offset in the vmcore */
static hwaddr get_offset(hwaddr phys_addr,
DumpState *s)
/* get the memory's offset and size in the vmcore */
static void get_offset_range(hwaddr phys_addr,
ram_addr_t mapping_length,
DumpState *s,
hwaddr *p_offset,
hwaddr *p_filesz)
{
RAMBlock *block;
GuestPhysBlock *block;
hwaddr offset = s->memory_offset;
int64_t size_in_block, start;
/* When the memory is not stored into vmcore, offset will be -1 */
*p_offset = -1;
*p_filesz = 0;
if (s->has_filter) {
if (phys_addr < s->begin || phys_addr >= s->begin + s->length) {
return -1;
return;
}
}
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
if (s->has_filter) {
if (block->offset >= s->begin + s->length ||
block->offset + block->length <= s->begin) {
if (block->target_start >= s->begin + s->length ||
block->target_end <= s->begin) {
/* This block is out of the range */
continue;
}
if (s->begin <= block->offset) {
start = block->offset;
if (s->begin <= block->target_start) {
start = block->target_start;
} else {
start = s->begin;
}
size_in_block = block->length - (start - block->offset);
if (s->begin + s->length < block->offset + block->length) {
size_in_block -= block->offset + block->length -
(s->begin + s->length);
size_in_block = block->target_end - start;
if (s->begin + s->length < block->target_end) {
size_in_block -= block->target_end - (s->begin + s->length);
}
} else {
start = block->offset;
size_in_block = block->length;
start = block->target_start;
size_in_block = block->target_end - block->target_start;
}
if (phys_addr >= start && phys_addr < start + size_in_block) {
return phys_addr - start + offset;
*p_offset = phys_addr - start + offset;
/* The offset range mapped from the vmcore file must not spill over
* the GuestPhysBlock, clamp it. The rest of the mapping will be
* zero-filled in memory at load time; see
* <http://refspecs.linuxbase.org/elf/gabi4+/ch5.pheader.html>.
*/
*p_filesz = phys_addr + mapping_length <= start + size_in_block ?
mapping_length :
size_in_block - (phys_addr - start);
return;
}
offset += size_in_block;
}
return -1;
}
static int write_elf_loads(DumpState *s)
{
hwaddr offset;
hwaddr offset, filesz;
MemoryMapping *memory_mapping;
uint32_t phdr_index = 1;
int ret;
@@ -476,11 +489,15 @@ static int write_elf_loads(DumpState *s)
}
QTAILQ_FOREACH(memory_mapping, &s->list.head, next) {
offset = get_offset(memory_mapping->phys_addr, s);
get_offset_range(memory_mapping->phys_addr,
memory_mapping->length,
s, &offset, &filesz);
if (s->dump_info.d_class == ELFCLASS64) {
ret = write_elf64_load(s, memory_mapping, phdr_index++, offset);
ret = write_elf64_load(s, memory_mapping, phdr_index++, offset,
filesz);
} else {
ret = write_elf32_load(s, memory_mapping, phdr_index++, offset);
ret = write_elf32_load(s, memory_mapping, phdr_index++, offset,
filesz);
}
if (ret < 0) {
@@ -591,7 +608,7 @@ static int dump_completed(DumpState *s)
return 0;
}
static int get_next_block(DumpState *s, RAMBlock *block)
static int get_next_block(DumpState *s, GuestPhysBlock *block)
{
while (1) {
block = QTAILQ_NEXT(block, next);
@@ -601,16 +618,16 @@ static int get_next_block(DumpState *s, RAMBlock *block)
}
s->start = 0;
s->block = block;
s->next_block = block;
if (s->has_filter) {
if (block->offset >= s->begin + s->length ||
block->offset + block->length <= s->begin) {
if (block->target_start >= s->begin + s->length ||
block->target_end <= s->begin) {
/* This block is out of the range */
continue;
}
if (s->begin > block->offset) {
s->start = s->begin - block->offset;
if (s->begin > block->target_start) {
s->start = s->begin - block->target_start;
}
}
@@ -621,18 +638,18 @@ static int get_next_block(DumpState *s, RAMBlock *block)
/* write all memory to vmcore */
static int dump_iterate(DumpState *s)
{
RAMBlock *block;
GuestPhysBlock *block;
int64_t size;
int ret;
while (1) {
block = s->block;
block = s->next_block;
size = block->length;
size = block->target_end - block->target_start;
if (s->has_filter) {
size -= s->start;
if (s->begin + s->length < block->offset + block->length) {
size -= block->offset + block->length - (s->begin + s->length);
if (s->begin + s->length < block->target_end) {
size -= block->target_end - (s->begin + s->length);
}
}
ret = write_memory(s, block, s->start, size);
@@ -667,23 +684,23 @@ static int create_vmcore(DumpState *s)
static ram_addr_t get_start_block(DumpState *s)
{
RAMBlock *block;
GuestPhysBlock *block;
if (!s->has_filter) {
s->block = QTAILQ_FIRST(&ram_list.blocks);
s->next_block = QTAILQ_FIRST(&s->guest_phys_blocks.head);
return 0;
}
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
if (block->offset >= s->begin + s->length ||
block->offset + block->length <= s->begin) {
QTAILQ_FOREACH(block, &s->guest_phys_blocks.head, next) {
if (block->target_start >= s->begin + s->length ||
block->target_end <= s->begin) {
/* This block is out of the range */
continue;
}
s->block = block;
if (s->begin > block->offset) {
s->start = s->begin - block->offset;
s->next_block = block;
if (s->begin > block->target_start) {
s->start = s->begin - block->target_start;
} else {
s->start = 0;
}
@@ -707,32 +724,35 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
s->resume = false;
}
/* If we use KVM, we should synchronize the registers before we get dump
* info or physmap info.
*/
cpu_synchronize_all_states();
nr_cpus = 0;
for (env = first_cpu; env != NULL; env = env->next_cpu) {
nr_cpus++;
}
s->errp = errp;
s->fd = fd;
s->has_filter = has_filter;
s->begin = begin;
s->length = length;
guest_phys_blocks_init(&s->guest_phys_blocks);
guest_phys_blocks_append(&s->guest_phys_blocks);
s->start = get_start_block(s);
if (s->start == -1) {
error_set(errp, QERR_INVALID_PARAMETER, "begin");
goto cleanup;
}
/*
* get dump info: endian, class and architecture.
/* get dump info: endian, class and architecture.
* If the target architecture is not supported, cpu_get_dump_info() will
* return -1.
*
* if we use kvm, we should synchronize the register before we get dump
* info.
*/
nr_cpus = 0;
for (env = first_cpu; env != NULL; env = env->next_cpu) {
cpu_synchronize_state(env);
nr_cpus++;
}
ret = cpu_get_dump_info(&s->dump_info);
ret = cpu_get_dump_info(&s->dump_info, &s->guest_phys_blocks);
if (ret < 0) {
error_set(errp, QERR_UNSUPPORTED);
goto cleanup;
@@ -748,9 +768,9 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
/* get memory mapping */
memory_mapping_list_init(&s->list);
if (paging) {
qemu_get_guest_memory_mapping(&s->list);
qemu_get_guest_memory_mapping(&s->list, &s->guest_phys_blocks);
} else {
qemu_get_guest_simple_memory_mapping(&s->list);
qemu_get_guest_simple_memory_mapping(&s->list, &s->guest_phys_blocks);
}
if (s->has_filter) {
@@ -802,6 +822,8 @@ static int dump_init(DumpState *s, int fd, bool paging, bool has_filter,
return 0;
cleanup:
guest_phys_blocks_free(&s->guest_phys_blocks);
if (s->resume) {
vm_start();
}
@@ -849,7 +871,7 @@ void qmp_dump_guest_memory(bool paging, const char *file, bool has_begin,
return;
}
s = g_malloc(sizeof(DumpState));
s = g_malloc0(sizeof(DumpState));
ret = dump_init(s, fd, paging, has_begin, begin, length, errp);
if (ret < 0) {

31
exec.c
View File

@@ -75,6 +75,7 @@ CPUArchState *first_cpu;
/* current CPU in the current thread. It is only valid inside
cpu_exec() */
DEFINE_TLS(CPUArchState *,cpu_single_env);
DEFINE_TLS(CPUState *, current_cpu);
/* 0 = Do not count executed instructions.
1 = Precise instruction counting.
2 = Adaptive rate instruction counting. */
@@ -493,7 +494,7 @@ void cpu_reset_interrupt(CPUArchState *env, int mask)
void cpu_exit(CPUArchState *env)
{
env->exit_request = 1;
cpu_unlink_tb(env);
env->tcg_exit_req = 1;
}
void cpu_abort(CPUArchState *env, const char *fmt, ...)
@@ -1080,6 +1081,7 @@ ram_addr_t qemu_ram_alloc_from_ptr(ram_addr_t size, void *host,
qemu_ram_setup_dump(new_block->host, size);
qemu_madvise(new_block->host, size, QEMU_MADV_HUGEPAGE);
qemu_madvise(new_block->host, size, QEMU_MADV_DONTFORK);
if (kvm_enabled())
kvm_setup_guest_memory(new_block->host, size);
@@ -1164,7 +1166,7 @@ void qemu_ram_remap(ram_addr_t addr, ram_addr_t length)
QTAILQ_FOREACH(block, &ram_list.blocks, next) {
offset = addr - block->offset;
if (offset < block->length) {
vaddr = block->host + offset;
vaddr = ramblock_ptr(block, offset);
if (block->flags & RAM_PREALLOC_MASK) {
;
} else {
@@ -1255,7 +1257,7 @@ found:
xen_map_cache(block->offset, block->length, 1);
}
}
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
/* Return a host pointer to ram allocated with qemu_ram_alloc. Same as
@@ -1282,7 +1284,7 @@ static void *qemu_safe_ram_ptr(ram_addr_t addr)
xen_map_cache(block->offset, block->length, 1);
}
}
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
}
@@ -1308,7 +1310,7 @@ static void *qemu_ram_ptr_length(ram_addr_t addr, ram_addr_t *size)
if (addr - block->offset < block->length) {
if (addr - block->offset + *size > block->length)
*size = block->length - addr + block->offset;
return block->host + (addr - block->offset);
return ramblock_ptr(block, addr - block->offset);
}
}
@@ -1868,7 +1870,7 @@ void address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
int len, bool is_write)
{
AddressSpaceDispatch *d = as->dispatch;
int l;
ram_addr_t l;
uint8_t *ptr;
uint32_t val;
hwaddr page;
@@ -1908,7 +1910,7 @@ void address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
addr1 = memory_region_get_ram_addr(section->mr)
+ memory_region_section_addr(section, addr);
/* RAM case */
ptr = qemu_get_ram_ptr(addr1);
ptr = qemu_ram_ptr_length(addr1, &l);
memcpy(ptr, buf, l);
invalidate_and_set_dirty(addr1, l);
qemu_put_ram_ptr(ptr);
@@ -1937,9 +1939,10 @@ void address_space_rw(AddressSpace *as, hwaddr addr, uint8_t *buf,
}
} else {
/* RAM case */
ptr = qemu_get_ram_ptr(section->mr->ram_addr
+ memory_region_section_addr(section,
addr));
ptr = qemu_ram_ptr_length(section->mr->ram_addr +
memory_region_section_addr(section,
addr),
&l);
memcpy(buf, ptr, l);
qemu_put_ram_ptr(ptr);
}
@@ -2584,14 +2587,12 @@ int cpu_memory_rw_debug(CPUArchState *env, target_ulong addr,
}
#endif
#if !defined(CONFIG_USER_ONLY)
/*
* A helper function for the _utterly broken_ virtio device model to find out if
* it's running on a big endian machine. Don't do this at home kids!
*/
bool virtio_is_big_endian(void);
bool virtio_is_big_endian(void)
bool target_words_bigendian(void);
bool target_words_bigendian(void)
{
#if defined(TARGET_WORDS_BIGENDIAN)
return true;
@@ -2600,8 +2601,6 @@ bool virtio_is_big_endian(void)
#endif
}
#endif
#ifndef CONFIG_USER_ONLY
bool cpu_physical_memory_is_io(hwaddr phys_addr)
{

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p backend
* 9p backend
*
* Copyright IBM, Corp. 2010
*
@@ -23,40 +23,9 @@
#include <errno.h>
#include "qemu/compiler.h"
#include "virtio-9p-marshal.h"
#include "9p-iov-marshal.h"
#include "qemu/bswap.h"
void v9fs_string_free(V9fsString *str)
{
g_free(str->data);
str->data = NULL;
str->size = 0;
}
void v9fs_string_null(V9fsString *str)
{
v9fs_string_free(str);
}
void GCC_FMT_ATTR(2, 3)
v9fs_string_sprintf(V9fsString *str, const char *fmt, ...)
{
va_list ap;
v9fs_string_free(str);
va_start(ap, fmt);
str->size = g_vasprintf(&str->data, fmt, ap);
va_end(ap);
}
void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs)
{
v9fs_string_free(lhs);
v9fs_string_sprintf(lhs, "%s", rhs->data);
}
static ssize_t v9fs_packunpack(void *addr, struct iovec *sg, int sg_count,
size_t offset, size_t size, int pack)
{
@@ -108,15 +77,13 @@ ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
return v9fs_packunpack((void *)src, in_sg, in_num, offset, size, 1);
}
ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...)
ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, va_list ap)
{
int i;
va_list ap;
ssize_t copied = 0;
size_t old_offset = offset;
va_start(ap, fmt);
for (i = 0; fmt[i]; i++) {
switch (fmt[i]) {
case 'b': {
@@ -159,14 +126,14 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
}
case 's': {
V9fsString *str = va_arg(ap, V9fsString *);
copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
"w", &str->size);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"w", &str->size);
if (copied > 0) {
offset += copied;
str->data = g_malloc(str->size + 1);
copied = v9fs_unpack(str->data, out_sg, out_num, offset,
str->size);
if (copied > 0) {
if (copied >= 0) {
str->data[str->size] = 0;
} else {
v9fs_string_free(str);
@@ -176,56 +143,70 @@ ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
}
case 'Q': {
V9fsQID *qidp = va_arg(ap, V9fsQID *);
copied = v9fs_unmarshal(out_sg, out_num, offset, bswap, "bdq",
&qidp->type, &qidp->version, &qidp->path);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"bdq", &qidp->type, &qidp->version,
&qidp->path);
break;
}
case 'S': {
V9fsStat *statp = va_arg(ap, V9fsStat *);
copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
"wwdQdddqsssssddd",
&statp->size, &statp->type, &statp->dev,
&statp->qid, &statp->mode, &statp->atime,
&statp->mtime, &statp->length,
&statp->name, &statp->uid, &statp->gid,
&statp->muid, &statp->extension,
&statp->n_uid, &statp->n_gid,
&statp->n_muid);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"wwdQdddqsssssddd",
&statp->size, &statp->type,
&statp->dev, &statp->qid,
&statp->mode, &statp->atime,
&statp->mtime, &statp->length,
&statp->name, &statp->uid,
&statp->gid, &statp->muid,
&statp->extension,
&statp->n_uid, &statp->n_gid,
&statp->n_muid);
break;
}
case 'I': {
V9fsIattr *iattr = va_arg(ap, V9fsIattr *);
copied = v9fs_unmarshal(out_sg, out_num, offset, bswap,
"ddddqqqqq",
&iattr->valid, &iattr->mode,
&iattr->uid, &iattr->gid, &iattr->size,
&iattr->atime_sec, &iattr->atime_nsec,
&iattr->mtime_sec, &iattr->mtime_nsec);
copied = v9fs_iov_unmarshal(out_sg, out_num, offset, bswap,
"ddddqqqqq",
&iattr->valid, &iattr->mode,
&iattr->uid, &iattr->gid,
&iattr->size, &iattr->atime_sec,
&iattr->atime_nsec,
&iattr->mtime_sec,
&iattr->mtime_nsec);
break;
}
default:
break;
}
if (copied < 0) {
va_end(ap);
return copied;
}
offset += copied;
}
va_end(ap);
return offset - old_offset;
}
ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, ...)
ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...)
{
ssize_t ret;
va_list ap;
va_start(ap, fmt);
ret = v9fs_iov_vunmarshal(out_sg, out_num, offset, bswap, fmt, ap);
va_end(ap);
return ret;
}
ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, va_list ap)
{
int i;
va_list ap;
ssize_t copied = 0;
size_t old_offset = offset;
va_start(ap, fmt);
for (i = 0; fmt[i]; i++) {
switch (fmt[i]) {
case 'b': {
@@ -265,8 +246,8 @@ ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, size_t offset,
}
case 's': {
V9fsString *str = va_arg(ap, V9fsString *);
copied = v9fs_marshal(in_sg, in_num, offset, bswap,
"w", str->size);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"w", str->size);
if (copied > 0) {
offset += copied;
copied = v9fs_pack(in_sg, in_num, offset, str->data, str->size);
@@ -275,49 +256,65 @@ ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, size_t offset,
}
case 'Q': {
V9fsQID *qidp = va_arg(ap, V9fsQID *);
copied = v9fs_marshal(in_sg, in_num, offset, bswap, "bdq",
qidp->type, qidp->version, qidp->path);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap, "bdq",
qidp->type, qidp->version,
qidp->path);
break;
}
case 'S': {
V9fsStat *statp = va_arg(ap, V9fsStat *);
copied = v9fs_marshal(in_sg, in_num, offset, bswap,
"wwdQdddqsssssddd",
statp->size, statp->type, statp->dev,
&statp->qid, statp->mode, statp->atime,
statp->mtime, statp->length, &statp->name,
&statp->uid, &statp->gid, &statp->muid,
&statp->extension, statp->n_uid,
statp->n_gid, statp->n_muid);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"wwdQdddqsssssddd",
statp->size, statp->type, statp->dev,
&statp->qid, statp->mode, statp->atime,
statp->mtime, statp->length,
&statp->name,
&statp->uid, &statp->gid, &statp->muid,
&statp->extension, statp->n_uid,
statp->n_gid, statp->n_muid);
break;
}
case 'A': {
V9fsStatDotl *statp = va_arg(ap, V9fsStatDotl *);
copied = v9fs_marshal(in_sg, in_num, offset, bswap,
"qQdddqqqqqqqqqqqqqqq",
statp->st_result_mask,
&statp->qid, statp->st_mode,
statp->st_uid, statp->st_gid,
statp->st_nlink, statp->st_rdev,
statp->st_size, statp->st_blksize,
statp->st_blocks, statp->st_atime_sec,
statp->st_atime_nsec, statp->st_mtime_sec,
statp->st_mtime_nsec, statp->st_ctime_sec,
statp->st_ctime_nsec, statp->st_btime_sec,
statp->st_btime_nsec, statp->st_gen,
statp->st_data_version);
copied = v9fs_iov_marshal(in_sg, in_num, offset, bswap,
"qQdddqqqqqqqqqqqqqqq",
statp->st_result_mask,
&statp->qid, statp->st_mode,
statp->st_uid, statp->st_gid,
statp->st_nlink, statp->st_rdev,
statp->st_size, statp->st_blksize,
statp->st_blocks, statp->st_atime_sec,
statp->st_atime_nsec,
statp->st_mtime_sec,
statp->st_mtime_nsec,
statp->st_ctime_sec,
statp->st_ctime_nsec,
statp->st_btime_sec,
statp->st_btime_nsec, statp->st_gen,
statp->st_data_version);
break;
}
default:
break;
}
if (copied < 0) {
va_end(ap);
return copied;
}
offset += copied;
}
va_end(ap);
return offset - old_offset;
}
ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, ...)
{
ssize_t ret;
va_list ap;
va_start(ap, fmt);
ret = v9fs_iov_vmarshal(in_sg, in_num, offset, bswap, fmt, ap);
va_end(ap);
return ret;
}

18
fsdev/9p-iov-marshal.h Normal file
View File

@@ -0,0 +1,18 @@
#ifndef _QEMU_9P_IOV_MARSHAL_H
#define _QEMU_9P_IOV_MARSHAL_H
#include "9p-marshal.h"
ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
const void *src, size_t size);
ssize_t v9fs_iov_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...);
ssize_t v9fs_iov_marshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, ...);
ssize_t v9fs_iov_vunmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, va_list ap);
ssize_t v9fs_iov_vmarshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, va_list ap);
#endif

56
fsdev/9p-marshal.c Normal file
View File

@@ -0,0 +1,56 @@
/*
* 9p backend
*
* Copyright IBM, Corp. 2010
*
* Authors:
* Anthony Liguori <aliguori@us.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#include <glib.h>
#include <glib/gprintf.h>
#include <sys/types.h>
#include <dirent.h>
#include <sys/time.h>
#include <utime.h>
#include <sys/uio.h>
#include <string.h>
#include <stdint.h>
#include <errno.h>
#include "qemu/compiler.h"
#include "9p-marshal.h"
void v9fs_string_free(V9fsString *str)
{
g_free(str->data);
str->data = NULL;
str->size = 0;
}
void v9fs_string_null(V9fsString *str)
{
v9fs_string_free(str);
}
void GCC_FMT_ATTR(2, 3)
v9fs_string_sprintf(V9fsString *str, const char *fmt, ...)
{
va_list ap;
v9fs_string_free(str);
va_start(ap, fmt);
str->size = g_vasprintf(&str->data, fmt, ap);
va_end(ap);
}
void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs)
{
v9fs_string_free(lhs);
v9fs_string_sprintf(lhs, "%s", rhs->data);
}

View File

@@ -1,5 +1,5 @@
#ifndef _QEMU_VIRTIO_9P_MARSHAL_H
#define _QEMU_VIRTIO_9P_MARSHAL_H
#ifndef _QEMU_9P_MARSHAL_H
#define _QEMU_9P_MARSHAL_H
typedef struct V9fsString
{
@@ -30,7 +30,7 @@ typedef struct V9fsStat
V9fsString muid;
/* 9p2000.u */
V9fsString extension;
int32_t n_uid;
int32_t n_uid;
int32_t n_gid;
int32_t n_muid;
} V9fsStat;
@@ -81,10 +81,4 @@ extern void v9fs_string_null(V9fsString *str);
extern void v9fs_string_sprintf(V9fsString *str, const char *fmt, ...);
extern void v9fs_string_copy(V9fsString *lhs, V9fsString *rhs);
ssize_t v9fs_pack(struct iovec *in_sg, int in_num, size_t offset,
const void *src, size_t size);
ssize_t v9fs_unmarshal(struct iovec *out_sg, int out_num, size_t offset,
int bswap, const char *fmt, ...);
ssize_t v9fs_marshal(struct iovec *in_sg, int in_num, size_t offset,
int bswap, const char *fmt, ...);
#endif

View File

@@ -1,5 +1,5 @@
ifeq ($(CONFIG_REALLY_VIRTFS),y)
common-obj-y = qemu-fsdev.o virtio-9p-marshal.o
common-obj-y = qemu-fsdev.o 9p-marshal.o 9p-iov-marshal.o
else
common-obj-y = qemu-fsdev-dummy.o
endif

View File

@@ -102,6 +102,7 @@ struct FileOperations
{
int (*parse_opts)(QemuOpts *, struct FsDriverEntry *);
int (*init)(struct FsContext *);
void (*cleanup)(struct FsContext *);
int (*lstat)(FsContext *, V9fsPath *, struct stat *);
ssize_t (*readlink)(FsContext *, V9fsPath *, char *, size_t);
int (*chmod)(FsContext *, V9fsPath *, FsCred *);

View File

@@ -9,6 +9,10 @@
* the COPYING file in the top-level directory.
*/
/* work around a broken sys/capability.h */
#if defined(__i386__)
typedef unsigned long long __u64;
#endif
#include <sys/resource.h>
#include <getopt.h>
#include <syslog.h>
@@ -23,9 +27,9 @@
#include "qemu-common.h"
#include "qemu/sockets.h"
#include "qemu/xattr.h"
#include "virtio-9p-marshal.h"
#include "hw/9pfs/virtio-9p-proxy.h"
#include "fsdev/virtio-9p-marshal.h"
#include "9p-iov-marshal.h"
#include "hw/9pfs/9p-proxy.h"
#include "fsdev/9p-iov-marshal.h"
#define PROGNAME "virtfs-proxy-helper"

View File

@@ -2926,14 +2926,14 @@ void gdbserver_fork(CPUArchState *env)
cpu_watchpoint_remove_all(env, BP_GDB);
}
#else
static int gdb_chr_can_receive(void *opaque)
static size_t gdb_chr_can_receive(void *opaque)
{
/* We can handle an arbitrarily large amount of data.
Pick the maximum packet size, which is as good as anything. */
return MAX_PACKET_LENGTH;
}
static void gdb_chr_receive(void *opaque, const uint8_t *buf, int size)
static void gdb_chr_receive(void *opaque, const uint8_t *buf, size_t size)
{
int i;
@@ -2965,7 +2965,7 @@ static void gdb_monitor_output(GDBState *s, const char *msg, int len)
put_packet(s, buf);
}
static int gdb_monitor_write(CharDriverState *chr, const uint8_t *buf, int len)
static size_t gdb_monitor_write(CharDriverState *chr, const uint8_t *buf, size_t len)
{
const char *p = (const char *)buf;
int max_sz;

20
hmp.c
View File

@@ -173,6 +173,8 @@ void hmp_info_migrate(Monitor *mon, const QDict *qdict)
info->ram->total >> 10);
monitor_printf(mon, "duplicate: %" PRIu64 " pages\n",
info->ram->duplicate);
monitor_printf(mon, "skipped: %" PRIu64 " pages\n",
info->ram->skipped);
monitor_printf(mon, "normal: %" PRIu64 " pages\n",
info->ram->normal);
monitor_printf(mon, "normal bytes: %" PRIu64 " kbytes\n",
@@ -1211,21 +1213,18 @@ void hmp_send_key(Monitor *mon, const QDict *qdict)
int has_hold_time = qdict_haskey(qdict, "hold-time");
int hold_time = qdict_get_try_int(qdict, "hold-time", -1);
Error *err = NULL;
char keyname_buf[16];
char *separator;
int keyname_len;
while (1) {
separator = strchr(keys, '-');
keyname_len = separator ? separator - keys : strlen(keys);
pstrcpy(keyname_buf, sizeof(keyname_buf), keys);
/* Be compatible with old interface, convert user inputted "<" */
if (!strncmp(keyname_buf, "<", 1) && keyname_len == 1) {
pstrcpy(keyname_buf, sizeof(keyname_buf), "less");
if (keys[0] == '<' && keyname_len == 1) {
keys = "less";
keyname_len = 4;
}
keyname_buf[keyname_len] = 0;
keylist = g_malloc0(sizeof(*keylist));
keylist->value = g_malloc0(sizeof(*keylist->value));
@@ -1238,16 +1237,17 @@ void hmp_send_key(Monitor *mon, const QDict *qdict)
}
tmp = keylist;
if (strstart(keyname_buf, "0x", NULL)) {
if (strstart(keys, "0x", NULL)) {
char *endp;
int value = strtoul(keyname_buf, &endp, 0);
if (*endp != '\0') {
int value = strtoul(keys, &endp, 0);
assert(endp <= keys + keyname_len);
if (endp != keys + keyname_len) {
goto err_out;
}
keylist->value->kind = KEY_VALUE_KIND_NUMBER;
keylist->value->number = value;
} else {
int idx = index_from_key(keyname_buf);
int idx = index_from_key(keys, keyname_len);
if (idx == Q_KEY_CODE_MAX) {
goto err_out;
}
@@ -1269,7 +1269,7 @@ out:
return;
err_out:
monitor_printf(mon, "invalid parameter: %s\n", keyname_buf);
monitor_printf(mon, "invalid parameter: %.*s\n", keyname_len, keys);
goto out;
}

24
hw/9p.h
View File

@@ -1,24 +0,0 @@
/*
* Virtio 9p
*
* Copyright IBM, Corp. 2010
*
* Authors:
* Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#ifndef QEMU_9P_H
#define QEMU_9P_H
typedef struct V9fsConf
{
/* tag name for the device */
char *tag;
char *fsdev_id;
} V9fsConf;
#endif

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p handle callback
* 9p handle callback
*
* Copyright IBM, Corp. 2011
*
@@ -11,9 +11,8 @@
*
*/
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "virtio-9p-xattr.h"
#include "9p.h"
#include "9p-xattr.h"
#include <arpa/inet.h>
#include <pwd.h>
#include <grp.h>
@@ -111,7 +110,7 @@ static int handle_close(FsContext *ctx, V9fsFidOpenState *fs)
static int handle_closedir(FsContext *ctx, V9fsFidOpenState *fs)
{
return closedir(fs->dir);
return closedir(fs->dir.stream);
}
static int handle_open(FsContext *ctx, V9fsPath *fs_path,
@@ -131,8 +130,8 @@ static int handle_opendir(FsContext *ctx,
if (ret < 0) {
return -1;
}
fs->dir = fdopendir(ret);
if (!fs->dir) {
fs->dir.stream = fdopendir(ret);
if (!fs->dir.stream) {
return -1;
}
return 0;
@@ -140,24 +139,24 @@ static int handle_opendir(FsContext *ctx,
static void handle_rewinddir(FsContext *ctx, V9fsFidOpenState *fs)
{
return rewinddir(fs->dir);
rewinddir(fs->dir.stream);
}
static off_t handle_telldir(FsContext *ctx, V9fsFidOpenState *fs)
{
return telldir(fs->dir);
return telldir(fs->dir.stream);
}
static int handle_readdir_r(FsContext *ctx, V9fsFidOpenState *fs,
struct dirent *entry,
struct dirent **result)
{
return readdir_r(fs->dir, entry, result);
return readdir_r(fs->dir.stream, entry, result);
}
static void handle_seekdir(FsContext *ctx, V9fsFidOpenState *fs, off_t off)
{
return seekdir(fs->dir, off);
seekdir(fs->dir.stream, off);
}
static ssize_t handle_preadv(FsContext *ctx, V9fsFidOpenState *fs,
@@ -261,7 +260,7 @@ static int handle_fstat(FsContext *fs_ctx, int fid_type,
int fd;
if (fid_type == P9_FID_DIR) {
fd = dirfd(fs->dir);
fd = dirfd(fs->dir.stream);
} else {
fd = fs->fd;
}
@@ -408,7 +407,7 @@ static int handle_fsync(FsContext *ctx, int fid_type,
int fd;
if (fid_type == P9_FID_DIR) {
fd = dirfd(fs->dir);
fd = dirfd(fs->dir.stream);
} else {
fd = fs->fd;
}

1424
hw/9pfs/9p-local.c Normal file

File diff suppressed because it is too large Load Diff

20
hw/9pfs/9p-local.h Normal file
View File

@@ -0,0 +1,20 @@
/*
* 9p local backend utilities
*
* Copyright IBM, Corp. 2017
*
* Authors:
* Greg Kurz <groug@kaod.org>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#ifndef QEMU_9P_LOCAL_H
#define QEMU_9P_LOCAL_H
int local_open_nofollow(FsContext *fs_ctx, const char *path, int flags,
mode_t mode);
int local_opendir_nofollow(FsContext *fs_ctx, const char *path);
#endif

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p system.posix* xattr callback
* 9p system.posix* xattr callback
*
* Copyright IBM, Corp. 2010
*
@@ -13,10 +13,9 @@
#include <sys/types.h>
#include "qemu/xattr.h"
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "9p.h"
#include "fsdev/file-op-9p.h"
#include "virtio-9p-xattr.h"
#include "9p-xattr.h"
#define MAP_ACL_ACCESS "user.virtfs.system.posix_acl_access"
#define MAP_ACL_DEFAULT "user.virtfs.system.posix_acl_default"
@@ -26,8 +25,7 @@
static ssize_t mp_pacl_getxattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
char buffer[PATH_MAX];
return lgetxattr(rpath(ctx, path, buffer), MAP_ACL_ACCESS, value, size);
return local_getxattr_nofollow(ctx, path, MAP_ACL_ACCESS, value, size);
}
static ssize_t mp_pacl_listxattr(FsContext *ctx, const char *path,
@@ -52,17 +50,16 @@ static ssize_t mp_pacl_listxattr(FsContext *ctx, const char *path,
static int mp_pacl_setxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
char buffer[PATH_MAX];
return lsetxattr(rpath(ctx, path, buffer), MAP_ACL_ACCESS, value,
size, flags);
return local_setxattr_nofollow(ctx, path, MAP_ACL_ACCESS, value, size,
flags);
}
static int mp_pacl_removexattr(FsContext *ctx,
const char *path, const char *name)
{
int ret;
char buffer[PATH_MAX];
ret = lremovexattr(rpath(ctx, path, buffer), MAP_ACL_ACCESS);
ret = local_removexattr_nofollow(ctx, path, MAP_ACL_ACCESS);
if (ret == -1 && errno == ENODATA) {
/*
* We don't get ENODATA error when trying to remove a
@@ -78,8 +75,7 @@ static int mp_pacl_removexattr(FsContext *ctx,
static ssize_t mp_dacl_getxattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
char buffer[PATH_MAX];
return lgetxattr(rpath(ctx, path, buffer), MAP_ACL_DEFAULT, value, size);
return local_getxattr_nofollow(ctx, path, MAP_ACL_DEFAULT, value, size);
}
static ssize_t mp_dacl_listxattr(FsContext *ctx, const char *path,
@@ -104,17 +100,16 @@ static ssize_t mp_dacl_listxattr(FsContext *ctx, const char *path,
static int mp_dacl_setxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
char buffer[PATH_MAX];
return lsetxattr(rpath(ctx, path, buffer), MAP_ACL_DEFAULT, value,
size, flags);
return local_setxattr_nofollow(ctx, path, MAP_ACL_DEFAULT, value, size,
flags);
}
static int mp_dacl_removexattr(FsContext *ctx,
const char *path, const char *name)
{
int ret;
char buffer[PATH_MAX];
ret = lremovexattr(rpath(ctx, path, buffer), MAP_ACL_DEFAULT);
ret = local_removexattr_nofollow(ctx, path, MAP_ACL_DEFAULT);
if (ret == -1 && errno == ENODATA) {
/*
* We don't get ENODATA error when trying to remove a

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p Proxy callback
* 9p Proxy callback
*
* Copyright IBM, Corp. 2011
*
@@ -11,10 +11,9 @@
*/
#include <sys/socket.h>
#include <sys/un.h>
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "9p.h"
#include "fsdev/qemu-fsdev.h"
#include "virtio-9p-proxy.h"
#include "9p-proxy.h"
typedef struct V9fsProxy {
int sockfd;
@@ -631,7 +630,7 @@ static int proxy_close(FsContext *ctx, V9fsFidOpenState *fs)
static int proxy_closedir(FsContext *ctx, V9fsFidOpenState *fs)
{
return closedir(fs->dir);
return closedir(fs->dir.stream);
}
static int proxy_open(FsContext *ctx, V9fsPath *fs_path,
@@ -650,14 +649,14 @@ static int proxy_opendir(FsContext *ctx,
{
int serrno, fd;
fs->dir = NULL;
fs->dir.stream = NULL;
fd = v9fs_request(ctx->private, T_OPEN, NULL, "sd", fs_path, O_DIRECTORY);
if (fd < 0) {
errno = -fd;
return -1;
}
fs->dir = fdopendir(fd);
if (!fs->dir) {
fs->dir.stream = fdopendir(fd);
if (!fs->dir.stream) {
serrno = errno;
close(fd);
errno = serrno;
@@ -668,24 +667,24 @@ static int proxy_opendir(FsContext *ctx,
static void proxy_rewinddir(FsContext *ctx, V9fsFidOpenState *fs)
{
return rewinddir(fs->dir);
rewinddir(fs->dir.stream);
}
static off_t proxy_telldir(FsContext *ctx, V9fsFidOpenState *fs)
{
return telldir(fs->dir);
return telldir(fs->dir.stream);
}
static int proxy_readdir_r(FsContext *ctx, V9fsFidOpenState *fs,
struct dirent *entry,
struct dirent **result)
{
return readdir_r(fs->dir, entry, result);
return readdir_r(fs->dir.stream, entry, result);
}
static void proxy_seekdir(FsContext *ctx, V9fsFidOpenState *fs, off_t off)
{
return seekdir(fs->dir, off);
seekdir(fs->dir.stream, off);
}
static ssize_t proxy_preadv(FsContext *ctx, V9fsFidOpenState *fs,
@@ -791,7 +790,7 @@ static int proxy_fstat(FsContext *fs_ctx, int fid_type,
int fd;
if (fid_type == P9_FID_DIR) {
fd = dirfd(fs->dir);
fd = dirfd(fs->dir.stream);
} else {
fd = fs->fd;
}
@@ -936,7 +935,7 @@ static int proxy_fsync(FsContext *ctx, int fid_type,
int fd;
if (fid_type == P9_FID_DIR) {
fd = dirfd(fs->dir);
fd = dirfd(fs->dir.stream);
} else {
fd = fs->fd;
}
@@ -1033,13 +1032,10 @@ static int proxy_name_to_path(FsContext *ctx, V9fsPath *dir_path,
const char *name, V9fsPath *target)
{
if (dir_path) {
v9fs_string_sprintf((V9fsString *)target, "%s/%s",
dir_path->data, name);
v9fs_path_sprintf(target, "%s/%s", dir_path->data, name);
} else {
v9fs_string_sprintf((V9fsString *)target, "%s", name);
v9fs_path_sprintf(target, "%s", name);
}
/* Bump the size for including terminating NULL */
target->size++;
return 0;
}

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p Proxy callback
* 9p Proxy callback
*
* Copyright IBM, Corp. 2011
*
@@ -9,8 +9,8 @@
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*/
#ifndef _QEMU_VIRTIO_9P_PROXY_H
#define _QEMU_VIRTIO_9P_PROXY_H
#ifndef _QEMU_9P_PROXY_H
#define _QEMU_9P_PROXY_H
#define PROXY_MAX_IO_SZ (64 * 1024)
#define V9FS_FD_VALID INT_MAX
@@ -20,9 +20,9 @@
* marsha/unmarshal doesn't do little endian conversion.
*/
#define proxy_unmarshal(in_sg, offset, fmt, args...) \
v9fs_unmarshal(in_sg, 1, offset, 0, fmt, ##args)
v9fs_iov_unmarshal(in_sg, 1, offset, 0, fmt, ##args)
#define proxy_marshal(out_sg, offset, fmt, args...) \
v9fs_marshal(out_sg, 1, offset, 0, fmt, ##args)
v9fs_iov_marshal(out_sg, 1, offset, 0, fmt, ##args)
union MsgControl {
struct cmsghdr cmsg;

77
hw/9pfs/9p-util.c Normal file
View File

@@ -0,0 +1,77 @@
/*
* 9p utilities
*
* Copyright IBM, Corp. 2017
*
* Authors:
* Greg Kurz <groug@kaod.org>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include <sys/types.h>
#include <string.h>
#include <unistd.h>
#include <errno.h>
#include <fcntl.h>
#include <sys/stat.h>
#include <assert.h>
#include "glib.h"
#include "qemu/osdep.h"
#include "qemu/xattr.h"
#include "9p-util.h"
int relative_openat_nofollow(int dirfd, const char *path, int flags,
mode_t mode)
{
int fd;
fd = dup(dirfd);
if (fd == -1) {
return -1;
}
while (*path) {
const char *c;
int next_fd;
char *head;
/* Only relative paths without consecutive slashes */
assert(path[0] != '/');
head = g_strdup(path);
c = strchr(path, '/');
if (c) {
head[c - path] = 0;
next_fd = openat_dir(fd, head);
} else {
next_fd = openat_file(fd, head, flags, mode);
}
g_free(head);
if (next_fd == -1) {
close_preserve_errno(fd);
return -1;
}
close(fd);
fd = next_fd;
if (!c) {
break;
}
path = c + 1;
}
return fd;
}
ssize_t fgetxattrat_nofollow(int dirfd, const char *filename, const char *name,
void *value, size_t size)
{
char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
int ret;
ret = lgetxattr(proc_path, name, value, size);
g_free(proc_path);
return ret;
}

60
hw/9pfs/9p-util.h Normal file
View File

@@ -0,0 +1,60 @@
/*
* 9p utilities
*
* Copyright IBM, Corp. 2017
*
* Authors:
* Greg Kurz <groug@kaod.org>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#ifndef QEMU_9P_UTIL_H
#define QEMU_9P_UTIL_H
static inline void close_preserve_errno(int fd)
{
int serrno = errno;
close(fd);
errno = serrno;
}
static inline int openat_dir(int dirfd, const char *name)
{
#ifdef O_PATH
#define OPENAT_DIR_O_PATH O_PATH
#else
#define OPENAT_DIR_O_PATH 0
#endif
return openat(dirfd, name,
O_DIRECTORY | O_RDONLY | O_NOFOLLOW | OPENAT_DIR_O_PATH);
}
static inline int openat_file(int dirfd, const char *name, int flags,
mode_t mode)
{
int fd, serrno, ret;
fd = openat(dirfd, name, flags | O_NOFOLLOW | O_NOCTTY | O_NONBLOCK,
mode);
if (fd == -1) {
return -1;
}
serrno = errno;
/* O_NONBLOCK was only needed to open the file. Let's drop it. */
ret = fcntl(fd, F_SETFL, flags);
assert(!ret);
errno = serrno;
return fd;
}
int relative_openat_nofollow(int dirfd, const char *path, int flags,
mode_t mode);
ssize_t fgetxattrat_nofollow(int dirfd, const char *path, const char *name,
void *value, size_t size);
int fsetxattrat_nofollow(int dirfd, const char *path, const char *name,
void *value, size_t size, int flags);
#endif

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p user. xattr callback
* 9p user. xattr callback
*
* Copyright IBM, Corp. 2010
*
@@ -12,16 +12,14 @@
*/
#include <sys/types.h>
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "9p.h"
#include "fsdev/file-op-9p.h"
#include "virtio-9p-xattr.h"
#include "9p-xattr.h"
static ssize_t mp_user_getxattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
char buffer[PATH_MAX];
if (strncmp(name, "user.virtfs.", 12) == 0) {
/*
* Don't allow fetch of user.virtfs namesapce
@@ -30,7 +28,7 @@ static ssize_t mp_user_getxattr(FsContext *ctx, const char *path,
errno = ENOATTR;
return -1;
}
return lgetxattr(rpath(ctx, path, buffer), name, value, size);
return local_getxattr_nofollow(ctx, path, name, value, size);
}
static ssize_t mp_user_listxattr(FsContext *ctx, const char *path,
@@ -69,7 +67,6 @@ static ssize_t mp_user_listxattr(FsContext *ctx, const char *path,
static int mp_user_setxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
char buffer[PATH_MAX];
if (strncmp(name, "user.virtfs.", 12) == 0) {
/*
* Don't allow fetch of user.virtfs namesapce
@@ -78,13 +75,12 @@ static int mp_user_setxattr(FsContext *ctx, const char *path, const char *name,
errno = EACCES;
return -1;
}
return lsetxattr(rpath(ctx, path, buffer), name, value, size, flags);
return local_setxattr_nofollow(ctx, path, name, value, size, flags);
}
static int mp_user_removexattr(FsContext *ctx,
const char *path, const char *name)
{
char buffer[PATH_MAX];
if (strncmp(name, "user.virtfs.", 12) == 0) {
/*
* Don't allow fetch of user.virtfs namesapce
@@ -93,7 +89,7 @@ static int mp_user_removexattr(FsContext *ctx,
errno = EACCES;
return -1;
}
return lremovexattr(rpath(ctx, path, buffer), name);
return local_removexattr_nofollow(ctx, path, name);
}
XattrOperations mapped_user_xattr = {

318
hw/9pfs/9p-xattr.c Normal file
View File

@@ -0,0 +1,318 @@
/*
* 9p xattr callback
*
* Copyright IBM, Corp. 2010
*
* Authors:
* Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#include "9p.h"
#include "fsdev/file-op-9p.h"
#include "9p-xattr.h"
#include "9p-util.h"
#include "9p-local.h"
static XattrOperations *get_xattr_operations(XattrOperations **h,
const char *name)
{
XattrOperations *xops;
for (xops = *(h)++; xops != NULL; xops = *(h)++) {
if (!strncmp(name, xops->name, strlen(xops->name))) {
return xops;
}
}
return NULL;
}
ssize_t v9fs_get_xattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->getxattr(ctx, path, name, value, size);
}
errno = -EOPNOTSUPP;
return -1;
}
ssize_t pt_listxattr(FsContext *ctx, const char *path,
char *name, void *value, size_t size)
{
int name_size = strlen(name) + 1;
if (!value) {
return name_size;
}
if (size < name_size) {
errno = ERANGE;
return -1;
}
/* no need for strncpy: name_size is strlen(name)+1 */
memcpy(value, name, name_size);
return name_size;
}
static ssize_t flistxattrat_nofollow(int dirfd, const char *filename,
char *list, size_t size)
{
char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
int ret;
ret = llistxattr(proc_path, list, size);
g_free(proc_path);
return ret;
}
/*
* Get the list and pass to each layer to find out whether
* to send the data or not
*/
ssize_t v9fs_list_xattr(FsContext *ctx, const char *path,
void *value, size_t vsize)
{
ssize_t size = 0;
void *ovalue = value;
XattrOperations *xops;
char *orig_value, *orig_value_start;
ssize_t xattr_len, parsed_len = 0, attr_len;
char *dirpath, *name;
int dirfd;
/* Get the actual len */
dirpath = g_path_get_dirname(path);
dirfd = local_opendir_nofollow(ctx, dirpath);
g_free(dirpath);
if (dirfd == -1) {
return -1;
}
name = g_path_get_basename(path);
xattr_len = flistxattrat_nofollow(dirfd, name, value, 0);
if (xattr_len <= 0) {
g_free(name);
close_preserve_errno(dirfd);
return xattr_len;
}
/* Now fetch the xattr and find the actual size */
orig_value = g_malloc(xattr_len);
xattr_len = flistxattrat_nofollow(dirfd, name, orig_value, xattr_len);
g_free(name);
close_preserve_errno(dirfd);
if (xattr_len < 0) {
g_free(orig_value);
return -1;
}
/* store the orig pointer */
orig_value_start = orig_value;
while (xattr_len > parsed_len) {
xops = get_xattr_operations(ctx->xops, orig_value);
if (!xops) {
goto next_entry;
}
if (!value) {
size += xops->listxattr(ctx, path, orig_value, value, vsize);
} else {
size = xops->listxattr(ctx, path, orig_value, value, vsize);
if (size < 0) {
goto err_out;
}
value += size;
vsize -= size;
}
next_entry:
/* Got the next entry */
attr_len = strlen(orig_value) + 1;
parsed_len += attr_len;
orig_value += attr_len;
}
if (value) {
size = value - ovalue;
}
err_out:
g_free(orig_value_start);
return size;
}
int v9fs_set_xattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->setxattr(ctx, path, name, value, size, flags);
}
errno = -EOPNOTSUPP;
return -1;
}
int v9fs_remove_xattr(FsContext *ctx,
const char *path, const char *name)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->removexattr(ctx, path, name);
}
errno = -EOPNOTSUPP;
return -1;
}
ssize_t local_getxattr_nofollow(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
char *dirpath = g_path_get_dirname(path);
char *filename = g_path_get_basename(path);
int dirfd;
ssize_t ret = -1;
dirfd = local_opendir_nofollow(ctx, dirpath);
if (dirfd == -1) {
goto out;
}
ret = fgetxattrat_nofollow(dirfd, filename, name, value, size);
close_preserve_errno(dirfd);
out:
g_free(dirpath);
g_free(filename);
return ret;
}
ssize_t pt_getxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size)
{
return local_getxattr_nofollow(ctx, path, name, value, size);
}
int fsetxattrat_nofollow(int dirfd, const char *filename, const char *name,
void *value, size_t size, int flags)
{
char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
int ret;
ret = lsetxattr(proc_path, name, value, size, flags);
g_free(proc_path);
return ret;
}
ssize_t local_setxattr_nofollow(FsContext *ctx, const char *path,
const char *name, void *value, size_t size,
int flags)
{
char *dirpath = g_path_get_dirname(path);
char *filename = g_path_get_basename(path);
int dirfd;
ssize_t ret = -1;
dirfd = local_opendir_nofollow(ctx, dirpath);
if (dirfd == -1) {
goto out;
}
ret = fsetxattrat_nofollow(dirfd, filename, name, value, size, flags);
close_preserve_errno(dirfd);
out:
g_free(dirpath);
g_free(filename);
return ret;
}
int pt_setxattr(FsContext *ctx, const char *path, const char *name, void *value,
size_t size, int flags)
{
return local_setxattr_nofollow(ctx, path, name, value, size, flags);
}
static ssize_t fremovexattrat_nofollow(int dirfd, const char *filename,
const char *name)
{
char *proc_path = g_strdup_printf("/proc/self/fd/%d/%s", dirfd, filename);
int ret;
ret = lremovexattr(proc_path, name);
g_free(proc_path);
return ret;
}
ssize_t local_removexattr_nofollow(FsContext *ctx, const char *path,
const char *name)
{
char *dirpath = g_path_get_dirname(path);
char *filename = g_path_get_basename(path);
int dirfd;
ssize_t ret = -1;
dirfd = local_opendir_nofollow(ctx, dirpath);
if (dirfd == -1) {
goto out;
}
ret = fremovexattrat_nofollow(dirfd, filename, name);
close_preserve_errno(dirfd);
out:
g_free(dirpath);
g_free(filename);
return ret;
}
int pt_removexattr(FsContext *ctx, const char *path, const char *name)
{
return local_removexattr_nofollow(ctx, path, name);
}
ssize_t notsup_getxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size)
{
errno = ENOTSUP;
return -1;
}
int notsup_setxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
errno = ENOTSUP;
return -1;
}
ssize_t notsup_listxattr(FsContext *ctx, const char *path, char *name,
void *value, size_t size)
{
return 0;
}
int notsup_removexattr(FsContext *ctx, const char *path, const char *name)
{
errno = ENOTSUP;
return -1;
}
XattrOperations *mapped_xattr_ops[] = {
&mapped_user_xattr,
&mapped_pacl_xattr,
&mapped_dacl_xattr,
NULL,
};
XattrOperations *passthrough_xattr_ops[] = {
&passthrough_user_xattr,
&passthrough_acl_xattr,
NULL,
};
/* for .user none model should be same as passthrough */
XattrOperations *none_xattr_ops[] = {
&passthrough_user_xattr,
&none_acl_xattr,
NULL,
};

View File

@@ -1,5 +1,5 @@
/*
* Virtio 9p
* 9p
*
* Copyright IBM, Corp. 2010
*
@@ -10,8 +10,8 @@
* the COPYING file in the top-level directory.
*
*/
#ifndef _QEMU_VIRTIO_9P_XATTR_H
#define _QEMU_VIRTIO_9P_XATTR_H
#ifndef _QEMU_9P_XATTR_H
#define _QEMU_9P_XATTR_H
#include "qemu/xattr.h"
@@ -28,6 +28,13 @@ typedef struct xattr_operations
const char *path, const char *name);
} XattrOperations;
ssize_t local_getxattr_nofollow(FsContext *ctx, const char *path,
const char *name, void *value, size_t size);
ssize_t local_setxattr_nofollow(FsContext *ctx, const char *path,
const char *name, void *value, size_t size,
int flags);
ssize_t local_removexattr_nofollow(FsContext *ctx, const char *path,
const char *name);
extern XattrOperations mapped_user_xattr;
extern XattrOperations passthrough_user_xattr;
@@ -48,58 +55,21 @@ ssize_t v9fs_list_xattr(FsContext *ctx, const char *path, void *value,
int v9fs_set_xattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags);
int v9fs_remove_xattr(FsContext *ctx, const char *path, const char *name);
ssize_t pt_listxattr(FsContext *ctx, const char *path, char *name, void *value,
size_t size);
ssize_t pt_getxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size);
int pt_setxattr(FsContext *ctx, const char *path, const char *name, void *value,
size_t size, int flags);
int pt_removexattr(FsContext *ctx, const char *path, const char *name);
static inline ssize_t pt_getxattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
char buffer[PATH_MAX];
return lgetxattr(rpath(ctx, path, buffer), name, value, size);
}
static inline int pt_setxattr(FsContext *ctx, const char *path,
const char *name, void *value,
size_t size, int flags)
{
char buffer[PATH_MAX];
return lsetxattr(rpath(ctx, path, buffer), name, value, size, flags);
}
static inline int pt_removexattr(FsContext *ctx,
const char *path, const char *name)
{
char buffer[PATH_MAX];
return lremovexattr(rpath(ctx, path, buffer), name);
}
static inline ssize_t notsup_getxattr(FsContext *ctx, const char *path,
const char *name, void *value,
size_t size)
{
errno = ENOTSUP;
return -1;
}
static inline int notsup_setxattr(FsContext *ctx, const char *path,
const char *name, void *value,
size_t size, int flags)
{
errno = ENOTSUP;
return -1;
}
static inline ssize_t notsup_listxattr(FsContext *ctx, const char *path,
char *name, void *value, size_t size)
{
return 0;
}
static inline int notsup_removexattr(FsContext *ctx,
const char *path, const char *name)
{
errno = ENOTSUP;
return -1;
}
ssize_t notsup_getxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size);
int notsup_setxattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags);
ssize_t notsup_listxattr(FsContext *ctx, const char *path, char *name,
void *value, size_t size);
int notsup_removexattr(FsContext *ctx, const char *path, const char *name);
#endif

File diff suppressed because it is too large Load Diff

334
hw/9pfs/9p.h Normal file
View File

@@ -0,0 +1,334 @@
#ifndef _QEMU_9P_H
#define _QEMU_9P_H
#include <sys/types.h>
#include <dirent.h>
#include <sys/time.h>
#include <utime.h>
#include <sys/resource.h>
#include <glib.h>
#include "standard-headers/linux/virtio_9p.h"
#include "hw/virtio.h"
#include "fsdev/file-op-9p.h"
#include "fsdev/9p-iov-marshal.h"
#include "qemu/thread.h"
#include "qemu/coroutine.h"
enum {
P9_TLERROR = 6,
P9_RLERROR,
P9_TSTATFS = 8,
P9_RSTATFS,
P9_TLOPEN = 12,
P9_RLOPEN,
P9_TLCREATE = 14,
P9_RLCREATE,
P9_TSYMLINK = 16,
P9_RSYMLINK,
P9_TMKNOD = 18,
P9_RMKNOD,
P9_TRENAME = 20,
P9_RRENAME,
P9_TREADLINK = 22,
P9_RREADLINK,
P9_TGETATTR = 24,
P9_RGETATTR,
P9_TSETATTR = 26,
P9_RSETATTR,
P9_TXATTRWALK = 30,
P9_RXATTRWALK,
P9_TXATTRCREATE = 32,
P9_RXATTRCREATE,
P9_TREADDIR = 40,
P9_RREADDIR,
P9_TFSYNC = 50,
P9_RFSYNC,
P9_TLOCK = 52,
P9_RLOCK,
P9_TGETLOCK = 54,
P9_RGETLOCK,
P9_TLINK = 70,
P9_RLINK,
P9_TMKDIR = 72,
P9_RMKDIR,
P9_TRENAMEAT = 74,
P9_RRENAMEAT,
P9_TUNLINKAT = 76,
P9_RUNLINKAT,
P9_TVERSION = 100,
P9_RVERSION,
P9_TAUTH = 102,
P9_RAUTH,
P9_TATTACH = 104,
P9_RATTACH,
P9_TERROR = 106,
P9_RERROR,
P9_TFLUSH = 108,
P9_RFLUSH,
P9_TWALK = 110,
P9_RWALK,
P9_TOPEN = 112,
P9_ROPEN,
P9_TCREATE = 114,
P9_RCREATE,
P9_TREAD = 116,
P9_RREAD,
P9_TWRITE = 118,
P9_RWRITE,
P9_TCLUNK = 120,
P9_RCLUNK,
P9_TREMOVE = 122,
P9_RREMOVE,
P9_TSTAT = 124,
P9_RSTAT,
P9_TWSTAT = 126,
P9_RWSTAT,
};
/* qid.types */
enum {
P9_QTDIR = 0x80,
P9_QTAPPEND = 0x40,
P9_QTEXCL = 0x20,
P9_QTMOUNT = 0x10,
P9_QTAUTH = 0x08,
P9_QTTMP = 0x04,
P9_QTSYMLINK = 0x02,
P9_QTLINK = 0x01,
P9_QTFILE = 0x00,
};
enum p9_proto_version {
V9FS_PROTO_2000U = 0x01,
V9FS_PROTO_2000L = 0x02,
};
#define P9_NOTAG (u16)(~0)
#define P9_NOFID (u32)(~0)
#define P9_MAXWELEM 16
#define FID_REFERENCED 0x1
#define FID_NON_RECLAIMABLE 0x2
static inline char *rpath(FsContext *ctx, const char *path)
{
return g_strdup_printf("%s/%s", ctx->fs_root, path);
}
/*
* ample room for Twrite/Rread header
* size[4] Tread/Twrite tag[2] fid[4] offset[8] count[4]
*/
#define P9_IOHDRSZ 24
typedef struct V9fsPDU V9fsPDU;
struct V9fsState;
struct V9fsPDU
{
uint32_t size;
uint16_t tag;
uint8_t id;
uint8_t cancelled;
CoQueue complete;
struct V9fsState *s;
QLIST_ENTRY(V9fsPDU) next;
uint32_t idx;
};
/* FIXME
* 1) change user needs to set groups and stuff
*/
#define MAX_REQ 128
#define MAX_TAG_LEN 32
#define BUG_ON(cond) assert(!(cond))
typedef struct V9fsFidState V9fsFidState;
enum {
P9_FID_NONE = 0,
P9_FID_FILE,
P9_FID_DIR,
P9_FID_XATTR,
};
typedef struct V9fsConf
{
/* tag name for the device */
char *tag;
char *fsdev_id;
} V9fsConf;
typedef struct V9fsXattr
{
int64_t copied_len;
int64_t len;
void *value;
V9fsString name;
int flags;
} V9fsXattr;
typedef struct V9fsDir {
DIR *stream;
} V9fsDir;
/*
* Filled by fs driver on open and other
* calls.
*/
union V9fsFidOpenState {
int fd;
V9fsDir dir;
V9fsXattr xattr;
/*
* private pointer for fs drivers, that
* have its own internal representation of
* open files.
*/
void *private;
};
struct V9fsFidState
{
int fid_type;
int32_t fid;
V9fsPath path;
V9fsFidOpenState fs;
V9fsFidOpenState fs_reclaim;
int flags;
int open_flags;
uid_t uid;
int ref;
int clunked;
V9fsFidState *next;
V9fsFidState *rclm_lst;
};
typedef struct V9fsState
{
QLIST_HEAD(, V9fsPDU) free_list;
QLIST_HEAD(, V9fsPDU) active_list;
V9fsFidState *fid_list;
FileOperations *ops;
FsContext ctx;
char *tag;
enum p9_proto_version proto_version;
int32_t msize;
V9fsPDU pdus[MAX_REQ];
/*
* lock ensuring atomic path update
* on rename.
*/
CoRwlock rename_lock;
int32_t root_fid;
Error *migration_blocker;
V9fsConf fsconf;
V9fsQID root_qid;
} V9fsState;
/* 9p2000.L open flags */
#define P9_DOTL_RDONLY 00000000
#define P9_DOTL_WRONLY 00000001
#define P9_DOTL_RDWR 00000002
#define P9_DOTL_NOACCESS 00000003
#define P9_DOTL_CREATE 00000100
#define P9_DOTL_EXCL 00000200
#define P9_DOTL_NOCTTY 00000400
#define P9_DOTL_TRUNC 00001000
#define P9_DOTL_APPEND 00002000
#define P9_DOTL_NONBLOCK 00004000
#define P9_DOTL_DSYNC 00010000
#define P9_DOTL_FASYNC 00020000
#define P9_DOTL_DIRECT 00040000
#define P9_DOTL_LARGEFILE 00100000
#define P9_DOTL_DIRECTORY 00200000
#define P9_DOTL_NOFOLLOW 00400000
#define P9_DOTL_NOATIME 01000000
#define P9_DOTL_CLOEXEC 02000000
#define P9_DOTL_SYNC 04000000
/* 9p2000.L at flags */
#define P9_DOTL_AT_REMOVEDIR 0x200
/* 9P2000.L lock type */
#define P9_LOCK_TYPE_RDLCK 0
#define P9_LOCK_TYPE_WRLCK 1
#define P9_LOCK_TYPE_UNLCK 2
#define P9_LOCK_SUCCESS 0
#define P9_LOCK_BLOCKED 1
#define P9_LOCK_ERROR 2
#define P9_LOCK_GRACE 3
#define P9_LOCK_FLAGS_BLOCK 1
#define P9_LOCK_FLAGS_RECLAIM 2
typedef struct V9fsFlock
{
uint8_t type;
uint32_t flags;
uint64_t start; /* absolute offset */
uint64_t length;
uint32_t proc_id;
V9fsString client_id;
} V9fsFlock;
typedef struct V9fsGetlock
{
uint8_t type;
uint64_t start; /* absolute offset */
uint64_t length;
uint32_t proc_id;
V9fsString client_id;
} V9fsGetlock;
extern int open_fd_hw;
extern int total_open_fd;
static inline void v9fs_path_write_lock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_wrlock(&s->rename_lock);
}
}
static inline void v9fs_path_read_lock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_rdlock(&s->rename_lock);
}
}
static inline void v9fs_path_unlock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_unlock(&s->rename_lock);
}
}
static inline uint8_t v9fs_request_cancelled(V9fsPDU *pdu)
{
return pdu->cancelled;
}
extern void v9fs_reclaim_fd(V9fsPDU *pdu);
extern void v9fs_path_init(V9fsPath *path);
extern void v9fs_path_free(V9fsPath *path);
extern void v9fs_path_sprintf(V9fsPath *path, const char *fmt, ...);
extern void v9fs_path_copy(V9fsPath *lhs, V9fsPath *rhs);
extern int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
const char *name, V9fsPath *path);
extern int v9fs_device_realize_common(V9fsState *s, Error **errp);
extern void v9fs_device_unrealize_common(V9fsState *s, Error **errp);
ssize_t pdu_marshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
ssize_t pdu_unmarshal(V9fsPDU *pdu, size_t offset, const char *fmt, ...);
V9fsPDU *pdu_alloc(V9fsState *s);
void pdu_free(V9fsPDU *pdu);
void pdu_submit(V9fsPDU *pdu);
void v9fs_reset(V9fsState *s);
#endif

View File

@@ -1,9 +1,9 @@
common-obj-y = virtio-9p.o
common-obj-y += virtio-9p-local.o virtio-9p-xattr.o
common-obj-y += virtio-9p-xattr-user.o virtio-9p-posix-acl.o
common-obj-y = 9p.o 9p-util.o
common-obj-y += 9p-local.o 9p-xattr.o
common-obj-y += 9p-xattr-user.o 9p-posix-acl.o
common-obj-y += virtio-9p-coth.o cofs.o codir.o cofile.o
common-obj-y += coxattr.o virtio-9p-synth.o
common-obj-$(CONFIG_OPEN_BY_HANDLE) += virtio-9p-handle.o
common-obj-y += virtio-9p-proxy.o
common-obj-$(CONFIG_OPEN_BY_HANDLE) += 9p-handle.o
common-obj-y += 9p-proxy.o
obj-y += virtio-9p-device.o

View File

@@ -138,10 +138,10 @@ int v9fs_co_open2(V9fsPDU *pdu, V9fsFidState *fidp, V9fsString *name, gid_t gid,
cred.fc_gid = gid;
/*
* Hold the directory fid lock so that directory path name
* don't change. Read lock is fine because this fid cannot
* be used by any other operation.
* don't change. Take the write lock to be sure this fid
* cannot be used by another operation.
*/
v9fs_path_read_lock(s);
v9fs_path_write_lock(s);
v9fs_co_run_in_worker(
{
err = s->ops->open2(&s->ctx, &fidp->path,

View File

@@ -14,11 +14,62 @@
#include "hw/virtio.h"
#include "hw/pc.h"
#include "qemu/sockets.h"
#include "hw/virtio-pci.h"
#include "virtio-9p.h"
#include "fsdev/qemu-fsdev.h"
#include "virtio-9p-xattr.h"
#include "9p-xattr.h"
#include "virtio-9p-coth.h"
#include "qemu/iov.h"
void virtio_9p_push_and_notify(V9fsPDU *pdu)
{
V9fsState *s = pdu->s;
V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
VirtQueueElement *elem = &v->elems[pdu->idx];
/* push onto queue and notify */
virtqueue_push(v->vq, elem, pdu->size);
/* FIXME: we should batch these completions */
virtio_notify(VIRTIO_DEVICE(v), v->vq);
}
static void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq)
{
V9fsVirtioState *v = (V9fsVirtioState *)vdev;
V9fsState *s = &v->state;
V9fsPDU *pdu;
ssize_t len;
while ((pdu = pdu_alloc(s))) {
struct {
uint32_t size_le;
uint8_t id;
uint16_t tag_le;
} QEMU_PACKED out;
VirtQueueElement *elem = &v->elems[pdu->idx];
len = virtqueue_pop(vq, elem);
if (!len) {
pdu_free(pdu);
break;
}
BUG_ON(elem->out_num == 0 || elem->in_num == 0);
QEMU_BUILD_BUG_ON(sizeof out != 7);
len = iov_to_buf(elem->out_sg, elem->out_num, 0,
&out, sizeof out);
BUG_ON(len != sizeof out);
pdu->size = le32_to_cpu(out.size_le);
pdu->id = out.id;
pdu->tag = le16_to_cpu(out.tag_le);
qemu_co_queue_init(&pdu->complete);
pdu_submit(pdu);
}
}
static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
{
@@ -26,164 +77,125 @@ static uint32_t virtio_9p_get_features(VirtIODevice *vdev, uint32_t features)
return features;
}
static V9fsState *to_virtio_9p(VirtIODevice *vdev)
{
return (V9fsState *)vdev;
}
static void virtio_9p_get_config(VirtIODevice *vdev, uint8_t *config)
{
int len;
struct virtio_9p_config *cfg;
V9fsState *s = to_virtio_9p(vdev);
V9fsVirtioState *v = VIRTIO_9P(vdev);
V9fsState *s = &v->state;
len = strlen(s->tag);
cfg = g_malloc0(sizeof(struct virtio_9p_config) + len);
stw_raw(&cfg->tag_len, len);
/* We don't copy the terminating null to config space */
memcpy(cfg->tag, s->tag, len);
memcpy(config, cfg, s->config_size);
memcpy(config, cfg, v->config_size);
g_free(cfg);
}
VirtIODevice *virtio_9p_init(DeviceState *dev, V9fsConf *conf)
static void virtio_9p_device_realize(DeviceState *dev, Error **errp)
{
V9fsState *s;
int i, len;
struct stat stat;
FsDriverEntry *fse;
V9fsPath path;
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
V9fsVirtioState *v = VIRTIO_9P(dev);
V9fsState *s = &v->state;
s = (V9fsState *)virtio_common_init("virtio-9p",
VIRTIO_ID_9P,
sizeof(struct virtio_9p_config)+
MAX_TAG_LEN,
sizeof(V9fsState));
/* initialize pdu allocator */
QLIST_INIT(&s->free_list);
QLIST_INIT(&s->active_list);
for (i = 0; i < (MAX_REQ - 1); i++) {
QLIST_INSERT_HEAD(&s->free_list, &s->pdus[i], next);
if (v9fs_device_realize_common(s, errp)) {
goto out;
}
vdev->get_config = virtio_9p_get_config;
s->vq = virtio_add_queue(&s->vdev, MAX_REQ, handle_9p_output);
v->config_size = sizeof(struct virtio_9p_config) + strlen(s->fsconf.tag);
virtio_init(vdev, "virtio-9p", VIRTIO_ID_9P, v->config_size);
v->vq = virtio_add_queue(vdev, MAX_REQ, handle_9p_output);
fse = get_fsdev_fsentry(conf->fsdev_id);
if (!fse) {
/* We don't have a fsdev identified by fsdev_id */
fprintf(stderr, "Virtio-9p device couldn't find fsdev with the "
"id = %s\n", conf->fsdev_id ? conf->fsdev_id : "NULL");
exit(1);
}
if (!conf->tag) {
/* we haven't specified a mount_tag */
fprintf(stderr, "fsdev with id %s needs mount_tag arguments\n",
conf->fsdev_id);
exit(1);
}
s->ctx.export_flags = fse->export_flags;
s->ctx.fs_root = g_strdup(fse->path);
s->ctx.exops.get_st_gen = NULL;
len = strlen(conf->tag);
if (len > MAX_TAG_LEN - 1) {
fprintf(stderr, "mount tag '%s' (%d bytes) is longer than "
"maximum (%d bytes)", conf->tag, len, MAX_TAG_LEN - 1);
exit(1);
}
s->tag = g_strdup(conf->tag);
s->ctx.uid = -1;
s->ops = fse->ops;
s->vdev.get_features = virtio_9p_get_features;
s->config_size = sizeof(struct virtio_9p_config) + len;
s->vdev.get_config = virtio_9p_get_config;
s->fid_list = NULL;
qemu_co_rwlock_init(&s->rename_lock);
if (s->ops->init(&s->ctx) < 0) {
fprintf(stderr, "Virtio-9p Failed to initialize fs-driver with id:%s"
" and export path:%s\n", conf->fsdev_id, s->ctx.fs_root);
exit(1);
}
if (v9fs_init_worker_threads() < 0) {
fprintf(stderr, "worker thread initialization failed\n");
exit(1);
}
/*
* Check details of export path, We need to use fs driver
* call back to do that. Since we are in the init path, we don't
* use co-routines here.
*/
v9fs_path_init(&path);
if (s->ops->name_to_path(&s->ctx, NULL, "/", &path) < 0) {
fprintf(stderr,
"error in converting name to path %s", strerror(errno));
exit(1);
}
if (s->ops->lstat(&s->ctx, &path, &stat)) {
fprintf(stderr, "share path %s does not exist\n", fse->path);
exit(1);
} else if (!S_ISDIR(stat.st_mode)) {
fprintf(stderr, "share path %s is not a directory\n", fse->path);
exit(1);
}
v9fs_path_free(&path);
return &s->vdev;
out:
return;
}
static int virtio_9p_init_pci(PCIDevice *pci_dev)
static void virtio_9p_device_unrealize(DeviceState *dev, Error **errp)
{
VirtIOPCIProxy *proxy = DO_UPCAST(VirtIOPCIProxy, pci_dev, pci_dev);
VirtIODevice *vdev;
VirtIODevice *vdev = VIRTIO_DEVICE(dev);
V9fsVirtioState *v = VIRTIO_9P(dev);
V9fsState *s = &v->state;
vdev = virtio_9p_init(&pci_dev->qdev, &proxy->fsconf);
vdev->nvectors = proxy->nvectors;
virtio_init_pci(proxy, vdev);
/* make the actual value visible */
proxy->nvectors = vdev->nvectors;
return 0;
virtio_cleanup(vdev);
v9fs_device_unrealize_common(s, errp);
}
static void virtio_9p_reset(VirtIODevice *vdev)
{
V9fsVirtioState *v = (V9fsVirtioState *)vdev;
v9fs_reset(&v->state);
}
ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
const char *fmt, va_list ap)
{
V9fsState *s = pdu->s;
V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
VirtQueueElement *elem = &v->elems[pdu->idx];
return v9fs_iov_vmarshal(elem->in_sg, elem->in_num, offset, 1, fmt, ap);
}
ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
const char *fmt, va_list ap)
{
V9fsState *s = pdu->s;
V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
VirtQueueElement *elem = &v->elems[pdu->idx];
return v9fs_iov_vunmarshal(elem->out_sg, elem->out_num, offset, 1, fmt, ap);
}
void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
unsigned int *pniov, bool is_write)
{
V9fsState *s = pdu->s;
V9fsVirtioState *v = container_of(s, V9fsVirtioState, state);
VirtQueueElement *elem = &v->elems[pdu->idx];
if (is_write) {
*piov = elem->out_sg;
*pniov = elem->out_num;
} else {
*piov = elem->in_sg;
*pniov = elem->in_num;
}
}
/* virtio-9p device */
static Property virtio_9p_properties[] = {
DEFINE_PROP_BIT("ioeventfd", VirtIOPCIProxy, flags, VIRTIO_PCI_FLAG_USE_IOEVENTFD_BIT, true),
DEFINE_PROP_UINT32("vectors", VirtIOPCIProxy, nvectors, 2),
DEFINE_VIRTIO_COMMON_FEATURES(VirtIOPCIProxy, host_features),
DEFINE_PROP_STRING("mount_tag", VirtIOPCIProxy, fsconf.tag),
DEFINE_PROP_STRING("fsdev", VirtIOPCIProxy, fsconf.fsdev_id),
DEFINE_PROP_STRING("mount_tag", V9fsVirtioState, state.fsconf.tag),
DEFINE_PROP_STRING("fsdev", V9fsVirtioState, state.fsconf.fsdev_id),
DEFINE_PROP_END_OF_LIST(),
};
static void virtio_9p_class_init(ObjectClass *klass, void *data)
{
DeviceClass *dc = DEVICE_CLASS(klass);
PCIDeviceClass *k = PCI_DEVICE_CLASS(klass);
VirtioDeviceClass *vdc = VIRTIO_DEVICE_CLASS(klass);
k->init = virtio_9p_init_pci;
k->vendor_id = PCI_VENDOR_ID_REDHAT_QUMRANET;
k->device_id = PCI_DEVICE_ID_VIRTIO_9P;
k->revision = VIRTIO_PCI_ABI_VERSION;
k->class_id = 0x2;
dc->props = virtio_9p_properties;
dc->reset = virtio_pci_reset;
vdc->realize = virtio_9p_device_realize;
vdc->unrealize = virtio_9p_device_unrealize;
vdc->get_features = virtio_9p_get_features;
vdc->get_config = virtio_9p_get_config;
vdc->reset = virtio_9p_reset;
}
static const TypeInfo virtio_9p_info = {
.name = "virtio-9p-pci",
.parent = TYPE_PCI_DEVICE,
.instance_size = sizeof(VirtIOPCIProxy),
.class_init = virtio_9p_class_init,
static const TypeInfo virtio_device_info = {
.name = TYPE_VIRTIO_9P,
.parent = TYPE_VIRTIO_DEVICE,
.instance_size = sizeof(V9fsVirtioState),
.class_init = virtio_9p_class_init,
};
static void virtio_9p_register_types(void)
{
type_register_static(&virtio_9p_info);
virtio_9p_set_fd_limit();
type_register_static(&virtio_device_info);
}
type_init(virtio_9p_register_types)

File diff suppressed because it is too large Load Diff

View File

@@ -13,8 +13,8 @@
*/
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "virtio-9p-xattr.h"
#include "9p.h"
#include "9p-xattr.h"
#include "fsdev/qemu-fsdev.h"
#include "virtio-9p-synth.h"

View File

@@ -1,161 +0,0 @@
/*
* Virtio 9p xattr callback
*
* Copyright IBM, Corp. 2010
*
* Authors:
* Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
*
* This work is licensed under the terms of the GNU GPL, version 2. See
* the COPYING file in the top-level directory.
*
*/
#include "hw/virtio.h"
#include "virtio-9p.h"
#include "fsdev/file-op-9p.h"
#include "virtio-9p-xattr.h"
static XattrOperations *get_xattr_operations(XattrOperations **h,
const char *name)
{
XattrOperations *xops;
for (xops = *(h)++; xops != NULL; xops = *(h)++) {
if (!strncmp(name, xops->name, strlen(xops->name))) {
return xops;
}
}
return NULL;
}
ssize_t v9fs_get_xattr(FsContext *ctx, const char *path,
const char *name, void *value, size_t size)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->getxattr(ctx, path, name, value, size);
}
errno = -EOPNOTSUPP;
return -1;
}
ssize_t pt_listxattr(FsContext *ctx, const char *path,
char *name, void *value, size_t size)
{
int name_size = strlen(name) + 1;
if (!value) {
return name_size;
}
if (size < name_size) {
errno = ERANGE;
return -1;
}
/* no need for strncpy: name_size is strlen(name)+1 */
memcpy(value, name, name_size);
return name_size;
}
/*
* Get the list and pass to each layer to find out whether
* to send the data or not
*/
ssize_t v9fs_list_xattr(FsContext *ctx, const char *path,
void *value, size_t vsize)
{
ssize_t size = 0;
char buffer[PATH_MAX];
void *ovalue = value;
XattrOperations *xops;
char *orig_value, *orig_value_start;
ssize_t xattr_len, parsed_len = 0, attr_len;
/* Get the actual len */
xattr_len = llistxattr(rpath(ctx, path, buffer), value, 0);
if (xattr_len <= 0) {
return xattr_len;
}
/* Now fetch the xattr and find the actual size */
orig_value = g_malloc(xattr_len);
xattr_len = llistxattr(rpath(ctx, path, buffer), orig_value, xattr_len);
/* store the orig pointer */
orig_value_start = orig_value;
while (xattr_len > parsed_len) {
xops = get_xattr_operations(ctx->xops, orig_value);
if (!xops) {
goto next_entry;
}
if (!value) {
size += xops->listxattr(ctx, path, orig_value, value, vsize);
} else {
size = xops->listxattr(ctx, path, orig_value, value, vsize);
if (size < 0) {
goto err_out;
}
value += size;
vsize -= size;
}
next_entry:
/* Got the next entry */
attr_len = strlen(orig_value) + 1;
parsed_len += attr_len;
orig_value += attr_len;
}
if (value) {
size = value - ovalue;
}
err_out:
g_free(orig_value_start);
return size;
}
int v9fs_set_xattr(FsContext *ctx, const char *path, const char *name,
void *value, size_t size, int flags)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->setxattr(ctx, path, name, value, size, flags);
}
errno = -EOPNOTSUPP;
return -1;
}
int v9fs_remove_xattr(FsContext *ctx,
const char *path, const char *name)
{
XattrOperations *xops = get_xattr_operations(ctx->xops, name);
if (xops) {
return xops->removexattr(ctx, path, name);
}
errno = -EOPNOTSUPP;
return -1;
}
XattrOperations *mapped_xattr_ops[] = {
&mapped_user_xattr,
&mapped_pacl_xattr,
&mapped_dacl_xattr,
NULL,
};
XattrOperations *passthrough_xattr_ops[] = {
&passthrough_user_xattr,
&passthrough_acl_xattr,
NULL,
};
/* for .user none model should be same as passthrough */
XattrOperations *none_xattr_ops[] = {
&passthrough_user_xattr,
&none_acl_xattr,
NULL,
};

View File

@@ -1,405 +1,30 @@
#ifndef _QEMU_VIRTIO_9P_H
#define _QEMU_VIRTIO_9P_H
#include <sys/types.h>
#include <dirent.h>
#include <sys/time.h>
#include <utime.h>
#include <sys/resource.h>
#include "standard-headers/linux/virtio_9p.h"
#include "hw/virtio.h"
#include "fsdev/file-op-9p.h"
#include "fsdev/virtio-9p-marshal.h"
#include "qemu/thread.h"
#include "block/coroutine.h"
#include "9p.h"
/* The feature bitmap for virtio 9P */
/* The mount point is specified in a config variable */
#define VIRTIO_9P_MOUNT_TAG 0
enum {
P9_TLERROR = 6,
P9_RLERROR,
P9_TSTATFS = 8,
P9_RSTATFS,
P9_TLOPEN = 12,
P9_RLOPEN,
P9_TLCREATE = 14,
P9_RLCREATE,
P9_TSYMLINK = 16,
P9_RSYMLINK,
P9_TMKNOD = 18,
P9_RMKNOD,
P9_TRENAME = 20,
P9_RRENAME,
P9_TREADLINK = 22,
P9_RREADLINK,
P9_TGETATTR = 24,
P9_RGETATTR,
P9_TSETATTR = 26,
P9_RSETATTR,
P9_TXATTRWALK = 30,
P9_RXATTRWALK,
P9_TXATTRCREATE = 32,
P9_RXATTRCREATE,
P9_TREADDIR = 40,
P9_RREADDIR,
P9_TFSYNC = 50,
P9_RFSYNC,
P9_TLOCK = 52,
P9_RLOCK,
P9_TGETLOCK = 54,
P9_RGETLOCK,
P9_TLINK = 70,
P9_RLINK,
P9_TMKDIR = 72,
P9_RMKDIR,
P9_TRENAMEAT = 74,
P9_RRENAMEAT,
P9_TUNLINKAT = 76,
P9_RUNLINKAT,
P9_TVERSION = 100,
P9_RVERSION,
P9_TAUTH = 102,
P9_RAUTH,
P9_TATTACH = 104,
P9_RATTACH,
P9_TERROR = 106,
P9_RERROR,
P9_TFLUSH = 108,
P9_RFLUSH,
P9_TWALK = 110,
P9_RWALK,
P9_TOPEN = 112,
P9_ROPEN,
P9_TCREATE = 114,
P9_RCREATE,
P9_TREAD = 116,
P9_RREAD,
P9_TWRITE = 118,
P9_RWRITE,
P9_TCLUNK = 120,
P9_RCLUNK,
P9_TREMOVE = 122,
P9_RREMOVE,
P9_TSTAT = 124,
P9_RSTAT,
P9_TWSTAT = 126,
P9_RWSTAT,
};
/* qid.types */
enum {
P9_QTDIR = 0x80,
P9_QTAPPEND = 0x40,
P9_QTEXCL = 0x20,
P9_QTMOUNT = 0x10,
P9_QTAUTH = 0x08,
P9_QTTMP = 0x04,
P9_QTSYMLINK = 0x02,
P9_QTLINK = 0x01,
P9_QTFILE = 0x00,
};
enum p9_proto_version {
V9FS_PROTO_2000U = 0x01,
V9FS_PROTO_2000L = 0x02,
};
#define P9_NOTAG (u16)(~0)
#define P9_NOFID (u32)(~0)
#define P9_MAXWELEM 16
#define FID_REFERENCED 0x1
#define FID_NON_RECLAIMABLE 0x2
static inline const char *rpath(FsContext *ctx, const char *path, char *buffer)
typedef struct V9fsVirtioState
{
snprintf(buffer, PATH_MAX, "%s/%s", ctx->fs_root, path);
return buffer;
}
/*
* ample room for Twrite/Rread header
* size[4] Tread/Twrite tag[2] fid[4] offset[8] count[4]
*/
#define P9_IOHDRSZ 24
typedef struct V9fsPDU V9fsPDU;
struct V9fsState;
struct V9fsPDU
{
uint32_t size;
uint16_t tag;
uint8_t id;
uint8_t cancelled;
CoQueue complete;
VirtQueueElement elem;
struct V9fsState *s;
QLIST_ENTRY(V9fsPDU) next;
};
/* FIXME
* 1) change user needs to set groups and stuff
*/
/* from Linux's linux/virtio_9p.h */
/* The ID for virtio console */
#define VIRTIO_ID_9P 9
#define MAX_REQ 128
#define MAX_TAG_LEN 32
#define BUG_ON(cond) assert(!(cond))
typedef struct V9fsFidState V9fsFidState;
enum {
P9_FID_NONE = 0,
P9_FID_FILE,
P9_FID_DIR,
P9_FID_XATTR,
};
typedef struct V9fsXattr
{
int64_t copied_len;
int64_t len;
void *value;
V9fsString name;
int flags;
} V9fsXattr;
/*
* Filled by fs driver on open and other
* calls.
*/
union V9fsFidOpenState {
int fd;
DIR *dir;
V9fsXattr xattr;
/*
* private pointer for fs drivers, that
* have its own internal representation of
* open files.
*/
void *private;
};
struct V9fsFidState
{
int fid_type;
int32_t fid;
V9fsPath path;
V9fsFidOpenState fs;
V9fsFidOpenState fs_reclaim;
int flags;
int open_flags;
uid_t uid;
int ref;
int clunked;
V9fsFidState *next;
V9fsFidState *rclm_lst;
};
typedef struct V9fsState
{
VirtIODevice vdev;
VirtIODevice parent_obj;
VirtQueue *vq;
V9fsPDU pdus[MAX_REQ];
QLIST_HEAD(, V9fsPDU) free_list;
QLIST_HEAD(, V9fsPDU) active_list;
V9fsFidState *fid_list;
FileOperations *ops;
FsContext ctx;
char *tag;
size_t config_size;
enum p9_proto_version proto_version;
int32_t msize;
/*
* lock ensuring atomic path update
* on rename.
*/
CoRwlock rename_lock;
int32_t root_fid;
Error *migration_blocker;
} V9fsState;
VirtQueueElement elems[MAX_REQ];
V9fsState state;
} V9fsVirtioState;
typedef struct V9fsStatState {
V9fsPDU *pdu;
size_t offset;
V9fsStat v9stat;
V9fsFidState *fidp;
struct stat stbuf;
} V9fsStatState;
extern void virtio_9p_push_and_notify(V9fsPDU *pdu);
typedef struct V9fsOpenState {
V9fsPDU *pdu;
size_t offset;
int32_t mode;
V9fsFidState *fidp;
V9fsQID qid;
struct stat stbuf;
int iounit;
} V9fsOpenState;
ssize_t virtio_pdu_vmarshal(V9fsPDU *pdu, size_t offset,
const char *fmt, va_list ap);
ssize_t virtio_pdu_vunmarshal(V9fsPDU *pdu, size_t offset,
const char *fmt, va_list ap);
void virtio_init_iov_from_pdu(V9fsPDU *pdu, struct iovec **piov,
unsigned int *pniov, bool is_write);
typedef struct V9fsReadState {
V9fsPDU *pdu;
size_t offset;
int32_t count;
int32_t total;
int64_t off;
V9fsFidState *fidp;
struct iovec iov[128]; /* FIXME: bad, bad, bad */
struct iovec *sg;
off_t dir_pos;
struct dirent *dent;
struct stat stbuf;
V9fsString name;
V9fsStat v9stat;
int32_t len;
int32_t cnt;
int32_t max_count;
} V9fsReadState;
typedef struct V9fsWriteState {
V9fsPDU *pdu;
size_t offset;
int32_t len;
int32_t count;
int32_t total;
int64_t off;
V9fsFidState *fidp;
struct iovec iov[128]; /* FIXME: bad, bad, bad */
struct iovec *sg;
int cnt;
} V9fsWriteState;
struct virtio_9p_config
{
/* number of characters in tag */
uint16_t tag_len;
/* Variable size tag name */
uint8_t tag[0];
} QEMU_PACKED;
typedef struct V9fsMkState {
V9fsPDU *pdu;
size_t offset;
V9fsQID qid;
struct stat stbuf;
V9fsString name;
V9fsString fullname;
} V9fsMkState;
/* 9p2000.L open flags */
#define P9_DOTL_RDONLY 00000000
#define P9_DOTL_WRONLY 00000001
#define P9_DOTL_RDWR 00000002
#define P9_DOTL_NOACCESS 00000003
#define P9_DOTL_CREATE 00000100
#define P9_DOTL_EXCL 00000200
#define P9_DOTL_NOCTTY 00000400
#define P9_DOTL_TRUNC 00001000
#define P9_DOTL_APPEND 00002000
#define P9_DOTL_NONBLOCK 00004000
#define P9_DOTL_DSYNC 00010000
#define P9_DOTL_FASYNC 00020000
#define P9_DOTL_DIRECT 00040000
#define P9_DOTL_LARGEFILE 00100000
#define P9_DOTL_DIRECTORY 00200000
#define P9_DOTL_NOFOLLOW 00400000
#define P9_DOTL_NOATIME 01000000
#define P9_DOTL_CLOEXEC 02000000
#define P9_DOTL_SYNC 04000000
/* 9p2000.L at flags */
#define P9_DOTL_AT_REMOVEDIR 0x200
/* 9P2000.L lock type */
#define P9_LOCK_TYPE_RDLCK 0
#define P9_LOCK_TYPE_WRLCK 1
#define P9_LOCK_TYPE_UNLCK 2
#define P9_LOCK_SUCCESS 0
#define P9_LOCK_BLOCKED 1
#define P9_LOCK_ERROR 2
#define P9_LOCK_GRACE 3
#define P9_LOCK_FLAGS_BLOCK 1
#define P9_LOCK_FLAGS_RECLAIM 2
typedef struct V9fsFlock
{
uint8_t type;
uint32_t flags;
uint64_t start; /* absolute offset */
uint64_t length;
uint32_t proc_id;
V9fsString client_id;
} V9fsFlock;
typedef struct V9fsGetlock
{
uint8_t type;
uint64_t start; /* absolute offset */
uint64_t length;
uint32_t proc_id;
V9fsString client_id;
} V9fsGetlock;
extern int open_fd_hw;
extern int total_open_fd;
size_t pdu_packunpack(void *addr, struct iovec *sg, int sg_count,
size_t offset, size_t size, int pack);
static inline size_t do_pdu_unpack(void *dst, struct iovec *sg, int sg_count,
size_t offset, size_t size)
{
return pdu_packunpack(dst, sg, sg_count, offset, size, 0);
}
static inline void v9fs_path_write_lock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_wrlock(&s->rename_lock);
}
}
static inline void v9fs_path_read_lock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_rdlock(&s->rename_lock);
}
}
static inline void v9fs_path_unlock(V9fsState *s)
{
if (s->ctx.export_flags & V9FS_PATHNAME_FSCONTEXT) {
qemu_co_rwlock_unlock(&s->rename_lock);
}
}
static inline uint8_t v9fs_request_cancelled(V9fsPDU *pdu)
{
return pdu->cancelled;
}
extern void handle_9p_output(VirtIODevice *vdev, VirtQueue *vq);
extern void virtio_9p_set_fd_limit(void);
extern void v9fs_reclaim_fd(V9fsPDU *pdu);
extern void v9fs_path_init(V9fsPath *path);
extern void v9fs_path_free(V9fsPath *path);
extern void v9fs_path_copy(V9fsPath *lhs, V9fsPath *rhs);
extern int v9fs_name_to_path(V9fsState *s, V9fsPath *dirpath,
const char *name, V9fsPath *path);
#define pdu_marshal(pdu, offset, fmt, args...) \
v9fs_marshal(pdu->elem.in_sg, pdu->elem.in_num, offset, 1, fmt, ##args)
#define pdu_unmarshal(pdu, offset, fmt, args...) \
v9fs_unmarshal(pdu->elem.out_sg, pdu->elem.out_num, offset, 1, fmt, ##args)
#define TYPE_VIRTIO_9P "virtio-9p-device"
#define VIRTIO_9P(obj) \
OBJECT_CHECK(V9fsVirtioState, (obj), TYPE_VIRTIO_9P)
#endif

View File

@@ -472,8 +472,9 @@ static const MemoryRegionOps acpi_pm_cnt_ops = {
.endianness = DEVICE_LITTLE_ENDIAN,
};
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent)
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent, uint8_t s4_val)
{
ar->pm1.cnt.s4_val = s4_val;
ar->wakeup.notify = acpi_notify_wakeup;
qemu_register_wakeup_notifier(&ar->wakeup);
memory_region_init_io(&ar->pm1.cnt.io, &acpi_pm_cnt_ops, ar, "acpi-cnt", 2);

View File

@@ -142,7 +142,7 @@ void acpi_pm1_evt_init(ACPIREGS *ar, acpi_update_sci_fn update_sci,
MemoryRegion *parent);
/* PM1a_CNT: piix and ich9 don't implement PM1b CNT. */
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent);
void acpi_pm1_cnt_init(ACPIREGS *ar, MemoryRegion *parent, uint8_t s4_val);
void acpi_pm1_cnt_update(ACPIREGS *ar,
bool sci_enable, bool sci_disable);
void acpi_pm1_cnt_reset(ACPIREGS *ar);

View File

@@ -212,7 +212,7 @@ void ich9_pm_init(PCIDevice *lpc_pci, ICH9LPCPMRegs *pm,
acpi_pm_tmr_init(&pm->acpi_regs, ich9_pm_update_sci_fn, &pm->io);
acpi_pm1_evt_init(&pm->acpi_regs, ich9_pm_update_sci_fn, &pm->io);
acpi_pm1_cnt_init(&pm->acpi_regs, &pm->io);
acpi_pm1_cnt_init(&pm->acpi_regs, &pm->io, 2);
acpi_gpe_init(&pm->acpi_regs, ICH9_PMIO_GPE0_LEN);
memory_region_init_io(&pm->io_gpe, &ich9_gpe_ops, pm, "apci-gpe0",

View File

@@ -266,7 +266,7 @@ static int acpi_load_old(QEMUFile *f, void *opaque, int version_id)
static const VMStateDescription vmstate_acpi = {
.name = "piix4_pm",
.version_id = 3,
.minimum_version_id = 3,
.minimum_version_id = 2, /* qemu-kvm */
.minimum_version_id_old = 1,
.load_state_old = acpi_load_old,
.post_load = vmstate_acpi_post_load,
@@ -418,7 +418,7 @@ static int piix4_pm_initfn(PCIDevice *dev)
acpi_pm_tmr_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_evt_init(&s->ar, pm_tmr_timer, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io);
acpi_pm1_cnt_init(&s->ar, &s->io, s->s4_val);
acpi_gpe_init(&s->ar, GPE_LEN);
s->powerdown_notifier.notify = piix4_pm_powerdown_req;

View File

@@ -430,7 +430,7 @@ static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
}
/* The other end is writing some data. Store it and try to interpret */
static int baum_write(CharDriverState *chr, const uint8_t *buf, int len)
static size_t baum_write(CharDriverState *chr, const uint8_t *buf, size_t len)
{
BaumDriverState *baum = chr->opaque;
int tocopy, cur, eaten, orig_len = len;

View File

@@ -294,8 +294,8 @@ static int csrhci_data_len(const uint8_t *pkt)
exit(-1);
}
static int csrhci_write(struct CharDriverState *chr,
const uint8_t *buf, int len)
static size_t csrhci_write(struct CharDriverState *chr,
const uint8_t *buf, size_t len)
{
struct csrhci_s *s = (struct csrhci_s *) chr->opaque;
int plen = s->in_len;

View File

@@ -227,7 +227,7 @@ static void uart_parameters_setup(UartState *s)
qemu_chr_fe_ioctl(s->chr, CHR_IOCTL_SERIAL_SET_PARAMS, &ssp);
}
static int uart_can_receive(void *opaque)
static size_t uart_can_receive(void *opaque)
{
UartState *s = (UartState *)opaque;

View File

@@ -36,6 +36,10 @@
#include "monitor/monitor.h"
#include "hw/ccid.h"
#ifdef CONFIG_SECCOMP
#include "sysemu/seccomp.h"
#endif
#define DPRINTF(card, lvl, fmt, ...) \
do {\
if (lvl <= card->debug) {\
@@ -234,6 +238,10 @@ static void *handle_apdu_thread(void* arg)
VReaderStatus reader_status;
EmulEvent *event;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
while (1) {
qemu_mutex_lock(&card->handle_apdu_mutex);
qemu_cond_wait(&card->handle_apdu_cond, &card->handle_apdu_mutex);
@@ -281,6 +289,10 @@ static void *event_thread(void *arg)
VEvent *event = NULL;
EmulatedState *card = arg;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
while (1) {
const char *reader_name;

View File

@@ -104,7 +104,7 @@ static void ccid_card_vscard_send_init(PassthruState *s)
(uint8_t *)&msg, sizeof(msg));
}
static int ccid_card_vscard_can_read(void *opaque)
static size_t ccid_card_vscard_can_read(void * opaque)
{
PassthruState *card = opaque;
@@ -203,14 +203,14 @@ static void ccid_card_vscard_drop_connection(PassthruState *card)
card->vscard_in_pos = card->vscard_in_hdr = 0;
}
static void ccid_card_vscard_read(void *opaque, const uint8_t *buf, int size)
static void ccid_card_vscard_read(void *opaque, const uint8_t *buf, size_t size)
{
PassthruState *card = opaque;
VSCMsgHeader *hdr;
if (card->vscard_in_pos + size > VSCARD_IN_SIZE) {
error_report(
"no room for data: pos %d + size %d > %d. dropping connection.",
"no room for data: pos %d + size %zu > %d. dropping connection.",
card->vscard_in_pos, size, VSCARD_IN_SIZE);
ccid_card_vscard_drop_connection(card);
return;

View File

@@ -172,27 +172,14 @@
#define CIRRUS_PNPMMIO_SIZE 0x1000
#define BLTUNSAFE(s) \
( \
( /* check dst is within bounds */ \
(s)->cirrus_blt_height * ABS((s)->cirrus_blt_dstpitch) \
+ ((s)->cirrus_blt_dstaddr & (s)->cirrus_addr_mask) > \
(s)->vga.vram_size \
) || \
( /* check src is within bounds */ \
(s)->cirrus_blt_height * ABS((s)->cirrus_blt_srcpitch) \
+ ((s)->cirrus_blt_srcaddr & (s)->cirrus_addr_mask) > \
(s)->vga.vram_size \
) \
)
struct CirrusVGAState;
typedef void (*cirrus_bitblt_rop_t) (struct CirrusVGAState *s,
uint8_t * dst, const uint8_t * src,
uint32_t dstaddr, uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight);
typedef void (*cirrus_fill_t)(struct CirrusVGAState *s,
uint8_t *dst, int dst_pitch, int width, int height);
uint32_t dstaddr, int dst_pitch,
int width, int height);
typedef struct CirrusVGAState {
VGACommonState vga;
@@ -273,19 +260,108 @@ static void cirrus_update_memory_access(CirrusVGAState *s);
*
***************************************/
static bool blit_region_is_unsafe(struct CirrusVGAState *s,
int32_t pitch, int32_t addr)
{
if (!pitch) {
return true;
}
if (pitch < 0) {
int64_t min = addr
+ ((int64_t)s->cirrus_blt_height - 1) * pitch
- s->cirrus_blt_width;
if (min < -1 || addr >= s->vga.vram_size) {
return true;
}
} else {
int64_t max = addr
+ ((int64_t)s->cirrus_blt_height-1) * pitch
+ s->cirrus_blt_width;
if (max > s->vga.vram_size) {
return true;
}
}
return false;
}
static bool blit_is_unsafe(struct CirrusVGAState *s, bool dst_only)
{
/* should be the case, see cirrus_bitblt_start */
assert(s->cirrus_blt_width > 0);
assert(s->cirrus_blt_height > 0);
if (s->cirrus_blt_width > CIRRUS_BLTBUFSIZE) {
return true;
}
if (blit_region_is_unsafe(s, s->cirrus_blt_dstpitch,
s->cirrus_blt_dstaddr)) {
return true;
}
if (dst_only) {
return false;
}
if (blit_region_is_unsafe(s, s->cirrus_blt_srcpitch,
s->cirrus_blt_srcaddr)) {
return true;
}
return false;
}
static void cirrus_bitblt_rop_nop(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
uint32_t dstaddr, uint32_t srcaddr,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
{
}
static void cirrus_bitblt_fill_nop(CirrusVGAState *s,
uint8_t *dst,
uint32_t dstaddr,
int dstpitch, int bltwidth,int bltheight)
{
}
static inline uint8_t cirrus_src(CirrusVGAState *s, uint32_t srcaddr)
{
if (s->cirrus_srccounter) {
/* cputovideo */
return s->cirrus_bltbuf[srcaddr & (CIRRUS_BLTBUFSIZE - 1)];
} else {
/* videotovideo */
return s->vga.vram_ptr[srcaddr & s->cirrus_addr_mask];
}
}
static inline uint16_t cirrus_src16(CirrusVGAState *s, uint32_t srcaddr)
{
uint16_t *src;
if (s->cirrus_srccounter) {
/* cputovideo */
src = (void *)&s->cirrus_bltbuf[srcaddr & (CIRRUS_BLTBUFSIZE - 1) & ~1];
} else {
/* videotovideo */
src = (void *)&s->vga.vram_ptr[srcaddr & s->cirrus_addr_mask & ~1];
}
return *src;
}
static inline uint32_t cirrus_src32(CirrusVGAState *s, uint32_t srcaddr)
{
uint32_t *src;
if (s->cirrus_srccounter) {
/* cputovideo */
src = (void *)&s->cirrus_bltbuf[srcaddr & (CIRRUS_BLTBUFSIZE - 1) & ~3];
} else {
/* videotovideo */
src = (void *)&s->vga.vram_ptr[srcaddr & s->cirrus_addr_mask & ~3];
}
return *src;
}
#define ROP_NAME 0
#define ROP_FN(d, s) 0
#include "cirrus_vga_rop.h"
@@ -615,25 +691,54 @@ static void cirrus_invalidate_region(CirrusVGAState * s, int off_begin,
int off_cur;
int off_cur_end;
if (off_pitch < 0) {
off_begin -= bytesperline - 1;
}
for (y = 0; y < lines; y++) {
off_cur = off_begin;
off_cur_end = (off_cur + bytesperline) & s->cirrus_addr_mask;
assert(off_cur_end >= off_cur);
memory_region_set_dirty(&s->vga.vram, off_cur, off_cur_end - off_cur);
off_begin += off_pitch;
}
}
static int cirrus_bitblt_common_patterncopy(CirrusVGAState * s,
const uint8_t * src)
static int cirrus_bitblt_common_patterncopy(CirrusVGAState *s, bool videosrc)
{
uint8_t *dst;
uint32_t patternsize;
uint8_t *src;
dst = s->vga.vram_ptr + (s->cirrus_blt_dstaddr & s->cirrus_addr_mask);
if (videosrc) {
switch (s->vga.get_bpp(&s->vga)) {
case 8:
patternsize = 64;
break;
case 15:
case 16:
patternsize = 128;
break;
case 24:
case 32:
default:
patternsize = 256;
break;
}
s->cirrus_blt_srcaddr &= ~(patternsize - 1);
if (s->cirrus_blt_srcaddr + patternsize > s->vga.vram_size) {
return 0;
}
src = s->vga.vram_ptr + s->cirrus_blt_srcaddr;
} else {
src = s->cirrus_bltbuf;
}
if (BLTUNSAFE(s))
if (blit_is_unsafe(s, true)) {
return 0;
}
(*s->cirrus_rop) (s, dst, src,
(*s->cirrus_rop) (s, s->cirrus_blt_dstaddr,
videosrc ? s->cirrus_blt_srcaddr : 0,
s->cirrus_blt_dstpitch, 0,
s->cirrus_blt_width, s->cirrus_blt_height);
cirrus_invalidate_region(s, s->cirrus_blt_dstaddr,
@@ -648,10 +753,11 @@ static int cirrus_bitblt_solidfill(CirrusVGAState *s, int blt_rop)
{
cirrus_fill_t rop_func;
if (BLTUNSAFE(s))
if (blit_is_unsafe(s, true)) {
return 0;
}
rop_func = cirrus_fill[rop_to_index[blt_rop]][s->cirrus_blt_pixelwidth - 1];
rop_func(s, s->vga.vram_ptr + (s->cirrus_blt_dstaddr & s->cirrus_addr_mask),
rop_func(s, s->cirrus_blt_dstaddr,
s->cirrus_blt_dstpitch,
s->cirrus_blt_width, s->cirrus_blt_height);
cirrus_invalidate_region(s, s->cirrus_blt_dstaddr,
@@ -669,12 +775,10 @@ static int cirrus_bitblt_solidfill(CirrusVGAState *s, int blt_rop)
static int cirrus_bitblt_videotovideo_patterncopy(CirrusVGAState * s)
{
return cirrus_bitblt_common_patterncopy(s,
s->vga.vram_ptr + ((s->cirrus_blt_srcaddr & ~7) &
s->cirrus_addr_mask));
return cirrus_bitblt_common_patterncopy(s, true);
}
static void cirrus_do_copy(CirrusVGAState *s, int dst, int src, int w, int h)
static int cirrus_do_copy(CirrusVGAState *s, int dst, int src, int w, int h)
{
int sx = 0, sy = 0;
int dx = 0, dy = 0;
@@ -688,6 +792,9 @@ static void cirrus_do_copy(CirrusVGAState *s, int dst, int src, int w, int h)
int width, height;
depth = s->vga.get_bpp(&s->vga) / 8;
if (!depth) {
return 0;
}
s->vga.get_resolution(&s->vga, &width, &height);
/* extra x, y */
@@ -717,42 +824,35 @@ static void cirrus_do_copy(CirrusVGAState *s, int dst, int src, int w, int h)
}
}
/* we have to flush all pending changes so that the copy
is generated at the appropriate moment in time */
if (notify)
vga_hw_update();
(*s->cirrus_rop) (s, s->vga.vram_ptr +
(s->cirrus_blt_dstaddr & s->cirrus_addr_mask),
s->vga.vram_ptr +
(s->cirrus_blt_srcaddr & s->cirrus_addr_mask),
(*s->cirrus_rop) (s, s->cirrus_blt_dstaddr,
s->cirrus_blt_srcaddr,
s->cirrus_blt_dstpitch, s->cirrus_blt_srcpitch,
s->cirrus_blt_width, s->cirrus_blt_height);
if (notify)
qemu_console_copy(s->vga.ds,
sx, sy, dx, dy,
dpy_gfx_update(s->vga.ds, dx, dy,
s->cirrus_blt_width / depth,
s->cirrus_blt_height);
/* we don't have to notify the display that this portion has
changed since qemu_console_copy implies this */
changed since dpy_gfx_update implies this */
cirrus_invalidate_region(s, s->cirrus_blt_dstaddr,
s->cirrus_blt_dstpitch, s->cirrus_blt_width,
s->cirrus_blt_height);
return 1;
}
static int cirrus_bitblt_videotovideo_copy(CirrusVGAState * s)
{
if (BLTUNSAFE(s))
if (blit_is_unsafe(s, false)) {
return 0;
}
cirrus_do_copy(s, s->cirrus_blt_dstaddr - s->vga.start_addr,
return cirrus_do_copy(s, s->cirrus_blt_dstaddr - s->vga.start_addr,
s->cirrus_blt_srcaddr - s->vga.start_addr,
s->cirrus_blt_width, s->cirrus_blt_height);
return 1;
}
/***************************************
@@ -768,16 +868,15 @@ static void cirrus_bitblt_cputovideo_next(CirrusVGAState * s)
if (s->cirrus_srccounter > 0) {
if (s->cirrus_blt_mode & CIRRUS_BLTMODE_PATTERNCOPY) {
cirrus_bitblt_common_patterncopy(s, s->cirrus_bltbuf);
cirrus_bitblt_common_patterncopy(s, false);
the_end:
s->cirrus_srccounter = 0;
cirrus_bitblt_reset(s);
} else {
/* at least one scan line */
do {
(*s->cirrus_rop)(s, s->vga.vram_ptr +
(s->cirrus_blt_dstaddr & s->cirrus_addr_mask),
s->cirrus_bltbuf, 0, 0, s->cirrus_blt_width, 1);
(*s->cirrus_rop)(s, s->cirrus_blt_dstaddr,
0, 0, 0, s->cirrus_blt_width, 1);
cirrus_invalidate_region(s, s->cirrus_blt_dstaddr, 0,
s->cirrus_blt_width, 1);
s->cirrus_blt_dstaddr += s->cirrus_blt_dstpitch;
@@ -823,6 +922,10 @@ static int cirrus_bitblt_cputovideo(CirrusVGAState * s)
{
int w;
if (blit_is_unsafe(s, true)) {
return 0;
}
s->cirrus_blt_mode &= ~CIRRUS_BLTMODE_MEMSYSSRC;
s->cirrus_srcptr = &s->cirrus_bltbuf[0];
s->cirrus_srcptr_end = &s->cirrus_bltbuf[0];
@@ -848,6 +951,10 @@ static int cirrus_bitblt_cputovideo(CirrusVGAState * s)
}
s->cirrus_srccounter = s->cirrus_blt_srcpitch * s->cirrus_blt_height;
}
/* the blit_is_unsafe call above should catch this */
assert(s->cirrus_blt_srcpitch <= CIRRUS_BLTBUFSIZE);
s->cirrus_srcptr = s->cirrus_bltbuf;
s->cirrus_srcptr_end = s->cirrus_bltbuf + s->cirrus_blt_srcpitch;
cirrus_update_memory_access(s);
@@ -895,6 +1002,9 @@ static void cirrus_bitblt_start(CirrusVGAState * s)
s->cirrus_blt_modeext = s->vga.gr[0x33];
blt_rop = s->vga.gr[0x32];
s->cirrus_blt_dstaddr &= s->cirrus_addr_mask;
s->cirrus_blt_srcaddr &= s->cirrus_addr_mask;
#ifdef DEBUG_BITBLT
printf("rop=0x%02x mode=0x%02x modeext=0x%02x w=%d h=%d dpitch=%d spitch=%d daddr=0x%08x saddr=0x%08x writemask=0x%02x\n",
blt_rop,
@@ -1907,15 +2017,14 @@ static void cirrus_mem_writeb_mode4and5_8bpp(CirrusVGAState * s,
unsigned val = mem_value;
uint8_t *dst;
dst = s->vga.vram_ptr + (offset &= s->cirrus_addr_mask);
for (x = 0; x < 8; x++) {
dst = s->vga.vram_ptr + ((offset + x) & s->cirrus_addr_mask);
if (val & 0x80) {
*dst = s->cirrus_shadow_gr1;
} else if (mode == 5) {
*dst = s->cirrus_shadow_gr0;
}
val <<= 1;
dst++;
}
memory_region_set_dirty(&s->vga.vram, offset, 8);
}
@@ -1929,8 +2038,8 @@ static void cirrus_mem_writeb_mode4and5_16bpp(CirrusVGAState * s,
unsigned val = mem_value;
uint8_t *dst;
dst = s->vga.vram_ptr + (offset &= s->cirrus_addr_mask);
for (x = 0; x < 8; x++) {
dst = s->vga.vram_ptr + ((offset + 2 * x) & s->cirrus_addr_mask & ~1);
if (val & 0x80) {
*dst = s->cirrus_shadow_gr1;
*(dst + 1) = s->vga.gr[0x11];
@@ -1939,7 +2048,6 @@ static void cirrus_mem_writeb_mode4and5_16bpp(CirrusVGAState * s,
*(dst + 1) = s->vga.gr[0x10];
}
val <<= 1;
dst += 2;
}
memory_region_set_dirty(&s->vga.vram, offset, 16);
}

View File

@@ -22,31 +22,65 @@
* THE SOFTWARE.
*/
static inline void glue(rop_8_,ROP_NAME)(uint8_t *dst, uint8_t src)
static inline void glue(rop_8_, ROP_NAME)(CirrusVGAState *s,
uint32_t dstaddr, uint8_t src)
{
uint8_t *dst = &s->vga.vram_ptr[dstaddr & s->cirrus_addr_mask];
*dst = ROP_FN(*dst, src);
}
static inline void glue(rop_16_,ROP_NAME)(uint16_t *dst, uint16_t src)
static inline void glue(rop_tr_8_, ROP_NAME)(CirrusVGAState *s,
uint32_t dstaddr, uint8_t src,
uint8_t transp)
{
uint8_t *dst = &s->vga.vram_ptr[dstaddr & s->cirrus_addr_mask];
uint8_t pixel = ROP_FN(*dst, src);
if (pixel != transp) {
*dst = pixel;
}
}
static inline void glue(rop_16_, ROP_NAME)(CirrusVGAState *s,
uint32_t dstaddr, uint16_t src)
{
uint16_t *dst = (uint16_t *)
(&s->vga.vram_ptr[dstaddr & s->cirrus_addr_mask & ~1]);
*dst = ROP_FN(*dst, src);
}
static inline void glue(rop_32_,ROP_NAME)(uint32_t *dst, uint32_t src)
static inline void glue(rop_tr_16_, ROP_NAME)(CirrusVGAState *s,
uint32_t dstaddr, uint16_t src,
uint16_t transp)
{
uint16_t *dst = (uint16_t *)
(&s->vga.vram_ptr[dstaddr & s->cirrus_addr_mask & ~1]);
uint16_t pixel = ROP_FN(*dst, src);
if (pixel != transp) {
*dst = pixel;
}
}
static inline void glue(rop_32_, ROP_NAME)(CirrusVGAState *s,
uint32_t dstaddr, uint32_t src)
{
uint32_t *dst = (uint32_t *)
(&s->vga.vram_ptr[dstaddr & s->cirrus_addr_mask & ~3]);
*dst = ROP_FN(*dst, src);
}
#define ROP_OP(d, s) glue(rop_8_,ROP_NAME)(d, s)
#define ROP_OP_16(d, s) glue(rop_16_,ROP_NAME)(d, s)
#define ROP_OP_32(d, s) glue(rop_32_,ROP_NAME)(d, s)
#define ROP_OP(st, d, s) glue(rop_8_, ROP_NAME)(st, d, s)
#define ROP_OP_TR(st, d, s, t) glue(rop_tr_8_, ROP_NAME)(st, d, s, t)
#define ROP_OP_16(st, d, s) glue(rop_16_, ROP_NAME)(st, d, s)
#define ROP_OP_TR_16(st, d, s, t) glue(rop_tr_16_, ROP_NAME)(st, d, s, t)
#define ROP_OP_32(st, d, s) glue(rop_32_, ROP_NAME)(st, d, s)
#undef ROP_FN
static void
glue(cirrus_bitblt_rop_fwd_, ROP_NAME)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
int x,y;
dstpitch -= bltwidth;
@@ -59,134 +93,139 @@ glue(cirrus_bitblt_rop_fwd_, ROP_NAME)(CirrusVGAState *s,
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x++) {
ROP_OP(dst, *src);
dst++;
src++;
ROP_OP(s, dstaddr, cirrus_src(s, srcaddr));
dstaddr++;
srcaddr++;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}
static void
glue(cirrus_bitblt_rop_bkwd_, ROP_NAME)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
int x,y;
dstpitch += bltwidth;
srcpitch += bltwidth;
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x++) {
ROP_OP(dst, *src);
dst--;
src--;
ROP_OP(s, dstaddr, cirrus_src(s, srcaddr));
dstaddr--;
srcaddr--;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}
static void
glue(glue(cirrus_bitblt_rop_fwd_transp_, ROP_NAME),_8)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch,
int srcpitch,
int bltwidth,
int bltheight)
{
int x,y;
uint8_t p;
uint8_t transp = s->vga.gr[0x34];
dstpitch -= bltwidth;
srcpitch -= bltwidth;
if (bltheight > 1 && (dstpitch < 0 || srcpitch < 0)) {
return;
}
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x++) {
p = *dst;
ROP_OP(&p, *src);
if (p != s->vga.gr[0x34]) *dst = p;
dst++;
src++;
ROP_OP_TR(s, dstaddr, cirrus_src(s, srcaddr), transp);
dstaddr++;
srcaddr++;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}
static void
glue(glue(cirrus_bitblt_rop_bkwd_transp_, ROP_NAME),_8)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch,
int srcpitch,
int bltwidth,
int bltheight)
{
int x,y;
uint8_t p;
uint8_t transp = s->vga.gr[0x34];
dstpitch += bltwidth;
srcpitch += bltwidth;
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x++) {
p = *dst;
ROP_OP(&p, *src);
if (p != s->vga.gr[0x34]) *dst = p;
dst--;
src--;
ROP_OP_TR(s, dstaddr, cirrus_src(s, srcaddr), transp);
dstaddr--;
srcaddr--;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}
static void
glue(glue(cirrus_bitblt_rop_fwd_transp_, ROP_NAME),_16)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch,
int srcpitch,
int bltwidth,
int bltheight)
{
int x,y;
uint8_t p1, p2;
uint16_t transp = s->vga.gr[0x34] | (uint16_t)s->vga.gr[0x35] << 8;
dstpitch -= bltwidth;
srcpitch -= bltwidth;
if (bltheight > 1 && (dstpitch < 0 || srcpitch < 0)) {
return;
}
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x+=2) {
p1 = *dst;
p2 = *(dst+1);
ROP_OP(&p1, *src);
ROP_OP(&p2, *(src + 1));
if ((p1 != s->vga.gr[0x34]) || (p2 != s->vga.gr[0x35])) {
*dst = p1;
*(dst+1) = p2;
}
dst+=2;
src+=2;
ROP_OP_TR_16(s, dstaddr, cirrus_src16(s, srcaddr), transp);
dstaddr += 2;
srcaddr += 2;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}
static void
glue(glue(cirrus_bitblt_rop_bkwd_transp_, ROP_NAME),_16)(CirrusVGAState *s,
uint8_t *dst,const uint8_t *src,
int dstpitch,int srcpitch,
int bltwidth,int bltheight)
uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch,
int srcpitch,
int bltwidth,
int bltheight)
{
int x,y;
uint8_t p1, p2;
uint16_t transp = s->vga.gr[0x34] | (uint16_t)s->vga.gr[0x35] << 8;
dstpitch += bltwidth;
srcpitch += bltwidth;
for (y = 0; y < bltheight; y++) {
for (x = 0; x < bltwidth; x+=2) {
p1 = *(dst-1);
p2 = *dst;
ROP_OP(&p1, *(src - 1));
ROP_OP(&p2, *src);
if ((p1 != s->vga.gr[0x34]) || (p2 != s->vga.gr[0x35])) {
*(dst-1) = p1;
*dst = p2;
}
dst-=2;
src-=2;
ROP_OP_TR_16(s, dstaddr - 1, cirrus_src16(s, srcaddr - 1), transp);
dstaddr -= 2;
srcaddr -= 2;
}
dst += dstpitch;
src += srcpitch;
dstaddr += dstpitch;
srcaddr += srcpitch;
}
}

View File

@@ -23,30 +23,32 @@
*/
#if DEPTH == 8
#define PUTPIXEL() ROP_OP(&d[0], col)
#define PUTPIXEL(s, a, c) ROP_OP(s, a, c)
#elif DEPTH == 16
#define PUTPIXEL() ROP_OP_16((uint16_t *)&d[0], col)
#define PUTPIXEL(s, a, c) ROP_OP_16(s, a, c)
#elif DEPTH == 24
#define PUTPIXEL() ROP_OP(&d[0], col); \
ROP_OP(&d[1], (col >> 8)); \
ROP_OP(&d[2], (col >> 16))
#define PUTPIXEL(s, a, c) do { \
ROP_OP(s, a, c); \
ROP_OP(s, a + 1, (col >> 8)); \
ROP_OP(s, a + 2, (col >> 16)); \
} while (0)
#elif DEPTH == 32
#define PUTPIXEL() ROP_OP_32(((uint32_t *)&d[0]), col)
#define PUTPIXEL(s, a, c) ROP_OP_32(s, a, c)
#else
#error unsupported DEPTH
#endif
static void
glue(glue(glue(cirrus_patternfill_, ROP_NAME), _),DEPTH)
(CirrusVGAState * s, uint8_t * dst,
const uint8_t * src,
(CirrusVGAState *s, uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
uint8_t *d;
uint32_t addr;
int x, y, pattern_y, pattern_pitch, pattern_x;
unsigned int col;
const uint8_t *src1;
uint32_t src1addr;
#if DEPTH == 24
int skipleft = s->vga.gr[0x2f] & 0x1f;
#else
@@ -63,42 +65,44 @@ glue(glue(glue(cirrus_patternfill_, ROP_NAME), _),DEPTH)
pattern_y = s->cirrus_blt_srcaddr & 7;
for(y = 0; y < bltheight; y++) {
pattern_x = skipleft;
d = dst + skipleft;
src1 = src + pattern_y * pattern_pitch;
addr = dstaddr + skipleft;
src1addr = srcaddr + pattern_y * pattern_pitch;
for (x = skipleft; x < bltwidth; x += (DEPTH / 8)) {
#if DEPTH == 8
col = src1[pattern_x];
col = cirrus_src(s, src1addr + pattern_x);
pattern_x = (pattern_x + 1) & 7;
#elif DEPTH == 16
col = ((uint16_t *)(src1 + pattern_x))[0];
col = cirrus_src16(s, src1addr + pattern_x);
pattern_x = (pattern_x + 2) & 15;
#elif DEPTH == 24
{
const uint8_t *src2 = src1 + pattern_x * 3;
col = src2[0] | (src2[1] << 8) | (src2[2] << 16);
uint32_t src2addr = src1addr + pattern_x * 3;
col = cirrus_src(s, src2addr) |
(cirrus_src(s, src2addr + 1) << 8) |
(cirrus_src(s, src2addr + 2) << 16);
pattern_x = (pattern_x + 1) & 7;
}
#else
col = ((uint32_t *)(src1 + pattern_x))[0];
col = cirrus_src32(s, src1addr + pattern_x);
pattern_x = (pattern_x + 4) & 31;
#endif
PUTPIXEL();
d += (DEPTH / 8);
PUTPIXEL(s, addr, col);
addr += (DEPTH / 8);
}
pattern_y = (pattern_y + 1) & 7;
dst += dstpitch;
dstaddr += dstpitch;
}
}
/* NOTE: srcpitch is ignored */
static void
glue(glue(glue(cirrus_colorexpand_transp_, ROP_NAME), _),DEPTH)
(CirrusVGAState * s, uint8_t * dst,
const uint8_t * src,
(CirrusVGAState *s, uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
uint8_t *d;
uint32_t addr;
int x, y;
unsigned bits, bits_xor;
unsigned int col;
@@ -122,33 +126,33 @@ glue(glue(glue(cirrus_colorexpand_transp_, ROP_NAME), _),DEPTH)
for(y = 0; y < bltheight; y++) {
bitmask = 0x80 >> srcskipleft;
bits = *src++ ^ bits_xor;
d = dst + dstskipleft;
bits = cirrus_src(s, srcaddr++) ^ bits_xor;
addr = dstaddr + dstskipleft;
for (x = dstskipleft; x < bltwidth; x += (DEPTH / 8)) {
if ((bitmask & 0xff) == 0) {
bitmask = 0x80;
bits = *src++ ^ bits_xor;
bits = cirrus_src(s, srcaddr++) ^ bits_xor;
}
index = (bits & bitmask);
if (index) {
PUTPIXEL();
PUTPIXEL(s, addr, col);
}
d += (DEPTH / 8);
addr += (DEPTH / 8);
bitmask >>= 1;
}
dst += dstpitch;
dstaddr += dstpitch;
}
}
static void
glue(glue(glue(cirrus_colorexpand_, ROP_NAME), _),DEPTH)
(CirrusVGAState * s, uint8_t * dst,
const uint8_t * src,
(CirrusVGAState *s, uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
uint32_t colors[2];
uint8_t *d;
uint32_t addr;
int x, y;
unsigned bits;
unsigned int col;
@@ -160,30 +164,30 @@ glue(glue(glue(cirrus_colorexpand_, ROP_NAME), _),DEPTH)
colors[1] = s->cirrus_blt_fgcol;
for(y = 0; y < bltheight; y++) {
bitmask = 0x80 >> srcskipleft;
bits = *src++;
d = dst + dstskipleft;
bits = cirrus_src(s, srcaddr++);
addr = dstaddr + dstskipleft;
for (x = dstskipleft; x < bltwidth; x += (DEPTH / 8)) {
if ((bitmask & 0xff) == 0) {
bitmask = 0x80;
bits = *src++;
bits = cirrus_src(s, srcaddr++);
}
col = colors[!!(bits & bitmask)];
PUTPIXEL();
d += (DEPTH / 8);
PUTPIXEL(s, addr, col);
addr += (DEPTH / 8);
bitmask >>= 1;
}
dst += dstpitch;
dstaddr += dstpitch;
}
}
static void
glue(glue(glue(cirrus_colorexpand_pattern_transp_, ROP_NAME), _),DEPTH)
(CirrusVGAState * s, uint8_t * dst,
const uint8_t * src,
(CirrusVGAState *s, uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
uint8_t *d;
uint32_t addr;
int x, y, bitpos, pattern_y;
unsigned int bits, bits_xor;
unsigned int col;
@@ -205,30 +209,30 @@ glue(glue(glue(cirrus_colorexpand_pattern_transp_, ROP_NAME), _),DEPTH)
pattern_y = s->cirrus_blt_srcaddr & 7;
for(y = 0; y < bltheight; y++) {
bits = src[pattern_y] ^ bits_xor;
bits = cirrus_src(s, srcaddr + pattern_y) ^ bits_xor;
bitpos = 7 - srcskipleft;
d = dst + dstskipleft;
addr = dstaddr + dstskipleft;
for (x = dstskipleft; x < bltwidth; x += (DEPTH / 8)) {
if ((bits >> bitpos) & 1) {
PUTPIXEL();
PUTPIXEL(s, addr, col);
}
d += (DEPTH / 8);
addr += (DEPTH / 8);
bitpos = (bitpos - 1) & 7;
}
pattern_y = (pattern_y + 1) & 7;
dst += dstpitch;
dstaddr += dstpitch;
}
}
static void
glue(glue(glue(cirrus_colorexpand_pattern_, ROP_NAME), _),DEPTH)
(CirrusVGAState * s, uint8_t * dst,
const uint8_t * src,
(CirrusVGAState *s, uint32_t dstaddr,
uint32_t srcaddr,
int dstpitch, int srcpitch,
int bltwidth, int bltheight)
{
uint32_t colors[2];
uint8_t *d;
uint32_t addr;
int x, y, bitpos, pattern_y;
unsigned int bits;
unsigned int col;
@@ -240,40 +244,39 @@ glue(glue(glue(cirrus_colorexpand_pattern_, ROP_NAME), _),DEPTH)
pattern_y = s->cirrus_blt_srcaddr & 7;
for(y = 0; y < bltheight; y++) {
bits = src[pattern_y];
bits = cirrus_src(s, srcaddr + pattern_y);
bitpos = 7 - srcskipleft;
d = dst + dstskipleft;
addr = dstaddr + dstskipleft;
for (x = dstskipleft; x < bltwidth; x += (DEPTH / 8)) {
col = colors[(bits >> bitpos) & 1];
PUTPIXEL();
d += (DEPTH / 8);
PUTPIXEL(s, addr, col);
addr += (DEPTH / 8);
bitpos = (bitpos - 1) & 7;
}
pattern_y = (pattern_y + 1) & 7;
dst += dstpitch;
dstaddr += dstpitch;
}
}
static void
glue(glue(glue(cirrus_fill_, ROP_NAME), _),DEPTH)
(CirrusVGAState *s,
uint8_t *dst, int dst_pitch,
uint32_t dstaddr, int dst_pitch,
int width, int height)
{
uint8_t *d, *d1;
uint32_t addr;
uint32_t col;
int x, y;
col = s->cirrus_blt_fgcol;
d1 = dst;
for(y = 0; y < height; y++) {
d = d1;
addr = dstaddr;
for(x = 0; x < width; x += (DEPTH / 8)) {
PUTPIXEL();
d += (DEPTH / 8);
PUTPIXEL(s, addr, col);
addr += (DEPTH / 8);
}
d1 += dst_pitch;
dstaddr += dst_pitch;
}
}

View File

@@ -22,6 +22,10 @@
#include "hw/virtio-blk.h"
#include "hw/dataplane/virtio-blk.h"
#ifdef CONFIG_SECCOMP
#include "sysemu/seccomp.h"
#endif
enum {
SEG_MAX = 126, /* maximum number of I/O segments */
VRING_MAX = SEG_MAX + 2, /* maximum number of vring descriptors */
@@ -356,6 +360,10 @@ static void *data_plane_thread(void *opaque)
{
VirtIOBlockDataPlane *s = opaque;
#ifdef CONFIG_SECCOMP
seccomp_start(!!0);
#endif
do {
event_poll(&s->event_poll);
} while (!s->stopping || s->num_reqs > 0);

View File

@@ -17,7 +17,7 @@
#ifndef VRING_H
#define VRING_H
#include <linux/virtio_ring.h>
#include "standard-headers/linux/virtio_ring.h"
#include "qemu-common.h"
#include "hw/dataplane/hostmem.h"
#include "hw/virtio.h"

View File

@@ -594,6 +594,9 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
msh = hdr + tp->mss;
do {
bytes = split_size;
if (tp->size >= msh) {
goto eop;
}
if (tp->size + bytes > msh)
bytes = msh - tp->size;
@@ -608,7 +611,8 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
memmove(tp->data, tp->header, hdr);
tp->size = hdr;
}
} while (split_size -= bytes);
split_size -= bytes;
} while (bytes && split_size);
} else if (!tp->tse && tp->cptse) {
// context descriptor TSE is not set, while data descriptor TSE is set
DBGOUT(TXERR, "TCP segmentation error\n");
@@ -618,6 +622,7 @@ process_tx_desc(E1000State *s, struct e1000_tx_desc *dp)
tp->size += split_size;
}
eop:
if (!(txd_lower & E1000_TXD_CMD_EOP))
return;
if (!(tp->tse && tp->cptse && tp->size < hdr))
@@ -683,7 +688,8 @@ start_xmit(E1000State *s)
* bogus values to TDT/TDLEN.
* there's nothing too intelligent we could do about this.
*/
if (s->mac_reg[TDH] == tdh_start) {
if (s->mac_reg[TDH] == tdh_start ||
tdh_start >= s->mac_reg[TDLEN] / sizeof(desc)) {
DBGOUT(TXERR, "TDH wraparound @%x, TDT %x, TDLEN %x\n",
tdh_start, s->mac_reg[TDT], s->mac_reg[TDLEN]);
break;
@@ -888,7 +894,8 @@ e1000_receive(NetClientState *nc, const uint8_t *buf, size_t size)
if (++s->mac_reg[RDH] * sizeof(desc) >= s->mac_reg[RDLEN])
s->mac_reg[RDH] = 0;
/* see comment in start_xmit; same here */
if (s->mac_reg[RDH] == rdh_start) {
if (s->mac_reg[RDH] == rdh_start ||
rdh_start >= s->mac_reg[RDLEN] / sizeof(desc)) {
DBGOUT(RXERR, "RDH wraparound @%x, RDT %x, RDLEN %x\n",
rdh_start, s->mac_reg[RDT], s->mac_reg[RDLEN]);
set_ics(s, 0, E1000_ICS_RXO);

View File

@@ -774,6 +774,11 @@ static void tx_command(EEPRO100State *s)
#if 0
uint16_t tx_buffer_el = lduw_le_pci_dma(&s->dev, tbd_address + 6);
#endif
if (tx_buffer_size == 0) {
/* Prevent an endless loop. */
logout("loop in %s:%u\n", __FILE__, __LINE__);
break;
}
tbd_address += 8;
TRACE(RXTX, logout
("TBD (simplified mode): buffer address 0x%08x, size 0x%04x\n",
@@ -855,6 +860,10 @@ static void set_multicast_list(EEPRO100State *s)
static void action_command(EEPRO100State *s)
{
/* The loop below won't stop if it gets special handcrafted data.
Therefore we limit the number of iterations. */
unsigned max_loop_count = 16;
for (;;) {
bool bit_el;
bool bit_s;
@@ -870,6 +879,13 @@ static void action_command(EEPRO100State *s)
#if 0
bool bit_sf = ((s->tx.command & COMMAND_SF) != 0);
#endif
if (max_loop_count-- == 0) {
/* Prevent an endless loop. */
logout("loop in %s:%u\n", __FILE__, __LINE__);
break;
}
s->cu_offset = s->tx.link;
TRACE(OTHER,
logout("val=(cu start), status=0x%04x, command=0x%04x, link=0x%08x\n",
@@ -1848,6 +1864,7 @@ static void pci_nic_uninit(PCIDevice *pci_dev)
memory_region_destroy(&s->io_bar);
memory_region_destroy(&s->flash_bar);
vmstate_unregister(&pci_dev->qdev, s->vmstate, s);
g_free(s->vmstate);
eeprom93xx_free(&pci_dev->qdev, s->eeprom);
qemu_del_nic(s->nic);
}

View File

@@ -788,6 +788,9 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel,
int csc_bytes = (csc + 1) << d->shift;
int cnt = d->frame_cnt >> 16;
int size = d->frame_cnt & 0xffff;
if (size < cnt) {
return;
}
int left = ((size - cnt + 1) << 2) + d->leftover;
int transferred = 0;
int temp = audio_MIN (max, audio_MIN (left, csc_bytes));
@@ -796,7 +799,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel,
addr += (cnt << 2) + d->leftover;
if (index == ADC_CHANNEL) {
while (temp) {
while (temp > 0) {
int acquired, to_copy;
to_copy = audio_MIN ((size_t) temp, sizeof (tmpbuf));
@@ -814,7 +817,7 @@ static void es1370_transfer_audio (ES1370State *s, struct chan *d, int loop_sel,
else {
SWVoiceOut *voice = s->dac_voice[index];
while (temp) {
while (temp > 0) {
int copied, to_copy;
to_copy = audio_MIN ((size_t) temp, sizeof (tmpbuf));

View File

@@ -208,7 +208,7 @@ struct SerialState {
#define R_EXTINT 15
static void handle_kbd_command(ChannelState *s, int val);
static int serial_can_receive(void *opaque);
static size_t serial_can_receive(void *opaque);
static void serial_receive_byte(ChannelState *s, int ch);
static void clear_queue(void *opaque)
@@ -610,10 +610,10 @@ static const MemoryRegionOps escc_mem_ops = {
},
};
static int serial_can_receive(void *opaque)
static size_t serial_can_receive(void *opaque)
{
ChannelState *s = opaque;
int ret;
size_t ret;
if (((s->wregs[W_RXCTRL] & RXCTRL_RXEN) == 0) // Rx not enabled
|| ((s->rregs[R_STATUS] & STATUS_RXAV) == STATUS_RXAV))

View File

@@ -80,7 +80,7 @@ void esp_request_cancelled(SCSIRequest *req)
}
}
static uint32_t get_cmd(ESPState *s, uint8_t *buf)
static uint32_t get_cmd(ESPState *s, uint8_t *buf, uint8_t buflen)
{
uint32_t dmalen;
int target;
@@ -90,6 +90,9 @@ static uint32_t get_cmd(ESPState *s, uint8_t *buf)
dmalen = s->rregs[ESP_TCLO];
dmalen |= s->rregs[ESP_TCMID] << 8;
dmalen |= s->rregs[ESP_TCHI] << 16;
if (dmalen > buflen) {
return 0;
}
s->dma_memory_read(s->dma_opaque, buf, dmalen);
} else {
dmalen = s->ti_size;
@@ -164,7 +167,7 @@ static void handle_satn(ESPState *s)
s->dma_cb = handle_satn;
return;
}
len = get_cmd(s, buf);
len = get_cmd(s, buf, sizeof(buf));
if (len)
do_cmd(s, buf);
}
@@ -178,7 +181,7 @@ static void handle_s_without_atn(ESPState *s)
s->dma_cb = handle_s_without_atn;
return;
}
len = get_cmd(s, buf);
len = get_cmd(s, buf, sizeof(buf));
if (len) {
do_busid_cmd(s, buf, 0);
}
@@ -190,7 +193,7 @@ static void handle_satn_stop(ESPState *s)
s->dma_cb = handle_satn_stop;
return;
}
s->cmdlen = get_cmd(s, s->cmdbuf);
s->cmdlen = get_cmd(s, s->cmdbuf, sizeof(s->cmdbuf));
if (s->cmdlen) {
trace_esp_handle_satn_stop(s->cmdlen);
s->do_cmd = 1;
@@ -214,7 +217,7 @@ static void write_response(ESPState *s)
} else {
s->ti_size = 2;
s->ti_rptr = 0;
s->ti_wptr = 0;
s->ti_wptr = 2;
s->rregs[ESP_RFLAGS] = 2;
}
esp_raise_irq(s);
@@ -395,19 +398,17 @@ uint64_t esp_reg_read(ESPState *s, uint32_t saddr)
trace_esp_mem_readb(saddr, s->rregs[saddr]);
switch (saddr) {
case ESP_FIFO:
if (s->ti_size > 0) {
if ((s->rregs[ESP_RSTAT] & STAT_PIO_MASK) == 0) {
/* Data out. */
qemu_log_mask(LOG_UNIMP, "esp: PIO data read not implemented\n");
s->rregs[ESP_FIFO] = 0;
esp_raise_irq(s);
} else if (s->ti_rptr < s->ti_wptr) {
s->ti_size--;
if ((s->rregs[ESP_RSTAT] & STAT_PIO_MASK) == 0) {
/* Data out. */
qemu_log_mask(LOG_UNIMP,
"esp: PIO data read not implemented\n");
s->rregs[ESP_FIFO] = 0;
} else {
s->rregs[ESP_FIFO] = s->ti_buf[s->ti_rptr++];
}
s->rregs[ESP_FIFO] = s->ti_buf[s->ti_rptr++];
esp_raise_irq(s);
}
if (s->ti_size == 0) {
if (s->ti_rptr == s->ti_wptr) {
s->ti_rptr = 0;
s->ti_wptr = 0;
}
@@ -439,8 +440,12 @@ void esp_reg_write(ESPState *s, uint32_t saddr, uint64_t val)
break;
case ESP_FIFO:
if (s->do_cmd) {
s->cmdbuf[s->cmdlen++] = val & 0xff;
} else if (s->ti_size == TI_BUFSZ - 1) {
if (s->cmdlen < TI_BUFSZ) {
s->cmdbuf[s->cmdlen++] = val & 0xff;
} else {
trace_esp_error_fifo_overrun();
}
} else if (s->ti_wptr == TI_BUFSZ - 1) {
trace_esp_error_fifo_overrun();
} else {
s->ti_size++;

View File

@@ -156,7 +156,7 @@ static const MemoryRegionOps ser_ops = {
}
};
static void serial_receive(void *opaque, const uint8_t *buf, int size)
static void serial_receive(void *opaque, const uint8_t *buf, size_t size)
{
struct etrax_serial *s = opaque;
int i;
@@ -177,10 +177,10 @@ static void serial_receive(void *opaque, const uint8_t *buf, int size)
ser_update_irq(s);
}
static int serial_can_receive(void *opaque)
static size_t serial_can_receive(void *opaque)
{
struct etrax_serial *s = opaque;
int r;
size_t r;
/* Is the receiver enabled? */
if (!(s->regs[RW_REC_CTRL] & (1 << 3))) {

View File

@@ -484,7 +484,7 @@ static const MemoryRegionOps exynos4210_uart_ops = {
},
};
static int exynos4210_uart_can_receive(void *opaque)
static size_t exynos4210_uart_can_receive(void *opaque)
{
Exynos4210UartState *s = (Exynos4210UartState *)opaque;

View File

@@ -1432,7 +1432,7 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
{
FDrive *cur_drv;
uint32_t retval = 0;
int pos;
uint32_t pos;
cur_drv = get_cur_drv(fdctrl);
fdctrl->dsr &= ~FD_DSR_PWRDOWN;
@@ -1441,8 +1441,8 @@ static uint32_t fdctrl_read_data(FDCtrl *fdctrl)
return 0;
}
pos = fdctrl->data_pos;
pos %= FD_SECTOR_LEN;
if (fdctrl->msr & FD_MSR_NONDMA) {
pos %= FD_SECTOR_LEN;
if (pos == 0) {
if (fdctrl->data_pos != 0)
if (!fdctrl_seek_to_next_sect(fdctrl, cur_drv)) {
@@ -1786,10 +1786,13 @@ static void fdctrl_handle_option(FDCtrl *fdctrl, int direction)
static void fdctrl_handle_drive_specification_command(FDCtrl *fdctrl, int direction)
{
FDrive *cur_drv = get_cur_drv(fdctrl);
uint32_t pos;
if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x80) {
pos = fdctrl->data_pos - 1;
pos %= FD_SECTOR_LEN;
if (fdctrl->fifo[pos] & 0x80) {
/* Command parameters done */
if (fdctrl->fifo[fdctrl->data_pos - 1] & 0x40) {
if (fdctrl->fifo[pos] & 0x40) {
fdctrl->fifo[0] = fdctrl->fifo[1];
fdctrl->fifo[2] = 0;
fdctrl->fifo[3] = 0;
@@ -1889,7 +1892,7 @@ static uint8_t command_to_handler[256];
static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value)
{
FDrive *cur_drv;
int pos;
uint32_t pos;
/* Reset mode */
if (!(fdctrl->dor & FD_DOR_nRESET)) {
@@ -1937,7 +1940,9 @@ static void fdctrl_write_data(FDCtrl *fdctrl, uint32_t value)
}
FLOPPY_DPRINTF("%s: %02x\n", __func__, value);
fdctrl->fifo[fdctrl->data_pos++] = value;
pos = fdctrl->data_pos++;
pos %= FD_SECTOR_LEN;
fdctrl->fifo[pos] = value;
if (fdctrl->data_pos == fdctrl->data_len) {
/* We now have all parameters
* and will be able to treat the command

View File

@@ -203,12 +203,15 @@ static void fw_cfg_reboot(FWCfgState *s)
static void fw_cfg_write(FWCfgState *s, uint8_t value)
{
int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
&s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
trace_fw_cfg_write(s, value);
if (s->cur_entry & FW_CFG_WRITE_CHANNEL && e->callback &&
s->cur_offset < e->len) {
if (s->cur_entry & FW_CFG_WRITE_CHANNEL
&& e != NULL
&& e->callback
&& s->cur_offset < e->len) {
e->data[s->cur_offset++] = value;
if (s->cur_offset == e->len) {
e->callback(e->callback_opaque, e->data);
@@ -237,7 +240,8 @@ static int fw_cfg_select(FWCfgState *s, uint16_t key)
static uint8_t fw_cfg_read(FWCfgState *s)
{
int arch = !!(s->cur_entry & FW_CFG_ARCH_LOCAL);
FWCfgEntry *e = &s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
FWCfgEntry *e = (s->cur_entry == FW_CFG_INVALID) ? NULL :
&s->entries[arch][s->cur_entry & FW_CFG_ENTRY_MASK];
uint8_t ret;
if (s->cur_entry == FW_CFG_INVALID || !e->data || s->cur_offset >= e->len)

View File

@@ -125,14 +125,14 @@ static void uart_add_to_fifo(UART *uart,
uart->len += length;
}
static int grlib_apbuart_can_receive(void *opaque)
static size_t grlib_apbuart_can_receive(void *opaque)
{
UART *uart = opaque;
return FIFO_LENGTH - uart->len;
}
static void grlib_apbuart_receive(void *opaque, const uint8_t *buf, int size)
static void grlib_apbuart_receive(void *opaque, const uint8_t *buf, size_t size)
{
UART *uart = opaque;

View File

@@ -222,6 +222,18 @@ static int hpet_pre_load(void *opaque)
return 0;
}
static bool hpet_validate_num_timers(void *opaque, int version_id)
{
HPETState *s = opaque;
if (s->num_timers < HPET_MIN_TIMERS) {
return false;
} else if (s->num_timers > HPET_MAX_TIMERS) {
return false;
}
return true;
}
static int hpet_post_load(void *opaque, int version_id)
{
HPETState *s = opaque;
@@ -290,6 +302,7 @@ static const VMStateDescription vmstate_hpet = {
VMSTATE_UINT64(isr, HPETState),
VMSTATE_UINT64(hpet_counter, HPETState),
VMSTATE_UINT8_V(num_timers, HPETState, 2),
VMSTATE_VALIDATE("num_timers in range", hpet_validate_num_timers),
VMSTATE_STRUCT_VARRAY_UINT8(timer, HPETState, num_timers, 0,
vmstate_hpet_timer, HPETTimer),
VMSTATE_END_OF_LIST()

View File

@@ -187,6 +187,12 @@ static uint64_t pit_ioport_read(void *opaque, hwaddr addr,
PITChannelState *s;
addr &= 3;
if (addr == 3) {
/* Mode/Command register is write only, read is ignored */
return 0;
}
s = &pit->channels[addr];
if (s->status_latched) {
s->status_latched = 0;

View File

@@ -266,6 +266,12 @@ static int pit_dispatch_post_load(void *opaque, int version_id)
return 0;
}
static bool is_qemu_kvm(void *opaque, int version_id)
{
/* HACK: We ignore incoming migration from upstream qemu */
return version_id < 3;
}
static const VMStateDescription vmstate_pit_common = {
.name = "i8254",
.version_id = 3,
@@ -275,6 +281,7 @@ static const VMStateDescription vmstate_pit_common = {
.pre_save = pit_dispatch_pre_save,
.post_load = pit_dispatch_post_load,
.fields = (VMStateField[]) {
VMSTATE_UNUSED_TEST(is_qemu_kvm, 4),
VMSTATE_UINT32_V(channels[0].irq_disabled, PITCommonState, 3),
VMSTATE_STRUCT_ARRAY(channels, PITCommonState, 3, 2,
vmstate_pit_channel, PITChannelState),

View File

@@ -636,7 +636,8 @@ static void ahci_write_fis_d2h(AHCIDevice *ad, uint8_t *cmd_fis)
}
}
static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist,
int32_t offset)
{
AHCICmdHdr *cmd = ad->cur_cmd;
uint32_t opts = le32_to_cpu(cmd->opts);
@@ -647,11 +648,19 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
uint8_t *prdt;
int i;
int r = 0;
int sum = 0;
uint64_t sum = 0;
int off_idx = -1;
int off_pos = -1;
int64_t off_pos = -1;
int tbl_entry_size;
/*
* Note: AHCI PRDT can describe up to 256GiB. SATA/ATA only support
* transactions of up to 32MiB as of ATA8-ACS3 rev 1b, assuming a
* 512 byte sector size. We limit the PRDT in this implementation to
* a reasonably large 2GiB, which can accommodate the maximum transfer
* request for sector sizes up to 32K.
*/
if (!sglist_alloc_hint) {
DPRINTF(ad->port_no, "no sg list given by guest: 0x%08x\n", opts);
return -1;
@@ -686,7 +695,7 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
}
if ((off_idx == -1) || (off_pos < 0) || (off_pos > tbl_entry_size)) {
DPRINTF(ad->port_no, "%s: Incorrect offset! "
"off_idx: %d, off_pos: %d\n",
"off_idx: %d, off_pos: %"PRId64"\n",
__func__, off_idx, off_pos);
r = -1;
goto out;
@@ -700,6 +709,13 @@ static int ahci_populate_sglist(AHCIDevice *ad, QEMUSGList *sglist, int offset)
/* flags_size is zero-based */
qemu_sglist_add(sglist, le64_to_cpu(tbl[i].addr),
le32_to_cpu(tbl[i].flags_size) + 1);
if (sglist->size > INT32_MAX) {
error_report("AHCI Physical Region Descriptor Table describes "
"more than 2 GiB.\n");
qemu_sglist_destroy(sglist);
r = -1;
goto out;
}
}
}
@@ -709,6 +725,16 @@ out:
return r;
}
static void ncq_err(NCQTransferState *ncq_tfs)
{
IDEState *ide_state = &ncq_tfs->drive->port.ifs[0];
ide_state->error = ABRT_ERR;
ide_state->status = READY_STAT | ERR_STAT;
ncq_tfs->drive->port_regs.scr_err |= (1 << ncq_tfs->tag);
ncq_tfs->used = 0;
}
static void ncq_cb(void *opaque, int ret)
{
NCQTransferState *ncq_tfs = (NCQTransferState *)opaque;
@@ -718,10 +744,7 @@ static void ncq_cb(void *opaque, int ret)
ncq_tfs->drive->port_regs.scr_act &= ~(1 << ncq_tfs->tag);
if (ret < 0) {
/* error */
ide_state->error = ABRT_ERR;
ide_state->status = READY_STAT | ERR_STAT;
ncq_tfs->drive->port_regs.scr_err |= (1 << ncq_tfs->tag);
ncq_err(ncq_tfs);
} else {
ide_state->status = READY_STAT | SEEK_STAT;
}
@@ -1043,16 +1066,19 @@ static void ahci_start_dma(IDEDMA *dma, IDEState *s,
dma_cb(s, 0);
}
static int ahci_dma_prepare_buf(IDEDMA *dma, int is_write)
static int32_t ahci_dma_prepare_buf(IDEDMA *dma, int is_write)
{
AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
IDEState *s = &ad->port.ifs[0];
ahci_populate_sglist(ad, &s->sg, 0);
if (ahci_populate_sglist(ad, &s->sg, s->io_buffer_offset) == -1) {
DPRINTF(ad->port_no, "ahci_dma_prepare_buf failed.\n");
return -1;
}
s->io_buffer_size = s->sg.size;
DPRINTF(ad->port_no, "len=%#x\n", s->io_buffer_size);
return s->io_buffer_size != 0;
return s->io_buffer_size;
}
static int ahci_dma_rw_buf(IDEDMA *dma, int is_write)
@@ -1104,10 +1130,15 @@ static int ahci_dma_add_status(IDEDMA *dma, int status)
}
static int ahci_dma_set_inactive(IDEDMA *dma)
{
return 0;
}
static int ahci_async_cmd_done(IDEDMA *dma)
{
AHCIDevice *ad = DO_UPCAST(AHCIDevice, dma, dma);
DPRINTF(ad->port_no, "dma done\n");
DPRINTF(ad->port_no, "async cmd done\n");
/* update d2h status */
ahci_write_fis_d2h(ad, NULL);
@@ -1142,6 +1173,7 @@ static const IDEDMAOps ahci_dma_ops = {
.set_unit = ahci_dma_set_unit,
.add_status = ahci_dma_add_status,
.set_inactive = ahci_dma_set_inactive,
.async_cmd_done = ahci_async_cmd_done,
.restart_cb = ahci_dma_restart_cb,
.reset = ahci_dma_reset,
};
@@ -1176,6 +1208,18 @@ void ahci_init(AHCIState *s, DeviceState *qdev, DMAContext *dma, int ports)
void ahci_uninit(AHCIState *s)
{
int i, j;
for (i = 0; i < s->ports; i++) {
AHCIDevice *ad = &s->dev[i];
for (j = 0; j < 2; j++) {
IDEState *s = &ad->port.ifs[j];
ide_exit(s);
}
}
memory_region_destroy(&s->mem);
memory_region_destroy(&s->idp);
g_free(s->dev);
@@ -1270,7 +1314,7 @@ const VMStateDescription vmstate_ahci = {
VMSTATE_UINT32(control_regs.impl, AHCIState),
VMSTATE_UINT32(control_regs.version, AHCIState),
VMSTATE_UINT32(idp_index, AHCIState),
VMSTATE_INT32(ports, AHCIState),
VMSTATE_INT32_EQUAL(ports, AHCIState),
VMSTATE_END_OF_LIST()
},
};

View File

@@ -231,6 +231,8 @@ void ide_atapi_cmd_reply_end(IDEState *s)
s->packet_transfer_size -= size;
s->elementary_transfer_size -= size;
s->io_buffer_index += size;
assert(size <= s->io_buffer_total_len);
assert(s->io_buffer_index <= s->io_buffer_total_len);
ide_transfer_start(s, s->io_buffer + s->io_buffer_index - size,
size, ide_atapi_cmd_reply_end);
ide_set_irq(s->bus);
@@ -879,6 +881,7 @@ static void cmd_start_stop_unit(IDEState *s, uint8_t* buf)
if (pwrcnd) {
/* eject/load only happens for power condition == 0 */
ide_atapi_cmd_ok(s);
return;
}

View File

@@ -568,10 +568,18 @@ static void dma_buf_commit(IDEState *s)
qemu_sglist_destroy(&s->sg);
}
static void ide_async_cmd_done(IDEState *s)
{
if (s->bus->dma->ops->async_cmd_done) {
s->bus->dma->ops->async_cmd_done(s->bus->dma);
}
}
void ide_set_inactive(IDEState *s)
{
s->bus->dma->aiocb = NULL;
s->bus->dma->ops->set_inactive(s->bus->dma);
ide_async_cmd_done(s);
}
void ide_dma_error(IDEState *s)
@@ -651,10 +659,11 @@ void ide_dma_cb(void *opaque, int ret)
n = s->nsector;
s->io_buffer_index = 0;
s->io_buffer_size = n * 512;
if (s->bus->dma->ops->prepare_buf(s->bus->dma, ide_cmd_is_read(s)) == 0) {
if (s->bus->dma->ops->prepare_buf(s->bus->dma, ide_cmd_is_read(s)) < 512) {
/* The PRDs were too short. Reset the Active bit, but don't raise an
* interrupt. */
s->status = READY_STAT | SEEK_STAT;
dma_buf_commit(s);
goto eot;
}
@@ -804,6 +813,7 @@ static void ide_flush_cb(void *opaque, int ret)
bdrv_acct_done(s->bs, &s->acct);
s->status = READY_STAT | SEEK_STAT;
ide_async_cmd_done(s);
ide_set_irq(s->bus);
}
@@ -814,6 +824,7 @@ void ide_flush_cache(IDEState *s)
return;
}
s->status |= BUSY_STAT;
bdrv_acct_start(s->bs, &s->acct, 0, BDRV_ACCT_FLUSH);
bdrv_aio_flush(s->bs, ide_flush_cb, s);
}
@@ -1013,11 +1024,11 @@ void ide_ioport_write(void *opaque, uint32_t addr, uint32_t val)
static const uint8_t ide_cmd_table[0x100] = {
/* NOP not implemented, mandatory for CD */
[CFA_REQ_EXT_ERROR_CODE] = CFA_OK,
[WIN_DSM] = ALL_OK,
[WIN_DSM] = HD_CFA_OK,
[WIN_DEVICE_RESET] = CD_OK,
[WIN_RECAL] = HD_CFA_OK,
[WIN_READ] = ALL_OK,
[WIN_READ_ONCE] = ALL_OK,
[WIN_READ_ONCE] = HD_CFA_OK,
[WIN_READ_EXT] = HD_CFA_OK,
[WIN_READDMA_EXT] = HD_CFA_OK,
[WIN_READ_NATIVE_MAX_EXT] = HD_CFA_OK,
@@ -1036,12 +1047,12 @@ static const uint8_t ide_cmd_table[0x100] = {
[CFA_TRANSLATE_SECTOR] = CFA_OK,
[WIN_DIAGNOSE] = ALL_OK,
[WIN_SPECIFY] = HD_CFA_OK,
[WIN_STANDBYNOW2] = ALL_OK,
[WIN_IDLEIMMEDIATE2] = ALL_OK,
[WIN_STANDBY2] = ALL_OK,
[WIN_SETIDLE2] = ALL_OK,
[WIN_CHECKPOWERMODE2] = ALL_OK,
[WIN_SLEEPNOW2] = ALL_OK,
[WIN_STANDBYNOW2] = HD_CFA_OK,
[WIN_IDLEIMMEDIATE2] = HD_CFA_OK,
[WIN_STANDBY2] = HD_CFA_OK,
[WIN_SETIDLE2] = HD_CFA_OK,
[WIN_CHECKPOWERMODE2] = HD_CFA_OK,
[WIN_SLEEPNOW2] = HD_CFA_OK,
[WIN_PACKETCMD] = CD_OK,
[WIN_PIDENTIFY] = CD_OK,
[WIN_SMART] = HD_CFA_OK,
@@ -1055,19 +1066,19 @@ static const uint8_t ide_cmd_table[0x100] = {
[WIN_WRITEDMA] = HD_CFA_OK,
[WIN_WRITEDMA_ONCE] = HD_CFA_OK,
[CFA_WRITE_MULTI_WO_ERASE] = CFA_OK,
[WIN_STANDBYNOW1] = ALL_OK,
[WIN_IDLEIMMEDIATE] = ALL_OK,
[WIN_STANDBY] = ALL_OK,
[WIN_SETIDLE1] = ALL_OK,
[WIN_CHECKPOWERMODE1] = ALL_OK,
[WIN_SLEEPNOW1] = ALL_OK,
[WIN_STANDBYNOW1] = HD_CFA_OK,
[WIN_IDLEIMMEDIATE] = HD_CFA_OK,
[WIN_STANDBY] = HD_CFA_OK,
[WIN_SETIDLE1] = HD_CFA_OK,
[WIN_CHECKPOWERMODE1] = HD_CFA_OK,
[WIN_SLEEPNOW1] = HD_CFA_OK,
[WIN_FLUSH_CACHE] = ALL_OK,
[WIN_FLUSH_CACHE_EXT] = HD_CFA_OK,
[WIN_IDENTIFY] = ALL_OK,
[WIN_SETFEATURES] = ALL_OK,
[IBM_SENSE_CONDITION] = CFA_OK,
[CFA_WEAR_LEVEL] = HD_CFA_OK,
[WIN_READ_NATIVE_MAX] = ALL_OK,
[WIN_READ_NATIVE_MAX] = HD_CFA_OK,
};
static bool ide_cmd_permitted(IDEState *s, uint32_t cmd)
@@ -1603,7 +1614,7 @@ void ide_exec_cmd(IDEBus *bus, uint32_t val)
case 2: /* extended self test */
s->smart_selftest_count++;
if(s->smart_selftest_count > 21)
s->smart_selftest_count = 0;
s->smart_selftest_count = 1;
n = 2 + (s->smart_selftest_count - 1) * 24;
s->smart_selftest_data[n] = s->sector;
s->smart_selftest_data[n+1] = 0x00; /* OK and finished */
@@ -1790,11 +1801,17 @@ void ide_data_writew(void *opaque, uint32_t addr, uint32_t val)
}
p = s->data_ptr;
if (p + 2 > s->data_end) {
return;
}
*(uint16_t *)p = le16_to_cpu(val);
p += 2;
s->data_ptr = p;
if (p >= s->data_end)
if (p >= s->data_end) {
s->status &= ~DRQ_STAT;
s->end_transfer_func(s);
}
}
uint32_t ide_data_readw(void *opaque, uint32_t addr)
@@ -1811,11 +1828,17 @@ uint32_t ide_data_readw(void *opaque, uint32_t addr)
}
p = s->data_ptr;
if (p + 2 > s->data_end) {
return 0;
}
ret = cpu_to_le16(*(uint16_t *)p);
p += 2;
s->data_ptr = p;
if (p >= s->data_end)
if (p >= s->data_end) {
s->status &= ~DRQ_STAT;
s->end_transfer_func(s);
}
return ret;
}
@@ -1832,11 +1855,17 @@ void ide_data_writel(void *opaque, uint32_t addr, uint32_t val)
}
p = s->data_ptr;
if (p + 4 > s->data_end) {
return;
}
*(uint32_t *)p = le32_to_cpu(val);
p += 4;
s->data_ptr = p;
if (p >= s->data_end)
if (p >= s->data_end) {
s->status &= ~DRQ_STAT;
s->end_transfer_func(s);
}
}
uint32_t ide_data_readl(void *opaque, uint32_t addr)
@@ -1853,11 +1882,17 @@ uint32_t ide_data_readl(void *opaque, uint32_t addr)
}
p = s->data_ptr;
if (p + 4 > s->data_end) {
return 0;
}
ret = cpu_to_le32(*(uint32_t *)p);
p += 4;
s->data_ptr = p;
if (p >= s->data_end)
if (p >= s->data_end) {
s->status &= ~DRQ_STAT;
s->end_transfer_func(s);
}
return ret;
}
@@ -2072,6 +2107,11 @@ static int ide_nop_int(IDEDMA *dma, int x)
return 0;
}
static int32_t ide_nop_int32(IDEDMA *dma, int x)
{
return 0;
}
static void ide_nop_restart(void *opaque, int x, RunState y)
{
}
@@ -2079,7 +2119,7 @@ static void ide_nop_restart(void *opaque, int x, RunState y)
static const IDEDMAOps ide_dma_nop_ops = {
.start_dma = ide_nop_start,
.start_transfer = ide_nop,
.prepare_buf = ide_nop_int,
.prepare_buf = ide_nop_int32,
.rw_buf = ide_nop_int,
.set_unit = ide_nop_int,
.add_status = ide_nop_int,
@@ -2154,6 +2194,14 @@ void ide_init2_with_non_qdev_drives(IDEBus *bus, DriveInfo *hd0,
bus->dma = &ide_dma_nop;
}
void ide_exit(IDEState *s)
{
qemu_del_timer(s->sector_write_timer);
qemu_free_timer(s->sector_write_timer);
qemu_vfree(s->smart_selftest_data);
qemu_vfree(s->io_buffer);
}
static const MemoryRegionPortio ide_portio_list[] = {
{ 0, 8, 1, .read = ide_ioport_read, .write = ide_ioport_write },
{ 0, 2, 2, .read = ide_data_readw, .write = ide_data_writew },

Some files were not shown because too many files have changed in this diff Show More