Compare commits

..

542 Commits

Author SHA1 Message Date
Alexander Bulekov
2a28f75130 memory: prevent dma-reentracy issues
Git-commit: a2e1753b80
References: bsc#1190011 (CVE-2021-3750)

Add a flag to the DeviceState, when a device is engaged in PIO/MMIO/DMA.
This flag is set/checked prior to calling a device's MemoryRegion
handlers, and set when device code initiates DMA.  The purpose of this
flag is to prevent two types of DMA-based reentrancy issues:

1.) mmio -> dma -> mmio case
2.) bh -> dma write -> mmio case

These issues have led to problems such as stack-exhaustion and
use-after-frees.

Summary of the problem from Peter Maydell:
https://lore.kernel.org/qemu-devel/CAFEAcA_23vc7hE3iaM-JVA6W38LK4hJoWae5KcknhPRD5fPBZA@mail.gmail.com

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/62
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/540
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/541
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/556
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/557
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/827
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/1282
Resolves: CVE-2023-0330

Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20230427211013.2994127-2-alxndr@bu.edu>
[thuth: Replace warn_report() with warn_report_once()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 14:57:22 +02:00
Daniel P. Berrang_
80c4ffe74c io: remove io watch if TLS channel is closed during handshake
Git-commit: 10be627d2b
References: bsc#1212850 (CVE-2023-3354)

The TLS handshake make take some time to complete, during which time an
I/O watch might be registered with the main loop. If the owner of the
I/O channel invokes qio_channel_close() while the handshake is waiting
to continue the I/O watch must be removed. Failing to remove it will
later trigger the completion callback which the owner is not expecting
to receive. In the case of the VNC server, this results in a SEGV as
vnc_disconnect_start() tries to shutdown a client connection that is
already gone / NULL.

CVE-2023-3354
Reported-by: jiangyegen <jiangyegen@huawei.com>
Signed-off-by: Daniel P. Berrang© <berrange@redhat.co>
[DB: Open-code g_clear_handle_id(), as it's not available in glib
     until version 2.56.]
Signed-off-by: Brahmajit Das <brahmajit.das@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 14:19:10 +02:00
Christian Schoenebeck
6a54dc762b 9pfs: prevent opening special files (CVE-2023-2861)
Git-commit: f6b0de53fb
References: bsc#1212968 CVE-2023-2861

The 9p protocol does not specifically define how server shall behave when
client tries to open a special file, however from security POV it does
make sense for 9p server to prohibit opening any special file on host side
in general. A sane Linux 9p client for instance would never attempt to
open a special file on host side, it would always handle those exclusively
on its guest side. A malicious client however could potentially escape
from the exported 9p tree by creating and opening a device file on host
side.

With QEMU this could only be exploited in the following unsafe setups:

  - Running QEMU binary as root AND 9p 'local' fs driver AND 'passthrough'
    security model.

or

  - Using 9p 'proxy' fs driver (which is running its helper daemon as
    root).

These setups were already discouraged for safety reasons before,
however for obvious reasons we are now tightening behaviour on this.

Fixes: CVE-2023-2861
Reported-by: Yanwu Shen <ywsPlz@gmail.com>
Reported-by: Jietao Xiao <shawtao1125@gmail.com>
Reported-by: Jinku Li <jkli@xidian.edu.cn>
Reported-by: Wenbo Shen <shenwenbo@zju.edu.cn>
Signed-off-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <E1q6w7r-0000Q0-NM@lizzy.crudebyte.com>
Signed-off-by: Jaroslav Jindrak <jjindrak@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 14:19:01 +02:00
Philippe Mathieu-Daud303251
8dcbc9d5e0 softmmu/physmem: Introduce MemTxAttrs::memory field and MEMTX_ACCESS_ERROR
Git-commit: 3ab6fdc91b
Reference: bsc#1190258, CVE-2021-3750

Add the 'memory' bit to the memory attributes to restrict bus
controller accesses to memories.

Introduce flatview_access_allowed() to check bus permission
before running any bus transaction.

Have read/write accessors return MEMTX_ACCESS_ERROR if an access is
restricted.

There is no change for the default case where 'memory' is not set.

Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20211215182421.418374-4-philmd@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[thuth: Replaced MEMTX_BUS_ERROR with MEMTX_ACCESS_ERROR, remove
"inline"]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jaroslav Jindrak <jjindrak@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-10-20 14:18:43 +02:00
Mauro Matteo Cascella
75eaae13cf scsi/lsi53c895a: really fix use-after-free in lsi_do_msgout (CVE-2022-0216)
Git-commit: 4367a20cc4
References: bsc#1198038, CVE-2022-0216

Set current_req to NULL, not current_req->req, to prevent reusing a free'd
buffer in case of repeated SCSI cancel requests.  Also apply the fix to
CLEAR QUEUE and BUS DEVICE RESET messages as well, since they also cancel
the request.

Thanks to Alexander Bulekov for providing a reproducer.

Fixes: CVE-2022-0216
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/972
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20220711123316.421279-1-mcascell@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-04-20 16:18:30 +02:00
6e21fac0ae hw/nvme: fix CVE-2021-3929
Git-commit: 736b01642d
References: bsc#1193880, CVE-2021-3929

This fixes CVE-2021-3929 "locally" by denying DMA to the iomem of the
device itself. This still allows DMA to MMIO regions of other devices
(e.g. doing P2P DMA to the controller memory buffer of another NVMe
device).

Fixes: CVE-2021-3929
Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Reviewed-by: Keith Busch <kbusch@kernel.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>41
Signed-off-by: Klaus Jensen <k.jensen@samsung.com>
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-04-20 16:05:53 +02:00
Mauro Matteo Cascella
824b3bdd12 display/qxl-render: fix race condition in qxl_cursor (CVE-2021-4207)
git-commit: 9569f5cb5b
References: bsc#1198035

Avoid fetching 'width' and 'height' a second time to prevent possible
race condition. Refer to security advisory
https://starlabs.sg/advisories/22-4207/ for more information.

Fixes: CVE-2021-4207
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220407081106.343235-1-mcascell@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
2023-04-20 15:03:50 +02:00
Mauro Matteo Cascella
5ccea051df ui/cursor: fix integer overflow in cursor_alloc (CVE-2021-4206)
Git-commit: fa892e9abb
References: bsc#1198035, CVE-2021-4206

Prevent potential integer overflow by limiting 'width' and 'height' to
512x512. Also change 'datasize' type to size_t. Refer to security
advisory https://starlabs.sg/advisories/22-4206/ for more information.

Fixes: CVE-2021-4206
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20220407081712.345609-1-mcascell@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Dario Faggioli <dfaggioli@suse.com>
Signed-off-by: Lin Ma <lma@suse.com>
2022-10-27 22:52:29 +08:00
Jessica Clarke
b225ecaa3e Partially revert "build: -no-pie is no functional linker flag"
Git-commit: ffd205ef29
References: bsc#1192463

This partially reverts commit bbd2d5a812.

This commit was misguided and broke using --disable-pie on any distro
that enables PIE by default in their compiler driver, including Debian
and its derivatives. Whilst -no-pie is not a linker flag, it is a
compiler driver flag that ensures -pie is not automatically passed by it
to the linker. Without it, all compile_prog checks will fail as any code
built with the explicit -fno-pie will fail to link with the implicit
default -pie due to trying to use position-dependent relocations. The
only bug that needed fixing was LDFLAGS_NOPIE being used as a flag for
the linker itself in pc-bios/optionrom/Makefile.

Note this does not reinstate exporting LDFLAGS_NOPIE, as it is unused,
since the only previous use was the one that should not have existed. I
have also updated the comment for the -fno-pie and -no-pie checks to
reflect what they're actually needed for.

Fixes: bbd2d5a812
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jessica Clarke <jrtc27@jrtc27.com>
Message-Id: <20210805192545.38279-1-jrtc27@jrtc27.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
2021-12-19 00:55:38 -05:00
Christian Ehrhardt
ce51c62486 build: -no-pie is no functional linker flag
Git-commit: bbd2d5a812
References: bsc#1192463

Recent binutils changes dropping unsupported options [1] caused a build
issue in regard to the optionroms.

  ld -m elf_i386 -T /<<PKGBUILDDIR>>/pc-bios/optionrom//flat.lds -no-pie \
    -s -o multiboot.img multiboot.o
  ld.bfd: Error: unable to disambiguate: -no-pie (did you mean --no-pie ?)

This isn't really a regression in ld.bfd, filing the bug upstream
revealed that this never worked as a ld flag [2] - in fact it seems we
were by accident setting --nmagic).

Since it never had the wanted effect this usage of LDFLAGS_NOPIE, should be
droppable without any effect. This also is the only use-case of LDFLAGS_NOPIE
in .mak, therefore we can also remove it from being added there.

[1]: https://sourceware.org/git/?p=binutils-gdb.git;a=commit;h=983d925d
[2]: https://sourceware.org/bugzilla/show_bug.cgi?id=27050#c5

Signed-off-by: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Message-Id: <20201214150938.1297512-1-christian.ehrhardt@canonical.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Liang Yan <lyan@suse.com>
2021-12-19 00:55:28 -05:00
Jason Wang
74c6626d3e virtio-net: fix use after unmap/free for sg
Git-commit: bedd7e93d0
References: bsc#1189938 CVE-2021-3748

When mergeable buffer is enabled, we try to set the num_buffers after
the virtqueue elem has been unmapped. This will lead several issues,
E.g a use after free when the descriptor has an address which belongs
to the non direct access region. In this case we use bounce buffer
that is allocated during address_space_map() and freed during
address_space_unmap().

Fixing this by storing the elems temporarily in an array and delay the
unmap after we set the the num_buffers.

This addresses CVE-2021-3748.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: fbe78f4f55 ("virtio-net support")
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-09-30 12:10:15 -06:00
Greg Kurz
fe21a6e681 virtio-net: handle virtio_net_receive() errors
Git-commit: ba10b9c003
References: bsc#1189938 CVE-2021-3748

All these errors are caused by a buggy guest: let's switch the device to
the broken state instead of terminating QEMU. Also we detach the element
from the virtqueue and free it.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[JRZ: tweaked patch to bring the min. necessary]
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-09-30 11:47:44 -06:00
Gerd Hoffmann
39fc6d642a uas: add stream number sanity checks.
Git-commit: 13b250b12a
References: bsc#1189702 CVE-2021-3713

The device uses the guest-supplied stream number unchecked, which can
lead to guest-triggered out-of-band access to the UASDevice->data3 and
UASDevice->status3 fields.  Add the missing checks.

Fixes: CVE-2021-3713
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reported-by: Chen Zhe <chenzhe@huawei.com>
Reported-by: Tan Jingguo <tanjingguo@huawei.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210818120505.1258262-2-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-09-30 08:11:41 -06:00
Gerd Hoffmann
86749a5093 usbredir: fix free call
Git-commit: 5e796671e6
References: bsc#1189145 CVE-2021-3682

data might point into the middle of a larger buffer, there is a separate
free_on_destroy pointer passed into bufp_alloc() to handle that.  It is
only used in the normal workflow though, not when dropping packets due
to the queue being full.  Fix that.

Resolves: https://gitlab.com/qemu-project/qemu/-/issues/491
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20210722072756.647673-1-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-06 09:42:16 -06:00
Mauro Matteo Cascella
1382b12a0b hw/scsi/megasas: check for NULL frame in megasas_command_cancelled()
Git-commit: 00000000000000000000000000000000000000000000
References: bsc#1180432, CVE-2020-35503

Ensure that 'cmd->frame' is not NULL before accessing the 'header' field.
This check prevents a potential NULL pointer dereference issue.

RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1910346
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 14:57:21 -06:00
Mark Cave-Ayland
ed3c190dd6 esp: ensure cmdfifo is not empty and current_dev is non-NULL
Git-commit: 9954575173
References: bsc#1180433, CVE-2020-35504
            bsc#1180434, CVE-2020-35505
            bsc#1180435, CVE-2020-35506

When about to execute a SCSI command, ensure that cmdfifo is not empty and
current_dev is non-NULL. This can happen if the guest tries to execute a TI
(Transfer Information) command without issuing one of the select commands
first.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-7-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 14:57:20 -06:00
Mark Cave-Ayland
82ab880bc9 esp: always check current_req is not NULL before use in DMA callbacks
Git-commit: 0db895361b
References: bsc#1180433, CVE-2020-35504
            bsc#1180434, CVE-2020-35505
            bsc#1180435, CVE-2020-35506

After issuing a SCSI command the SCSI layer can call the SCSIBusInfo .cancel
callback which resets both current_req and current_dev to NULL. If any data
is left in the transfer buffer (async_len != 0) then the next TI (Transfer
Information) command will attempt to reference the NULL pointer causing a
segfault.

Buglink: https://bugs.launchpad.net/qemu/+bug/1910723
Buglink: https://bugs.launchpad.net/qemu/+bug/1909247
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Tested-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20210407195801.685-2-mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 14:57:20 -06:00
Gerd Hoffmann
d17d1a9453 usb: limit combined packets to 1 MiB (CVE-2021-3527)
Git-commit: 05a40b172e
References: bsc#1186012, CVE-2021-3527

usb-host and usb-redirect try to batch bulk transfers by combining many
small usb packets into a single, large transfer request, to reduce the
overhead and improve performance.

This patch adds a size limit of 1 MiB for those combined packets to
restrict the host resources the guest can bind that way.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20210503132915.2335822-6-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 14:57:20 -06:00
Gerd Hoffmann
15dbc2d070 usb/mtp: avoid dynamic stack allocation
Git-commit: 06aa50c06c
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-4-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 14:57:13 -06:00
Gerd Hoffmann
787ae80eca usb/redir: avoid dynamic stack allocation (CVE-2021-3527)
Git-commit: 7ec54f9eb6
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Fixes: 4f4321c11f ("usb: use iovecs in USBPacket")
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-3-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 13:02:43 -06:00
Gerd Hoffmann
52f7c65298 usb/hid: avoid dynamic stack allocation
Git-commit: 3f67e2e7f1
References: bsc#1186012, CVE-2021-3527

Use autofree heap allocation instead.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Tested-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20210503132915.2335822-2-kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 13:02:43 -06:00
Philippe Mathieu-Daudé
de54b40586 hw/usb/host-stub: Remove unused header
Git-commit: 1081607bfa
References: bsc#1186012, CVE-2021-3527

Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20210424224110.3442424-2-f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-08-02 13:02:43 -06:00
Jose R Ziviani
3353eac1f1 net: eepro100: validate various address values
Git-commit: 000000000000000000000000000000000000000000000
References: bsc#1182651, CVE-2021-20255

Patch based on discussion:
https://lists.gnu.org/archive/html/qemu-devel/2021-02/msg06098.html

While processing controller commands, eepro100 emulator gets
command unit(CU) base address OR receive unit (RU) base address
OR command block (CB) address from guest. If these values are not
checked, it may lead to an infinite loop kind of issues. Add checks
to avoid it.

Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-By: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-30 08:17:02 -06:00
Jose R Ziviani
7733abd6d0 libslirp: tftp: check tftp_input buffer size
Git-commit: 3f17948137155f025f7809fdc38576d5d2451c3d
References: bsc#1187366, CVE-2021-3595

Fixes: CVE-2021-3595
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/46

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-08 07:51:00 -06:00
Jose R Ziviani
0a9f991fd8 libslirp: upd6: check udp6_input buffer size
Git-commit: de71c15de66ba9350bf62c45b05f8fbff166517b
References: bsc#1187365, CVE-2021-3593

Fixes: CVE-2021-3593
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/45

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 13:34:09 -06:00
Jose R Ziviani
fb2fb42144 libslirp: dhcp: Always send DHCP_OPT_LEN bytes in options
Git-commit: d7fb54218424c3b2517aee5b391ced0f75386a5d
References: bsc#1187364, CVE-2021-3592

RFC2131 suggests that the options field may be at least 312 bytes.
Some DHCP clients seem to assume that it has to be at least 312 bytes.

Fixes #51
Fixes: f13cad45b25d92760bb0ad67bec0300a4d7d5275 ("bootp: limit
vendor-specific area to input packet memory buffer")

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 13:32:00 -06:00
Jose R Ziviani
54114c84b6 libslirp: udp: check upd_input buffer size
Git-commit: 74572be49247c8c5feae7c6e0b50c4f569ca9824
References: bsc#1187367, CVE-2021-3594

Fixes: CVE-2021-3594
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/47

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:08:32 -06:00
Jose R Ziviani
261907bacd libslirp: bootp: check bootp_input buffer size
Git-commit: 2eca0838eee1da96204545e22cdaed860d9d7c6c
References: bsc#1187364, CVE-2021-3592

Fixes: CVE-2021-3592
Fixes: https://gitlab.freedesktop.org/slirp/libslirp/-/issues/44

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:08:14 -06:00
Jose R Ziviani
91bff80092 libslirp: bootp: limit vendor-specific area to input packet memory buffer
Git-commit: f13cad45b25d92760bb0ad67bec0300a4d7d5275
References: bsc#1187364, CVE-2021-3592

sizeof(bootp_t) currently holds DHCP_OPT_LEN. Remove this optional field
from the structure, to help with the following patch checking for
minimal header size. Modify the bootp_reply() function to take the
buffer boundaries and avoiding potential buffer overflow.

Related to CVE-2021-3592.

https://gitlab.freedesktop.org/slirp/libslirp/-/issues/44

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:07:59 -06:00
Jose R Ziviani
43ba839b8a libslirp: slirp: Add sanity check for str option length
Git commit: 6fd057f7960bc7b3a69f3c53de5c9d0d5d34a79c
References: bsc#1187364, CVE-2021-3592

When user provides a long domainname or hostname that doesn't fit in the
DHCP packet, we mustn't overflow the response packet buffer. Instead,
report errors, following the g_warning() in the slirp->vdnssearch
branch.

Also check the strlen against 256 when initializing slirp, which limit
is also from the protocol where one byte represents the string length.
This gives an early error before the warning which is harder to notice
or diagnose.

Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:07:41 -06:00
Jose R Ziviani
61ebe00d01 libslirp: Add mtod_check()
Git-commit: 93e645e72a056ec0b2c16e0299fc5c6b94e4ca17
References: bsc#1187364, CVE-2021-3592
            bsc#1187367, CVE-2021-3594

Recent security issues demonstrate the lack of safety care when casting
a mbuf to a particular structure type. At least, it should check that
the buffer is large enough. The following patches will make use of this
function.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-07-07 11:07:23 -06:00
Cho, Yu-Chen
880f9f107d qom: code hardening - have bound checking while looping with integer value
Git-commit: 1bf8b88f14
References: bsc#1187529 CVE-2021-3611

Object property insertion code iterates over an integer to get an unused
index that can be used as an unique name for an object property. This loop
increments the integer value indefinitely. Although very unlikely, this can
still cause an integer overflow.
In this change, we fix the above code by checking against INT16_MAX and making
sure that the interger index does not overflow beyond that value. If no
available index is found, the code would cause an assertion failure. This
assertion failure is necessary because the callers of the function do not check
the return value for NULL.

Signed-off-by: Ani Sinha <ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20200921093325.25617-1-ani@anisinha.ca>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Cho, Yu-Chen <acho@suse.com>
2021-07-05 18:28:08 +08:00
Philippe Mathieu-Daudé
81797f1232 hw/char/bcm2835_aux: Allow less than 32-bit accesses
Git-commit: 3059344f01
References: bsc#1172382 CVE-2020-13754

The "BCM2835 ARM Peripherals" datasheet [*] chapter 2
("Auxiliaries: UART1 & SPI1, SPI2"), list the register
sizes as 3/8/16/32 bits. We assume this means this
peripheral allows 8-bit accesses.

This was not an issue until commit 5d971f9e67 which reverted
("memory: accept mismatching sizes in memory_region_access_valid").

The model is implemented as 32-bit accesses (see commit 97398d900c,
all registers are 32-bit) so replace MemoryRegionOps.valid as
MemoryRegionOps.impl, and re-introduce MemoryRegionOps.valid
with a 8/32-bit range.

[*] https://www.raspberrypi.org/app/uploads/2012/02/BCM2835-ARM-Peripherals.pdf

Fixes: 97398d900c ("bcm2835_aux: add emulation of BCM2835 AUX (aka UART1) block")
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20201002181032.1899463-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-26 16:00:02 -06:00
Michael Tokarev
8f0b4c34c2 acpi: accept byte and word access to core ACPI registers
Git-commit: dba04c3488
References: bsc#1172382 CVE-2020-13754

All ISA registers should be accessible as bytes, words or dwords
(if wide enough).  Fix the access constraints for acpi-pm-evt,
acpi-pm-tmr & acpi-cnt registers.

Fixes: 5d971f9e67 (memory: Revert "memory: accept mismatching sizes in memory_region_access_valid")
Fixes: afafe4bbe0 (apci: switch cnt to memory api)
Fixes: 77d58b1e47 (apci: switch timer to memory api)
Fixes: b5a7c024d2 (apci: switch evt to memory api)
Buglink: https://lore.kernel.org/xen-devel/20200630170913.123646-1-anthony.perard@citrix.com/T/
Buglink: https://bugs.debian.org/964793
BugLink: https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=964247
BugLink: https://bugs.launchpad.net/bugs/1886318
Reported-By: Simon John <git@the-jedi.co.uk>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20200720160627.15491-1-mjt@msgid.tls.msk.ru>
Cc: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:15:40 -06:00
Laurent Vivier
fc37d3df8f xhci: fix valid.max_access_size to access address registers
Git-commit: 8e67fda2dd
References: bsc#1172382 CVE-2020-13754

QEMU XHCI advertises AC64 (64-bit addressing) but doesn't allow
64-bit mode access in "runtime" and "operational" MemoryRegionOps.

Set the max_access_size based on sizeof(dma_addr_t) as AC64 is set.

XHCI specs:
"If the xHC supports 64-bit addressing (AC64 = ‘1’), then software
should write 64-bit registers using only Qword accesses.  If a
system is incapable of issuing Qword accesses, then writes to the
64-bit address fields shall be performed using 2 Dword accesses;
low Dword-first, high-Dword second.  If the xHC supports 32-bit
addressing (AC64 = ‘0’), then the high Dword of registers containing
64-bit address fields are unused and software should write addresses
using only Dword accesses"

The problem has been detected with SLOF, as linux kernel always accesses
registers using 32-bit access even if AC64 is set and revealed by
5d971f9e67 ("memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"")

Suggested-by: Alexey Kardashevskiy <aik@au1.ibm.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 20200721083322.90651-1-lvivier@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:15:38 -06:00
Jose R Ziviani
d75bbcc367 memory: Revert "memory: accept mismatching sizes in memory_region_access_valid"
Git-commit: 5d971f9e67
References: bsc#1172382 CVE-2020-13754

Memory API documentation documents valid .min_access_size and .max_access_size
fields and explains that any access outside these boundaries is blocked.

This is what devices seem to assume.

However this is not what the implementation does: it simply
ignores the boundaries unless there's an "accepts" callback.

Naturally, this breaks a bunch of devices.

Revert to the documented behaviour.

Devices that want to allow any access can just drop the valid field,
or add the impl field to have accesses converted to appropriate
length.

Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Fixes: CVE-2020-13754
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1842363
Fixes: a014ed07bd ("memory: accept mismatching sizes in memory_region_access_valid")
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20200610134731.1514409-1-mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R. Ziviani <jose.ziviani@suse.com>
2021-05-11 13:15:35 -06:00
Jose R Ziviani
e0d111d8a7 libqos: usb-hcd-ehci: use 32-bit write for config register
Git-commit: 89ed83d8b2
References: bsc#1172382 CVE-2020-13754

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:15:32 -06:00
Jose R Ziviani
bf0df9a2ce libqos: pci-pc: use 32-bit write for EJ register
Git-commit: 4b7c06837a
References: bsc#1172382 CVE-2020-13754

The memory region ops have min_access_size == 4 so obey it.

Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-11 13:15:28 -06:00
Jose R Ziviani
0658952eaa ppc: add host-serial and host-model machine attributes (CVE-2019-8934)
Git-commit: 27461d69a0
Git-commit: 0a794529bd
References: bsc#1126455 CVE-2019-8934

On ppc hosts, hypervisor shares following system attributes

  - /proc/device-tree/system-id
  - /proc/device-tree/model

with a guest. This could lead to information leakage and misuse.[*]
Add machine attributes to control such system information exposure
to a guest.

[*] https://wiki.openstack.org/wiki/OSSN/OSSN-0028

Reported-by: Daniel P. Berrangé <berrange@redhat.com>
Fix-suggested-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20190218181349.23885-1-ppandit@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-07 22:14:53 -06:00
Ralf Haferkamp
f15ee33a43 Drop bogus IPv6 messages
Git-commit: c7ede54cbd2e2b25385325600958ba0124e31cc0
References: bsc#1172380 CVE-2020-10756

Drop IPv6 message shorter than what's mentioned in the payload
length header (+ the size of the IPv6 header). They're invalid an could
lead to data leakage in icmp6_send_echoreply().

Signed-off-by: Jose R Ziviani <jose.ziviani@suse.com>
2021-05-07 15:00:35 -06:00
Philippe Mathieu-Daudé
f011921419 hw/intc/arm_gic: Fix interrupt ID in GICD_SGIR register
Git-commit: edfe2eb436
References: bsc#1181933 CVE-2021-20221

Per the ARM Generic Interrupt Controller Architecture specification
(document "ARM IHI 0048B.b (ID072613)"), the SGIINTID field is 4 bit,
not 10:

  - 4.3 Distributor register descriptions
  - 4.3.15 Software Generated Interrupt Register, GICD_SG

    - Table 4-21 GICD_SGIR bit assignments

    The Interrupt ID of the SGI to forward to the specified CPU
    interfaces. The value of this field is the Interrupt ID, in
    the range 0-15, for example a value of 0b0011 specifies
    Interrupt ID 3.

Correct the irq mask to fix an undefined behavior (which eventually
lead to a heap-buffer-overflow, see [Buglink]):

   $ echo 'writel 0x8000f00 0xff4affb0' | qemu-system-aarch64 -M virt,accel=qtest -qtest stdio
   [I 1612088147.116987] OPENED
  [R +0.278293] writel 0x8000f00 0xff4affb0
  ../hw/intc/arm_gic.c:1498:13: runtime error: index 944 out of bounds for type 'uint8_t [16][8]'
  SUMMARY: UndefinedBehaviorSanitizer: undefined-behavior ../hw/intc/arm_gic.c:1498:13

This fixes a security issue when running with KVM on Arm with
kernel-irqchip=off. (The default is kernel-irqchip=on, which is
unaffected, and which is also the correct choice for performance.)

Cc: qemu-stable@nongnu.org
Fixes: CVE-2021-20221
Fixes: 9ee6e8bb85 ("ARMv7 support.")
Buglink: https://bugs.launchpad.net/qemu/+bug/1913916
Buglink: https://bugs.launchpad.net/qemu/+bug/1913917
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20210131103401.217160-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-05-07 14:59:31 -06:00
Gerd Hoffmann
2f05127737 usb: fix setup_len init (CVE-2020-14364)
Git-commit: b946434f26
References: bsc#1175441, CVE-2020-14364

Store calculated setup_len in a local variable, verify it, and only
write it to the struct (USBDevice->setup_len) in case it passed the
sanity checks.

This prevents other code (do_token_{in,out} functions specifically)
from working with invalid USBDevice->setup_len values and overrunning
the USBDevice->setup_buf[] buffer.

Fixes: CVE-2020-14364
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200825053636.29648-1-kraxel@redhat.com
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Mauro Matteo Cascella
99c9f5668b hw/net/net_tx_pkt: fix assertion failure in net_tx_pkt_add_raw_fragment()
Git-commit: 035e69b063
References: bsc#1174641, CVE-2020-16092

An assertion failure issue was found in the code that processes network packets
while adding data fragments into the packet context. It could be abused by a
malicious guest to abort the QEMU process on the host. This patch replaces the
affected assert() with a conditional statement, returning false if the current
data fragment exceeds max_raw_frags.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry.fleytman@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Mauro Matteo Cascella
ba07baa93a hw/net/xgmac: Fix buffer overflow in xgmac_enet_send()
Git-commit: 5519724a13
References: bsc#1174386 CVE-2020-15863

A buffer overflow issue was reported by Mr. Ziming Zhang, CC'd here. It
occurs while sending an Ethernet frame due to missing break statements
and improper checking of the buffer size.

Reported-by: Ziming Zhang <ezrakiez@gmail.com>
Signed-off-by: Mauro Matteo Cascella <mcascell@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Prasad J Pandit
90b147ec11 net: vmxnet3: validate configuration values during activate (CVE-2021-20203)
Git-commit: 0000000000000000000000000000000000000000
References: bsc#1181639

While activating device in vmxnet3_acticate_device(), it does not
validate guest supplied configuration values against predefined
minimum - maximum limits. This may lead to integer overflow or
OOB access issues. Add checks to avoid it.

Fixes: CVE-2021-20203
Buglink: https://bugs.launchpad.net/qemu/+bug/1913873
Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Chen Qun
47662a39e6 block/iscsi:fix heap-buffer-overflow in iscsi_aio_ioctl_cb
Git-commit: ff0507c239
References: bsc#1180523, CVE-2020-11947

There is an overflow, the source 'datain.data[2]' is 100 bytes,
 but the 'ss' is 252 bytes.This may cause a security issue because
 we can access a lot of unrelated memory data.

The len for sbp copy data should take the minimum of mx_sb_len and
 sb_len_wr, not the maximum.

If we use iscsi device for VM backend storage, ASAN show stack:

READ of size 252 at 0xfffd149dcfc4 thread T0
    #0 0xaaad433d0d34 in __asan_memcpy (aarch64-softmmu/qemu-system-aarch64+0x2cb0d34)
    #1 0xaaad45f9d6d0 in iscsi_aio_ioctl_cb /qemu/block/iscsi.c:996:9
    #2 0xfffd1af0e2dc  (/usr/lib64/iscsi/libiscsi.so.8+0xe2dc)
    #3 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #4 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #5 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #6 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #7 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #8 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #9 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #10 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #11 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #12 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #13 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #14 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #15 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #16 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #17 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

0xfffd149dcfc4 is located 0 bytes to the right of 100-byte region [0xfffd149dcf60,0xfffd149dcfc4)
allocated by thread T0 here:
    #0 0xaaad433d1e70 in __interceptor_malloc (aarch64-softmmu/qemu-system-aarch64+0x2cb1e70)
    #1 0xfffd1af0e254  (/usr/lib64/iscsi/libiscsi.so.8+0xe254)
    #2 0xfffd1af0d174  (/usr/lib64/iscsi/libiscsi.so.8+0xd174)
    #3 0xfffd1af19fac  (/usr/lib64/iscsi/libiscsi.so.8+0x19fac)
    #4 0xaaad45f9acc8 in iscsi_process_read /qemu/block/iscsi.c:403:5
    #5 0xaaad4623733c in aio_dispatch_handler /qemu/util/aio-posix.c:467:9
    #6 0xaaad4622f350 in aio_dispatch_handlers /qemu/util/aio-posix.c:510:20
    #7 0xaaad4622f350 in aio_dispatch /qemu/util/aio-posix.c:520
    #8 0xaaad46215944 in aio_ctx_dispatch /qemu/util/async.c:298:5
    #9 0xfffd1bed12f4 in g_main_context_dispatch (/lib64/libglib-2.0.so.0+0x512f4)
    #10 0xaaad46227de0 in glib_pollfds_poll /qemu/util/main-loop.c:219:9
    #11 0xaaad46227de0 in os_host_main_loop_wait /qemu/util/main-loop.c:242
    #12 0xaaad46227de0 in main_loop_wait /qemu/util/main-loop.c:518
    #13 0xaaad43d9d60c in qemu_main_loop /qemu/softmmu/vl.c:1662:9
    #14 0xaaad4607a5b0 in main /qemu/softmmu/main.c:49:5
    #15 0xfffd1a460b9c in __libc_start_main (/lib64/libc.so.6+0x20b9c)
    #16 0xaaad43320740 in _start (aarch64-softmmu/qemu-system-aarch64+0x2c00740)

Reported-by: Euler Robot <euler.robot@huawei.com>
Signed-off-by: Chen Qun <kuhn.chenqun@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20200418062602.10776-1-kuhn.chenqun@huawei.com
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Prasad J Pandit
55a6bfcab6 exec: set map length to zero when returning NULL
Git-commit: 77f55eac6c
References: bsc#1172386, CVE-2020-13659

When mapping physical memory into host's virtual address space,
'address_space_map' may return NULL if BounceBuffer is in_use.
Set and return '*plen = 0' to avoid later NULL pointer dereference.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Fixes: https://bugs.launchpad.net/qemu/+bug/1878259
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200526111743.428367-1-ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-04-06 16:54:16 -06:00
Prasad J Pandit
609d4a653b slirp: check pkt_len before reading protocol header
Git-commit: 2e1dcbc0c2af64fcb17009eaf2ceedd81be2b27f
References: bsc#1179467

While processing ARP/NCSI packets in 'arp_input' or 'ncsi_input'
routines, ensure that pkt_len is large enough to accommodate the
respective protocol headers, lest it should do an OOB access.
Add check to avoid it.

CVE-2020-29129 CVE-2020-29130
  QEMU: slirp: out-of-bounds access while processing ARP/NCSI packets
 -> https://www.openwall.com/lists/oss-security/2020/11/27/1

Reported-by: Qiuhao Li <Qiuhao.Li@outlook.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20201126135706.273950-1-ppandit@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
[BR: no NCSI code, so only ARP patched]
2021-04-06 16:53:50 -06:00
Jason Wang
873913cba4 dp8393x: switch to use qemu_receive_packet() for loopback packet
Git-commit: 331d2ac9ea

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Jason Wang
fb535e054a e1000: switch to use qemu_receive_packet() for loopback
Git-commit: 1caff0340f

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Alexander Bulekov
56c468ee59 rtl8139: switch to use qemu_receive_packet() for loopback
Git-commit: 5311fb805a
References: bsc#1182968, CVE-2021-3416

This patch switches to use qemu_receive_packet() which can detect
reentrancy and return early.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Buglink: https://bugs.launchpad.net/qemu/+bug/1910826
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com
Signed-off-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Jason Wang
51f8c5a1f7 net: introduce qemu_receive_packet()
Git-commit: 705df5466c
References: bsc#1182968, CVE-2021-3416
Some NIC supports loopback mode and this is done by calling
nc->info->receive() directly which in fact suppresses the effort of
reentrancy check that is done in qemu_net_queue_send().

Unfortunately we can't use qemu_net_queue_send() here since for
loopback there's no sender as peer, so this patch introduce a
qemu_receive_packet() which is used for implementing loopback mode
for a NIC with this check.

NIC that supports loopback mode will be converted to this helper.

This is intended to address CVE-2021-3416.

Cc: Prasad J Pandit <ppandit@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Jason Wang
8013ae0d0a e1000: fail early for evil descriptor
Git-commit: 3de46e6fc4
References: bsc#1182577, CVE-2021-20257

During procss_tx_desc(), driver can try to chain data descriptor with
legacy descriptor, when will lead underflow for the following
calculation in process_tx_desc() for bytes:

            if (tp->size + bytes > msh)
                bytes = msh - tp->size;

This will lead a infinite loop. So check and fail early if tp->size if
greater or equal to msh.

Reported-by: Alexander Bulekov <alxndr@bu.edu>
Reported-by: Cheolwoo Myung <cwmyung@snu.ac.kr>
Reported-by: Ruhr-University Bochum <bugs-syssec@rub.de>
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
947d9d0bb6 spapr_pci: add spapr msi read method
Git-commit: 921604e175
References: bsc#1173612, CVE-2020-15469

Add spapr msi mmio read method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-7-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
ca71857357 vfio: add quirk device write method
Git-commit: 24202d2b56
References: bsc#1173612, CVE-2020-15469

Add vfio quirk device mmio write method to avoid NULL pointer
dereference issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-4-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
07ade86129 hw/pci-host: add pci-intack write method
Git-commit: 520f26fc6d
References: bsc#1173612, CVE-2020-15469

Add pci-intack mmio write method to avoid NULL pointer dereference
issue.

Reported-by: Lei Sun <slei.casper@gmail.com>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20200811114133.672647-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Paolo Bonzini
bbd9603155 ide: atapi: assert that the buffer pointer is in range
Git-commit: 8132122889
References: bsc#1181108, CVE-2020-29443

A case was reported where s->io_buffer_index can be out of range.
The report skimped on the details but it seems to be triggered
by s->lba == -1 on the READ/READ CD paths (e.g. by sending an
ATAPI command with LBA = 0xFFFFFFFF).  For now paper over it
with assertions.  The first one ensures that there is no overflow
when incrementing s->io_buffer_index, the second checks for the
buffer overrun.

Note that the buffer overrun is only a read, so I am not sure
if the assertion failure is actually less harmful than the overrun.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20201201120926.56559-1-pbonzini@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
e26fdc7a90 net: remove an assert call in eth_get_gso_type
Git-commit: 7564bf7701
References: bsc#1178174, CVE-2020-27617

eth_get_gso_type() routine returns segmentation offload type based on
L3 protocol type. It calls g_assert_not_reached if L3 protocol is
unknown, making the following return statement unreachable. Remove the
g_assert call, it maybe triggered by a guest user.

Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
ab7e2695ad hw: usb: hcd-ohci: check for processed TD before retire
Git-commit: 1be90ebecc
References: bsc#1176684, CVE-2020-25625

While servicing OHCI transfer descriptors(TD), ohci_service_iso_td
retires a TD if it has passed its time frame. It does not check if
the TD was already processed once and holds an error code in TD_CC.
It may happen if the TD list has a loop. Add check to avoid an
infinite loop condition.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Message-id: 20200915182259.68522-3-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
d5a2a0b50a hw: usb: hcd-ohci: check len and frame_number variables
Git-commit: 1328fe0c32
References: bsc#1176682, CVE-2020-25624

While servicing the OHCI transfer descriptors(TD), OHCI host
controller derives variables 'start_addr', 'end_addr', 'len'
etc. from values supplied by the host controller driver.
Host controller driver may supply values such that using
above variables leads to out-of-bounds access issues.
Add checks to avoid them.

AddressSanitizer: stack-buffer-overflow on address 0x7ffd53af76a0
  READ of size 2 at 0x7ffd53af76a0 thread T0
  #0 ohci_service_iso_td ../hw/usb/hcd-ohci.c:734
  #1 ohci_service_ed_list ../hw/usb/hcd-ohci.c:1180
  #2 ohci_process_lists ../hw/usb/hcd-ohci.c:1214
  #3 ohci_frame_boundary ../hw/usb/hcd-ohci.c:1257
  #4 timerlist_run_timers ../util/qemu-timer.c:572
  #5 qemu_clock_run_timers ../util/qemu-timer.c:586
  #6 qemu_clock_run_all_timers ../util/qemu-timer.c:672
  #7 main_loop_wait ../util/main-loop.c:527
  #8 qemu_main_loop ../softmmu/vl.c:1676
  #9 main ../softmmu/main.c:50

Reported-by: Gaoning Pan <pgn@zju.edu.cn>
Reported-by: Yongkang Jia <j_kangel@163.com>
Reported-by: Yi Ren <yunye.ry@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200915182259.68522-2-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Li Qiang
b406e8afc7 hw: ehci: check return value of 'usb_packet_map'
Git-commit: 2fdb42d840
References: bsc#1178934, CVE-2020-25723

If 'usb_packet_map' fails, we should stop to process the usb
request.

Signed-off-by: Li Qiang <liq3ea@163.com>
Message-Id: <20200812161727.29412-1-liq3ea@163.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Li Qiang
1d8dfccdf9 hw: xhci: check return value of 'usb_packet_map'
Git-commit: 21bc31524e
References: bsc#1176673, CVE-2020-25084

Currently we don't check the return value of 'usb_packet_map',
this will cause an UAF issue. This is LP#1891341.
Following is the reproducer provided in:
-->https://bugs.launchpad.net/qemu/+bug/1891341

cat << EOF | ./i386-softmmu/qemu-system-i386 -device nec-usb-xhci \
-trace usb\* -device usb-audio -device usb-storage,drive=mydrive \
-drive id=mydrive,file=null-co://,size=2M,format=raw,if=none \
-nodefaults -nographic -qtest stdio
outl 0xcf8 0x80001016
outl 0xcfc 0x3c009f0d
outl 0xcf8 0x80001004
outl 0xcfc 0xc77695e
writel 0x9f0d000000000040 0xffff3655
writeq 0x9f0d000000002000 0xff2f9e0000000000
write 0x1d 0x1 0x27
write 0x2d 0x1 0x2e
write 0x17232 0x1 0x03
write 0x17254 0x1 0x06
write 0x17278 0x1 0x34
write 0x3d 0x1 0x27
write 0x40 0x1 0x2e
write 0x41 0x1 0x72
write 0x42 0x1 0x01
write 0x4d 0x1 0x2e
write 0x4f 0x1 0x01
writeq 0x9f0d000000002000 0x5c051a0100000000
write 0x34001d 0x1 0x13
write 0x340026 0x1 0x30
write 0x340028 0x1 0x08
write 0x34002c 0x1 0xfe
write 0x34002d 0x1 0x08
write 0x340037 0x1 0x5e
write 0x34003a 0x1 0x05
write 0x34003d 0x1 0x05
write 0x34004d 0x1 0x13
writeq 0x9f0d000000002000 0xff00010100400009
EOF

This patch fixes this.

Buglink: https://bugs.launchpad.net/qemu/+bug/1891341
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Li Qiang <liq3ea@163.com>
Message-id: 20200812153139.15146-1-liq3ea@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-30 19:27:20 -06:00
Prasad J Pandit
7e496d88ec megasas: use unsigned type for reply_queue_head and check index
Git-commit: f50ab86a26
References: bsc#1172383, CVE-2020-13362

A guest user may set 'reply_queue_head' field of MegasasState to
a negative value. Later in 'megasas_lookup_frame' it is used to
index into s->frames[] array. Use unsigned type to avoid OOB
access issue.

Also check that 'index' value stays within s->frames[] bounds
through the while() loop in 'megasas_lookup_frame' to avoid OOB
access.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Reported-by: Alexander Bulekov <alxndr@bu.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Alexander Bulekov <alxndr@bu.edu>
Message-Id: <20200513192540.1583887-2-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
Bruce Rogers
272680ca58 sm501: add qemu/log.h include
References: bsc#1172385, CVE-2020-12829

This is in place of commit 4a1f253adb which added the log.h
reference, but did other changes we're not interested in.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
20a95fdd64 sm501: Replace hand written implementation with pixman where possible
Git-commit: b15a22bbcb
References: bsc#1172385, CVE-2020-12829

Besides being faster this should also prevent malicious guests to
abuse 2D engine to overwrite data or cause a crash.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: 58666389b6cae256e4e972a32c05cf8aa51bffc0.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6bed160dd4e4f68a85e032b08d40d6bd5449274f)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
984aae14e6 sm501: Clean up local variables in sm501_2d_operation
Git-commit: 3d0b096298
References: bsc#1172385, CVE-2020-12829

Make variables local to the block they are used in to make it clearer
which operation they are needed for.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: ae59f8138afe7f6a5a4a82539d0f61496a906b06.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit e8a152c862db6fdd316345d9cd932e4602e9b29e)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
f315bde2fc sm501: Use BIT(x) macro to shorten constant
Git-commit: 2824809b7f
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 124bf5de8d7cf503b32b377d0445029a76bfbd49.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
(cherry picked from commit 6f7231dc2939aca80adc5080464696e7d5f0ca45)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
9792df9cfb sm501: Shorten long variable names in sm501_2d_operation
Git-commit: 6f8183b5dc
References: bsc#1172385, CVE-2020-12829

This increases readability and cleans up some confusing naming.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: b9b67b94c46e945252a73c77dfd117132c63c4fb.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
ba8402f5c6 sm501: Convert printf + abort to qemu_log_mask
Git-commit: e29da77e5f
References: bsc#1172385, CVE-2020-12829

Some places already use qemu_log_mask() to log unimplemented features
or errors but some others have printf() then abort(). Convert these to
qemu_log_mask() and avoid aborting to prevent guests to easily cause
denial of service.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-id: 305af87f59d81e92f2aaff09eb8a3603b8baa322.1590089984.git.balaton@eik.bme.hu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
Marcus Comstedt
07453fba3f sm501: Adjust endianness of pixel value in rectangle fill
Git-commit: f3a60058c9
References: bsc#1172385, CVE-2020-12829

The value from twoD_foreground (which is in host endian format) must
be converted to the endianness of the framebuffer (currently always
little endian) before it can be used to perform the fill operation.

Signed-off-by: Marcus Comstedt <marcus@mc.pp.se>
Reviewed-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
0595f29cbb sm501: Set updated region dirty after 2D operation
Git-commit: eb76243c9d
References: bsc#1172385, CVE-2020-12829

Set the changed memory region dirty after performed a 2D operation to
ensure that the screen is updated properly.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
2485829699 sm501: Fix support for non-zero frame buffer start address
Git-commit: 33159dd7ce
References: bsc#1172385, CVE-2020-12829

Display updates and drawing hardware cursor did not work when frame
buffer address was non-zero. Fix this by taking the frame buffer
address into account in these cases. This fixes screen dragging on
AmigaOS. Based on patch by Sebastian Bauer.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
Sebastian Bauer
3c7f4b1e83 sm501: Log unimplemented raster operation modes
Git-commit: 06cb926aaa
References: bsc#1172385, CVE-2020-12829

The sm501 currently implements only a very limited set of raster operation
modes. After this change, unknown raster operation modes are logged so
these can be easily spotted.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
Sebastian Bauer
756840d6f3 sm501: Implement negated destination raster operation mode
Git-commit: debc7e7dad
References: bsc#1172385, CVE-2020-12829

Add support for the negated destination operation mode. This is used e.g.
by AmigaOS for the INVERSEVID drawing mode. With this change, the cursor
in the shell and non-immediate window adjustment are working now.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
Sebastian Bauer
be3d6f79f3 sm501: Use values from the pitch register for 2D operations
Git-commit: 54b2a4339c
References: bsc#1172385, CVE-2020-12829

Before, crt_h_total was used for src_width and dst_width. This is a
property of the current display setting and not relevant for the 2D
operation that also can be done off-screen. The pitch register's purpose
is to describe line pitch relevant of the 2D operation.

Signed-off-by: Sebastian Bauer <mail@sebastianbauer.info>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
0a10c33429 sm501: Add some more unimplemented registers
Git-commit: 5690d9ecef
References: bsc#1172385, CVE-2020-12829

These are not really implemented (just return zero or default values)
but add these so guests accessing them can run.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:31:41 -06:00
BALATON Zoltan
e0e40a7287 sm501: Add support for panel layer
Git-commit: 1ae5e6eb42
References: bsc#1172385, CVE-2020-12829

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 2029a276362c0c3a14c78acb56baa9466848dd51.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:30:37 -06:00
BALATON Zoltan
5ce169e8d0 sm501: Fix hardware cursor
Git-commit: 6a2a5aae02
References: bsc#1172385, CVE-2020-12829

Rework HWC handling to simplify it and fix cursor not updating on
screen as needed. Previously cursor was not updated because checking
for changes in a line overrode the update flag set for the cursor but
fixing this is not enough because the cursor should also be updated if
its shape or location changes. Introduce hwc_invalidate() function to
handle that similar to other display controller models.

Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 6970a5e9868b7246656c1d02038dc5d5fa369507.1492787889.git.balaton@eik.bme.hu
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-29 14:29:27 -06:00
Thomas Huth
91323648f1 hw/core/loader: Fix possible crash in rom_copy()
Git-commit: e423455c4f
References: bsc#1172478, CVE-2020-13765

Both, "rom->addr" and "addr" are derived from the binary image
that can be loaded with the "-kernel" paramer. The code in
rom_copy() then calculates:

    d = dest + (rom->addr - addr);

and uses "d" as destination in a memcpy() some lines later. Now with
bad kernel images, it is possible that rom->addr is smaller than addr,
thus "rom->addr - addr" gets negative and the memcpy() then tries to
copy contents from the image to a bad memory location. This could
maybe be used to inject code from a kernel image into the QEMU binary,
so we better fix it with an additional sanity check here.

Cc: qemu-stable@nongnu.org
Reported-by: Guangming Liu
Buglink: https://bugs.launchpad.net/qemu/+bug/1844635
Message-Id: <20190925130331.27825-1-thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Prasad J Pandit
88d76734fc es1370: check total frame count against current frame
Git-commit: 369ff955a8
References: bsc#1172384, CVE-2020-13361

A guest user may set channel frame count via es1370_write()
such that, in es1370_transfer_audio(), total frame count
'size' is lesser than the number of frames that are processed
'cnt'.

    int cnt = d->frame_cnt >> 16;
    int size = d->frame_cnt & 0xffff;

if (size < cnt), it results in incorrect calculations leading
to OOB access issue(s). Add check to avoid it.

Reported-by: Ren Ding <rding@gatech.edu>
Reported-by: Hanqing Zhao <hanqing@gatech.edu>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20200514200608.1744203-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Paolo Bonzini
d70565786d scsi: lsi: exit infinite loop while executing script (CVE-2019-12068)
When executing script in lsi_execute_script(), the LSI scsi adapter
emulator advances 's->dsp' index to read next opcode. This can lead
to an infinite loop if the next opcode is empty. Move the existing
loop exit after 10k iterations so that it covers no-op opcodes as
well.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit de594e4765)
[BR: BSC#1146873 CVE-2019-12068 - minor trace related tweak applied]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Sven Schnelle
a4072b59b9 lsi: use enum type for s->waiting
This makes the code easier to read - no functional change.

Signed-off-by: Sven Schnelle <svens@stackframe.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20190305195519.24303-3-svens@stackframe.org>
(cherry picked from commit f08ec2b82a)
[BR: BSC#1146873 CVE-2019-12068 - small tweak made related to trace code]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Felipe Franciosi
1ce9aef928 iscsi: Cap block count from GET LBA STATUS (CVE-2020-1711)
Git-commit: 693fd2acdf
References: bsc#1166240

When querying an iSCSI server for the provisioning status of blocks (via
GET LBA STATUS), Qemu only validates that the response descriptor zero's
LBA matches the one requested. Given the SCSI spec allows servers to
respond with the status of blocks beyond the end of the LUN, Qemu may
have its heap corrupted by clearing/setting too many bits at the end of
its allocmap for the LUN.

A malicious guest in control of the iSCSI server could carefully program
Qemu's heap (by selectively setting the bitmap) and then smash it.

This limits the number of bits that iscsi_co_block_status() will try to
update in the allocmap so it can't overflow the bitmap.

Fixes: CVE-2020-1711
Cc: qemu-stable@nongnu.org
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Peter Turschmid <peter.turschm@nutanix.com>
Signed-off-by: Raphael Norwitz <raphael.norwitz@nutanix.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: Converted patch from byte based to sector based]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
d857bcd0eb Fix use-afte-free in ip_reass() (CVE-2020-1983)
Git-commit: 2faae0f778f818fadc873308f983289df697eb93
References: bsc#1170940, CVE-2020-1983

The q pointer is updated when the mbuf data is moved from m_dat to
m_ext.

m_ext buffer may also be realloc()'ed and moved during m_cat():
q should also be updated in this case.

Reported-by: Aviv Sasson <asasson@paloaltonetworks.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 9bd6c5913271eabcb7768a58197ed3301fe19f2d)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
9502e32fc9 tcp_emu: fix unsafe snprintf() usages
Git-commit: 68ccb8021a838066f0951d4b2817eb6b6f10a843
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() assume that snprintf() returns "only" the
number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Before patch ce131029, if there isn't enough room in "m_data" for the
"DCC ..." message, we overflow "m_data".

After the patch, if there isn't enough room for the same, we don't
overflow "m_data", but we set "m_len" out-of-bounds. The next time an
access is bounded by "m_len", we'll have a buffer overflow then.

Use slirp_fmt*() to fix potential OOB memory access.

Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-7-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Prasad J Pandit
a82e97d8b1 slirp: use correct size while emulating commands
Git-commit: 82ebe9c370a0e2970fb5695aa19aa5214a6a1c80
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating services in tcp_emu(), it uses 'mbuf' size
'm->m_size' to write commands via snprintf(3). Use M_FREEROOM(m)
size to avoid possible OOB access.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-3-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Prasad J Pandit
127fa97a85 slirp: use correct size while emulating IRC commands
Git-commit: ce131029d6d4a405cb7d3ac6716d03e58fb4a5d9
References: bsc#1161066, CVE-2020-7039, bsc#1163018, CVE-2020-8608

While emulating IRC DCC commands, tcp_emu() uses 'mbuf' size
'm->m_size' to write DCC commands via snprintf(3). This may
lead to OOB write access, because 'bptr' points somewhere in
the middle of 'mbuf' buffer, not at the start. Use M_FREEROOM(m)
size to avoid OOB access.

Reported-by: Vishnu Dev TJ <vishnudevtj@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200109094228.79764-2-ppandit@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Samuel Thibault
1fbab5aa56 tcp_emu: Fix oob access
Git-commit: 2655fffed7a9e765bcb4701dd876e9dab975f289
References: bsc#1161066, CVE2020-7039, bsc#1161066, CVS-2020-7039

The main loop only checks for one available byte, while we sometimes
need two bytes.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
0a284f55e5 util: add slirp_fmt() helpers
Git-commit: 30648c03b27fb8d9611b723184216cd3174b6775
References: bsc#1163018, CVE-2020-8608

Various calls to snprintf() in libslirp assume that snprintf() returns
"only" the number of bytes written (excluding terminating NUL).

https://pubs.opengroup.org/onlinepubs/9699919799/functions/snprintf.html#tag_16_159_04

"Upon successful completion, the snprintf() function shall return the
number of bytes that would be written to s had n been sufficiently
large excluding the terminating null byte."

Introduce slirp_fmt() that handles several pathological cases the
way libslirp usually expect:

- treat error as fatal (instead of silently returning -1)

- fmt0() will always \0 end

- return the number of bytes actually written (instead of what would
have been written, which would usually result in OOB later), including
the ending \0 for fmt0()

- warn if truncation happened (instead of ignoring)

Other less common cases can still be handled with strcpy/snprintf() etc.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-Id: <20200127092414.169796-2-marcandre.lureau@redhat.com>
[Since porting from a quite different now libslirp, quite a bit changed]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
e3a8de2d60 slirp: don't manipulate so_rcv in tcp_emu()
For some reason, EMU_IDENT is not like other "emulated" protocols and
tries to reconstitute the original buffer, if it came in multiple
packets. Unfortunately, it does so wrongly, as it doesn't respect the
sbuf circular buffer appending rules, nor does it maintain some of the
invariants (rptr is incremented without bounds, etc): this leads to
further memory corruption revealed by ASAN or various malloc
errors. Furthermore, the so_rcv buffer is regularly flushed, so there
is no guarantee that buffer reconstruction will do what is expected.

Instead, do what the function comment says: "XXX Assumes the whole
command came in one packet", and don't touch so_rcv.

Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
10615bcff3 slirp: ensure there is enough space in mbuf to null-terminate
Prevents from buffer overflows.
Related to: https://bugzilla.redhat.com/show_bug.cgi?id=1664205

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Samuel Thibault
f5aa696bcb slirp: fix big/little endian conversion in ident protocol
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>

---
Based-on: <1551476756-25749-1-git-send-email-will@wbowling.info>

(cherry picked from commit 1fd71067da)
[BFR: In support of BSC#1123156]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Michael Roth
91743bf223 slrip: ip_reass: Fix use after free
Using ip_deq after m_free might read pointers from an allocation reuse.

This would be difficult to exploit, but that is still related with
CVE-2019-14378 which generates fragmented IP packets that would trigger this
issue and at least produce a DoS.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(from libslirp.git commit c59279437eda91841b9d26079c70b8a540d41204)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[BR: BSC#1149811 CVE-2019-15890]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Samuel Thibault
1e7a8db05d Fix heap overflow in ip_reass on big packet input
When the first fragment does not fit in the preallocated buffer, q will
already be pointing to the ext buffer, so we mustn't try to update it.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 126c04acbabd7ad32c2b018fe10dfac2a3bc1210)
[LY: CVE-2019-14378 BSC#1143794]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:53 -06:00
Liang Yan
0a276ee7e6 qemu-bridge-helper: restrict interface name
The interface names in qemu-bridge-helper are defined to be
of size IFNAMSIZ(=16), including the terminating null('\0') byte.
The same is applied to interface names read from 'bridge.conf'
file to form ACLs rules. If user supplied '--br=bridge' name
is not restricted to the same length, it could lead to ACL bypass
issue. Restrict bridge name to IFNAMSIZ, including null byte.

Reported-by: Riccardo Schirone <rschiron@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1140402 CVE-2019-13164]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:53 -06:00
Prasad J Pandit
d7daf6a2d4 qxl: check release info object
When releasing spice resources in release_resource() routine,
if release info object 'ext.info' is null, it leads to null
pointer dereference. Add check to avoid it.

Reported-by: Bugs SysSec <bugs-syssec@rub.de>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20190425063534.32747-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d52680fc93)
[LY: BSC#1135902 CVE-2019-12155]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:53 -06:00
Paolo Bonzini
412eae5ac2 target/i386: define md-clear bit
md-clear is a new CPUID bit which is set when microcode provides the
mechanism to invoke a flush of various exploitable CPU buffers by invoking
the VERW instruction.  Add the new feature.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1111331 CVE-2018-12126 CVE-2018-12127 CVE-2018-12130
CVE-2019-11091]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Peter Maydell
303af45a7b device_tree.c: Don't use load_image()
The load_image() function is deprecated, as it does not let the
caller specify how large the buffer to read the file into is.
Instead use load_image_size().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20181130151712.2312-9-peter.maydell@linaro.org
(cherry picked from commit da885fe1ee)
[LM: BSC#1130675 CVE-2018-20815]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:53 -06:00
William Bowling
5e3bfbde20 slirp: check sscanf result when emulating ident
When emulating ident in tcp_emu, if the strchr checks passed but the
sscanf check failed, two uninitialized variables would be copied and
sent in the reply, so move this code inside the if(sscanf()) clause.

Signed-off-by: William Bowling <will@wbowling.info>
Cc: qemu-stable@nongnu.org
Cc: secalert@redhat.com
Message-Id: <1551476756-25749-1-git-send-email-will@wbowling.info>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <philmd@redhat.com>
(cherry picked from commit d3222975c7)
[LM: BSC#1129622 CVE-2019-9824 To pass our checkpatch check, I changed
the patch to use spaces, not tabs, as in the initially proposed]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:53 -06:00
Richard W.M. Jones
a357de9672 virtio-scsi: Add virtqueue_size parameter allowing virtqueue size to be set.
Since Linux switched to blk-mq as the default in Linux commit
5c279bd9e406 ("scsi: default to scsi-mq"), virtio-scsi LUNs consume
about 10x as much guest kernel memory.

This commit allows you to choose the virtqueue size for each
virtio-scsi-pci controller like this:

  -device virtio-scsi-pci,id=scsi,virtqueue_size=16

The default is still 128 as before.  Using smaller virtqueue_size
allows many more disks to be added to small memory virtual machines.
For a 1 vCPU, 500 MB, no swap VM I observed:

  With scsi-mq enabled (upstream kernel):              175 disks
    -"- ditto -"-   virtqueue_size=64:                 318 disks
    -"- ditto -"-   virtqueue_size=16:                 775 disks
  With scsi-mq disabled (kernel before 5c279bd9e406): 1755 disks

Note that to have any effect, this requires a kernel patch:

  https://lkml.org/lkml/2017/8/10/689

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170810165255.20865-1-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5c0919d020)
[BR: FATE#327255]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
ebcc1754d4 vga: catch depth 0
depth == 0 is used to indicate 256 color modes.  Our region calculation
goes wrong in that case.  So detect that and just take the safe code
path we already have for the wraparound case.

While being at it also catch depth == 15 (where our region size
calculation goes wrong too).  And make the comment more verbose,
explaining what is going on here.

Without this windows guest install might trigger an assert due to trying
to check dirty bitmap outside the snapshot region.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1575541
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180514103117.21059-1-kraxel@redhat.com
(cherry picked from commit a89fe6c329)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
3110d05e80 vga: fix region calculation
Typically the scanline length and the line offset are identical.  But
in case they are not our calculation for region_end is incorrect.  Using
line_offset is fine for all scanlines, except the last one where we have
to use the actual scanline length.

Fixes: CVE-2018-7550
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-id: 20180309143704.13420-1-kraxel@redhat.com
(cherry picked from commit 7cdc61becd)
[BR: BSC#1084604 CVE-2018-7858 (it appears the above CVE ref. is wrong)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
7fe982f74c vga: fix region checks in wraparound case
Cc: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 20171030102830.4469-1-kraxel@redhat.com
(cherry picked from commit 115788d7a7)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
a029598ffc vga: add ram_addr_t cast
Reported by Coverity.

Fixes: CID 1381409
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010141323.14049-4-kraxel@redhat.com
(cherry picked from commit b0898b42ef)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
b4635392f3 vga: handle cirrus vbe mode wraparounds.
Commit "3d90c62548 vga: stop passing pointers to vga_draw_line*
functions" is incomplete.  It doesn't handle the case that the vga
rendering code tries to create a shared surface, i.e. a pixman image
backed by vga video memory.  That can not work in case the guest display
wraps from end of video memory to the start.  So force shadowing in that
case.  Also adjust the snapshot region calculation.

Can trigger with cirrus only, when programming vbe modes using the bochs
api (stdvga, also qxl and virtio-vga in vga compat mode) wrap arounds
can't happen.

Fixes: CVE-2017-13672
Fixes: 3d90c62548
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010141323.14049-3-kraxel@redhat.com
(cherry picked from commit 28f77de26a)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
1e7d6c8033 vga: drop line_offset variable
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 362f811793)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
2ac84d9594 vga: fix display update region calculation (split screen)
vga display update mis-calculated the region for the dirty bitmap
snapshot in case split screen mode is used.  This can trigger an
assert in cpu_physical_memory_snapshot_get_dirty().

Impact:  DoS for privileged guest users.

Fixes: CVE-2017-13673
Fixes: fec5e8c92b
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828123307.15392-1-kraxel@redhat.com
(cherry picked from commit e65294157d)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
66d854d96e vga: use DIV_ROUND_UP
I used the clang-tidy qemu-round check to generate the fix:
https://github.com/elmarco/clang-tools-extra

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 2c23ce22c6)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
3ca3b35eb5 vga: fix display update region calculation
vga display update mis-calculated the region for the dirty bitmap
snapshot in case the scanlines are padded.  This can triggere an
assert in cpu_physical_memory_snapshot_get_dirty().

Fixes: fec5e8c92b
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509104839.19415-1-kraxel@redhat.com
(cherry picked from commit bfc56535f7)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
4670327c5d vga: make display updates thread safe.
The vga code clears the dirty bits *after* reading the framebuffer
memory.  So if the guest framebuffer updates hits the race window
between vga reading the framebuffer and vga clearing the dirty bits
vga will miss that update

Fix it by using the new memory_region_copy_and_clear_dirty()
memory_region_copy_get_dirty() functions.  That way we clear the
dirty bitmap before reading the framebuffer.  Any guest display
updates happening in parallel will be properly tracked in the
dirty bitmap then and the next display refresh will pick them up.

Problem triggers with mttcg only.  Before mttcg was merged tcg
never ran in parallel to vga emulation.  Using kvm will hide the
problem too, due to qemu operating on a userspace copy of the
kernel's dirty bitmap.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-5-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fec5e8c92b)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
ac4fc4fe48 vga: add vga_scanline_invalidated helper
Add vga_scanline_invalidated helper to check whenever a scanline was
invalidated.  Add a sanity check to fix OOB read access for display
heights larger than 2048.

Only cirrus uses this, for hardware cursor rendering, so having this
work properly for the first 2048 scanlines only shouldn't be a problem
as the cirrus can't handle large resolutions anyway.  Also changing the
invalidated_y_table size would break live migration.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-4-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f3289f6f0f)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
ae22970d8b memory: add support getting and using a dirty bitmap copy.
This patch adds support for getting and using a local copy of the dirty
bitmap.

memory_region_snapshot_and_clear_dirty() will create a snapshot of the
dirty bitmap for the specified range, clear the dirty bitmap and return
the copy.  The returned bitmap can be a bit larger than requested, the
range is expanded so the code can copy unsigned longs from the bitmap
and avoid atomic bit update operations.

memory_region_snapshot_get_dirty() will return the dirty status of
pages, pretty much like memory_region_get_dirty(), but using the copy
returned by memory_region_copy_and_clear_dirty().

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-3-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8deaf12ca1)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
62872f3c60 bitmap: add bitmap_copy_and_clear_atomic
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170421091632.30900-2-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d6eb141392)
[BR: Support for BSC#1084604 (and other useful vga fixes)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Greg Kurz
f5ddcc4018 9p: take write lock on fid path updates (CVE-2018-19364)
Recent commit 5b76ef50f6 fixed a race where v9fs_co_open2() could
possibly overwrite a fid path with v9fs_path_copy() while it is being
accessed by some other thread, ie, use-after-free that can be detected
by ASAN with a custom 9p client.

It turns out that the same can happen at several locations where
v9fs_path_copy() is used to set the fid path. The fix is again to
take the write lock.

Fixes CVE-2018-19364.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b3c77aa58)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Greg Kurz
d091b06df0 9p: write lock path in v9fs_co_open2()
The assumption that the fid cannot be used by any other operation is
wrong. At least, nothing prevents a misbehaving client to create a
file with a given fid, and to pass this fid to some other operation
at the same time (ie, without waiting for the response to the creation
request). The call to v9fs_path_copy() performed by the worker thread
after the file was created can race with any access to the fid path
performed by some other thread. This causes use-after-free issues that
can be detected by ASAN with a custom 9p client.

Unlike other operations that only read the fid path, v9fs_co_open2()
does modify it. It should hence take the write lock.

Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 5b76ef50f6)
[BR: BSC#1116717 CVE-2018-19364]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Greg Kurz
af95059831 9p: fix QEMU crash when renaming files
When using the 9P2000.u version of the protocol, the following shell
command line in the guest can cause QEMU to crash:

    while true; do rm -rf aa; mkdir -p a/b & touch a/b/c & mv a aa; done

With 9P2000.u, file renaming is handled by the WSTAT command. The
v9fs_wstat() function calls v9fs_complete_rename(), which calls
v9fs_fix_path() for every fid whose path is affected by the change.
The involved calls to v9fs_path_copy() may race with any other access
to the fid path performed by some worker thread, causing a crash like
shown below:

Thread 12 "qemu-system-x86" received signal SIGSEGV, Segmentation fault.
0x0000555555a25da2 in local_open_nofollow (fs_ctx=0x555557d958b8, path=0x0,
 flags=65536, mode=0) at hw/9pfs/9p-local.c:59
59          while (*path && fd != -1) {
(gdb) bt
 path=0x0, flags=65536, mode=0) at hw/9pfs/9p-local.c:59
 path=0x0) at hw/9pfs/9p-local.c:92
 fs_path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/9p-local.c:185
 path=0x555556b56858, stbuf=0x7fff84830ef0) at hw/9pfs/cofile.c:53
 at hw/9pfs/9p.c:1083
 at util/coroutine-ucontext.c:116
(gdb)

The fix is to take the path write lock when calling v9fs_complete_rename(),
like in v9fs_rename().

Impact:  DoS triggered by unprivileged guest users.

Fixes: CVE-2018-19489
Cc: P J P <ppandit@redhat.com>
Reported-by: zhibin hu <noirfate@gmail.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 1d20398694)
[BR: BSC#1117275]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Gerd Hoffmann
1b264d42a4 usb-mtp: use O_NOFOLLOW and O_CLOEXEC.
Open files and directories with O_NOFOLLOW to avoid symlinks attacks.
While being at it also add O_CLOEXEC.

usb-mtp only handles regular files and directories and ignores
everything else, so users should not see a difference.

Because qemu ignores symlinks, carrying out a successful symlink attack
requires swapping an existing file or directory below rootdir for a
symlink and winning the race against the inotify notification to qemu.

Fixes: CVE-2018-16872
Cc: Prasad J Pandit <ppandit@redhat.com>
Cc: Bandan Das <bsd@redhat.com>
Reported-by: Michael Hanselmann <public@hansmi.ch>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Michael Hanselmann <public@hansmi.ch>
Message-id: 20181213122511.13853-1-kraxel@redhat.com
(cherry picked from commit bab9df35ce)
[BR: BSC#1119493]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Prasad J Pandit
269c99e859 slirp: check data length while emulating ident function
While emulating identification protocol, tcp_emu() does not check
available space in the 'sc_rcv->sb_data' buffer. It could lead to
heap buffer overflow issue. Add check to avoid it.

Reported-by: Kira <864786842@qq.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit a7104eda7d)
[BR: BSC#1123156 CVE-2019-6778,  modify patch to use spaces instead of tabs]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Jim Somerville
3ff2bdad2a kvmclock: use the updated system_timer_msr
Fixes e2b6c17 (kvmclock: update system_time_msr address forcibly)
which makes a call to get the latest value of the address
stored in system_timer_msr, but then uses the old address anyway.

Signed-off-by: Jim Somerville <Jim.Somerville@windriver.com>
Message-Id: <59b67db0bd15a46ab47c3aa657c81a4c11f168ea.1506702472.git.Jim.Somerville@windriver.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 346b1215b1)
[BR: BSC#1113231]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Denis Plotnikov
5d0c639cd5 kvmclock: update system_time_msr address forcibly
Do an update of system_time_msr address every time before reading
the value of tsc_timestamp from guest's kvmclock page.

There is no other code paths which ensure that qemu has an up-to-date
value of system_time_msr. So, force this update on guest's tsc_timestamp
reading.

This bug causes effect on those nested setups which turn off TPR access
interception for L2 guests and that access being intercepted by L0 doesn't
show up in L1.
Linux bootstrap initiate kvmclock before APIC initializing causing TPR access.
That's why on L1 guests, having TPR interception turned on for L2, the effect
of the bug is not revealed.

This patch fixes this problem by making sure it knows the correct
system_time_msr address every time it is needed.

Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
Message-Id: <1496054944-25623-1-git-send-email-dplotnikov@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e2b6c1712e)
[BR: BSC#1113231]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Paolo Bonzini
12df9d3c76 lsi53c895a: check message length value is valid
While writing a message in 'lsi_do_msgin', message length value
in 'msg_len' could be invalid due to an invalid migration stream.
Add an assertion to avoid an out of bounds access, and reject
the incoming migration data if it contains an invalid message
length.

Discovered by Deja vu Security. Reported by Oracle.

Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20181026194314.18663-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e58ccf0396)
[BR: BSC#1114422 CVE-2018-18849]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:53 -06:00
Jason Wang
cf993afaee net: ignore packet size greater than INT_MAX
There should not be a reason for passing a packet size greater than
INT_MAX. It's usually a hint of bug somewhere, so ignore packet size
greater than INT_MAX in qemu_deliver_packet_iov()

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1592a99470)
[LD: BSC#1111013 CVE-2018-17963]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Jason Wang
19ce0883bc pcnet: fix possible buffer overflow
In pcnet_receive(), we try to assign size_ to size which converts from
size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit b1d80d12c5)
[LD: BSC#1111010 CVE-2018-17962]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Jason Wang
aa64cb16b4 rtl8139: fix possible out of bound access
In rtl8139_do_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 1a326646fe)
[LD: BSC#1111006 CVE-2018-17958]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Jason Wang
518b247d0f ne2000: fix possible out of bound access in ne2000_receive
In ne2000_receive(), we try to assign size_ to size which converts
from size_t to integer. This will cause troubles when size_ is greater
INT_MAX, this will lead a negative value in size and it can then pass
the check of size < MIN_BUF_SIZE which may lead out of bound access of
for both buf and buf1.

Fixing by converting the type of size to size_t.

CC: qemu-stable@nongnu.org
Reported-by: Daniel Shapira <daniel@twistlock.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit fdc89e90fa)
[LD: BSC#1110910 CVE-2018-10839]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Larry Dewey
fd499d23c8 Adding uuid header file to files using UUID_FMT
The following files have received the header qemu/uuid.h because
they are calling UUID_FMT, which it defines:

- block/vdi.c
- hw/xenpv/xen_domainbuild.c
- hw/ppc/spapr.c
- qmp.c

[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
698375bff5 seccomp: check TSYNC host capability
Remove -sandbox option if the host is not capable of TSYNC, since the
sandbox will fail at setup time otherwise. This will help libvirt, for
ex, to figure out if -sandbox will work.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 5780760f5e)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Yi Min Zhao
4ef1eea613 sandbox: disable -sandbox if CONFIG_SECCOMP undefined
If CONFIG_SECCOMP is undefined, the option 'elevatedprivileges' remains
compiled. This would make libvirt set the corresponding capability and
then trigger failure during guest startup. This patch moves the code
regarding seccomp command line options to qemu-seccomp.c file and
wraps qemu_opts_foreach finding sandbox option with CONFIG_SECCOMP.
Because parse_sandbox() is moved into qemu-seccomp.c file, change
seccomp_start() to static function.

Signed-off-by: Yi Min Zhao <zyimin@linux.ibm.com>
Reviewed-by: Ján Tomko <jtomko@redhat.com>
Tested-by: Ján Tomko <jtomko@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 9d0fdecbad)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
e3b87f202b seccomp: set the seccomp filter to all threads
When using "-seccomp on", the seccomp policy is only applied to the
main thread, the vcpu worker thread and other worker threads created
after seccomp policy is applied; the seccomp policy is not applied to
e.g. the RCU thread because it is created before the seccomp policy is
applied and SECCOMP_FILTER_FLAG_TSYNC isn't used.

This can be verified with
for task in /proc/`pidof qemu`/task/*; do cat $task/status | grep Secc ; done
Seccomp:	2
Seccomp:	0
Seccomp:	0
Seccomp:	2
Seccomp:	2
Seccomp:	2

Starting with libseccomp 2.2.0 and kernel >= 3.17, we can use
seccomp_attr_set(ctx, > SCMP_FLTATR_CTL_TSYNC, 1) to update the policy
on all threads.

libseccomp requirement was bumped to 2.2.0 in previous patch.
libseccomp should fail to set the filter if it can't honour
SCMP_FLTATR_CTL_TSYNC (untested), and thus -sandbox will now fail on
kernel < 3.17.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 70dfabeaa7)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:53 -06:00
Marc-André Lureau
d82fe448b5 configure: require libseccomp 2.2.0
The following patch is going to require TSYNC, which is only available
since libseccomp 2.2.0.

libseccomp 2.2.0 was released February 12, 2015.

According to repology, libseccomp version in different distros:

  RHEL-7: 2.3.1
  Debian (Stretch): 2.3.1
  OpenSUSE Leap 15: 2.3.2
  Ubuntu (Xenial):  2.3.1

This will drop support for -sandbox on:

  Debian (Jessie): 2.1.1 (but 2.2.3 in backports)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit d0699bd37c)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Marc-André Lureau
bf6f9e0506 seccomp: prefer SCMP_ACT_KILL_PROCESS if available
The upcoming libseccomp release should have SCMP_ACT_KILL_PROCESS
action (https://github.com/seccomp/libseccomp/issues/96).

SCMP_ACT_KILL_PROCESS is preferable to immediately terminate the
offending process, rather than having the SIGSYS handler running.

Use SECCOMP_GET_ACTION_AVAIL to check availability of kernel support,
as libseccomp will fallback on SCMP_ACT_KILL otherwise, and we still
prefer SCMP_ACT_TRAP.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit bda08a5764)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Marc-André Lureau
a4590761b5 seccomp: allow sched_setscheduler() with SCHED_IDLE policy
Current and upcoming mesa releases rely on a shader disk cash. It uses
a thread job queue with low priority, set with
sched_setscheduler(SCHED_IDLE). However, that syscall is rejected by
the "resourcecontrol" seccomp qemu filter.

Since it should be safe to allow lowering thread priority, let's allow
scheduling thread to idle policy.

Related to:
https://bugzilla.redhat.com/show_bug.cgi?id=1594456

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 056de1e894)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Otubo
3fa309f8c1 seccomp: add resourcecontrol argument to command line
This patch adds [,resourcecontrol=deny] to `-sandbox on' option. It
blacklists all process affinity and scheduler priority system calls to
avoid any bigger of the process.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 24f8cdc572)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Otubo
1cec67ddf2 seccomp: add spawn argument to command line
This patch adds [,spawn=deny] argument to `-sandbox on' option. It
blacklists fork and execve system calls, avoiding Qemu to spawn new
threads or processes.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 995a226f88)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Otubo
852236161b seccomp: add elevateprivileges argument to command line
This patch introduces the new argument
[,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows
or denies Qemu process to elevate its privileges by blacklisting all
set*uid|gid system calls. The 'children' option will let forks and
execves run unprivileged.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 73a1e64725)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Otubo
8bb3f9bc58 seccomp: add obsolete argument to command line
This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
option. It allows Qemu to run safely on old system that still relies on
old system calls.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 2b716fa6d6)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Otubo
69ac434719 seccomp: changing from whitelist to blacklist
This patch changes the default behavior of the seccomp filter from
whitelist to blacklist. By default now all system calls are allowed and
a small black list of definitely forbidden ones was created.

Signed-off-by: Eduardo Otubo <otubo@redhat.com>
(cherry picked from commit 1bd6152ae2)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Fam Zheng
bac8819931 util: Add UUID API
A number of different places across the code base use CONFIG_UUID. Some
of them are soft dependency, some are not built if libuuid is not
available, some come with dummy fallback, some throws runtime error.

It is hard to maintain, and hard to reason for users.

Since UUID is a simple standard with only a small number of operations,
it is cleaner to have a central support in libqemuutil. This patch adds
qemu_uuid_* functions that all uuid users in the code base can
rely on. Except for qemu_uuid_generate which is new code, all other
functions are just copy from existing fallbacks from other files.

Note that qemu_uuid_parse is moved without updating the function
signature to use QemuUUID, to keep this patch simple.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-Id: <1474432046-325-2-git-send-email-famz@redhat.com>
(cherry picked from commit cea25275a3)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Denis V. Lunev
3165f3027f trace: move qemu_trace_opts to trace/control.c
The patch also creates trace_opt_parse() helper in trace/control.c to reuse
this code in next patches for qemu-nbd and qemu-io.

The patch also makes trace_init_events() static, as this call is not used
outside the module anymore.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-4-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e9e0bb2af2)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Sergey Fedorov
b9a7258ed2 include/qemu/osdep.h: Add macros for pointer alignment
These macros provide a convenient way to n-byte align pointers up and
down and check if a pointer is n-byte aligned.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-3-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
(cherry picked from commit 6b587d3cda)
[LD: BSC#1106222 CVE-2018-15746]
Signed-off-by: Larry Dewey <ldewey@suse.com>
2021-03-17 09:45:52 -06:00
Konrad Rzeszutek Wilk
cffb152b96 i386: define the AMD 'virt-ssbd' CPUID feature bit (CVE-2018-3639)
AMD Zen expose the Intel equivalant to Speculative Store Bypass Disable
via the 0x80000008_EBX[25] CPUID feature bit.

This needs to be exposed to guest OS to allow them to protect
against CVE-2018-3639.

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 403503b162)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Konrad Rzeszutek Wilk
4afd44894b i386: Define the Virt SSBD MSR and handling of it (CVE-2018-3639)
"Some AMD processors only support a non-architectural means of enabling
speculative store bypass disable (SSBD).  To allow a simplified view of
this to a guest, an architectural definition has been created through a new
CPUID bit, 0x80000008_EBX[25], and a new MSR, 0xc001011f.  With this, a
hypervisor can virtualize the existence of this definition and provide an
architectural method for using SSBD to a guest.

Add the new CPUID feature, the new MSR and update the existing SSBD
support to use this MSR when present." (from x86/speculation: Add virtualized
speculative store bypass disable support in Linux).

Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20180521215424.13520-4-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit cfeea0c021)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
7fb5525b8d qga: check bytes count read by guest-file-read
While reading file content via 'guest-file-read' command,
'qmp_guest_file_read' routine allocates buffer of count+1
bytes. It could overflow for large values of 'count'.
Add check to avoid it.

Reported-by: Fakhri Zulkifli <mohdfakhrizulkifli@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
(cherry picked from commit 141b197408)
[FL: BSC#1098735 CVE-2018-12617]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
0c2e5749e4 slirp: correct size computation while concatenating mbuf
While reassembling incoming fragmented datagrams, 'm_cat' routine
extends the 'mbuf' buffer, if it has insufficient room. It computes
a wrong buffer size, which leads to overwriting adjacent heap buffer
area. Correct this size computation in m_cat.

Reported-by: ZDI Disclosures <zdi-disclosures@trendmicro.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 864036e251)
[FL: BSC#1096223 CVE-2018-11806]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Bruce Rogers
2d08b68466 xen: add block resize support for xen disks
Provide monitor naming of xen disks, and plumb guest driver
notification through xenstore of resizing instigated via the
monitor.

[BR: FATE#325467]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Marc-André Lureau
3b9aa00611 tpm: lookup cancel path under tpm device class
Since Linux commit 313d21eeab9282e, tpm devices have their own device
class "tpm" and the cancel path must be looked up under
/sys/class/tpm/ instead of /sys/class/misc/.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
(cherry picked from commit 05b71fb207)
[LY: BSC#1079405]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrangé
d78adee7e2 i386: define the 'ssbd' CPUID feature bit (CVE-2018-3639)
New microcode introduces the "Speculative Store Bypass Disable"
CPUID feature bit. This needs to be exposed to guest OS to allow
them to protect against CVE-2018-3639.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Message-Id: <20180521215424.13520-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit d19d1f9659)
[BR: BSC#1092885 CVE-2018-3639]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Bruce Rogers
3fff8f2982 migration: warn about inconsistent spec_ctrl state
As an attempt to help the user do the right thing, warn if we
detect spec_ctrl data in the migration stream, but where the
cpu defined doesn't have the feature. This would indicate the
migration is from the quick and dirty qemu produced in January
2018 to handle Spectre v2. That qemu version exposed the IBRS
cpu feature to all vcpu types, which helped in the short term
but wasn't a well designed approach.
Warn the user that the now migrated guest needs to be restarted
as soon as possible, using the spec_ctrl cpu feature flag or a
*-IBRS vcpu model specified as appropriate.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Habkost
328b7e164a i386: Add new -IBRS versions of Intel CPU models
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715.  Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.

The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.

Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit ac96c41354)
[BR: BSC#1068032 CVE-2017-5715 CPU models are reduced as appropriate
to match this QEMU version]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Habkost
5208f5893f i386: Add FEAT_8000_0008_EBX CPUID feature word
Add the new feature word and the "ibpb" feature flag.

Based on a patch by Paolo Bonzini.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-5-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 1b3420e1c4)
[BR: BSC#1068032 CVE-2017-5715 change to match current code's feat_name
type]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Eduardo Habkost
b17ccd84fb i386: Add spec-ctrl CPUID bit
Add the feature name and a CPUID_7_0_EDX_SPEC_CTRL macro.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-4-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a2381f0934)
[BR: BSC#1068032 CVE-2017-5715 modify to match current code's feat_name
type]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Luwei Kang
3d0a6bf124 x86: add infrastructure for 7_0_EDX features
The spec can be found in Intel Software Developer Manual or in
Instruction Set Extensions Programming Reference.

Signed-off-by: Piotr Luc <piotr.luc@intel.com>
Signed-off-by: Luwei Kang <luwei.kang@intel.com>
Message-Id: <1477902446-5932-1-git-send-email-he.chen@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 95ea69fb46)
[BR: BSC#1068032 CVE-2017-5715 - orig patch modified to not provide new
feature, just infrastructure, and change to match current code's feat_name
type]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Bruce Rogers
57c63613b3 x86: cpu: increase model_id array size from 48 to 49
This is perhaps the easiest way to handle the longer model names
introduced with the -IBRS models. This is in lew of backporting
commit 807e9869b8 and other commits
which that would require.
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
e53d09c251 ui: place a hard cap on VNC server output buffer size
The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.

There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.

This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.

To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.

This completes the fix for CVE-2017-15124 started in the previous patches.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f887cf165d)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
41227e1b8f ui: fix VNC client throttling when forced update is requested
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check is disabled if the client has requested
a forced update, because we want to send these as soon as possible.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then repeatedly send full framebuffer update
requests, but never read data back from the server. This can easily make QEMU's
VNC server send buffer consume 100MB of RAM per second, until the OOM killer
starts reaping processes (hopefully the rogue QEMU process, but it might pick
others...).

To address this we make the throttling more intelligent, so we can throttle
full updates. When we get a forced update request, we keep track of exactly how
much data we put on the output buffer. We will not process a subsequent forced
update request until this data has been fully sent on the wire. We always allow
one forced update request to be in flight, regardless of what data is queued
for incremental updates or audio data. The slight complication is that we do
not initially know how much data an update will send, as this is done in the
background by the VNC job thread. So we must track the fact that the job thread
has an update pending, and not process any further updates until this job is
has been completed & put data on the output buffer.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-11-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ada8d2e436)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
4b74619181 ui: fix VNC client throttling when audio capture is active
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).

The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.

As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).

To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.

If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.

This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.

This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:

  commit a7b20a8efa
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Mon Oct 9 14:43:42 2017 +0100

    io: monitor encoutput buffer size from websocket GSource

This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e2b72cb6e0)
[LY: BSC#1073489 CVE-2017-15124]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
5062d58a76 ui: refactor code for determining if an update should be sent to the client
The logic for determining if it is possible to send an update to the client
will become more complicated shortly, so pull it out into a separate method
for easier extension later.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-9-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0bad834228)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
1198e67998 ui: correctly reset framebuffer update state after processing dirty regions
According to the RFB protocol, a client sends one or more framebuffer update
requests to the server. The server can reply with a single framebuffer update
response, that covers all previously received requests. Once the client has
read this update from the server, it may send further framebuffer update
requests to monitor future changes. The client is free to delay sending the
framebuffer update request if it needs to throttle the amount of data it is
reading from the server.

The QEMU VNC server, however, has never correctly handled the framebuffer
update requests. Once QEMU has received an update request, it will continue to
send client updates forever, even if the client hasn't asked for further
updates. This prevents the client from throttling back data it gets from the
server. This change fixes the flawed logic such that after a set of updates are
sent out, QEMU waits for a further update request before sending more data.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 728a7ac954)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
ad7c59727a ui: introduce enum to track VNC client framebuffer update request state
Currently the VNC servers tracks whether a client has requested an incremental
or forced update with two boolean flags. There are only really 3 distinct
states to track, so create an enum to more accurately reflect permitted states.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-7-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fef1bbadfb)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
29752006de ui: track how much decoded data we consumed when doing SASL encoding
When we encode data for writing with SASL, we encode the entire pending output
buffer. The subsequent write, however, may not be able to send the full encoded
data in one go though, particularly with a slow network. So we delay setting the
output buffer offset back to zero until all the SASL encoded data is sent.

Between encoding the data and completing sending of the SASL encoded data,
however, more data might have been placed on the pending output buffer. So it
is not valid to set offset back to zero. Instead we must keep track of how much
data we consumed during encoding and subtract only that amount.

With the current bug we would be throwing away some pending data without having
sent it at all. By sheer luck this did not previously cause any serious problem
because appending data to the send buffer is always an atomic action, so we
only ever throw away complete RFB protocol messages. In the case of frame buffer
updates we'd catch up fairly quickly, so no obvious problem was visible.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 8f61f1c5a6)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
c72ebe880c ui: avoid pointless VNC updates if framebuffer isn't dirty
The vnc_update_client() method checks the 'has_dirty' flag to see if there are
dirty regions that are pending to send to the client. Regardless of this flag,
if a forced update is requested, updates must be sent. For unknown reasons
though, the code also tries to sent updates if audio capture is enabled. This
makes no sense as audio capture state does not impact framebuffer contents, so
this check is removed.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3541b08475)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
98a38f4f4d ui: remove redundant indentation in vnc_client_update
Now that previous dead / unreachable code has been removed, we can simplify
the indentation in the vnc_client_update method.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-4-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b939eb89b6)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
351a422042 ui: remove unreachable code in vnc_update_client
A previous commit:

  commit 5a8be0f73d
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Wed Jul 13 12:21:20 2016 +0200

    vnc: make sure we finish disconnect

Added a check for vs->disconnecting at the very start of the
vnc_update_client method. This means that the very next "if"
statement check for !vs->disconnecting always evaluates true,
and is thus redundant. This in turn means the vs->disconnecting
check at the very end of the method never evaluates true, and
is thus unreachable code.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c53df96161)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
dbb6bb85f5 ui: remove 'sync' parameter from vnc_update_client
There is only one caller of vnc_update_client and that always passes false
for the 'sync' parameter.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-2-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6af998db05)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Marc-André Lureau
036433f416 vnc: fix debug spelling
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171220140618.12701-1-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 090fdc83b0)
[LY: BSC#1073489]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
7b3f15ef17 multiboot: check mh_load_end_addr address field
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. In that, end of the data segment address
'mh_load_end_addr' should be less than the bss segment address
'mh_bss_end_addr'. Add check to validate that.

Reported-by: CERT CC <cert.cc@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[LY: BSC#1083291 CVE-2018-7550]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
linzhecheng
bb93ef0c02 vga: check the validation of memory addr when draw text
Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2  -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:

int main()
 {
    iopl(3);
    srand(time(NULL));
    int a,b;
    while(1){
	a = rand()%0x100;
	b = 0x3c0 + (rand()%0x20);
        outb(a,b);
    }
    return 0;
}

The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.

Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 191f59dc17)
[LY: BSC#1076114 CVE-2018-5683]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
b66e379ef5 osdep: Fix ROUND_UP(64-bit, 32-bit)
When using bit-wise operations that exploit the power-of-two
nature of the second argument of ROUND_UP(), we still need to
ensure that the mask is as wide as the first argument (done
by using a ternary to force proper arithmetic promotion).
Unpatched, ROUND_UP(2ULL*1024*1024*1024*1024, 512U) produces 0,
instead of the intended 2TiB, because negation of an unsigned
32-bit quantity followed by widening to 64-bits does not
sign-extend the mask.

Broken since its introduction in commit 292c8e50 (v1.5.0).
Callers that passed the same width type to both macro parameters,
or that had other code to ensure the first parameter's maximum
runtime value did not exceed the second parameter's width, are
unaffected, but I did not audit to see which (if any) existing
clients of the macro could trigger incorrect behavior (I found
the bug while adding a new use of the macro).

While preparing the patch, checkpatch complained about poor
spacing, so I also fixed that here and in the nearby DIV_ROUND_UP.

CC: qemu-trivial@nongnu.org
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 2098b073f3)
[LY: BSC#1076775 CVE-2017-18043]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
4c3e368d9d nbd/server: CVE-2017-15119 Reject options larger than 32M
The NBD spec gives us permission to abruptly disconnect on clients
that send outrageously large option requests, rather than having
to spend the time reading to the end of the option.  No real
option request requires that much data anyways; and meanwhile, we
already have the practice of abruptly dropping the connection on
any client that sends NBD_CMD_WRITE with a payload larger than 32M.

For comparison, nbdkit drops the connection on any request with
more than 4096 bytes; however, that limit is probably too low
(as the NBD spec states an export name can theoretically be up
to 4096 bytes, which means a valid NBD_OPT_INFO could be even
longer) - even if qemu doesn't permit exports longer than 256
bytes.

It could be argued that a malicious client trying to get us to
read nearly 4G of data on a bad request is a form of denial of
service.  In particular, if the server requires TLS, but a client
that does not know the TLS credentials sends any option (other
than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
payload of nearly 4G, then the server was keeping the connection
alive trying to read all the payload, tying up resources that it
would rather be spending on a client that can get past the TLS
handshake.  Hence, this warranted a CVE.

Present since at least 2.5 when handling known options, and made
worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
to handle unknown options.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit fdad35ef6c)
[LY: BSC#1070144 CVE-2017-15119]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
0b7a0d71ae ps2: check PS2Queue pointers in post_load routine
During Qemu guest migration, a destination process invokes ps2
post_load function. In that, if 'rptr' and 'count' values were
invalid, it could lead to OOB access or infinite loop issue.
Add check to avoid it.

Reported-by: Cyrille Chatras <cyrille.chatras@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20171116075155.22378-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 802cbcb730)
[LY: bsc#1068613 CVE-2017-16845]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
ea8282c788 virtio: check VirtQueue Vring object is set
A guest could attempt to use an uninitialised VirtQueue object
or unset Vring.align leading to a arithmetic exception. Add check
to avoid it.

Reported-by: Zhangboxian <zhangboxian@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
(cherry picked from commit 758ead31c7)
[LY: bsc#1071228 CVE-2017-17381]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Christian Borntraeger
ae43e903c0 s390x/kvm: Handle bpb feature
We need to handle the bpb control on reset and migration. Normally
stfle.82 is transparent (and the normal guest part works without
hypervisor activity). To prevent any issues we require full
host kernel support for this feature.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit b073c87517)
[LY: BSC#1076814]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Christian Borntraeger
a200878f62 header sync
replace with proper header sync

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 9cbb636270)
[LY: BSC#1076814]
Signed-off-by: Liang Yan <lyan@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
86314707d7 i386: Add support for SPEC_CTRL MSR
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-3-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit a33a2cfe2f)
[BR: BSC#1068032 CVE-2017-5715]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Anthony PERARD
e9ba26fac3 exec: Add lock parameter to qemu_ram_ptr_length
Commit 04bf2526ce (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.

In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.

The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20170726165326.10327-1-anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit f5aa69bdc3)
[BR: BSC#1048902 BSC#1069178 CVE-2017-11334 (additional fix needed for orig
issue)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Stefano Stabellini
1ff1caa796 xen/mapcache: store dma information in revmapcache entries for debugging
The Xen mapcache is able to create long term mappings, they are called
"locked" mappings. The third parameter of the xen_map_cache call
specifies if a mapping is a "locked" mapping.

>From the QEMU point of view there are two kinds of long term mappings:

[a] device memory mappings, such as option roms and video memory
[b] dma mappings, created by dma_memory_map & friends

After certain operations, ballooning a VM in particular, Xen asks QEMU
kindly to destroy all mappings. However, certainly [a] mappings are
present and cannot be removed. That's not a problem as they are not
affected by balloonning. The *real* problem is that if there are any
mappings of type [b], any outstanding dma operations could fail. This is
a known shortcoming. In other words, when Xen asks QEMU to destroy all
mappings, it is an error if any [b] mappings exist.

However today we have no way of distinguishing [a] from [b]. Because of
that, we cannot even print a decent warning.

This patch introduces a new "dma" bool field to MapCacheRev entires, to
remember if a given mapping is for dma or is a long term device memory
mapping. When xen_invalidate_map_cache is called, we print a warning if
any [b] mappings exist. We ignore [a] mappings.

Mappings created by qemu_map_ram_ptr are assumed to be [a], while
mappings created by address_space_map->qemu_ram_ptr_length are assumed
to be [b].

The goal of the patch is to make debugging and system understanding
easier.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 1ff7c5986a)
[BR: BSC#1048902 BSC#1069178 CVE-2017-11334 (additional fix needed for orig
issue)]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Mark Cave-Ayland
6ecdd22d33 scsi-disk: fix reads from scsi-disk devices
Commit fcaafb1001 accidentally broke reads from
scsi-disk devices when being updated from its original form to use the new
byte-based block functions. Add the extra missing sector to offset conversion
in order to restore read functionality.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1464931021-25117-1-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 890e48d7fc)
[BR: BSC#1043176 BSC#1067824]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
093bed0cba 9pfs: use g_malloc0 to allocate space for xattr
9p back-end first queries the size of an extended attribute,
allocates space for it via g_malloc() and then retrieves its
value into allocated buffer. Race between querying attribute
size and retrieving its could lead to memory bytes disclosure.
Use g_malloc0() to avoid it.

Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7bd9275630)
[BR: BSC#1062069 CVE-2017-15038]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Gerd Hoffmann
42ff2fcb36 cirrus: fix oob access in mode4and5 write functions
Move dst calculation into the loop, so we apply the mask on each
interation and will not overflow vga memory.

Cc: Prasad J Pandit <pjp@fedoraproject.org>
Reported-by: Niu Guoxiang <niuguoxiang@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171011084314.21752-1-kraxel@redhat.com
[BR: BSC#1063122 CVE-2017-15289]
Signed-off-by: Bruce Rogers <brogers@suse.com
2021-03-17 09:45:52 -06:00
Daniel P. Berrange
bb9d845da2 io: monitor encoutput buffer size from websocket GSource
The websocket GSource is monitoring the size of the rawoutput
buffer to determine if the channel can accepts more writes.
The rawoutput buffer, however, is merely a temporary staging
buffer before data is copied into the encoutput buffer. Thus
its size will always be zero when the GSource runs.

This flaw causes the encoutput buffer to grow without bound
if the other end of the underlying data channel doesn't
read data being sent. This can be seen with VNC if a client
is on a slow WAN link and the guest OS is sending many screen
updates. A malicious VNC client can act like it is on a slow
link by playing a video in the guest and then reading data
very slowly, causing QEMU host memory to expand arbitrarily.

This issue is assigned CVE-2017-15268, publically reported in

  https://bugs.launchpad.net/qemu/+bug/1718964

(cherry picked from commit a7b20a8efa)

Reviewed-by: Eric Blake <eblake@redhat.com>

[Dan: Added extra checks to deal with code refactored in master but
 not stable 2.10]

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[BR: BSC#1062942 CVE-2017-15268]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:52 -06:00
Gerd Hoffmann
6fe4081163 vga: stop passing pointers to vga_draw_line* functions
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).

Impact:  DoS for privileged guest users.  qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.

Fixes: CVE-2017-13672
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828122906.18993-1-kraxel@redhat.com
(cherry picked from commit 3d90c62548)
[FL: BSC#1056334 CVE-2017-13672]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
fc2176bb35 exec: use qemu_ram_ptr_length to access guest ram
When accessing guest's ram block during DMA operation, use
'qemu_ram_ptr_length' to get ram block pointer. It ensures
that DMA operation of given length is possible; And avoids
any OOB memory access situations.

Reported-by: Alex <broscutamaker@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170712123840.29328-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 04bf2526ce)
[FL: BSC#1048902 CVE-2017-11334]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
7d3f326b5c slirp: check len against dhcp options array end
While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.

This is CVE-2017-11434.

Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
(cherry picked from commit 413d463f43)
[FL: BSC#1049381 CVE-2017-11434]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Prasad J Pandit
3a78277e58 multiboot: validate multiboot header address values
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.

This is CVE-2017-14167.

Reported-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ed4f86e8b6)
[FL: BSC#1057585 CVE-2017-14167]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Gerd Hoffmann
768021e2f8 usb-redir: fix stack overflow in usbredir_log_data
Don't reinvent a broken wheel, just use the hexdump function we have.

Impact: low, broken code doesn't run unless you have debug logging
enabled.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170509110128.27261-1-kraxel@redhat.com
(cherry picked from commit bd4a683505)
[FL: BSC#1047674 CVE-2017-10806]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Max Reitz
d47b8910c0 qemu-nbd: Ignore SIGPIPE
qemu proper has done so for 13 years
(8a7ddc38a6), qemu-img and qemu-io have
done so for four years (526eda14a6).
Ignoring this signal is especially important in qemu-nbd because
otherwise a client can easily take down the qemu-nbd server by dropping
the connection when the server wants to send something, for example:

$ qemu-nbd -x foo -f raw -t null-co:// &
[1] 12726
$ qemu-io -c quit nbd://localhost/bar
can't open device nbd://localhost/bar: No export with name 'bar' available
[1]  + 12726 broken pipe  qemu-nbd -x foo -f raw -t null-co://

In this case, the client sends an NBD_OPT_ABORT and closes the
connection (because it is not required to wait for a reply), but the
server replies with an NBD_REP_ACK (because it is required to reply).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20170611123714.31292-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 041e32b8d9)
[FL: BSC#1046636 CVE-2017-10664]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
1fad365147 nbd: Fix regression on resiliency to port scan
Back in qemu 2.5, qemu-nbd was immune to port probes (a transient
server would not quit, regardless of how many probe connections
came and went, until a connection actually negotiated).  But we
broke that in commit ee7d7aa when removing the return value to
nbd_client_new(), although that patch also introduced a bug causing
an assertion failure on a client that fails negotiation.  We then
made it worse during refactoring in commit 1a6245a (a segfault
before we could even assert); the (masked) assertion was cleaned
up in d3780c2 (still in 2.6), and just recently we finally fixed
the segfault ("nbd: Fully intialize client in case of failed
negotiation").  But that still means that ever since we added
TLS support to qemu-nbd, we have been vulnerable to an ill-timed
port-scan being able to cause a denial of service by taking down
qemu-nbd before a real client has a chance to connect.

Since negotiation is now handled asynchronously via coroutines,
we no longer have a synchronous point of return by re-adding a
return value to nbd_client_new().  So this patch instead wires
things up to pass the negotiation status through the close_fn
callback function.

Simple test across two terminals:
$ qemu-nbd -f raw -p 30001 file
$ nmap 127.0.0.1 -p 30001 && \
  qemu-io -c 'r 0 512' -f raw nbd://localhost:30001

Note that this patch does not change what constitutes successful
negotiation (thus, a client must enter transmission phase before
that client can be considered as a reason to terminate the server
when the connection ends).  Perhaps we may want to tweak things
in a later patch to also treat a client that uses NBD_OPT_ABORT
as being a 'successful' negotiation (the client correctly talked
the NBD protocol, and informed us it was not going to use our
export after all), but that's a discussion for another day.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170608222617.20376-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 0c9390d978)
[FL: BSC#1043808 CVE-2017-9524]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
02340f63ce nbd: Fully initialize client in case of failed negotiation
If a non-NBD client connects to qemu-nbd, we would end up with
a SIGSEGV in nbd_client_put() because we were trying to
unregister the client's association to the export, even though
we skipped inserting the client into that list.  Easy trigger
in two terminals:

$ qemu-nbd -p 30001 --format=raw file
$ nmap 127.0.0.1 -p 30001

nmap claims that it thinks it connected to a pago-services1
server (which probably means nmap could be updated to learn the
NBD protocol and give a more accurate diagnosis of the open
port - but that's not our problem), then terminates immediately,
so our call to nbd_negotiate() fails.  The fix is to reorder
nbd_co_client_start() to ensure that all initialization occurs
before we ever try talking to a client in nbd_negotiate(), so
that the teardown sequence on negotiation failure doesn't fault
while dereferencing a half-initialized object.

While debugging this, I also noticed that nbd_update_server_watch()
called by nbd_client_closed() was still adding a channel to accept
the next client, even when the state was no longer RUNNING.  That
is fixed by making nbd_can_accept() pay attention to the current
state.

Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1451614

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170527030421.28366-1-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit df8ad9f128)
[FL: BSC#1043808 CVE-2017-9524]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Stefan Hajnoczi
c53fc3bfcc IDE: Do not flush empty CDROM drives
The block backend changed in a way that flushing empty CDROM drives now
crashes.  Amend IDE to avoid doing so until the root problem can be
addressed for 2.11.

Original patch by John Snow <jsnow@redhat.com>.

Reported-by: Kieron Shorrock <kshorrock@paloaltonetworks.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170809160212.29976-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 4da97120d5)
[FL: BSC#1054724 CVE-2017-12809]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Stefano Stabellini
41e3c570bd xen/disk: don't leak stack data via response ring
Rather than constructing a local structure instance on the stack, fill
the fields directly on the shared ring, just like other (Linux)
backends do. Build on the fact that all response structure flavors are
actually identical (aside from alignment and padding at the end).

This is XSA-216.

Reported by: Anthony Perard <anthony.perard@citrix.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit b0ac694fdb)
[FL: BSC#1057378 CVE-2017-10911]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Alexander Graf
2f050eb24f vnc: Set default kbd delay to 10ms
The current VNC default keyboard delay is 1ms. With that we're constantly
typing faster than the guest receives keyboard events from an XHCI attached
USB HID device.

The default keyboard delay time in the input layer however is 10ms. I don't know
how that number came to be, but empirical tests on some OpenQA driven ARM
systems show that 10ms really is a reasonable default number for the delay.

This patch moves the VNC delay also to 10ms. That way our default is much
safer (good!) and also consistent with the input layer default (also good!).

Signed-off-by: Alexander Graf <agraf@suse.de>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1499863425-103133-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d3b0db6dfe)
[FL: BSC#1059369]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Alexander Graf
9c1627f632 hid: Reset kbd modifiers on reset
When resetting the keyboard, we need to reset not just the pending keystrokes,
but also any pending modifiers. Otherwise there's a race when we're getting
reset while running an escape sequence (modifier 0x100).

Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1498117295-162030-1-git-send-email-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 51dbea77a2)
[FL: BSC#1059369]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Alexander Graf
f9ed2f6db0 input: Decrement queue count on kbd delay
Delays in the input layer are special cased input events. Every input
event is accounted for in a global intput queue count. The special cased
delays however did not get removed from the queue, leading to queue overruns
and thus silent key drops after typing quite a few characters.

Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 1498117318-162102-1-git-send-email-agraf@suse.de
Fixes: be1a7176 ("input: add support for kbd delays")
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 77b0359bf4)
[FL: BSC#1059369]
Signed-off-by: Fei Li <fli@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
605de85f8d scsi-block: fix direction of BYTCHK test for VERIFY commands
The direction is wrong; scsi_block_is_passthrough returns
false for commands that *can* use sglists.

Reported-by: Zhang Qian <zhangqian@sangfor.com.cn>
Fixes: 8fdc7839e4
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1f8af0d186)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
8466c16902 scsi-block: always use SG_IO
Using pread/pwrite or io_submit has the advantage of eliminating the
bounce buffer, but drops the SCSI status.  This keeps the guest from
seeing unit attention codes, as well as statuses such as RESERVATION
CONFLICT.  Because we know scsi-block operates on an SBC device we can
still use the DMA helpers with SG_IO; just remember to patch the CDBs
if the transfer is split into multiple segments.

This means that scsi-block will always use the thread-pool unfortunately,
instead of respecting aio=native.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8fdc7839e4)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
ad4ea5d00e scsi-disk: introduce scsi_disk_req_check_error
Commonize all the checks for canceled requests and errors.  The next patch
will add another case to check for, in order to handle passthrough commands.

There is no semantic change here; the only nontrivial modification is in
scsi_write_do_fua, where cancellation has been checked earlier by both
callers.  Thus, the check is replaced with an assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5b956f415a)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
7996a67ea9 scsi-disk: add need_fua_emulation to SCSIDiskClass
scsi-block will be able to do FUA just by passing the request through
to the LUN (which is also more efficient); there is no need to emulate
it like we do for scsi-disk.

Add a new method to distinguish this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 94f8ba1125)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
1fa8c51cbe scsi-disk: introduce dma_readv and dma_writev
These are replacements for blk_aio_readv and blk_aio_writev that allow
customization of the data path.  They reuse the DMA helpers' DMAIOFunc
callback type, so that the same function can be used in either the
QEMUSGList or the bounce-buffered case.

This customization will be needed in the next patch to do zero-copy
SG_IO on scsi-block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit fcaafb1001)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
67e80c2fc4 scsi-disk: introduce a common base class
This will be the place to add DMAIOFuncs in the next patch.  There
are also a couple DeviceClass members that can be moved to the
abstract class's initialization function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 993935f315)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
1eaa9c3175 dma-helpers: change BlockBackend to opaque value in DMAIOFunc
Callers of dma_blk_io have no way to pass extra data to the DMAIOFunc,
because the original callback and opaque are gone by the time DMAIOFunc
is called.  On the other hand, the BlockBackend is usually derived
from those extra data that you could pass to the DMAIOFunc (in the
next patch, that would be the SCSIRequest).

So change DMAIOFunc's prototype, decoupling it from blk_aio_readv
and blk_aio_writev's.  The new prototype loses the BlockBackend
and gains an extra opaque value which, in the case of dma_blk_readv
and dma_blk_writev, is of course used for the BlockBackend.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8a8e63ebdd)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Paolo Bonzini
9c4f20b07f dma-helpers: change interface to byte-based
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit cbe0ed6247)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
ffa84a8166 ide: Switch to byte-based aio block access
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.

The patch had to touch multiple files at once, because dma_blk_io()
takes pointers to the functions, and ide_issue_trim() piggybacks on
the same interface (while ignoring offset under the hood).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit d4f510eb3f)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
2c7459c1bd block: Introduce byte-based aio read/write
blk_aio_readv() and blk_aio_writev() are annoying in that they
can't access sub-sector granularity, and cannot pass flags.
Also, they require the caller to pass redundant information
about the size of the I/O (qiov->size in bytes must match
nb_sectors in sectors).

Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix
the flaws. The next few patches will upgrade callers, then
finally delete the old interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 60cb2fa7eb)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
72b9fe861a block: Switch blk_*write_zeroes() to byte interface
Sector-based blk_write() should die; convert the one-off
variant blk_write_zeroes() to use an offset/count interface
instead.  Likewise for blk_co_write_zeroes() and
blk_aio_write_zeroes().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 983a160050)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:52 -06:00
Eric Blake
3a20a85994 block: Switch blk_read_unthrottled() to byte interface
Sector-based blk_read() should die; convert the one-off
variant blk_read_unthrottled().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b7d17f9fa4)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:51 -06:00
Eric Blake
bf887ef59d block: Allow BDRV_REQ_FUA through blk_pwrite()
We have several block drivers that understand BDRV_REQ_FUA,
and emulate it in the block layer for the rest by a full flush.
But without a way to actually request BDRV_REQ_FUA during a
pass-through blk_pwrite(), FUA-aware block drivers like NBD are
forced to repeat the emulation logic of a full flush regardless
of whether the backend they are writing to could do it more
efficiently.

This patch just wires up a flags argument; followup patches
will actually make use of it in the NBD driver and in qemu-io.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 8341f00dc2)
[LM: BSC#1043176]
Signed-off-by: Lin Ma <lma@suse.com>
2021-03-17 09:45:51 -06:00
Bruce Rogers
2e19ef7893 9pfs: local: remove: use correct path component
Commit a0e640a8 introduced a path processing error.
Pass fstatat the dirpath based path component instead
of the entire path.

[BR: BSC#1045035]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Paolo Bonzini
3a767be9c3 megasas: always store SCSIRequest* into MegasasCmd
This ensures that the request is unref'ed properly, and avoids a
segmentation fault in the new qtest testcase that is added.

Reported-by: Zhangyanyu <zyy4013@stu.ouc.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503, dropped testcase from patch]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Paolo Bonzini
9b5aeff4b2 megasas: do not read DCMD opcode more than once from frame
Avoid TOC-TOU bugs by storing the DCMD opcode in the MegasasCmd

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[BR: BSC#1043296 CVE-2017-9503]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
26aab24071 serial: fix memory leak in serial exit
The serial_exit_core function doesn't free some resources.
This can lead memory leak when hotplug and unplug. This
patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <586cb5ab.f31d9d0a.38ac3.acf2@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8409dc884a)
[BR: BSC#1021741 CVE-2017-5579]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
bed3603d0f 9pfs: local: fix unlink of alien files in mapped-file mode
When trying to remove a file from a directory, both created in non-mapped
mode, the file remains and EBADF is returned to the guest.

This is a regression introduced by commit "df4938a6651b 9pfs: local:
unlinkat: don't follow symlinks" when fixing CVE-2016-9602. It changed the
way we unlink the metadata file from

    ret = remove("$dir/.virtfs_metadata/$name");
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

to

    fd = openat("$dir/.virtfs_metadata")
    unlinkat(fd, "$name")
    if (ret < 0 && errno != ENOENT) {
         /* Error out */
    }
    /* Ignore absence of metadata */

If $dir was created in non-mapped mode, openat() fails with ENOENT and
we pass -1 to unlinkat(), which fails in turn with EBADF.

We just need to check the return of openat() and ignore ENOENT, in order
to restore the behaviour we had with remove().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
[groug: rewrote the comments as suggested by Eric]
(cherry picked from commit 6a87e7929f)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
a55f87c8ca 9pfs: local: forbid client access to metadata (CVE-2017-7493)
When using the mapped-file security mode, we shouldn't let the client mess
with the metadata. The current code already tries to hide the metadata dir
from the client by skipping it in local_readdir(). But the client can still
access or modify it through several other operations. This can be used to
escalate privileges in the guest.

Affected backend operations are:
- local_mknod()
- local_mkdir()
- local_open2()
- local_symlink()
- local_link()
- local_unlinkat()
- local_renameat()
- local_rename()
- local_name_to_path()

Other operations are safe because they are only passed a fid path, which
is computed internally in local_name_to_path().

This patch converts all the functions listed above to fail and return
EINVAL when being passed the name of the metadata dir. This may look
like a poor choice for errno, but there's no such thing as an illegal
path name on Linux and I could not think of anything better.

This fixes CVE-2017-7493.

Reported-by: Leo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit 7a95434e0c)
[BR: BSC#1039495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Prasad J Pandit
d12f514ad3 scsi: avoid an off-by-one error in megasas_mmio_write
While reading magic sequence(MFI_SEQ) in megasas_mmio_write,
an off-by-one error could occur as 's->adp_reset' index is not
reset after reading the last sequence.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170424120634.12268-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 24dfa9fa2f)
[BR: BSC#1037336 CVE-2017-8380]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
e7331ac37a audio: release capture buffers
AUD_add_capture() allocates two buffers which are never released.
Add the missing calls to AUD_del_capture().

Impact: Allows vnc clients to exhaust host memory by repeatedly
starting and stopping audio capture.

Fixes: CVE-2017-8309
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: "Jiangxin (hunter, SCC)" <jiangxin1@huawei.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170428075612.9997-1-kraxel@redhat.com
(cherry picked from commit 3268a845f4)
[BR: BSC#1037242]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
8d85137c18 input: limit kbd queue depth
Apply a limit to the number of items we accept into the keyboard queue.

Impact: Without this limit vnc clients can exhaust host memory by
sending keyboard events faster than qemu feeds them to the guest.

Fixes: CVE-2017-8379
Cc: P J P <ppandit@redhat.com>
Cc: Huawei PSIRT <PSIRT@huawei.com>
Reported-by: jiangxin1@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170428084237.23960-1-kraxel@redhat.com
(cherry picked from commit fa18f36a46)
[BR: BSC#1037334]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
96533d4bf7 usb: ohci: fix error return code in servicing iso td
It should return 1 if an error occurs when reading iso td.
This will avoid an infinite loop issue in ohci_service_ed_list.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5899ac3e.1033240a.944d5.9a2d@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 26f670a244)
[BR: BSC#1042159 CVE-2017-9330]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
fbc95afd59 ide: ahci: call cleanup function in ahci unit
This can avoid memory leak when hotunplug the ahci device.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-4-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit d68f0f778e)
[BSC#1042801 CVE-2017-9373]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
495349cb85 ide: core: add cleanup function
As the pci ahci can be hotplug and unplug, in the ahci unrealize
function it should free all the resource once allocated in the
realized function. This patch add ide_exit to free the resource.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1488449293-80280-3-git-send-email-liqiang6-s@360.cn
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit c9f086418a)
[BSC#1042801 CVE-2017-9373]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
1268011fc0 usb: ehci: fix memory leak in ehci
In usb_ehci_init function, it initializes 's->ipacket', but there
is no corresponding function to free this. As the ehci can be hotplug
and unplug, this will leak host memory leak. In order to make the
hierarchy clean, we should add a ehci pci finalize function, then call
the clean function in ehci device.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 589a85b8.3c2b9d0a.b8e6.1434@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d710e1e7bd)
[BR: BSC#1043073 CVE-2017-9374]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
c8f2dc2fbe xhci: guard xhci_kick_epctx against recursive calls
Track xhci_kick_epctx processing being active in a variable.  Check the
variable before calling xhci_kick_epctx from xhci_kick_ep.  Add an
assert to make sure we don't call recursively into xhci_kick_epctx.

Cc: 1653384@bugs.launchpad.net
Fixes: 94b037f2a4
Reported-by: Fabian Lesniak <fabian@lesniak-it.de>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1486035372-3621-1-git-send-email-kraxel@redhat.com
Message-id: 1485790607-31399-5-git-send-email-kraxel@redhat.com
(cherry picked from commit 96d87bdda3)
[BR: BSC#1042800 CVE-2017-9375]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
P J P
f05e4039c5 vmw_pvscsi: check message ring page count at initialisation
Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These come with their message ring buffers.
A guest could set the message ring page count to an arbitrary value
resulting in infinite loop. Add check to avoid it.

Reported-by: YY Z <bigbird475958471@gmail.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit f68826989c)
[BR: BSC#1036211 CVE-2017-8112]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
8f37797636 cirrus: fix off-by-one in cirrus_bitblt_rop_bkwd_transp_*_16
The switch from pointers to addresses (commit
026aeffcb4 and
ffaf857778) added
a off-by-one bug to 16bit backward blits.  Fix.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 1489735296-19047-1-git-send-email-kraxel@redhat.com
(cherry picked from commit f019722cbb)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
0c063d43d7 cirrus: stop passing around src pointers in the blitter
Does basically the same as "cirrus: stop passing around dst pointers in
the blitter", just for the src pointer instead of the dst pointer.

For the src we have to care about cputovideo blits though and fetch the
data from s->cirrus_bltbuf instead of vga memory.  The cirrus_src*()
helper functions handle that.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489584487-3489-1-git-send-email-kraxel@redhat.com
(cherry picked from commit ffaf857778)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
58c2dc899b cirrus: stop passing around dst pointers in the blitter
Instead pass around the address (aka offset into vga memory).  Calculate
the pointer in the rop_* functions, after applying the mask to the
address, to make sure the address stays within the valid range.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489574872-8679-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 026aeffcb4)
[BR: BSC#1035406 CVE-2017-7980]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
hangaohuai
9ad661ce48 fix :cirrus_vga fix OOB read case qemu Segmentation fault
check the validity of parameters in cirrus_bitblt_rop_fwd_transp_xxx
and cirrus_bitblt_rop_fwd_xxx to avoid the OOB read which causes qemu Segmentation fault.

After the fix, we will touch the assert in
cirrus_invalidate_region:
assert(off_cur_end >= off_cur);

Signed-off-by: fangying <fangying1@huawei.com>
Signed-off-by: hangaohuai <hangaohuai@huawei.com>
Message-id: 20170314063919.16200-1-hangaohuai@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 215902d7b6)
[BR: BSC#1034908 CVE-2017-7718]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Pranith Kumar
0d9e95e380 tcg/i386: Check the size of instruction being translated
This fixes the bug: 'user-to-root privesc inside VM via bad translation
caching' reported by Jann Horn here:
https://bugs.chromium.org/p/project-zero/issues/detail?id=1122

Reviewed-by: Richard Henderson <rth@twiddle.net>
CC: Peter Maydell <peter.maydell@linaro.org>
CC: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Jann Horn <jannh@google.com>
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170323175851.14342-1-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 30663fd26c)
[BR: BSC#1030624]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
f630f353c7 cirrus/vnc: zap bitblit support from console code.
There is a special code path (dpy_gfx_copy) to allow graphic emulation
notify user interface code about bitblit operations carryed out by
guests.  It is supported by cirrus and vnc server.  The intended purpose
is to optimize display scrolls and just send over the scroll op instead
of a full display update.

This is rarely used these days though because modern guests simply don't
use the cirrus blitter any more.  Any linux guest using the cirrus drm
driver doesn't.  Any windows guest newer than winxp doesn't ship with a
cirrus driver any more and thus uses the cirrus as simple framebuffer.

So this code tends to bitrot and bugs can go unnoticed for a long time.
See for example commit "3e10c3e vnc: fix qemu crash because of SIGSEGV"
which fixes a bug lingering in the code for almost a year, added by
commit "c7628bf vnc: only alloc server surface with clients connected".

Also the vnc server will throttle the frame rate in case it figures the
network can't keep up (send buffers are full).  This doesn't work with
dpy_gfx_copy, for any copy operation sent to the vnc client we have to
send all outstanding updates beforehand, otherwise the vnc client might
run the client side blit on outdated data and thereby corrupt the
display.  So this dpy_gfx_copy "optimization" might even make things
worse on slow network links.

Lets kill it once for all.

Oh, and one more reason: Turns out (after writing the patch) we have a
security bug in that code path ...

Fixes: CVE-2016-9603
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1489494419-14340-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 50628d3479)
[BR: BSC#1028656]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
e8b997b694 usb: ohci: limit the number of link eds
The guest may builds an infinite loop with link eds. This patch
limit the number of linked ed to avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5899a02e.45ca240a.6c373.93c1@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 95ed56939e)
[BR: BSC#1028184 CVE-2017-6505]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Prasad J Pandit
3b50b0f111 sd: sdhci: check transfer mode register in multi block transfer
In the SDHCI protocol, the transfer mode register value
is used during multi block transfer to check if block count
register is enabled and should be updated. Transfer mode
register could be set such that, block count register would
not be updated, thus leading to an infinite loop. Add check
to avoid it.

Reported-by: Wjjzhang <wjjzhang@tencent.com>
Reported-by: Jiang Xin <jiangxin1@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170214185225.7994-3-ppandit@redhat.com
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 6e86d90352)
[BR: BSC#1025311 CVE-2017-5987]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
cb78f61a32 xhci: apply limits to loops
Limits should be big enough that normal guest should not hit it.
Add a tracepoint to log them, just in case.  Also, while being
at it, log the existing link trb limit too.

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1486383669-6421-1-git-send-email-kraxel@redhat.com
(cherry picked from commit f89b60f6e5)
[BR: BSC#1025109 CVE-2017-5973]
Signed-off-by: Bruce Rogers <brogers@suse.com>

Conflicts:
	hw/usb/hcd-xhci.c
	hw/usb/trace-events
2021-03-17 09:45:51 -06:00
Greg Kurz
6d78bfd014 9pfs: local: set the path of the export root to "."
The local backend was recently converted to using "at*()" syscalls in order
to ensure all accesses happen below the shared directory. This requires that
we only pass relative paths, otherwise the dirfd argument to the "at*()"
syscalls is ignored and the path is treated as an absolute path in the host.
This is actually the case for paths in all fids, with the notable exception
of the root fid, whose path is "/". This causes the following backend ops to
act on the "/" directory of the host instead of the virtfs shared directory
when the export root is involved:
- lstat
- chmod
- chown
- utimensat

ie, chmod /9p_mount_point in the guest will be converted to chmod / in the
host for example. This could cause security issues with a privileged QEMU.

All "*at()" syscalls are being passed an open file descriptor. In the case
of the export root, this file descriptor points to the path in the host that
was passed to -fsdev.

The fix is thus as simple as changing the path of the export root fid to be
"." instead of "/".

This is CVE-2017-7471.

Cc: qemu-stable@nongnu.org
Reported-by: Léo Gaspard <leo@gaspard.io>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 9c6b899f7a)
[BR: BSC#1034866 CVE-2017-7471]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
9ecdf7758a 9pfs: introduce v9fs_path_sprintf() helper
This helper is similar to v9fs_string_sprintf(), but it includes the
terminating NUL character in the size field.

This is to avoid doing v9fs_string_sprintf((V9fsString *) &path) and
then bumping the size.

Affected users are changed to use this new helper.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
(cherry picked from commit e3e83f2e21)
[BR: support patch for BSC#1034866]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
77f8bc56c1 9pfs: xattr: fix memory leak in v9fs_list_xattr
Free 'orig_value' in error path.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4ffcdef427)
[BR: BSC#1035950 CVE-2017-8086]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
b018997f2a 9pfs: fix file descriptor leak
The v9fs_create() and v9fs_lcreate() functions are used to create a file
on the backend and to associate it to a fid. The fid shouldn't be already
in-use, otherwise both functions may silently leak a file descriptor or
allocated memory. The current code doesn't check that.

This patch ensures that the fid isn't already associated to anything
before using it.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
(reworded the changelog, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit d63fb193e7)
[BR: BSC#1032075 CVE-2017-7377]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
808589057b 9pfs: don't try to flush self and avoid QEMU hang on reset
According to the 9P spec [*], when a client wants to cancel a pending I/O
request identified by a given tag (uint16), it must send a Tflush message
and wait for the server to respond with a Rflush message before reusing this
tag for another I/O. The server may still send a completion message for the
I/O if it wasn't actually cancelled but the Rflush message must arrive after
that.

QEMU hence waits for the flushed PDU to complete before sending the Rflush
message back to the client.

If a client sends 'Tflush tag oldtag' and tag == oldtag, QEMU will then
allocate a PDU identified by tag, find it in the PDU list and wait for
this same PDU to complete... i.e. wait for a completion that will never
happen. This causes a tag and ring slot leak in the guest, and a PDU
leak in QEMU, all of them limited by the maximal number of PDUs (128).
But, worse, this causes QEMU to hang on device reset since v9fs_reset()
wants to drain all pending I/O.

This insane behavior is likely to denote a bug in the client, and it would
deserve an Rerror message to be sent back. Unfortunately, the protocol
allows it and requires all flush requests to suceed (only a Tflush response
is expected).

The only option is to detect when we have to handle a self-referencing
flush request and report success to the client right away.

[*] http://man.cat-v.org/plan_9/5/flush

Reported-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit d5f2af7b95)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
27d5e959c8 9pfs: fix vulnerability in openat_dir() and local_unlinkat_common()
We should pass O_NOFOLLOW otherwise openat() will follow symlinks and make
QEMU vulnerable.

While here, we also fix local_unlinkat_common() to use openat_dir() for
the same reasons (it was a leftover in the original patchset actually).

This fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b003fc0d8a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
e7d3f1f37c 9pfs: fix O_PATH build break with older glibc versions
When O_PATH is used with O_DIRECTORY, it only acts as an optimization: the
openat() syscall simply finds the name in the VFS, and doesn't trigger the
underlying filesystem.

On systems that don't define O_PATH, because they have glibc version 2.13
or older for example, we can safely omit it. We don't want to deactivate
O_PATH globally though, in case it is used without O_DIRECTORY. The is done
with a dedicated macro.

Systems without O_PATH may thus fail to resolve names that involve
unreadable directories, compared to newer systems succeeding, but such
corner case failure is our only option on those older systems to avoid
the security hole of chasing symlinks inappropriately.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(added last paragraph to changelog as suggested by Eric Blake)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit 918112c02a)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
1464f6eeb6 9pfs: don't use AT_EMPTY_PATH in local_set_cred_passthrough()
The name argument can never be an empty string, and dirfd always point to
the containing directory of the file name. AT_EMPTY_PATH is hence useless
here. Also it breaks build with glibc version 2.13 and older.

It is actually an oversight of a previous tentative patch to implement this
function. We can safely drop it.

Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b314f6a077)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
1f09c8f11a 9pfs: fail local_statfs() earlier
If we cannot open the given path, we can return right away instead of
passing -1 to fstatfs() and close(). This will make Coverity happy.

(Coverity issue CID1371729)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit 23da0145cc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
4b95efd175 9pfs: fix fd leak in local_opendir()
Coverity issue CID1371731

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
(cherry picked from commit faab207f11)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
28ca796f78 9pfs: fix bogus fd check in local_remove()
This was spotted by Coverity as a fd leak. This is certainly true, but also
local_remove() would always return without doing anything, unless the fd is
zero, which is very unlikely.

(Coverity issue CID1371732)

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit b7361d46e7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
b36499bf99 9pfs: local: drop unused code
Now that the all callbacks have been converted to use "at" syscalls, we
can drop this code.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c23d5f1d5b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
6c67cd9f5f 9pfs: local: open2: don't follow symlinks
The local_open2() callback is vulnerable to symlink attacks because it
calls:

(1) open() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_open2() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively. Since local_open2() already opens
a descriptor to the target file, local_set_cred_passthrough() is
modified to reuse it instead of opening a new one.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to openat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a565fea565)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
597cd1543a 9pfs: local: mkdir: don't follow symlinks
The local_mkdir() callback is vulnerable to symlink attacks because it
calls:

(1) mkdir() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mkdir() to rely on opendir_nofollow() and
mkdirat() to fix (1), as well as local_set_xattrat(),
local_set_mapped_file_attrat() and local_set_cred_passthrough() to
fix (2), (3) and (4) respectively.

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mkdirat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3f3a16990b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
0d0aa6459a 9pfs: local: mknod: don't follow symlinks
The local_mknod() callback is vulnerable to symlink attacks because it
calls:

(1) mknod() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one
(4) local_post_create_passthrough() which calls in turn lchown() and
    chmod(), both functions also following symbolic links

This patch converts local_mknod() to rely on opendir_nofollow() and
mknodat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

A new local_set_cred_passthrough() helper based on fchownat() and
fchmodat_nofollow() is introduced as a replacement to
local_post_create_passthrough() to fix (4).

The mapped and mapped-file security modes are supposed to be identical,
except for the place where credentials and file modes are stored. While
here, we also make that explicit by sharing the call to mknodat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d815e72190)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
7b4fe67373 9pfs: local: symlink: don't follow symlinks
The local_symlink() callback is vulnerable to symlink attacks because it
calls:

(1) symlink() which follows symbolic links for all path elements but the
    rightmost one
(2) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(3) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(4) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_symlink() to rely on opendir_nofollow() and
symlinkat() to fix (1), openat(O_NOFOLLOW) to fix (2), as well as
local_set_xattrat() and local_set_mapped_file_attrat() to fix (3) and
(4) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 38771613ea)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
f1c80d9c94 9pfs: local: chown: don't follow symlinks
The local_chown() callback is vulnerable to symlink attacks because it
calls:

(1) lchown() which follows symbolic links for all path elements but the
    rightmost one
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

This patch converts local_chown() to rely on open_nofollow() and
fchownat() to fix (1), as well as local_set_xattrat() and
local_set_mapped_file_attrat() to fix (2) and (3) respectively.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d369f20763)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
470f505dae 9pfs: local: chmod: don't follow symlinks
The local_chmod() callback is vulnerable to symlink attacks because it
calls:

(1) chmod() which follows symbolic links for all path elements
(2) local_set_xattr()->setxattr() which follows symbolic links for all
    path elements
(3) local_set_mapped_file_attr() which calls in turn local_fopen() and
    mkdir(), both functions following symbolic links for all path
    elements but the rightmost one

We would need fchmodat() to implement AT_SYMLINK_NOFOLLOW to fix (1). This
isn't the case on linux unfortunately: the kernel doesn't even have a flags
argument to the syscall :-\ It is impossible to fix it in userspace in
a race-free manner. This patch hence converts local_chmod() to rely on
open_nofollow() and fchmod(). This fixes the vulnerability but introduces
a limitation: the target file must readable and/or writable for the call
to openat() to succeed.

It introduces a local_set_xattrat() replacement to local_set_xattr()
based on fsetxattrat() to fix (2), and a local_set_mapped_file_attrat()
replacement to local_set_mapped_file_attr() based on local_fopenat()
and mkdirat() to fix (3). No effort is made to factor out code because
both local_set_xattr() and local_set_mapped_file_attr() will be dropped
when all users have been converted to use the "at" versions.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e3187a45dd)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
4ef872c7a0 9pfs: local: link: don't follow symlinks
The local_link() callback is vulnerable to symlink attacks because it calls:

(1) link() which follows symbolic links for all path elements but the
    rightmost one
(2) local_create_mapped_attr_dir()->mkdir() which follows symbolic links
    for all path elements but the rightmost one

This patch converts local_link() to rely on opendir_nofollow() and linkat()
to fix (1), mkdirat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ad0b46e6ac)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
740fb09307 9pfs: local: improve error handling in link op
When using the mapped-file security model, we also have to create a link
for the metadata file if it exists. In case of failure, we should rollback.

That's what this patch does.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 6dd4b1f1d0)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
266ca3b6dd 9pfs: local: rename: use renameat
The local_rename() callback is vulnerable to symlink attacks because it
uses rename() which follows symbolic links in all path elements but the
rightmost one.

This patch simply transforms local_rename() into a wrapper around
local_renameat() which is symlink-attack safe.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit d2767edec5)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
fafed95829 9pfs: local: renameat: don't follow symlinks
The local_renameat() callback is currently a wrapper around local_rename()
which is vulnerable to symlink attacks.

This patch rewrites local_renameat() to have its own implementation, based
on local_opendir_nofollow() and renameat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 99f2cf4b2d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
38fe01eac3 9pfs: local: lstat: don't follow symlinks
The local_lstat() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) getxattr() which follows symbolic links in all path elements
(3) local_mapped_file_attr()->local_fopen()->openat(O_NOFOLLOW) which
    follows symbolic links in all path elements but the rightmost
    one

This patch converts local_lstat() to rely on opendir_nofollow() and
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1), fgetxattrat_nofollow() to
fix (2).

A new local_fopenat() helper is introduced as a replacement to
local_fopen() to fix (3). No effort is made to factor out code
because local_fopen() will be dropped when all users have been
converted to call local_fopenat().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit f9aef99b3e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
54e9633381 9pfs: local: readlink: don't follow symlinks
The local_readlink() callback is vulnerable to symlink attacks because it
calls:

(1) open(O_NOFOLLOW) which follows symbolic links for all path elements but
    the rightmost one
(2) readlink() which follows symbolic links for all path elements but the
    rightmost one

This patch converts local_readlink() to rely on open_nofollow() to fix (1)
and opendir_nofollow(), readlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit bec1e9546e)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
377a01fd3e 9pfs: local: truncate: don't follow symlinks
The local_truncate() callback is vulnerable to symlink attacks because
it calls truncate() which follows symbolic links in all path elements.

This patch converts local_truncate() to rely on open_nofollow() and
ftruncate() instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit ac125d993b)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
73f1f209e9 9pfs: local: statfs: don't follow symlinks
The local_statfs() callback is vulnerable to symlink attacks because it
calls statfs() which follows symbolic links in all path elements.

This patch converts local_statfs() to rely on open_nofollow() and fstatfs()
instead.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 31e51d1c15)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
e8e540294d 9pfs: local: utimensat: don't follow symlinks
The local_utimensat() callback is vulnerable to symlink attacks because it
calls qemu_utimens()->utimensat(AT_SYMLINK_NOFOLLOW) which follows symbolic
links in all path elements but the rightmost one or qemu_utimens()->utimes()
which follows symbolic links for all path elements.

This patch converts local_utimensat() to rely on opendir_nofollow() and
utimensat(AT_SYMLINK_NOFOLLOW) directly instead of using qemu_utimens().
It is hence assumed that the OS supports utimensat(), i.e. has glibc 2.6
or higher and linux 2.6.22 or higher, which seems reasonable nowadays.

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a33eda0dd9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
89ecc2ee5b 9pfs: local: remove: don't follow symlinks
The local_remove() callback is vulnerable to symlink attacks because it
calls:

(1) lstat() which follows symbolic links in all path elements but the
    rightmost one
(2) remove() which follows symbolic links in all path elements but the
    rightmost one

This patch converts local_remove() to rely on opendir_nofollow(),
fstatat(AT_SYMLINK_NOFOLLOW) to fix (1) and unlinkat() to fix (2).

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit a0e640a872)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
bc882d1a0f 9pfs: local: unlinkat: don't follow symlinks
The local_unlinkat() callback is vulnerable to symlink attacks because it
calls remove() which follows symbolic links in all path elements but the
rightmost one.

This patch converts local_unlinkat() to rely on opendir_nofollow() and
unlinkat() instead.

Most of the code is moved to a separate local_unlinkat_common() helper
which will be reused in a subsequent patch to fix the same issue in
local_remove().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit df4938a665)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
56350836b1 9pfs: local: lremovexattr: don't follow symlinks
The local_lremovexattr() callback is vulnerable to symlink attacks because
it calls lremovexattr() which follows symbolic links in all path elements
but the rightmost one.

This patch introduces a helper to emulate the non-existing fremovexattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lremovexattr().

local_lremovexattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 72f0d0bf51)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
b928af70ed 9pfs: local: lsetxattr: don't follow symlinks
The local_lsetxattr() callback is vulnerable to symlink attacks because
it calls lsetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fsetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lsetxattr().

local_lsetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3e36aba757)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
56809c99e0 9pfs: local: llistxattr: don't follow symlinks
The local_llistxattr() callback is vulnerable to symlink attacks because
it calls llistxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing flistxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to llistxattr().

local_llistxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 5507904e36)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
f9e47c99e4 9pfs: local: lgetxattr: don't follow symlinks
The local_lgetxattr() callback is vulnerable to symlink attacks because
it calls lgetxattr() which follows symbolic links in all path elements but
the rightmost one.

This patch introduces a helper to emulate the non-existing fgetxattrat()
function: it is implemented with /proc/self/fd which provides a trusted
path that can be safely passed to lgetxattr().

local_lgetxattr() is converted to use this helper and opendir_nofollow().

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56ad3e54da)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
78720c0545 9pfs: local: open/opendir: don't follow symlinks
The local_open() and local_opendir() callbacks are vulnerable to symlink
attacks because they call:

(1) open(O_NOFOLLOW) which follows symbolic links in all path elements but
    the rightmost one
(2) opendir() which follows symbolic links in all path elements

This patch converts both callbacks to use new helpers based on
openat_nofollow() to only open files and directories if they are
below the virtfs shared folder

This partly fixes CVE-2016-9602.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 996a0d76d7)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
a6f75fb375 9pfs: local: keep a file descriptor on the shared folder
This patch opens the shared folder and caches the file descriptor, so that
it can be used to do symlink-safe path walk.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 0e35a37829)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
e45f9e0258 9pfs: introduce relative_openat_nofollow() helper
When using the passthrough security mode, symbolic links created by the
guest are actual symbolic links on the host file system.

Since the resolution of symbolic links during path walk is supposed to
occur on the client side. The server should hence never receive any path
pointing to an actual symbolic link. This isn't guaranteed by the protocol
though, and malicious code in the guest can trick the server to issue
various syscalls on paths whose one or more elements are symbolic links.
In the case of the "local" backend using the "passthrough" or "none"
security modes, the guest can directly create symbolic links to arbitrary
locations on the host (as per spec). The "mapped-xattr" and "mapped-file"
security modes are also affected to a lesser extent as they require some
help from an external entity to create actual symbolic links on the host,
i.e. another guest using "passthrough" mode for example.

The current code hence relies on O_NOFOLLOW and "l*()" variants of system
calls. Unfortunately, this only applies to the rightmost path component.
A guest could maliciously replace any component in a trusted path with a
symbolic link. This could allow any guest to escape a virtfs shared folder.

This patch introduces a variant of the openat() syscall that successively
opens each path element with O_NOFOLLOW. When passing a file descriptor
pointing to a trusted directory, one is guaranteed to be returned a
file descriptor pointing to a path which is beneath the trusted directory.
This will be used by subsequent patches to implement symlink-safe path walk
for any access to the backend.

Symbolic links aren't the only threats actually: a malicious guest could
change a path element to point to other types of file with undesirable
effects:
- a named pipe or any other thing that would cause openat() to block
- a terminal device which would become QEMU's controlling terminal

These issues can be addressed with O_NONBLOCK and O_NOCTTY.

Two helpers are introduced: one to open intermediate path elements and one
to open the rightmost path element.

Suggested-by: Jann Horn <jannh@google.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(renamed openat_nofollow() to relative_openat_nofollow(),
 assert path is relative and doesn't contain '//',
 fixed side-effect in assert, Greg Kurz)
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit 6482a96163)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
0d35fc3652 9pfs: remove side-effects in local_open() and local_opendir()
If these functions fail, they should not change *fs. Let's use local
variables to fix this.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 21328e1e57)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
4d8b5b1cb1 9p: introduce the V9fsDir type
If we are to switch back to readdir(), we need a more complex type than
DIR * to be able to serialize concurrent accesses to the directory stream.

This patch introduces a placeholder type and fixes all users.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
(cherry picked from commit f314ea4e30)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
25afc77978 9pfs: remove side-effects in local_init()
If this function fails, it should not modify *ctx.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 00c90bd1c2)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
669ae0aa9e 9pfs: local: move xattr security ops to 9p-xattr.c
These functions are always called indirectly. It really doesn't make sense
for them to sit in a header file.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 56fc494bdc)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
abdbbdaa32 9pfs: fix off-by-one error in PDU free list
The server can handle MAX_REQ - 1 PDUs at a time and the virtio-9p
device has a MAX_REQ sized virtqueue. If the client manages to fill
up the virtqueue, pdu_alloc() will fail and the request won't be
processed without any notice to the client (it actually causes the
linux 9p client to hang).

This has been there since the beginning (commit 9f10751365 "virtio-9p:
Add a virtio 9p device to qemu"), but it needs an agressive workload to
run in the guest to show up.

We actually allocate MAX_REQ PDUs and I see no reason not to link them
all into the free list, so let's fix the init loop.

Reported-by: Tuomas Tynkkynen <tuomas@tuxera.com>
Suggested-by: Al Viro <viro@ZenIV.linux.org.uk>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 0d78289c3d)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Stefano Stabellini
a4a4c3bebe 9pfs: move pdus to V9fsState
pdus are initialized and used in 9pfs common code. Move the array from
V9fsVirtioState to V9fsState.

Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 583f21f8b9)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Greg Kurz
d654bba5a5 9pfs: fix crash when fsdev is missing
If the user passes -device virtio-9p without the corresponding -fsdev, QEMU
dereferences a NULL pointer and crashes.

This is a 2.8 regression introduced by commit 702dbcc274.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
(cherry picked from commit f2b58c4375)
[BR: Fix and/or infrastructure for BSC#1020427 CVE-2016-9602]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Alexander Graf
4089e0e088 i386: Allow cpuid bit override
KVM has a feature bitmap of CPUID bits that it knows works for guests.
QEMU removes bits that are not part of that bitmap automatically on VM
start.

However, some times we just don't list features in that list because
they don't make sense for normal scenarios, but may be useful in specific,
targeted workloads.

For that purpose, add a new =force option to all CPUID feature flags in
the CPU property. With that we can override the accel filtering and give
users full control over the CPUID feature bits exposed into guests.

Signed-off-by: Alexander Graf <agraf@suse.de>
(backported from upstream submission)
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
979f0b2f77 virtio-gpu: fix resource leak in virgl_cmd_resource_unref
When the guest sends VIRTIO_GPU_CMD_RESOURCE_UNREF without detaching the
backing storage beforehand (VIRTIO_GPU_CMD_RESOURCE_DETACH_BACKING)
we'll leak memory.

This patch fixes it for 3d mode, simliar to the 2d mode fix in commit
"b8e2392 virtio-gpu: call cleanup mapping function in resource destroy".

Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485167210-4757-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 5e8e3c4c75)
[BR: CVE-2017-5857 BSC#1023073]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Prasad J Pandit
a3e8da458c usb: ccid: check ccid apdu length
CCID device emulator uses Application Protocol Data Units(APDU)
to exchange command and responses to and from the host.
The length in these units couldn't be greater than 65536. Add
check to ensure the same. It'd also avoid potential integer
overflow in emulated_apdu_from_guest.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20170202192228.10847-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c7dfbf3225)
[BR: CVE-2017-5898 BSC#1023907]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
e003f22e31 cirrus: add blit_is_unsafe call to cirrus_bitblt_cputovideo
CIRRUS_BLTMODE_MEMSYSSRC blits do NOT check blit destination
and blit width, at all.  Oops.  Fix it.

Security impact: high.

The missing blit destination check allows to write to host memory.
Basically same as CVE-2014-8106 for the other blit variants.

The missing blit width check allows to overflow cirrus_bltbuf,
with the attractive target cirrus_srcptr (current cirrus_bltbuf write
position) being located right after cirrus_bltbuf in CirrusVGAState.

Due to cirrus emulation writing cirrus_bltbuf bytewise the attacker
hasn't full control over cirrus_srcptr though, only one byte can be
changed.  Once the first byte has been modified further writes land
elsewhere.

[ This is CVE-2017-2620 / XSA-209  - Ian Jackson ]

Fixed compilation by removing extra parameter to blit_is_unsafe. -iwj

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Ian Jackson <ian.jackson@eu.citrix.com>
[BR: BSC#1024972]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
800f228f94 cirrus: fix patterncopy checks
The blit_region_is_unsafe checks don't work correctly for the
patterncopy source.  It's a fixed-sized region, which doesn't
depend on cirrus_blt_{width,height}.  So go do the check in
cirrus_bitblt_common_patterncopy instead, then tell blit_is_unsafe that
it doesn't need to verify the source.  Also handle the case where we
blit from cirrus_bitbuf correctly.

This patch replaces 5858dd1801.

Security impact:  I think for the most part error on the safe side this
time, refusing blits which should have been allowed.

Only exception is placing the blit source at the end of the video ram,
so cirrus_blt_srcaddr + 256 goes beyond the end of video memory.  But
even in that case I'm not fully sure this actually allows read access to
host memory.  To trick the commit 5858dd18 security checks one has to
pick very small cirrus_blt_{width,height} values, which in turn implies
only a fraction of the blit source will actually be used.

Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-id: 1486645341-5010-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 95280c31cd)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Paolo Bonzini
b8174ff4de megasas: fix guest-triggered memory leak
If the guest sets the sglist size to a value >=2GB, megasas_handle_dcmd
will return MFI_STAT_MEMORY_NOT_AVAILABLE without freeing the memory.
Avoid this by returning only the status from map_dcmd, and loading
cmd->iov_size in the caller.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 765a707000)
[BR: CVE-2017-5856 BSC#1023053]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
5b04bc0caa cirrus: fix oob access issue (CVE-2017-2615)
When doing bitblt copy in backward mode, we should minus the
blt width first just like the adding in the forward mode. This
can avoid the oob access of the front of vga's vram.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>

{ kraxel: with backward blits (negative pitch) addr is the topmost
          address, so check it as-is against vram size ]

Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Cc: Laszlo Ersek <lersek@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Wolfgang Bumiller <w.bumiller@proxmox.com>
Fixes: d3532a0db0 (CVE-2014-8106)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485938101-26602-1-git-send-email-kraxel@redhat.com
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 62d4c6bd52)
[BR: CVE-2017-2615 BSC#1023004]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Gerd Hoffmann
762566f838 cirrus: fix blit address mask handling
Apply the cirrus_addr_mask to cirrus_blt_dstaddr and cirrus_blt_srcaddr
right after assigning them, in cirrus_bitblt_start(), instead of having
this all over the place in the cirrus code, and missing a few places.

Reported-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1485338996-17095-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 60cd23e851)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Wolfgang Bumiller
93f7d131d1 cirrus: handle negative pitch in cirrus_invalidate_region()
cirrus_invalidate_region() calls memory_region_set_dirty()
on a per-line basis, always ranging from off_begin to
off_begin+bytesperline. With a negative pitch off_begin
marks the top most used address and thus we need to do an
initial shift backwards by a line for negative pitches of
backward blits, otherwise the first iteration covers the
line going from the start offset forwards instead of
backwards.
Additionally since the start address is inclusive, if we
shift by a full `bytesperline` we move to the first address
*not* included in the blit, so we only shift by one less
than bytesperline.

Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-id: 1485352137-29367-1-git-send-email-w.bumiller@proxmox.com

[ kraxel: codestyle fixes ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit f153b563f8)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Christian Borntraeger
8c82736af5 s390x/kvm: fix small race reboot vs. cmma
Right now we reset all devices before we reset the cmma states.  This
can result in the host kernel discarding guest pages that were
previously in the unused state but already contain a bios or a -kernel
file before the cmma reset has finished.  This race results in random
guest crashes or hangs during very early reboot.

Fixes: 1cd4e0f6f0 ("s390x/cmma: clean up cmma reset")
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
(cherry picked from commit 1a0e4c8b02)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Halil Pasic
98b8cfafc4 virtio: fix vq->inuse recalc after migr
Correct recalculation of vq->inuse after migration for the corner case
where the avail_idx has already wrapped but used_idx not yet.

Also change the type of the VirtQueue.inuse to unsigned int. This is
done to be consistent with other members representing sizes (VRing.num),
and because C99 guarantees max ring size < UINT_MAX but does not
guarantee max ring size < INT_MAX.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: bccdef6b ("virtio: recalculate vq->inuse after migration")
CC: qemu-stable@nongnu.org
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit e66bcc4081)
[BR: BSC#1020928]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
P J P
2b433ca6f5 sd: sdhci: check data length during dma_memory_read
While doing multi block SDMA transfer in routine
'sdhci_sdma_transfer_multi_blocks', the 's->fifo_buffer' starting
index 'begin' and data length 's->data_count' could end up to be same.
This could lead to an OOB access issue. Correct transfer data length
to avoid it.

Reported-by: Jiang Xin <jiangxin1@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2017-5667 BSC#1022541]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
bce1d2d825 audio: ac97: add exit function
Currently the ac97 device emulation doesn't have a exit function,
hot unplug this device will leak some memory. Add a exit function to
avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 58520052.4825ed0a.27a71.6cae@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 12351a91da)
[BR: CVE-2017-5525 BSC#1020491]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
675e6f0918 audio: es1370: add exit function
Currently the es1370 device emulation doesn't have a exit function,
hot unplug this device will leak some memory. Add a exit function to
avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 585200c9.a968ca0a.1ab80.4c98@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 069eb7b2b8)
[BR: CVE-2017-5526 BSC#1020589]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
362081fef5 virtio-gpu: fix memory leak in resource attach backing
In the resource attach backing function, everytime it will
allocate 'res->iov' thus can leading a memory leak. This
patch avoid this.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Message-id: 1483003721-65360-1-git-send-email-liq3ea@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 204f01b309)
[BR: CVE-2017-5578 BSC#1021481]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
f2b1fd78db virtio-gpu-3d: fix memory leak in resource attach backing
If the virgl_renderer_resource_attach_iov function fails the
'res_iovs' will be leaked. Add check of the return value to
free the 'res_iovs' when failing.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1482999086-59795-1-git-send-email-liq3ea@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 33243031da)
[BR: CVE-2017-5552 BSC#1021195]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Li Qiang
fdd0da69cf watchdog: 6300esb: add exit function
When the Intel 6300ESB watchdog is hot unplug. The timer allocated
in realize isn't freed thus leaking memory leak. This patch avoid
this through adding the exit function.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <583cde9c.3223ed0a.7f0c2.886e@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit eb7a20a361)
[BR: CVE-2016-10155 BSC#1021129]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Prasad J Pandit
c414074474 display: virtio-gpu-3d: check virgl capabilities max_size
Virtio GPU device while processing 'VIRTIO_GPU_CMD_GET_CAPSET'
command, retrieves the maximum capabilities size to fill in the
response object. It continues to fill in capabilities even if
retrieved 'max_size' is zero(0), thus resulting in OOB access.
Add check to avoid it.

Reported-by: Zhenhao Hong <zhenhaohong@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20161214070156.23368-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit abd7f08b23)
[BR: CVE-2016-10028 BSC#1017084 BSC#1016503]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Marc-André Lureau
c627235938 virtio-gpu: use VIRTIO_GPU_MAX_SCANOUTS
The value is defined in virtio_gpu.h already (changing from 4 to 16).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-6-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit acfc484650)
[BR: CVE-2016-10029 BSC#1017081 BSC#1016504]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Marc-André Lureau
3bb7554693 virtio-gpu: check max_outputs only
The scanout id should not be above the configured num_scanouts.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-5-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2fe760554e)
[BR: CVE-2016-10029 BSC#1017081 BSC#1016504]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:51 -06:00
Marc-André Lureau
b0561a36b7 virtio-gpu: check max_outputs value
The value must be less than VIRTIO_GPU_MAX_SCANOUT.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-4-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2021-03-17 09:45:51 -06:00
Marc-André Lureau
fe30b844e0 virtio-gpu: check early scanout id
Before accessing the g->scanout array, in order to avoid potential
out-of-bounds access.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-2-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2021-03-17 09:45:50 -06:00
Bruce Rogers
6d0418f0b8 display: cirrus: ignore source pitch value as needed in blit_is_unsafe
Commit 4299b90 added a check which is too broad, given that the source
pitch value is not required to be initialized for solid fill operations.
This patch refines the blit_is_unsafe() check to ignore source pitch in
that case. After applying the above commit as a security patch, we
noticed the SLES 11 SP4 guest gui failed to initialize properly.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Message-id: 20170109203520.5619-1-brogers@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 913a87885f)
[BR: BSC#1016779]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
8b202eccff display: cirrus: check vga bits per pixel(bpp) value
In Cirrus CLGD 54xx VGA Emulator, if cirrus graphics mode is VGA,
'cirrus_get_bpp' returns zero(0), which could lead to a divide
by zero error in while copying pixel data. The same could occur
via blit pitch values. Add check to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1476776717-24807-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4299b90e9b)
[BR: CVE-2016-9921 CVE-2016-9922 BSC#1014702 BSC#1015169]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
cbd8e4a39d virtio-gpu: fix information leak in capset get dispatch
In virgl_cmd_get_capset function, it uses g_malloc to allocate
a response struct to the guest. As the 'resp'struct hasn't been full
initialized it will lead the 'resp->padding' field to the guest.
Use g_malloc0 to avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[BR: CVE-2016-9908 BSC#1014514]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
0f945a9739 usbredir: free vm_change_state_handler in usbredir destroy dispatch
In usbredir destroy dispatch function, it doesn't free the vm change
state handler once registered in usbredir_realize function. This will
lead a memory leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 58216976.d0236b0a.77b99.bcd6@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 07b026fd82)
[BR: CVE-2016-9907 BSC#1014109]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
64df184421 usb: ehci: fix memory leak in ehci_init_transfer
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5821c0f4.091c6b0a.e0c92.e811@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 791f97758e)
[BR: CVE-2016-9911 BSC#1014111]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Andreas Färber
8c08aa438e tests: Add scsi-disk test
Test scsi-{disk,hd,cd} wwn properties for correct 64-bit parsing.

For now piggyback on virtio-scsi.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:50 -06:00
Andreas Färber
0705dbdeea tests: Add QOM property unit tests
Add a test for parsing and setting a uint64 property.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:50 -06:00
Andreas Färber
efade2087d test-string-input-visitor: Add uint64 test
Test parsing of decimal and hexadecimal uint64 numbers with most
significant bit set.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:50 -06:00
Andreas Färber
99e15fccd8 test-string-input-visitor: Add int test case
In addition to -42 also parse the maximum int64.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:50 -06:00
Andreas Färber
9ad861fe31 string-input-visitor: Fix uint64 parsing
All integers would get parsed by strtoll(), not handling the case of
UINT64 properties with the most significient bit set.

Implement a .type_uint64 visitor callback, reusing the existing
parse_str() code through a new argument, using strtoull().

As this is a bug fix, it intentionally ignores checkpatch warnings to
prefer the use of qemu_strto[u]ll() over strto[u]ll().

Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:50 -06:00
Li Qiang
d5faa5014b virtio-gpu: call cleanup mapping function in resource destroy
If the guest destroy the resource before detach banking, the 'iov'
and 'addrs' field in resource is not freed thus leading memory
leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
[BR: CVE-2016-9912 BSC#1014112]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
4193b2c4c4 9pfs: add cleanup operation for proxy backend driver
In the init operation of proxy backend dirver, it allocates a
V9fsProxy struct and some other resources. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 898ae90a44)
[BR: CVE-2016-9913 BSC#1014110]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
3d3ba2c657 9pfs: add cleanup operation for handle backend driver
In the init operation of handle backend dirver, it allocates a
handle_data struct and opens a mount file. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 971f406b77)
[BR: CVE-2016-9913 BSC#1014110]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
2513d4d946 9pfs: add cleanup operation in FileOperations
Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 702dbcc274)
[BR: CVE-2016-9913 BSC#1014110]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
ea0f6c7f2f 9pfs: adjust the order of resource cleanup in device unrealize
Unrealize should undo things that were set during realize in
reverse order. So should do in the error path in realize.

Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4774718e5c)
[BR: CVE-2016-9913 BSC#1014110]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
32fc43881a virtio-gpu: fix information leak in getting capset info dispatch
In virgl_cmd_get_capset_info dispatch function, the 'resp' hasn't
been full initialized before writing to the guest. This will leak
the 'resp.padding' and 'resp.hdr.padding' fieds to the guest. This
patch fix this issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5818661e.0860240a.77264.7a56@mx.google.com
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 42a8dadc74)
[BR: CVE-2016-9845 BSC#1013767]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
c6cbb0f717 virtio-gpu: fix memory leak in update_cursor_data_virgl
In update_cursor_data_virgl function, if the 'width'/ 'height'
is not equal to current cursor's width/height it will return
without free the 'data' allocated previously. This will lead
a memory leak issue. This patch fix this issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 58187760.41d71c0a.cca75.4cb9@mx.google.com
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2d1cd6c7a9)
[BR: CVE-2016-9846 BSC#1013764]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
037f8c1b57 net: mcf: check receive buffer size register value
ColdFire Fast Ethernet Controller uses a receive buffer size
register(EMRBR) to hold maximum size of all receive buffers.
It is set by a user before any operation. If it was set to be
zero, ColdFire emulator would go into an infinite loop while
receiving data in mcf_fec_receive. Add check to avoid it.

Reported-by: Wjjzhang <wjjzhang@tencent.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 77d54985b8)
[BR: CVE-2016-9776 BSC#1013285]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
994f208dda 9pfs: fix memory leak in v9fs_xattrcreate
The 'fs.xattr.value' field in V9fsFidState object doesn't consider the
situation that this field has been allocated previously. Every time, it
will be allocated directly. This leads to a host memory leak issue if
the client sends another Txattrcreate message with the same fid number
before the fid from the previous time got clunked.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, updated the changelog to indicate how the leak can occur]
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit ff55e94d23)
[BR: CVE-2016-9102 BSC#1007450 BSC#1014256]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
8b7d62d7b8 9pfs: fix information leak in xattr read
9pfs uses g_malloc() to allocate the xattr memory space, if the guest
reads this memory before writing to it, this will leak host heap memory
to the guest. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit eb68760285)
[BR: CVE-2016-9103 BSC#1007454]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Peter Xu
76ca5e2f74 intel_iommu: fix incorrect device invalidate
"mask" needs to be inverted before use.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6cb99acc28)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Zhuang Yanying
a61d5bf04b ivshmem: Fix 64 bit memory bar configuration
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one.  The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR.  A 32 bit BAR can support only up to 1 GiB of shared memory.

This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa.  Worse, the default got flipped as well.  Devices
ivshmem-plain and ivshmem-doorbell are not affected.

Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02.  Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.

Cc: qemu-stable@nongnu.org
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit be4e0d7375)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Greg Kurz
0f43d4c080 vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.

In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.

This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.

This works for legacy mappings as well.

Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit f1f9e6c596)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Michael S. Tsirkin
82cf146f1e virtio-net: mark VIRTIO_NET_F_GSO as legacy
virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.

Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.

Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 2a083ffd2e)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Michael S. Tsirkin
85160ca1a0 virtio: allow per-device-class legacy features
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.

Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 9b706dbbbb)
[BR: BSC#1013341 - Added variable vdc to virtio_device_class_init]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Max Reitz
2841b22ef3 block/curl: Do not wait for data beyond EOF
libcurl will only give us as much data as there is, not more. The block
layer will deny requests beyond the end of file for us; but since this
block driver is still using a sector-based interface, we can still get
in trouble if the file size is not a multiple of 512.

While we have already made sure not to attempt transfers beyond the end
of the file, we are currently still trying to receive data from there if
the original request exceeds the file size. This patch fixes this issue
and invokes qemu_iovec_memset() on the iovec's tail.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-5-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 4e504535c1)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Max Reitz
8f20f1971c block/curl: Remember all sockets
For some connection types (like FTP, generally), more than one socket
may be used (in FTP's case: control vs. data stream). As of commit
838ef60249 ("curl: Eliminate unnecessary
use of curl_multi_socket_all"), we have to remember all of the sockets
used by libcurl, but in fact we only did that for a single one. Since
one libcurl connection may use multiple sockets, however, we have to
remember them all.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-4-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
[BR: BSC#1013341 - also fixed adjacent line during conflict resolution]
(cherry picked from commit ff5ca1664a)
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Max Reitz
b22decf0f9 block/curl: Fix return value from curl_read_cb
While commit 38bbc0a580 is correct in that
the callback is supposed to return the number of bytes handled; what it
does not mention is that libcurl will throw an error if the callback did
not "handle" all of the data passed to it.

Therefore, if the callback receives some data that it cannot handle
(either because the receive buffer has not been set up yet or because it
would not fit into the receive buffer) and we have to ignore it, we
still have to report that the data has been handled.

Obviously, this should not happen normally. But it does happen at least
for FTP connections where some data (that we do not expect) may be
generated when the connection is established.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20161025025431.24714-3-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 4e7676571b)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Max Reitz
feefe1153d block/curl: Use BDRV_SECTOR_SIZE
Currently, curl defines its own constant SECTOR_SIZE. There is no
advantage over using the global BDRV_SECTOR_SIZE, so drop it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20161025025431.24714-2-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 9054d9f6b0)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Paolo Bonzini
0bc7795218 mirror: use bdrv_drained_begin/bdrv_drained_end
Ensure that there are no changes between the last check to
bdrv_get_dirty_count and the switch to the target.

There is already a bdrv_drained_end call, we only need to ensure
that bdrv_drained_begin is not called twice.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
(cherry picked from commit 9a0cec664e)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Paolo Bonzini
f0ba116aec rbd: shift byte count as a 64-bit value
Otherwise, reads of more than 2GB fail.  Until commit
7bbca9e290, reads of 2^41
bytes succeeded at least theoretically.

In fact, pdiscard ought to receive a 64-bit integer as the
count for the same reason.

Reported by Coverity.

Fixes: 7bbca9e290
Cc: qemu-stable@nongnu.org
Cc: kwolf@redhat.com
Cc: eblake@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e948f663e9)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Daniel P. Berrange
09935ed615 char: fix missing return in error path for chardev TLS init
If the qio_channel_tls_new_(server|client) methods fail,
we disconnect the client. Unfortunately a missing return
means we then go on to try and run the TLS handshake on
a NULL I/O channel. This gives predictably segfaulty
results.

The main way to trigger this is to request a bogus TLS
priority string for the TLS credentials. e.g.

  -object tls-creds-x509,id=tls0,priority=wibble,...

Most other ways appear impossible to trigger except
perhaps if OOM conditions cause gnutls initialization
to fail.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 660a2d83e0)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
John Snow
bd94998ff9 block-backend: remove blk_flush_all
We can teach Xen to drain and flush each device as it needs to, instead
of trying to flush ALL devices. This removes the last user of
blk_flush_all.

The function is therefore removed under the premise that any new uses
of blk_flush_all would be the wrong paradigm: either flush the single
device that requires flushing, or use an appropriate flush_all mechanism
from outside of the BlkBackend layer.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 49137bf684)
[BR: BSC#1013341 - remove also from place we have patched]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
John Snow
8371a62192 qemu: use bdrv_flush_all for vm_stop et al
Reimplement bdrv_flush_all for vm_stop. In contrast to blk_flush_all,
bdrv_flush_all does not have device model restrictions. This allows
us to flush and halt unconditionally without error.

This allows us to do things like migrate when we have a device with
an open tray, but has a node that may need to be flushed, or nodes
that aren't currently attached to any device and need to be flushed.

Specifically, this allows us to migrate when we have a CDROM with
an open tray.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 22af08eacf)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
John Snow
8f0b2c4c77 block: reintroduce bdrv_flush_all
Commit fe1a9cbc moved the flush_all routine from the bdrv layer to the
block-backend layer. In doing so, however, the semantics of the routine
changed slightly such that flush_all now used blk_flush instead of
bdrv_flush.

blk_flush can fail if the attached device model reports that it is not
"available," (i.e. the tray is open.) This changed the semantics of
flush_all such that it can now fail for e.g. open CDROM drives.

Reintroduce bdrv_flush_all to regain the old semantics without having to
alter the behavior of blk_flush or blk_flush_all, which are already
'doing the right thing.'

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 4085f5c7a2)
[BR: BSC#1013341 - slight tweak needed in backport]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Fam Zheng
6dab5279d9 virtio-blk: Remove stale comment about draining
This is stale after commit 6e40b3bf (virtio-blk: Use blk_drain() to
drain IO requests), remove it.

Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1470278654-13525-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 27d1b87688)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Fam Zheng
b126341373 virtio-blk: Release s->rq queue at system_reset
At system_reset, there is no point in retrying the queued request,
because the driver that issued the request won't be around any more.

Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 1470278654-13525-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 26307f6aa4)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Eric Blake
07ec4cdc26 rbd: Switch rbd_start_aio() to byte-based
The internal function converts to byte-based before calling into
RBD code; hoist the conversion to the callers so that callers
can then be switched to byte-based themselves.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1468624988-423-8-git-send-email-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7bbca9e290)
[BR: BSC#1013341 - precursor to cherry picked e948f66]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Fam Zheng
8b60fe613a macio: Use blk_drain instead of blk_drain_all
We only care about the associated backend, so blk_drain is more
appropriate here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20160612065603.21911-1-famz@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 0d0437aac6)
[BR: BSC#1013341]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Jan Beulich
641353d7d2 xen: fix ioreq handling
Avoid double fetches and bounds check size to avoid overflowing
internal variables.

This is CVE-2016-9381 / XSA-197.

Reported-by: yanghongke <yanghongke@huawei.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
(cherry picked from commit b85f9dfdb1)
[BR: BSC#1009109]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Alberto Garcia
6d03424206 gtk: don't leak the GtkBorder with VTE 0.36
When gtk_widget_style_get() is used to get the "inner-border" style
property, it returns a copy of the GtkBorder which must be freed by
the caller.

This patch also fixes a warning about the unused 'padding' structure
with VTE 0.36.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1463127654-5171-1-git-send-email-berto@igalia.com
Cc: Cole Robinson <crobinso@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>

[ kraxel: adapted to changes in ui patch queue ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 6978dc4adc)
[BR: BSC#1008519]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Cole Robinson
bacc2ba8ac ui: gtk: Fix a runtime warning on vte >= 0.37
inner-border was dropped in vte API 2.91, in favor of the standard
padding style

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 60a6cdc337d611d902f53907e66a8f37ea374d65.1462557436.git.crobinso@redhat.com

[ kraxel: Fix warning with old vte version. ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 84e2dc4bf3)
[BR: BSC#1008519]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Denis V. Lunev
68ed0f8c67 migration: fix inability to save VM after snapshot
The following sequence of operations fails:
    virsh start vm
    virsh snapshot-create vm
    virshh save vm --file file
with the following error
    error: Failed to save domain vm to file
    error: internal error: unable to execute QEMU command 'migrate':
    There's a migration process in progress

The problem is that qemu_savevm_state() calls migrate_init() which sets
migration state to MIGRATION_STATUS_SETUP and never cleaned it up.
This patch do the job.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1466003203-26263-1-git-send-email-den@openvz.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 6dcf66681a)
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:50 -06:00
Alexander Graf
a32520ff4e target-i386: present virtual L3 cache info for vcpus
Some software algorithms are based on the hardware's cache info, for example,
for x86 linux kernel, when cpu1 want to wakeup a task on cpu2, cpu1 will trigger
a resched IPI and told cpu2 to do the wakeup if they don't share low level
cache. Oppositely, cpu1 will access cpu2's runqueue directly if they share llc.
The relevant linux-kernel code as bellow:

	static void ttwu_queue(struct task_struct *p, int cpu)
	{
		struct rq *rq = cpu_rq(cpu);
		......
		if (... && !cpus_share_cache(smp_processor_id(), cpu)) {
			......
			ttwu_queue_remote(p, cpu); /* will trigger RES IPI */
			return;
		}
		......
		ttwu_do_activate(rq, p, 0); /* access target's rq directly */
		......
	}

In real hardware, the cpus on the same socket share L3 cache, so one won't
trigger a resched IPIs when wakeup a task on others. But QEMU doesn't present a
virtual L3 cache info for VM, then the linux guest will trigger lots of RES IPIs
under some workloads even if the virtual cpus belongs to the same virtual socket.

For KVM, there will be lots of vmexit due to guest send IPIs.
The workload is a SAP HANA's testsuite, we run it one round(about 40 minuates)
and observe the (Suse11sp3)Guest's amounts of RES IPIs which triggering during
the period:
        No-L3           With-L3(applied this patch)
cpu0:	363890		44582
cpu1:	373405		43109
cpu2:	340783		43797
cpu3:	333854		43409
cpu4:	327170		40038
cpu5:	325491		39922
cpu6:	319129		42391
cpu7:	306480		41035
cpu8:	161139		32188
cpu9:	164649		31024
cpu10:	149823		30398
cpu11:	149823		32455
cpu12:	164830		35143
cpu13:	172269		35805
cpu14:	179979		33898
cpu15:	194505		32754
avg:	268963.6	40129.8

The VM's topology is "1*socket 8*cores 2*threads".
After present virtual L3 cache info for VM, the amounts of RES IPIs in guest
reduce 85%.

For KVM, vcpus send IPIs will cause vmexit which is expensive, so it can cause
severe performance degradation. We had tested the overall system performance if
vcpus actually run on sparate physical socket. With L3 cache, the performance
improves 7.2%~33.1%(avg:15.7%).

Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
[agraf: cherry picked from 14c985cffa]
[agraf: disabled by default]
Signed-off-by: Alexander Graf <agraf@suse.de>

Conflicts:
	include/hw/i386/pc.h
	target-i386/cpu.c
	target-i386/cpu.h

Conflicts:
	target-i386/cpu-qom.h
	target-i386/cpu.c
2021-03-17 09:45:50 -06:00
Radim Krčmář
d54dcb36bb target-i386: Implement CPUID[0xB] (Extended Topology Enumeration)
I looked at a dozen Intel CPU that have this CPUID and all of them
always had Core offset as 1 (a wasted bit when hyperthreading is
disabled) and Package offset at least 4 (wasted bits at <= 4 cores).

QEMU uses more compact IDs and it doesn't make much sense to change it
now.  I keep the SMT and Core sub-leaves even if there is just one
thread/core;  it makes the code simpler and there should be no harm.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
(cherry picked from commit 5232d00a04)
[backported and disabled by default]
Signed-off-by: Alexander Graf <agraf@suse.de>

Conflicts:
	include/hw/i386/pc.h
	target-i386/cpu.h
2021-03-17 09:45:50 -06:00
P J P
69f9cb16de net: imx: limit buffer descriptor count
i.MX Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set an upper limit to number of buffer descriptors.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-7907 BSC#1002549]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
P J P
d0a3c9b8a2 dma: rc4030: limit interval timer reload value
The JAZZ RC4030 chipset emulator has a periodic timer and
associated interval reload register. The reload value is used
as divider when computing timer's next tick value. If reload
value is large, it could lead to divide by zero error. Limit
the interval reload value to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-8667 BSC#1004702]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
c420acdb9f 9pfs: fix integer overflow issue in xattr read/write
The v9fs_xattr_read() and v9fs_xattr_write() are passed a guest
originated offset: they must ensure this offset does not go beyond
the size of the extended attribute that was set in v9fs_xattrcreate().
Unfortunately, the current code implement these checks with unsafe
calculations on 32 and 64 bit values, which may allow a malicious
guest to cause OOB access anyway.

Fix this by comparing the offset and the xattr size, which are
both uint64_t, before trying to compute the effective number of bytes
to read or write.

Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 7e55d65c56)
[BR: CVE-2016-9104 BSC#1007493]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
03941954b3 virtio-gpu: fix memory leak in virtio_gpu_resource_create_2d
In virtio gpu resource create dispatch, if the pixman format is zero
it doesn't free the resource object allocated previously. Thus leading
a host memory leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 57df486e.8379240a.c3620.ff81@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit cb3a0522b6)
[BR: CVE-2016-7994 BSC#1003613]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
3784c99588 audio: intel-hda: check stream entry count during transfer
Intel HDA emulator uses stream of buffers during DMA data
transfers. Each entry has buffer length and buffer pointer
position, which are used to derive bytes to 'copy'. If this
length and buffer pointer were to be same, 'copy' could be
set to zero(0), leading to an infinite loop. Add check to
avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1476949224-6865-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0c0fc2b5fd)
[BR: CVE-2016-8909 BSC#1006536]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
d35c02c7ef net: rtl8139: limit processing of ring descriptors
RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.

Reported-by: Andrew Henderson <hendersa@icculus.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit c7c3591669)
[BR: CVE-2016-8910 BSC#1006538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
02b597b4e9 net: rocker: set limit to DMA buffer size
Rocker network switch emulator has test registers to help debug
DMA operations. While testing host DMA access, a buffer address
is written to register 'TEST_DMA_ADDR' and its size is written to
register 'TEST_DMA_SIZE'. When performing TEST_DMA_CTRL_INVERT
test, if DMA buffer size was greater than 'INT_MAX', it leads to
an invalid buffer access. Limit the DMA buffer size to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 8caed3d564)
[BR: CVE-2016-8668 BSC#1004706]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
f5632e96e6 net: eepro100: fix memory leak in device uninit
The exit dispatch of eepro100 network card device doesn't free
the 's->vmstate' field which was allocated in device realize thus
leading a host memory leak. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 2634ab7fe2)
[BR: CVE-2016-9101 BSC#1007391]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
b70aa3477d net: pcnet: check rx/tx descriptor ring length
The AMD PC-Net II emulator has set of control and status(CSR)
registers. Of these, CSR76 and CSR78 hold receive and transmit
descriptor ring length respectively. This ring length could range
from 1 to 65535. Setting ring length to zero leads to an infinite
loop in pcnet_rdra_addr() or pcnet_transmit(). Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 34e29ce754)
[BR: CVE-2016-7909 BSC#1002557]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
77165a6be4 char: serial: check divider value against baud base
16550A UART device uses an oscillator to generate frequencies
(baud base), which decide communication speed. This speed could
be changed by dividing it by a divider. If the divider is
greater than the baud base, speed is set to zero, leading to a
divide by zero error. Add check to avoid it.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1476251888-20238-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3592fe0c91)
[BR: CVE-2016-8669 BSC#1004707]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
5adda91422 9pfs: fix memory leak in v9fs_write
If an error occurs when marshalling the transfer length to the guest, the
v9fs_write() function doesn't free an IO vector, thus leading to a memory
leak. This patch fixes the issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit fdfcc9aeea)
[BR: CVE-2016-9106 BSC#1007495]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
a26c9d2717 9pfs: fix potential host memory leak in v9fs_read
In 9pfs read dispatch function, it doesn't free two QEMUIOVector
object thus causing potential memory leak. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit e95c9a493a)
[BR: CVE-2016-8577 BSC#1003893]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
fce73a18ee 9pfs: fix memory leak in v9fs_link
The v9fs_link() function keeps a reference on the source fid object. This
causes a memory leak since the reference never goes down to 0. This patch
fixes the issue.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit 4c1586787f)
[BR: CVE-2016-9105 BSC#1007494]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
4709e66290 9pfs: allocate space for guest originated empty strings
If a guest sends an empty string paramater to any 9P operation, the current
code unmarshals it into a V9fsString equal to { .size = 0, .data = NULL }.

This is unfortunate because it can cause NULL pointer dereference to happen
at various locations in the 9pfs code. And we don't want to check str->data
everywhere we pass it to strcmp() or any other function which expects a
dereferenceable pointer.

This patch enforces the allocation of genuine C empty strings instead, so
callers don't have to bother.

Out of all v9fs_iov_vunmarshal() users, only v9fs_xattrwalk() checks if
the returned string is empty. It now uses v9fs_string_size() since
name.data cannot be NULL anymore.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[groug, rewritten title and changelog,
 fix empty string check in v9fs_xattrwalk()]
Signed-off-by: Greg Kurz <groug@kaod.org>
(cherry picked from commit ba42ebb863)
[BR: CVE-2016-8578 BSC#1003894]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Gerd Hoffmann
68a6f6102c xhci: limit the number of link trbs we are willing to process
Needed to avoid we run in circles forever in case the guest builds
an endless loop with link trbs.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Tested-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1476096382-7981-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 05f43d44e4)
[BR: CVE-2016-8576 BSC#1003878]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
1c6f55c3d7 usb: ehci: fix memory leak in ehci_process_itd
While processing isochronous transfer descriptors(iTD), if the page
select(PG) field value is out of bands it will return. In this
situation the ehci's sg list is not freed thus leading to a memory
leak issue. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit b16c129daf)
[BR: CVE-2016-7995 BSC#1003612]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
375e22f2ec net: mcf: limit buffer descriptor count
ColdFire Fast Ethernet Controller uses buffer descriptors to manage
data flow to/fro receive & transmit queues. While transmitting
packets, it could continue to read buffer descriptors if a buffer
descriptor has length of zero and has crafted values in bd.flags.
Set upper limit to number of buffer descriptors.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 070c4b92b8)
[BR: CVE-2016-7908 BSC#1002550]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
71a1b6cd7b virtio: add check for descriptor's mapped address
virtio back end uses set of buffers to facilitate I/O operations.
If its size is too large, 'cpu_physical_memory_map' could return
a null address. This would result in a null dereference while
un-mapping descriptors. Add check to avoid it.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
(cherry picked from commit 973e7170dd)
[BR: CVE-2016-7422 BSC#1000346]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
4724b56b0b usb:xhci:fix memory leak in usb_xhci_exit
If the xhci uses msix, it doesn't free the corresponding
memory, thus leading a memory leak. This patch avoid this.

Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 57d7d2e0.d4301c0a.d13e9.9a55@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit b53dd4495c)
[BR: CVE-2016-7466 BSC#1000345]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
77436bbff3 vmsvga: correct bitmap and pixmap size checks
When processing svga command DEFINE_CURSOR in vmsvga_fifo_run,
the computed BITMAP and PIXMAP size are checked against the
'cursor.mask[]' and 'cursor.image[]' array sizes in bytes.
Correct these checks to avoid OOB memory access.

Reported-by: Qinghao Tang <luodalongde@gmail.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1473338754-15430-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 167d97a3de)
[BR: CVE-2016-7170 BSC#998516]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
chaojianhu
52d90475ae hw/net: Fix a heap overflow in xlnx.xps-ethernetlite
The .receive callback of xlnx.xps-ethernetlite doesn't check the length
of data before calling memcpy. As a result, the NetClientState object in
heap will be overflowed. All versions of qemu with xlnx.xps-ethernetlite
will be affected.

Reported-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: chaojianhu <chaojianhu@hotmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit a0d1cbdacf)
[BR: CVE-2016-7161 BSC#1001151]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Alexander Graf
7cbcd176dd ARM: KVM: Enable in-kernel timers with user space gic
When running with KVM enabled, you can choose between emulating the
gic in kernel or user space. If the kernel supports in-kernel virtualization
of the interrupt controller, it will default to that. If not, if will
default to user space emulation.

Unfortunately when running in user mode gic emulation, we miss out on
timer events which are only available from kernel space. This patch leverages
the new kernel/user space notification mechanism for those timer events.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:50 -06:00
Prasad J Pandit
d0a9c75825 scsi: pvscsi: avoid infinite loop while building SG list
In PVSCSI paravirtual SCSI bus, pvscsi_convert_sglist can take a very
long time or go into an infinite loop due to two different bugs:

1) the request descriptor data length is defined to be 64 bit. While
building SG list from a request descriptor, it gets truncated to 32bit
in routine 'pvscsi_convert_sglist'. This could lead to an infinite loop
situation for large 'dataLen' values, when data_length is cast to uint32_t
and chunk_size becomes always zero.  Fix this by removing the incorrect
cast.

2) pvscsi_get_next_sg_elem can be called arbitrarily many times if the
element has a zero length.  Get out of the loop early when this happens,
by introducing an upper limit on the number of SG list elements.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-7156 BSC#997859]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
8605f6b617 net: vmxnet: initialise local tx descriptor
In Vmxnet3 device emulator while processing transmit(tx) queue,
when it reaches end of packet, it calls vmxnet3_complete_packet.
In that local 'txcq_descr' object is not initialised, which could
leak host memory bytes a guest.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: CVE-2016-6836 BSC#994760]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:50 -06:00
Li Qiang
0caf75af72 net: vmxnet3: check for device_active before write
Vmxnet3 device emulator does not check if the device is active,
before using it for write. It leads to a use after free issue,
if the vmxnet3_io_bar0_write routine is called after the device is
deactivated. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 6c352ca9b4)
[BR: CVE-2016-6888 BSC#994771]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Prasad J Pandit
9aa39fe506 virtio: check vring descriptor buffer length
virtio back end uses set of buffers to facilitate I/O operations.
An infinite loop unfolds in virtqueue_pop() if a buffer was
of zero size. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 1e7aed7014)
[BR: CVE-2016-6490 BSC#991466]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Olaf Hering
afe3c5e6bb xen_platform: unplug also SCSI disks
References: bnc#953518 - disks added via SCSI controller are visible twice on HVM XEN guest systems

Using 'vdev=sd[a-o]' will create an emulated LSI controller, which can
be used by the emulated BIOS to boot from disk. If the HVM domU has also
PV driver the disk may appear twice in the guest. To avoid this an
unplug of the emulated hardware is needed, similar to what is done for
IDE and NIC drivers already.

Since the SCSI controller provides only disks the entire controller can
be unplugged at once.

Impact of the change for classic and pvops based guest kernels:

 vdev=sda:disk0
before: pvops:   disk0=pv xvda + emulated sda
        classic: disk0=pv sda  + emulated sdq
after:  pvops:   disk0=pv xvda
        classic: disk0=pv sda

 vdev=hda:disk0, vdev=sda:disk1
before: pvops:   disk0=pv xvda
                 disk1=emulated sda
        classic: disk0=pv hda
                 disk1=pv sda  + emulated sdq
after:  pvops:   disk0=pv xvda
                 disk1=not accessible by blkfront, index hda==index sda
        classic: disk0=pv hda
                 disk1=pv sda

 vdev=hda:disk0, vdev=sda:disk1, vdev=sdb:disk2
before: pvops:   disk0=pv xvda
                 disk1=emulated sda
                 disk2=pv xvdb + emulated sdb
        classic: disk0=pv hda
                 disk1=pv sda  + emulated sdq
                 disk2=pv sdb  + emulated sdr
after:  pvops:   disk0=pv xvda
                 disk1=not accessible by blkfront, index hda==index sda
                 disk2=pv xvdb
        classic: disk0=pv hda
                 disk1=pv sda
                 disk2=pv sda

Signed-off-by: Olaf Hering <olaf@aepfle.de>
2021-03-17 09:45:49 -06:00
Juergen Gross
211e334e0e xen: use a common function for pv and hvm guest backend register calls
Instead of calling xen_be_register() for each supported backend type
for hvm and pv guests in their machine init functions use a common
function in order not to have to add new backends twice.

This at once fixes the error that hvm domains couldn't use the qusb
backend.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-id: 1470119552-16170-1-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 0e39bb022b)
[BR: BSC#991785]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Bruce Rogers
31d186d6e3 qemu-bridge-helper: reduce security profile
Change from using glib alloc and free routines to those
from libc. Also perform safety measure of dropping privs
to user if configured no-caps.

[BR: BOO#988279]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Denis V. Lunev
828a800a9e qcow2: avoid extra flushes in qcow2
The problem with excessive flushing was found by a couple of performance
tests:
  - parallel directory tree creation (from 2 processes)
  - 32 cached writes + fsync at the end in a loop

For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

For the second one results improved from ~600 fsync/sec to ~1100
fsync/sec. Though, it was run on SSD so it probably won't show such
performance gain on rotational media.

qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
cache entries of a particular cache. This can lead to as many as
2 additional fdatasyncs inside bdrv_flush.

We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
as bdrv_flush for sure will do the job. These flushes are necessary to
keep the right order of writes to the different caches. Though this is
not necessary in the current code base as this ordering is ensured through
the flush in qcow2_cache_flush_dependency().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f3c3b87dae)
[BR: BSC#991296]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Juergen Gross
cb5a204f5f xen: drain submit queue in xen-usb before removing device
When unplugging a device in the Xen pvusb backend drain the submit
queue before deallocation of the control structures. Otherwise there
will be bogus memory accesses when I/O contracts are finished.

Correlated to this issue is the handling of cancel requests: a packet
cancelled will still lead to the call of complete, so add a flag
to the request indicating it should be just dropped on complete.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Juergen Gross
ba8a703b6e xen: when removing a backend don't remove many of them
When a Xenstore watch fires indicating a backend has to be removed
don't remove all backends for that domain with the specified device
index, but just the one which has the correct type.

The easiest way to achieve this is to use the already determined
xendev as parameter for xen_be_del_xendev() instead of only the domid
and device index.

This at once removes the open coded QTAILQ_FOREACH_SAVE() in
xen_be_del_xendev() as there is no need to search for the correct
xendev any longer.

Signed-off-by: Juergen Gross <jgross@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Ard Biesheuvel
14e0c8b0d0 hw/arm/virt: mark the PCIe host controller as DMA coherent in the DT
Since QEMU performs cacheable accesses to guest memory when doing DMA
as part of the implementation of emulated PCI devices, guest drivers
should use cacheable accesses as well when running under KVM. Since this
essentially means that emulated PCI devices are DMA coherent, set the
'dma-coherent' DT property on the PCIe host controller DT node.

This brings the DT description into line with the ACPI description,
which already marks the PCI bridge as cache coherent (see commit
bc64b96c98).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1467134090-5099-1-git-send-email-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 5d636e21c4)
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Paolo Bonzini
b11fef76b5 scsi: esp: fix migration
Commit 926cde5 ("scsi: esp: make cmdbuf big enough for maximum CDB size",
2016-06-16) changed the size of a migrated field.  Split it in two
parts, and only migrate the second part in a new vmstate version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cc96677469)
[BR: CVE-2016-6351 BSC#990835]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Cole Robinson
c6190f8067 configure: support vte-2.91
vte >= 0.37 expores API version 2.91, which is where all the active
development is. qemu builds and runs fine with that version, so use it
if it's available.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: b4f0375647f7b368d3dbd3834aee58cb0253566a.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit c6feff9e09)
[BR: BSC#988855]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Cole Robinson
5449860df5 configure: add echo_version helper
Simplifies printing library versions, dependent on if the library
was even found

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 3c9ab16123e06bb4109771ef6ee8acd82d449ba0.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 02d34f62fd)
[BR: BSC#988855]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Olaf Hering
897e4d7d70 xen: SUSE xenlinux unplug for emulated PCI
Implement SUSE specific unplug protocol for emulated PCI devices
in PVonHVM guests
(bsc#953339, bsc#953362, bsc#953518, bsc#984981)

Signed-off-by: Olaf Hering <ohering@suse.de>
2021-03-17 09:45:49 -06:00
Gerd Hoffmann
b5e76f284f vnc: add configurable keyboard delay
Limits the rate kbd events from the vnc server are forwarded to the
guest, so input devices which are typically low-bandwidth can keep
up even on bulky input.

v2: update documentation too.
v3: spell fixes.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Yang Hongyang <hongyang.yang@easystack.cn>
Message-id: 1464762150-25817-1-git-send-email-kraxel@redhat.com
(cherry picked from commit c5ce833344)
[BR: BSC#974914]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Juergen Gross
c176cbcc15 xen: move xen_sysdev to xen_backend.c
Commit 9432e53a5b added xen_sysdev as a
system device to serve as an anchor for removable virtual buses. This
introduced a build failure for non-x86 builds with CONFIG_XEN_BACKEND
set, as xen_sysdev was defined in a x86 specific file while being
consumed in an architecture independent source.

Move the xen_sysdev definition and initialization to xen_backend.c to
avoid the build failure.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2021-03-17 09:45:49 -06:00
Juergen Gross
2db9f0f054 xen: add pvUSB backend
Add a backend for para-virtualized USB devices for xen domains.

The backend is using host-libusb to forward USB requests from a
domain via libusb to the real device(s) passed through.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-id: 1463062421-613-4-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 816ac92ef7)
[BR: FATE#316612]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Juergen Gross
c4aed61f4f xen: write information about supported backends
Add a Xenstore directory for each supported pv backend. This will allow
Xen tools to decide which backend type to use in case there are
multiple possibilities.

The information is added under
/local/domain/<backend-domid>/device-model/<domid>/backends
before the "running" state is written to Xenstore. Using a directory
for each backend enables us to add parameters for specific backends
in the future.

This interface is documented in the Xen source repository in the file
docs/misc/qemu-backends.txt

In order to reuse the Xenstore directory creation already present in
hw/xen/xen_devconfig.c move the related functions to
hw/xen/xen_backend.c where they fit better.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Message-id: 1463062421-613-3-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 637c53ffcb)
[BR: FATE#316612]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Juergen Gross
a149fc97b8 xen: introduce dummy system device
Introduce a new dummy system device serving as parent for virtual
buses. This will enable new pv backends to introduce virtual buses
which are removable again opposed to system buses which are meant
to stay once added.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Message-id: 1463062421-613-2-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 9432e53a5b)
[BR: FATE#316612]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Chunyan Liu
45dfcfa2ee fix xen hvm direct kernel boot
Since commit a1666142: acpi-build: make ROMs RAM blocks resizeable,
xen HVM direct kernel boot failed. Xen HVM direct kernel boot will
insert a linuxboot.bin or multiboot.bin to /genroms, before this
commit, in acpi_setup, for rom linuxboot.bin/multiboot.bin, it
only needs 0x20000 size; after the commit, it will reserve x16
size for resize, that is 0x200000 size. It causes xen_ram_alloc
failed due to running out of memory.

To resolve it, either:
1. keep using original rom size instead of max size, don't reserve x16 size.
2. guest maxmem needs to be increased. (commit c1d322e6 "xen-hvm: increase
   maxmem before calling xc_domain_populate_physmap" solved the problem for
   a time, by accident. But then it is reverted in commit ffffbb369 due to
   other problem.)

For 2, more discussion is needed about howto. So this patch tries 1, to
use unresizable rom size in xen case in rom_set_mr.

[CYL: BSC#970791]

Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-17 09:45:49 -06:00
Olaf Hering
8c2f8971bf Split large discard requests from block frontend
Large discard requests lead to sign expansion errors in qemu.
Since there is no API to tell a guest about the limitations qmeu
has to split a large request itself.

[bsc#964427]

Signed-off-by: Olaf Hering <olaf@aepfle.de>
2021-03-17 09:45:49 -06:00
Bruce Rogers
86e2a3e4ce xen_disk: Add suse specific flush disable handling and map to QEMU equiv
Add code to read the suse specific suse-diskcache-disable-flush flag out
of xenstore, and set the equivalent flag within QEMU.

Patch taken from Xen's patch queue, Olaf Hering being the original author.
See https://bugzilla.novell.com/show_bug.cgi?id=879425

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Chunyan Liu
481ff5facd Fix tigervnc long press issue
Using xen tools 'xl vncviewer' with tigervnc (default on SLE-12),
found that: the display of the guest is unexpected while keep
pressing a key. We expect the same character multiple times, but
it prints only one time. This happens on a PV guest in text mode.

After debugging, found that tigervnc sends repeated key down events
in this case, to differentiate from user pressing the same key many
times. Vnc server only prints the character when it finally receives
key up event.

To solve this issue, this patch tries to add additional key up event
before the next repeated key down event (if the key is not a control
key).

[CYL: BSC#882405]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
89a3a660c2 dictzip: Fix on big endian systems
The dictzip code in SLE11 received some treatment over time to support
running on big endian hosts. Somewhere in the transition to SLE12 this
support got lost. Add it back in again from the SLE11 code base.

Furthermore while at it, fix up the debug prints to not emit warnings.

[AG: BSC#937572]
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Bruce Rogers
f3c6be491e qtest: Increase socket timeout to accomodate unpredictable build service
After encountering make check errors for some time, it appears that the
problem is the occasional socket timeout, which is probably due to a
combination of a slow and overloaded build host in the build service.

Increase the timeout from 5 to 15 seconds and hope we don't see the
problem again.

Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Andreas Färber
4a35aee408 acpi_piix4: Fix migration from SLE11 SP2
qemu-kvm 0.15 uses the same GPE format as qemu 1.4, but as version 2
rather than 3.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Andreas Färber
1541f1a92f i8254: Fix migration from SLE11 SP2
qemu-kvm 0.15 had a VMSTATE_UINT32(flags, PITState) field that
qemu 1.4 does not have.

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Andreas Färber
a6b493a209 vga: Raise VRAM to 16 MiB for pc-0.15 and below
qemu-kvm.git commit a7fe0297840908a4fd65a1cf742481ccd45960eb
(Extend vram size to 16MB) deviated from qemu.git since kvm-61, and only
in commit 9e56edcf8d (vga: raise default
vgamem size) did qemu.git adjust the VRAM size for v1.2.

Add compatibility properties so that up to and including pc-0.15 we
maintain migration compatibility with qemu-kvm rather than QEMU and
from pc-1.0 on with QEMU (last qemu-kvm release was 1.2).

Addresses part of BNC#812836.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: adjust comma position in list in macro for v2.5.0 compat]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
60f455bd98 vnc: provide fake color map
Our current VNC code does not handle color maps (aka non-true-color) at all
and aborts if a client requests them. There are 2 major issues with this:

 1) A VNC viewer on an 8-bit X11 system may request color maps
 2) RealVNC _always_ starts requesting color maps, then moves on to full color

In order to support these 2 use cases, let's just create a fake color map
that covers exactly our normal true color 8 bit color space. That way we don't
lose anything over a client that wants true color.

Reported-by: Sascha Wehnert <swehnert@suse.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Bruce Rogers
8e918405d0 increase x86_64 physical bits to 42
Allow for guests with higher amounts of ram. The current thought
is that 2TB specified on qemu commandline would be an appropriate
limit. Note that this requires the next higher bit value since
the highest address is actually more than 2TB due to the pci
memory hole.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Andreas Färber
6257bea8ea Raise soft address space limit to hard limit
For SLES we want users to be able to use large memory configurations
with KVM without fiddling with ulimit -Sv.

Signed-off-by: Andreas Färber <afaerber@suse.de>
[BR: add include for sys/resource.h]
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Dinar Valeev
842fa6fe35 configure: Enable PIE for ppc and ppc64 hosts
Signed-off-by: Dinar Valeev <dvaleev@suse.com>
[AF: Rebased for v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Bruce Rogers
c266e6c105 virtfs-proxy-helper: Provide __u64 for broken sys/capability.h
Fixes the build on SLE 11 SP2.

[AF: Extend to ppc64]
2021-03-17 09:45:49 -06:00
Alexander Graf
8d7eaa9d50 linux-user: lseek: explicitly cast non-set offsets to signed
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.

When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
65b5791770 Make char muxer more robust wrt small FIFOs
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.

While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.

To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.

This patch fixes input when using -nographic on s390 for me.
2021-03-17 09:45:49 -06:00
Alexander Graf
86406951b4 console: add question-mark escape operator
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.

This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
6409545618 Legacy Patch kvm-qemu-preXX-dictzip3.patch 2021-03-17 09:45:49 -06:00
Alexander Graf
046e48c18c block: Add tar container format
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.

So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.

This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.

The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: bdrv_file_open got an Error **errp argument, bdrv_delete -> brd_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
     bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
     BlockDriverCompletionFunc -> BlockCompletionFunc,
     qemu_aio_release() -> qemu_aio_unref(),
     drop tar_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
[AF: Drop bdrv_open() drv parameter for 2.5]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
ba1034c180 block: Add support for DictZip enabled gzip files
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.

Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.

This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.

Tar patch follows.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: Error **errp added for bdrv_file_open, bdrv_delete -> bdrv_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
     bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
[AF: BlockDriverAIOCB -> BlockAIOCB,
     BlockDriverCompletionFunc -> BlockCompletionFunc,
     qemu_aio_release() -> qemu_aio_unref(),
     drop dictzip_aio_cancel()]
[AF: common-obj-y -> block-obj-y, drop probe hook (bsc#945778)]
[AF: Drop bdrv_open() drv parameter for 2.5]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
32f99db7b2 linux-user: use target_ulong
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.

Pass syscall arguments as ulong always.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
4500a4fad0 linux-user: add more blk ioctls
Implement a few more ioctls that operate on block devices.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Andreas Färber
2134d394c0 vnc: password-file= and incoming-connections=
TBD (from SUSE Studio team)
2021-03-17 09:45:49 -06:00
Andreas Färber
68c13126c3 slirp: -nooutgoing
TBD (from SUSE Studio team)
2021-03-17 09:45:49 -06:00
Alexander Graf
01a2b0191c linux-user: XXX disable fiemap
agraf: fiemap breaks in libarchive. Disable it for now.
2021-03-17 09:45:49 -06:00
Alexander Graf
4c8ed9e947 linux-user: implement FS_IOC_SETFLAGS ioctl
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
eb53b0d156 linux-user: implement FS_IOC_GETFLAGS ioctl
Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
af27d226a7 linux-user: Fake /proc/cpuinfo
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.

The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.

Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
7ff1746c34 linux-user: lock tb flushing too
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto exec.c/translate-all.c split for 1.4]
[AF: Rebased onto tb_alloc() changes for v2.5.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
0a8dafe97c linux-user: Run multi-threaded code on a single core
Running multi-threaded code can easily expose some of the fundamental
breakages in QEMU's design. It's just not a well supported scenario.

So if we pin the whole process to a single host CPU, we guarantee that
we will never have concurrent memory access actually happen. We can still
get scheduled away at any time, so it's no complete guarantee, but apparently
it reduces the odds well enough to get my test cases to pass.

This gets Java 1.7 working for me again on my test box.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
6659286640 linux-user: lock tcg
The tcg code generator is not thread safe. Lock its generation between
different threads.

Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto exec.c/translate-all.c split for 1.4]
[AF: Rebased for v2.1.0-rc0]
[AF: Rebased onto tcg_gen_code_common() drop for v2.5.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
2b2a11358f linux-user: Ignore broken loop ioctl
During invocations of losetup, we run into an ioctl that doesn't
exist. However, because of that we output an error, which then
screws up the kiwi logic around that call.

So let's silently ignore that bogus ioctl.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
1e3cbd8e20 linux-user: binfmt: support host binaries
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.

Signed-off-by: Alexander Graf <agraf@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
6385e157df linux-user: fix segfault deadlock
When entering the guest we take a lock to ensure that nobody else messes
with our TB chaining while we're doing it. If we get a segfault inside that
code, we manage to work on, but will not unlock the lock.

This patch forces unlocking of that lock in the segv handler. I'm not sure
this is the right approach though. Maybe we should rather make sure we don't
segfault in the code? I would greatly appreciate someone more intelligible
than me to look at this :).

Example code to trigger this is at: http://csgraf.de/tmp/conftest.c

Reported-by: Fabio Erculiani <lxnay@sabayon.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Drop spinlock_safe_unlock() and switch to tb_lock_reset() (bonzini)]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
47f014896e PPC: KVM: Disable mmu notifier check
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.

So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
2021-03-17 09:45:49 -06:00
Alexander Graf
65912e18af linux-user: add binfmt wrapper for argv[0] handling
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.

This breaks in some subtile situations, such as the grep and make test
suites.

This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.

The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.

CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Ulrich Hecht
8b1ea3b0c2 block/vmdk: Support creation of SCSI VMDK images in qemu-img
Signed-off-by: Ulrich Hecht <uli@suse.de>
[AF: Changed BLOCK_FLAG_SCSI from 8 to 16 for v1.2]
[AF: Rebased onto upstream VMDK SCSI support]
[AF: Rebased onto skipping of image creation in v1.7]
[AF: Simplified in preparation for v1.7.1/v2.0]
[BR: Rebased onto v2.0]
[BR: Rebased onto v2.1]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Alexander Graf
4c4428b5c9 qemu-cvs-ioctl_nodirection
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
a42e09c897 qemu-cvs-ioctl_debug
Extends unsupported ioctl debug output.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-17 09:45:49 -06:00
Ulrich Hecht
4e29370bdc qemu-cvs-gettimeofday
No clue what this is for.
2021-03-17 09:45:49 -06:00
Alexander Graf
1ff55e263e qemu-cvs-alsa_mmap
Hack to prevent ALSA from using mmap() interface to simplify emulation.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
0828842b10 qemu-cvs-alsa_ioctl
Implements ALSA ioctls on PPC hosts.

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
c6a5f960cb qemu-cvs-alsa_bitfield
Implements TYPE_INTBITFIELD partially. (required for ALSA support)

Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
2021-03-17 09:45:49 -06:00
Ulrich Hecht
17fe1e6495 qemu-0.9.0.cvs-binfmt
Fixes binfmt_misc setup script:
- x86_64 is i386-compatible
- m68k signature fixed
- path to QEMU

Signed-off-by: Ulrich Hecht <uli@suse.de>
[AF: Update path for qemu-aarch64 for v2.0.0-rc1]
Signed-off-by: Andreas Färber <afaerber@suse.de>
2021-03-17 09:45:49 -06:00
Alexander Graf
f4cf842ef8 XXX dont dump core on sigabort 2021-03-17 09:45:49 -06:00
Greg Kurz
0be5309f8d 9pfs: Fully restart unreclaim loop (CVE-2021-20181)
Git-commit: 89fbea8737
References: bsc#1182137

Depending on the client activity, the server can be asked to open a huge
number of file descriptors and eventually hit RLIMIT_NOFILE. This is
currently mitigated using a reclaim logic : the server closes the file
descriptors of idle fids, based on the assumption that it will be able
to re-open them later. This assumption doesn't hold of course if the
client requests the file to be unlinked. In this case, we loop on the
entire fid list and mark all related fids as unreclaimable (the reclaim
logic will just ignore them) and, of course, we open or re-open their
file descriptors if needed since we're about to unlink the file.

This is the purpose of v9fs_mark_fids_unreclaim(). Since the actual
opening of a file can cause the coroutine to yield, another client
request could possibly add a new fid that we may want to mark as
non-reclaimable as well. The loop is thus restarted if the re-open
request was actually transmitted to the backend. This is achieved
by keeping a reference on the first fid (head) before traversing
the list.

This is wrong in several ways:
- a potential clunk request from the client could tear the first
  fid down and cause the reference to be stale. This leads to a
  use-after-free error that can be detected with ASAN, using a
  custom 9p client
- fids are added at the head of the list : restarting from the
  previous head will always miss fids added by a some other
  potential request

All these problems could be avoided if fids were being added at the
end of the list. This can be achieved with a QSIMPLEQ, but this is
probably too much change for a bug fix. For now let's keep it
simple and just restart the loop from the current head.

Fixes: CVE-2021-20181
Buglink: https://bugs.launchpad.net/qemu/+bug/1911666
Reported-by: Zero Day Initiative <zdi-disclosures@trendmicro.com>
Reviewed-by: Christian Schoenebeck <qemu_oss@crudebyte.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Message-Id: <161064025265.1838153.15185571283519390907.stgit@bahia.lan>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Bruce Rogers <brogers@suse.com>
2021-03-17 09:45:49 -06:00
Michael Roth
529d45e151 Update version for 2.6.2 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-29 14:57:09 -05:00
Cornelia Huck
1b5520a3fd s390x/css: handle cssid 255 correctly
The cssid 255 is reserved but still valid from an architectural
point of view. However, feeding a bogus schid of 0xffffffff into
the virtio hypercall will lead to a crash:

Stack trace of thread 138363:
        #0  0x00000000100d168c css_find_subch (qemu-system-s390x)
        #1  0x00000000100d3290 virtio_ccw_hcall_notify
        #2  0x00000000100cbf60 s390_virtio_hypercall
        #3  0x000000001010ff7a handle_hypercall
        #4  0x0000000010079ed4 kvm_cpu_exec (qemu-system-s390x)
        #5  0x00000000100609b4 qemu_kvm_cpu_thread_fn
        #6  0x000003ff8b887bb4 start_thread (libpthread.so.0)
        #7  0x000003ff8b78df0a thread_start (libc.so.6)

This is because the css array was only allocated for 0..254
instead of 0..255.

Let's fix this by bumping MAX_CSSID to 255 and fencing off the
reserved cssid of 255 during css image allocation.

Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit 882b3b9769)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 15:39:49 -05:00
John Snow
9ea7a46e26 ahci: clear aiocb in ncq_cb
Similar to existing fixes for IDE (87ac25fd) and ATAPI (7f951b2d), the
AIOCB must be cleared in the callback. Otherwise, we may accidentally
try to reset a dangling pointer in bdrv_aio_cancel() from a port reset.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1474575040-32079-2-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 11:54:06 -05:00
Gonglei
1c57ced0c4 vnc: fix incorrect checking condition when updating client
vs->disconnecting is set to TRUE and vs->ioc is closed, but
vs->ioc isn't set to NULL, so that the vnc_disconnect_finish()
isn't invoked when you update client in vnc_update_client()
after vnc_disconnect_start invoked. Let's using change the checking
condition to avoid resource leak.

Signed-off-by: Haibin Wang <wanghaibin.wang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1467949056-81208-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 5a693efda8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:50:25 -05:00
Herongguang (Stephen)
98b81297bf vnc-enc-tight: fix off-by-one bug
In tight_encode_indexed_rect32, buf(or src)’s size is count. In for loop,
the logic is supposed to be that i is an index into src, i should be
incremented when incrementing src.

This is broken when src is incremented but i is not before while loop,
resulting in off-by-one bug in while loop.

Signed-off-by: He Rongguang <herongguang.he@huawei.com>
Message-id: 5784B8EB.7010008@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3f7e51bca3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:48:17 -05:00
Gerd Hoffmann
8ef7abeccf vnc: make sure we finish disconnect
It may happen that vnc connections linger in disconnecting state forever
because VncState happens to be in a state where vnc_update_client()
exists early and never reaches the vnc_disconnect_finish() call at the
bottom of the function.  Fix that by doing an additinal check at the
start of the function.

https://bugzilla.redhat.com/show_bug.cgi?id=1352799

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1468405280-2571-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 5a8be0f73d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:48:08 -05:00
Daniel P. Berrange
14713d60cb vnc: don't crash getting server info if lsock is NULL
When VNC is started with '-vnc none' there will be no
listener socket present. When we try to populate the
VncServerInfo we'll crash accessing a NULL 'lsock'
field.

 #0  qio_channel_socket_get_local_address (ioc=0x0, errp=errp@entry=0x7ffd5b8aa0f0) at io/channel-socket.c:33
 #1  0x00007f4b9a297d6f in vnc_init_basic_info_from_server_addr (errp=0x7ffd5b8aa0f0, info=0x7f4b9d425460, ioc=<optimized out>)  at ui/vnc.c:146
 #2  vnc_server_info_get (vd=0x7f4b9e858000) at ui/vnc.c:223
 #3  0x00007f4b9a29d318 in vnc_qmp_event (vs=0x7f4b9ef82000, vs=0x7f4b9ef82000, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:279
 #4  vnc_connect (vd=vd@entry=0x7f4b9e858000, sioc=sioc@entry=0x7f4b9e8b3a20, skipauth=skipauth@entry=true, websocket=websocket @entry=false) at ui/vnc.c:2994
 #5  0x00007f4b9a29e8c8 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>) at ui/v nc.c:3825
 #6  0x00007f4b9a18d8a1 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7ffd5b8aa230) at qmp-marsh al.c:123
 #7  0x00007f4b9a0b53f5 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>) at /usr/src/debug/qemu-2.6.0/mon itor.c:3922
 #8  0x00007f4b9a348580 in json_message_process_token (lexer=0x7f4b9c78dfe8, input=0x7f4b9c7350e0, type=JSON_RCURLY, x=111, y=5 9) at qobject/json-streamer.c:94
 #9  0x00007f4b9a35cfeb in json_lexer_feed_char (lexer=lexer@entry=0x7f4b9c78dfe8, ch=125 '}', flush=flush@entry=false) at qobj ect/json-lexer.c:310
 #10 0x00007f4b9a35d0ae in json_lexer_feed (lexer=0x7f4b9c78dfe8, buffer=<optimized out>, size=<optimized out>) at qobject/json -lexer.c:360
 #11 0x00007f4b9a348679 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>) at q object/json-streamer.c:114
 #12 0x00007f4b9a0b3a1b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>) at /usr/src/deb ug/qemu-2.6.0/monitor.c:3938
 #13 0x00007f4b9a186751 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x7f4b9c7add40) at qemu-char.c:2895
 #14 0x00007f4b92b5c79a in g_main_context_dispatch () from /lib64/libglib-2.0.so.0
 #15 0x00007f4b9a2bb0c0 in glib_pollfds_poll () at main-loop.c:213
 #16 os_host_main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 #17 main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 #18 0x00007f4b9a0835cf in main_loop () at vl.c:1934
 #19 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4667

Do an upfront check for a NULL lsock and report an error to
the caller, which matches behaviour from before

  commit 04d2529da2
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Fri Feb 27 16:20:57 2015 +0000

    ui: convert VNC server to use QIOChannelSocket

where getsockname() would be given a FD value -1 and thus report
an error to the caller.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-2-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 624cdd46d7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:47:36 -05:00
Daniel P. Berrange
c5518b3a26 vnc: ensure connection sharing/limits is always configured
The connection sharing / limits are only set in the
vnc_display_open() method and so missed when VNC is running
with '-vnc none'. This in turn prevents clients being added
to the VNC server with the QMP "add_client" command.

This was introduced in

  commit e5f34cdd2d
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Thu Oct 2 12:09:34 2014 +0200

      vnc: track & limit connections

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-4-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 12e29b1682)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:46:48 -05:00
Daniel P. Berrange
6be9ee1620 vnc: fix crash when vnc_server_info_get has an error
The vnc_server_info_get will allocate the VncServerInfo
struct and then call vnc_init_basic_info_from_server_addr
to populate the basic fields. If this returns an error
though, the qapi_free_VncServerInfo call will then crash
because the VncServerInfo struct instance was not properly
NULL-initialized and thus contains random stack garbage.

 #0  0x00007f1987c8e6f5 in raise () at /lib64/libc.so.6
 #1  0x00007f1987c902fa in abort () at /lib64/libc.so.6
 #2  0x00007f1987ccf600 in __libc_message () at /lib64/libc.so.6
 #3  0x00007f1987cd7d4a in _int_free () at /lib64/libc.so.6
 #4  0x00007f1987cdb2ac in free () at /lib64/libc.so.6
 #5  0x00007f198b654f6e in g_free () at /lib64/libglib-2.0.so.0
 #6  0x0000559193cdcf54 in visit_type_str (v=v@entry=
     0x5591972f14b0, name=name@entry=0x559193de1e29 "host", obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899d80)
     at qapi/qapi-visit-core.c:255
 #7  0x0000559193cca8f3 in visit_type_VncBasicInfo_members (v=v@entry=
     0x5591972f14b0, obj=obj@entry=0x5591961dbfa0, errp=errp@entry=0x7fffd7899dc0) at qapi-visit.c:12307
 #8  0x0000559193ccb523 in visit_type_VncServerInfo_members (v=v@entry=
     0x5591972f14b0, obj=0x5591961dbfa0, errp=errp@entry=0x7fffd7899e00) at qapi-visit.c:12632
 #9  0x0000559193ccb60b in visit_type_VncServerInfo (v=v@entry=
     0x5591972f14b0, name=name@entry=0x0, obj=obj@entry=0x7fffd7899e48, errp=errp@entry=0x0) at qapi-visit.c:12658
 #10 0x0000559193cb53d8 in qapi_free_VncServerInfo (obj=<optimized out>) at qapi-types.c:3970
 #11 0x0000559193c1e6ba in vnc_server_info_get (vd=0x7f1951498010) at ui/vnc.c:233
 #12 0x0000559193c24275 in vnc_connect (vs=0x559197b2f200, vs=0x559197b2f200, event=QAPI_EVENT_VNC_CONNECTED) at ui/vnc.c:284
 #13 0x0000559193c24275 in vnc_connect (vd=vd@entry=0x7f1951498010, sioc=sioc@entry=0x559196bf9c00, skipauth=skipauth@entry=tru e, websocket=websocket@entry=false) at ui/vnc.c:3039
 #14 0x0000559193c25806 in vnc_display_add_client (id=<optimized out>, csock=<optimized out>, skipauth=<optimized out>)
     at ui/vnc.c:3877
 #15 0x0000559193a90c28 in qmp_marshal_add_client (args=<optimized out>, ret=<optimized out>, errp=0x7fffd7899f90)
     at qmp-marshal.c:105
 #16 0x000055919399c2b7 in handle_qmp_command (parser=<optimized out>, tokens=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3971
 #17 0x0000559193ce3307 in json_message_process_token (lexer=0x559194ab0838, input=0x559194a6d940, type=JSON_RCURLY, x=111, y=1 2) at qobject/json-streamer.c:105
 #18 0x0000559193cfa90d in json_lexer_feed_char (lexer=lexer@entry=0x559194ab0838, ch=125 '}', flush=flush@entry=false)
     at qobject/json-lexer.c:319
 #19 0x0000559193cfaa1e in json_lexer_feed (lexer=0x559194ab0838, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-lexer.c:369
 #20 0x0000559193ce33c9 in json_message_parser_feed (parser=<optimized out>, buffer=<optimized out>, size=<optimized out>)
     at qobject/json-streamer.c:124
 #21 0x000055919399a85b in monitor_qmp_read (opaque=<optimized out>, buf=<optimized out>, size=<optimized out>)
     at /home/berrange/src/virt/qemu/monitor.c:3987
 #22 0x0000559193a87d00 in tcp_chr_read (chan=<optimized out>, cond=<optimized out>, opaque=0x559194a7d900)
     at qemu-char.c:2895
 #23 0x00007f198b64f703 in g_main_context_dispatch () at /lib64/libglib-2.0.so.0
 #24 0x0000559193c484b3 in main_loop_wait () at main-loop.c:213
 #25 0x0000559193c484b3 in main_loop_wait (timeout=<optimized out>) at main-loop.c:258
 #26 0x0000559193c484b3 in main_loop_wait (nonblocking=<optimized out>) at main-loop.c:506
 #27 0x0000559193964c55 in main () at vl.c:1908
 #28 0x0000559193964c55 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4603

This was introduced in

  commit 98481bfcd6
  Author: Eric Blake <eblake@redhat.com>
  Date:   Mon Oct 26 16:34:45 2015 -0600

    vnc: Hoist allocation of VncBasicInfo to callers

which added error reporting for vnc_init_basic_info_from_server_addr
but didn't change the g_malloc calls to g_malloc0.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1470134726-15697-3-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3e7f136d8b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:45:38 -05:00
Daniel P. Berrange
2b13613e02 ui: avoid crash if vnc client disconnects with writes pending
The vnc_client_read() function is called from the vnc_client_io()
event handler callback when there is incoming data to process.
If it detects that the client has disconnected, then it will
trigger cleanup and free'ing of the VncState client struct at
a safe time.

Unfortunately, the vnc_client_io() event handler will also call
vnc_client_write() to handle any outgoing data writes. So if
vnc_client_io() was invoked with both G_IO_IN and G_IO_OUT
events set, and the client disconnects, we may try to write to
a client which has just been freed.

https://bugs.launchpad.net/qemu/+bug/1594861

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1467042529-3372-1-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit ea69744988)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-28 10:44:14 -05:00
Fam Zheng
6e184753b3 virtio-scsi: Don't abort when media is ejected
With an ejected block backend, blk_get_aio_context() would return
qemu_aio_context. In this case don't assert.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1473848224-24809-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 2a2d69f490)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:39:39 -05:00
Fam Zheng
469513b38a scsi-disk: Cleaning up around tray open state
Even if tray is not open, it can be empty (blk_is_inserted() == false).
Handle both cases correctly by replacing the s->tray_open checks with
blk_is_available(), which is an AND of the two.

Also simplify successive checks of them into blk_is_available(), in a
couple cases.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1473848224-24809-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cd723b8560)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:39:32 -05:00
Fam Zheng
12664c5a1c iothread: Stop threads before main() quits
Right after main_loop ends, we release various things but keep iothread
alive. The latter is not prepared to the sudden change of resources.

Specifically, after bdrv_close_all(), virtio-scsi dataplane get a
surprise at the empty BlockBackend:

(gdb) bt
    at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:543
    at /usr/src/debug/qemu-2.6.0/hw/scsi/virtio-scsi.c:577

It is because the d->conf.blk->root is set to NULL, then
blk_get_aio_context() returns qemu_aio_context, whereas s->ctx is still
pointing to the iothread:

    hw/scsi/virtio-scsi.c:543:

    if (s->dataplane_started) {
        assert(blk_get_aio_context(d->conf.blk) == s->ctx);
    }

To fix this, let's stop iothreads before doing bdrv_close_all().

Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1473326931-9699-1-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit dce8921b2b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:38:26 -05:00
Daniel P. Berrange
d90481343f crypto: ensure XTS is only used with ciphers with 16 byte blocks
The XTS cipher mode needs to be used with a cipher which has
a block size of 16 bytes. If a mis-matching block size is used,
the code will either corrupt memory beyond the IV array, or
not fully encrypt/decrypt the IV.

This fixes a memory corruption crash when attempting to use
cast5-128 with xts, since the former has an 8 byte block size.

A test case is added to ensure the cipher creation fails with
such an invalid combination.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit a5d2f44d0d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:08:52 -05:00
Paolo Bonzini
0751a606c2 scsi: mptconfig: fix misuse of MPTSAS_CONFIG_PACK
These issues cause respectively a QEMU crash and a leak of 2 bytes of
stack.  They were discovered by VictorV of 360 Marvel Team.

Reported-by: Tom Victor <i-tangtianwen@360.cm>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 65a8e1f641)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:33 -05:00
Prasad J Pandit
5e39560941 scsi: mptconfig: fix an assert expression
When LSI SAS1068 Host Bus emulator builds configuration page
headers, mptsas_config_pack() should assert that the size
fits in a byte.  However, the size is expressed in 32-bit
units, so up to 1020 bytes fit.  The assertion was only
allowing replies up to 252 bytes, so fix it.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472645167-30765-2-git-send-email-ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cf2bce203a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:33 -05:00
Prasad J Pandit
da99530e41 vmw_pvscsi: check page count while initialising descriptor rings
Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the page count for these rings to
an arbitrary value, leading to infinite loop or OOB access.
Add check to avoid it.

Reported-by: Tom Victor <vv474172261@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1472626169-12989-1-git-send-email-ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7f61f4690d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:33 -05:00
Rony Weng
7aa7c25186 scsi-disk: change disk serial length from 20 to 36
Openstack Cinder assigns volume a 36 characters uuid as serial.
QEMU will shrinks the uuid to 20 characters, which does not match
the original uuid.

Note that there is no limit to the length of the serial number in
the SCSI spec.  20 was copy-pasted from virtio-blk which in turn was
copy-pasted from ATA; 36 is even more arbitrary.  However, bumping it
up too much might cause issues (e.g. 252 seems to make sense because
then the maximum amount of returned data is 256; but who knows there's
no off-by-one somewhere for such a nicely rounded number).

Signed-off-by: Rony Weng <ronyweng@synology.com>
Message-Id: <1472457138-23386-1-git-send-email-ronyweng@synology.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 48b6206305)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:32 -05:00
b79239a41b qemu-char: avoid segfault if user lacks of permisson of a given logfile
Function qemu_chr_alloc returns NULL if it failed to open logfile by any reason,
says no write permission. For backends tty, stdio and msmouse, They need to
check this return value to avoid segfault in this case.

Signed-off-by: Lin Ma <lma@suse.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Message-Id: <20160914062250.22226-1-lma@suse.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 71200fb966)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:32 -05:00
Prasad J Pandit
c5b64fb79c scsi: pvscsi: limit process IO loop to ring size
Vmware Paravirtual SCSI emulator while processing IO requests
could run into an infinite loop if 'pvscsi_ring_pop_req_descr'
always returned positive value. Limit IO loop to the ring size.

Cc: qemu-stable@nongnu.org
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1473845952-30785-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d251157ac1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:32 -05:00
Li Qiang
12be5cfe1c scsi: mptsas: use g_new0 to allocate MPTSASRequest object
When processing IO request in mptsas, it uses g_new to allocate
a 'req' object. If an error occurs before 'req->sreq' is
allocated, It could lead to an OOB write in mptsas_free_request
function. Use g_new0 to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1473684251-17476-1-git-send-email-ppandit@redhat.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 670e56d3ed)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 11:03:32 -05:00
Greg Kurz
5e2c6fe7cc 9pfs: fix potential segfault during walk
If the call to fid_to_qid() returns an error, we will call v9fs_path_free()
on uninitialized paths.

It is a regression introduced by the following commit:

56f101ecce 9pfs: handle walk of ".." in the root directory

Let's fix this by initializing dpath and path before calling fid_to_qid().

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[groug: updated the changelog to indicate this is regression and to provide
        the offending commit SHA1]
Signed-off-by: Greg Kurz <groug@kaod.org>

(cherry picked from commit 13fd08e631)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 10:25:06 -05:00
Gonglei
b9ab2f6671 vnc: fix qemu crash because of SIGSEGV
The backtrace is:

0x00007f0b75cdf880 in pixman_image_get_stride () from /lib64/libpixman-1.so.0
0x00007f0b77bcb3cf in vnc_server_fb_stride (vd=0x7f0b7a1a2bb0) at ui/vnc.c:680
vnc_dpy_copy (dcl=0x7f0b7a1a2c00, src_x=224, src_y=263, dst_x=319, dst_y=363, w=1, h=1) at ui/vnc.c:915
0x00007f0b77bbcc35 in dpy_gfx_copy (con=0x7f0b7a146210, src_x=src_x@entry=224, src_y=src_y@entry=263, dst_x=dst_x@entry=319,
dst_y=dst_y@entry=363, w=1, h=1) at ui/console.c:1575
0x00007f0b77bbda4e in qemu_console_copy (con=<optimized out>, src_x=src_x@entry=224, src_y=src_y@entry=263, dst_x=dst_x@entry=319,
dst_y=dst_y@entry=363, w=<optimized out>, h=<optimized out>) at ui/console.c:2111
0x00007f0b77ac0980 in cirrus_do_copy (h=<optimized out>, w=<optimized out>, src=<optimized out>, dst=<optimized out>, s=0x7f0b7b086090) at hw/display/cirrus_vga.c:774
cirrus_bitblt_videotovideo_copy (s=0x7f0b7b086090) at hw/display/cirrus_vga.c:793
cirrus_bitblt_videotovideo (s=0x7f0b7b086090) at hw/display/cirrus_vga.c:915
cirrus_bitblt_start (s=0x7f0b7b086090) at hw/display/cirrus_vga.c:1056
0x00007f0b77965cfb in memory_region_write_accessor (mr=0x7f0b7b096e40, addr=320, value=<optimized out>, size=1, shift=<optimized out>,mask=<optimized out>, attrs=...) at /root/rpmbuild/BUILD/master/qemu/memory.c:525
0x00007f0b77963f59 in access_with_adjusted_size (addr=addr@entry=320, value=value@entry=0x7f0b69a268d8, size=size@entry=4,
access_size_min=<optimized out>, access_size_max=<optimized out>, access=access@entry=0x7f0b77965c80 <memory_region_write_accessor>,
mr=mr@entry=0x7f0b7b096e40, attrs=attrs@entry=...) at /root/rpmbuild/BUILD/master/qemu/memory.c:591
0x00007f0b77968315 in memory_region_dispatch_write (mr=mr@entry=0x7f0b7b096e40, addr=addr@entry=320, data=18446744073709551362,
size=size@entry=4, attrs=attrs@entry=...) at /root/rpmbuild/BUILD/master/qemu/memory.c:1262
0x00007f0b779256a9 in address_space_write_continue (mr=0x7f0b7b096e40, l=4, addr1=320, len=4, buf=0x7f0b77713028 "\002\377\377\377",
attrs=..., addr=4273930560, as=0x7f0b7827d280 <address_space_memory>) at /root/rpmbuild/BUILD/master/qemu/exec.c:2544
address_space_write (as=<optimized out>, addr=<optimized out>, attrs=..., buf=<optimized out>, len=<optimized out>) at /root/rpmbuild/BUILD/master/qemu/exec.c:2601
0x00007f0b77925c1d in address_space_rw (as=<optimized out>, addr=<optimized out>, attrs=..., attrs@entry=...,
buf=buf@entry=0x7f0b77713028 "\002\377\377\377", len=<optimized out>, is_write=<optimized out>) at /root/rpmbuild/BUILD/master/qemu/exec.c:2703
0x00007f0b77962f53 in kvm_cpu_exec (cpu=cpu@entry=0x7f0b79fcc2d0) at /root/rpmbuild/BUILD/master/qemu/kvm-all.c:1965
0x00007f0b77950cc6 in qemu_kvm_cpu_thread_fn (arg=0x7f0b79fcc2d0) at /root/rpmbuild/BUILD/master/qemu/cpus.c:1078
0x00007f0b744b3dc5 in start_thread (arg=0x7f0b69a27700) at pthread_create.c:308
0x00007f0b70d3d66d in clone () from /lib64/libc.so.6

The code path while meeting segfault:
 vnc_dpy_copy
   vnc_update_client
     vnc_disconnect_finish [while vnc_disconnect_start() is invoked because somethins wrong]
       vnc_update_server_surface
         vd->server = NULL;
   vnc_server_fb_stride
     pixman_image_get_stride(vd->server)

Let's add a non-NULL check before calling vnc_server_fb_stride() to avoid segmentation fault.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Daniel P. Berrange <berrange@redhat.com>
Reported-by: Yanying Zhuang <ann.zhuangyanying@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1472788698-120964-1-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 3e10c3ecfc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-20 09:47:56 -05:00
Ladi Prosek
44d28f22bc virtio-balloon: discard virtqueue element on reset
The one pending element is being freed but not discarded on device
reset, which causes svq->inuse to creep up, eventually hitting the
"Virtqueue size exceeded" error.

Properly discarding the element on device reset makes sure that its
buffers are unmapped and the inuse counter stays balanced.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Roman Kagan <rkagan@virtuozzo.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 104e70cae7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-14 20:50:31 -05:00
Stefan Hajnoczi
1af2c3fcb8 virtio: zero vq->inuse in virtio_reset()
vq->inuse must be zeroed upon device reset like most other virtqueue
fields.

In theory, virtio_reset() just needs assert(vq->inuse == 0) since
devices must clean up in-flight requests during reset (requests cannot
not be leaked!).

In practice, it is difficult to achieve vq->inuse == 0 across reset
because balloon, blk, 9p, etc implement various different strategies for
cleaning up requests.  Most devices call g_free(elem) directly without
telling virtio.c that the VirtQueueElement is cleaned up.  Therefore
vq->inuse is not decremented during reset.

This patch zeroes vq->inuse and trusts that devices are not leaking
VirtQueueElements across reset.

I will send a follow-up series that refactors request life-cycle across
all devices and converts vq->inuse = 0 into assert(vq->inuse == 0) but
this more invasive approach is not appropriate for stable trees.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: qemu-stable <qemu-stable@nongnu.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Ladi Prosek <lprosek@redhat.com>
(cherry picked from commit 4b7f91ed02)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-14 20:50:24 -05:00
Greg Kurz
85d0a53c58 9pfs: handle walk of ".." in the root directory
The 9P spec at http://man.cat-v.org/plan_9/5/intro says:

All directories must support walks to the directory .. (dot-dot) meaning
parent directory, although by convention directories contain no explicit
entry for .. or . (dot).  The parent of the root directory of a server's
tree is itself.

This means that a client cannot walk further than the root directory
exported by the server. In other words, if the client wants to walk
"/.." or "/foo/../..", the server should answer like the request was
to walk "/".

This patch just does that:
- we cache the QID of the root directory at attach time
- during the walk we compare the QID of each path component with the root
  QID to detect if we're in a "/.." situation
- if so, we skip the current component and go to the next one

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 56f101ecce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 15:47:04 -05:00
Greg Kurz
b5191b2df7 9pfs: forbid . and .. in file names
According to the 9P spec http://man.cat-v.org/plan_9/5/open about the
create request:

The names . and .. are special; it is illegal to create files with these
names.

This patch causes the create and lcreate requests to fail with EINVAL if
the file name is either "." or "..".

Even if it isn't explicitly written in the spec, this patch extends the
checking to all requests that may cause a directory entry to be created:

    - mknod
    - rename
    - renameat
    - mkdir
    - link
    - symlink

The unlinkat request also gets patched for consistency (even if
rmdir("foo/..") is expected to fail according to POSIX.1-2001).

The various error values come from the linux manual pages.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 805b5d98c6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 15:46:58 -05:00
Greg Kurz
917e9a9816 9pfs: forbid illegal path names
Empty path components don't make sense for most commands and may cause
undefined behavior, depending on the backend.

Also, the walk request described in the 9P spec [1] clearly shows that
the client is supposed to send individual path components: the official
linux client never sends portions of path containing the / character for
example.

Moreover, the 9P spec [2] also states that a system can decide to restrict
the set of supported characters used in path components, with an explicit
mention "to remove slashes from name components".

This patch introduces a new name_is_illegal() helper that checks the
names sent by the client are not empty and don't contain unwanted chars.
Since 9pfs is only supported on linux hosts, only the / character is
checked at the moment. When support for other hosts (AKA. win32) is added,
other chars may need to be blacklisted as well.

If a client sends an illegal path component, the request will fail and
ENOENT is returned to the client.

[1] http://man.cat-v.org/plan_9/5/walk
[2] http://man.cat-v.org/plan_9/5/intro

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit fff39a7ad0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 15:46:53 -05:00
Li Qiang
cb3677cd50 net: vmxnet: use g_new for pkt initialisation
When vmxnet transport abstraction layer initialises pkt,
the maximum fragmentation count is not checked. This could lead
to an integer overflow causing a NULL pointer dereference.
Replace g_malloc() with g_new() to catch the multiplication
overflow.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 15:29:37 -05:00
Li Qiang
93060258ae net: vmxnet: check IP header length
Vmxnet3 device emulator when parsing packet headers does not check
for IP header length. It could lead to a OOB access when reading
further packet data. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 15:23:02 -05:00
Vadim Rozenfeld
f216833ac0 iscsi: pass SCSI status back for SG_IO
Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 644c6869d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 14:48:33 -05:00
Stefan Hajnoczi
ca86c04717 virtio: decrement vq->inuse in virtqueue_discard()
virtqueue_discard() moves vq->last_avail_idx back so the element can be
popped again.  It's necessary to decrement vq->inuse to avoid "leaking"
the element count.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 58a83c6149)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 14:46:16 -05:00
Stefan Hajnoczi
8e44714fe9 virtio: recalculate vq->inuse after migration
The vq->inuse field is not migrated.  Many devices don't hold
VirtQueueElements across migration so it doesn't matter that vq->inuse
starts at 0 on the destination QEMU.

At least virtio-serial, virtio-blk, and virtio-balloon migrate while
holding VirtQueueElements.  For these devices we need to recalculate
vq->inuse upon load so the value is correct.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit bccdef6b1a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 14:46:02 -05:00
Daniel P. Berrange
d00d89a685 ui: fix refresh of VNC server surface
In previous commit

  commit c7628bff41
  Author: Gerd Hoffmann <kraxel@redhat.com>
  Date:   Fri Oct 30 12:10:09 2015 +0100

    vnc: only alloc server surface with clients connected

the VNC server was changed so that the 'vd->server' pixman
image was only allocated when a client is connected.

Since then if a client disconnects and then reconnects to
the VNC server all they will see is a black screen until
they do something that triggers a refresh. On a graphical
desktop this is not often noticed since there's many things
going on which cause a refresh. On a plain text console it
is really obvious since nothing refreshes frequently.

The problem is that the VNC server didn't update the guest
dirty bitmap, so still believes its server image is in sync
with the guest contents.

To fix this we must explicitly mark the entire guest desktop
as dirty after re-creating the server surface. Move this
logic into vnc_update_server_surface() so it is guaranteed
to be call in all code paths that re-create the surface
instead of only in vnc_dpy_switch()

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Tested-by: Peter Lieven <pl@kamp.de>
Message-id: 1471365032-18096-1-git-send-email-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit b69a553b4a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 14:43:52 -05:00
Prasad J Pandit
3b9717a62b net: check fragment length during fragmentation
Network transport abstraction layer supports packet fragmentation.
While fragmenting a packet, it checks for more fragments from
packet length and current fragment length. It is susceptible
to an infinite loop, if the current fragment length is zero.
Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ead315e43e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-09-08 14:42:28 -05:00
Michael Roth
fcf75ad007 Update version for 2.6.1 release
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-17 10:24:53 -05:00
Gonglei
5125bef25a timer: set vm_clock disabled default
(commit 80dcfb8532)
Upon migration, the code use a timer based on vm_clock for 1ns
in the future from post_load to do the event send in case host_connected
differs between migration source and target.

However, it's not guaranteed that the apic is ready to inject irqs into
the guest, and the irq line remained high, resulting in any future interrupts
going unnoticed by the guest as well.

That's because 1) the migration coroutine is not blocked when it get EAGAIN
while reading QEMUFile. 2) The vm_clock is enabled default currently, it doesn't
rely on the calling of vm_start(), that means vm_clock timers can run before
VCPUs are running.

So, let's set the vm_clock disabled default, keep the initial intention of
design for vm_clock timers.

Meanwhile, change the test-aio usecase, using QEMU_CLOCK_REALTIME instead of
QEMU_CLOCK_VIRTUAL as the block code does.

CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1470728955-90600-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3fdd0ee393)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-15 08:57:36 -05:00
Bruce Rogers
beeff749f6 Xen PCI passthrough: fix passthrough failure when no interrupt pin
Commit 5a11d0f7 mistakenly converted a log message into an error
condition when no pin interrupt is found for the pci device being
passed through. Revert that part of the commit.

Signed-off-by: Bruce Rogers <brogers@suse.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
(cherry picked from commit 0968c91ce0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-15 08:57:24 -05:00
Laurent Vivier
1f1b96a1df ppc64: fix compressed dump with pseries kernel
If we don't provide the page size in target-ppc:cpu_get_dump_info(),
the default one (TARGET_PAGE_SIZE, 4KB) is used to create
the compressed dump. It works fine with Macintosh, but not with
pseries as the kernel default page size is 64KB.

Without this patch, if we generate a compressed dump in the QEMU monitor:

    (qemu) dump-guest-memory -z qemu.dump

This dump cannot be read by crash:

    # crash vmlinux qemu.dump
    ...
    WARNING: cannot translate vmemmap kernel virtual addresses:
             commands requiring page structure contents will fail
    ...

Page_size is used to determine the dumpfile's block size. The
block size needs to be at least the page size, but a multiple of page
size works fine too. For PPC64, linux supports either 4KB or 64KB software
page size. So we define the page_size to 64KB.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 760d88d1d0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-15 08:55:58 -05:00
Prasad J Pandit
236039b89d scsi: esp: check TI buffer index before read/write
The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
FIFO buffers. One is used to handle commands and other is for
information transfer. Three control variables 'ti_rptr',
'ti_wptr' and 'ti_size' are used to control r/w access to the
information transfer buffer ti_buf[TI_BUFSZ=16]. In that,

'ti_rptr' is used as read index, where read occurs.
'ti_wptr' is a write index, where write would occur.
'ti_size' indicates total bytes to be read from the buffer.

While reading/writing to this buffer, index could exceed its
size. Add check to avoid OOB r/w access.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1465230883-22303-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ff589551c8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 16:01:08 -05:00
Prasad J Pandit
407fb6fce9 scsi: megasas: null terminate bios version buffer
While reading information via 'megasas_ctrl_get_info' routine,
a local bios version buffer isn't null terminated. Add the
terminating null byte to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 844864fbae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:31:33 -05:00
Prasad J Pandit
27fa5e735a scsi: esp: make cmdbuf big enough for maximum CDB size
While doing DMA read into ESP command buffer 's->cmdbuf', it could
write past the 's->cmdbuf' area, if it was transferring more than 16
bytes.  Increase the command buffer size to 32, which is maximum when
's->do_cmd' is set, and add a check on 'len' to avoid OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 926cde5f3e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:31:04 -05:00
Paolo Bonzini
8c04a291c9 scsi: esp: clean up handle_ti/esp_do_dma if s->do_cmd
Avoid duplicated code between esp_do_dma and handle_ti.  esp_do_dma
has the same code that handle_ti contains after the call to esp_do_dma;
but the code in handle_ti is never reached because it is in an "else if".
Remove the else and also the pointless return.

esp_do_dma also has a partially dead assignment of the to_device
variable.  Sink it to the point where it's actually used.

Finally, assert that the other caller of esp_do_dma (esp_transfer_data)
only transfers data and not a command.  This is true because get_cmd
cancels the old request synchronously before its caller handle_satn_stop
sets do_cmd to 1.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7f0b6e114a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:30:50 -05:00
Paolo Bonzini
aa6905db96 scsi: esp: respect FIFO invariant after message phase
The FIFO contains two bytes; hence the write ptr should be two bytes ahead
of the read pointer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d020aa504c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:30:39 -05:00
Prasad J Pandit
e5c4e642be scsi: esp: check buffer length before reading scsi command
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() in non-DMA mode, uses 'ti_size' to read scsi
command into a buffer. Add check to validate command length against
buffer size to avoid any overrun.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464717207-7549-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d3cdc49138)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:30:23 -05:00
Prasad J Pandit
80eb9b8c44 scsi: megasas: check 'read_queue_head' index value
While doing MegaRAID SAS controller command frame lookup, routine
'megasas_lookup_frame' uses 'read_queue_head' value as an index
into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value
within array bounds to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b60bdd1f1e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:30:08 -05:00
Prasad J Pandit
19dcd481ac scsi: megasas: initialise local configuration data buffer
When reading MegaRAID SAS controller configuration via MegaRAID
Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read
uses an uninitialised local data buffer. Initialise this buffer
to avoid stack information leakage.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d37af74073)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:29:51 -05:00
Prasad J Pandit
1467b936d9 scsi: megasas: use appropriate property buffer size
When setting MegaRAID SAS controller properties via MegaRAID
Firmware Interface(MFI) commands, a user supplied size parameter
is used to set property value. Use appropriate size value to avoid
OOB access issues.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1b85898025)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:28:56 -05:00
Prasad J Pandit
7a2c32ec06 net: mipsnet: check packet length against buffer
When receiving packets over MIPSnet network device, it uses
receive buffer of size 1514 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.

Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>

(cherry picked from commit 3af9187fc6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:28:45 -05:00
Cole Robinson
780d8317c8 hw/arm/virt: Reject gic-version=host for non-KVM
If you try to gic-version=host with TCG on a KVM aarch64 host,
qemu segfaults, since host requires KVM APIs.

Explicitly reject gic-version=host if KVM is not enabled

https://bugzilla.redhat.com/show_bug.cgi?id=1339977
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: b1b3b0dd143b7995a7f4062966b80a2cf3e3c71e.1464273085.git.crobinso@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 0bf8039dca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:28:21 -05:00
Cole Robinson
c5ba71b6b9 ui: spice: Exit if gl=on EGL init fails
The user explicitly requested spice GL, so if we know it isn't
going to work we should exit

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: e3789e35b16f9e3cc6f2652f91c52d88ba6d6936.1463588606.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit daafc661cc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:27:59 -05:00
Gerd Hoffmann
84da2c6701 sdl2: skip init without outputs
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
Message-id: 1464790116-32405-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 8efa5f29f8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:27:37 -05:00
Cole Robinson
ccecdf758d ui: sdl2: Release grab before opening console window
sdl 2.0.4 currently has a bug which causes our UI shortcuts to fire
rapidly in succession:

  https://bugzilla.libsdl.org/show_bug.cgi?id=3287

It's a toss up whether ctrl+alt+f or ctrl+alt+2 will fire an
odd or even number of times, thus determining whether the action
succeeds or fails.

Opening monitor/serial windows is doubly broken, since it will often
lock the UI trying to grab the pointer:

  0x00007fffef3720a5 in SDL_Delay_REAL () at /lib64/libSDL2-2.0.so.0
  0x00007fffef3688ba in X11_SetWindowGrab () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2f2da7 in SDL_SendWindowEvent () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2f080b in SDL_SetKeyboardFocus () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35d784 in X11_DispatchFocusIn.isra.8 () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35dbce in X11_DispatchEvent () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35ee4a in X11_PumpEvents () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2eea6a in SDL_PumpEvents_REAL () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2eeab5 in SDL_WaitEventTimeout_REAL () at /lib64/libSDL2-2.0.so.0
  0x000055555597eed0 in sdl2_poll_events (scon=0x55555876f928) at ui/sdl2.c:593

We can work around that hang by ungrabbing the pointer before launching
a new window. This roughly matches what our sdl1 code does

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 31c9ab6540b031f7a614c59edcecea9877685612.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 56f289f383)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:25:57 -05:00
Cole Robinson
0f9745afaa ui: gtk: fix crash when terminal inner-border is NULL
VTE terminal inner-border can be NULL. The vte-0.36 (API 2.90)
code checks for the condition too so I assume it's not just a bug

Fixes a crash on Fedora 24 with gtk 3.20

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 2b2e85d403e8760ea53afd735a170500d5c17716.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 4fd811a6bd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:25:53 -05:00
Marc-André Lureau
94c8340d93 ahci: free irqs array
Each irq is referenced by the IDEBus in ide_init2(), thus we can free
the no longer used array.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 9d324b0e67)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:25:31 -05:00
Marc-André Lureau
3d34297e9c ahci: fix sglist leak on retry
ahci-test /x86_64/ahci/io/dma/lba28/retry triggers the following leak:

Direct leak of 16 byte(s) in 1 object(s) allocated from:
    #0 0x7fc4b2a25e20 in malloc (/lib64/libasan.so.3+0xc6e20)
    #1 0x7fc4993bce58 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee58)
    #2 0x556a187d4b34 in ahci_populate_sglist hw/ide/ahci.c:896
    #3 0x556a187d8237 in ahci_dma_prepare_buf hw/ide/ahci.c:1367
    #4 0x556a187b5a1a in ide_dma_cb hw/ide/core.c:844
    #5 0x556a187d7eec in ahci_start_dma hw/ide/ahci.c:1333
    #6 0x556a187b650b in ide_start_dma hw/ide/core.c:921
    #7 0x556a187b61e6 in ide_sector_start_dma hw/ide/core.c:911
    #8 0x556a187b9e26 in cmd_write_dma hw/ide/core.c:1486
    #9 0x556a187bd519 in ide_exec_cmd hw/ide/core.c:2027
    #10 0x556a187d71c5 in handle_reg_h2d_fis hw/ide/ahci.c:1204
    #11 0x556a187d7681 in handle_cmd hw/ide/ahci.c:1254
    #12 0x556a187d168a in check_cmd hw/ide/ahci.c:510
    #13 0x556a187d0afc in ahci_port_write hw/ide/ahci.c:314
    #14 0x556a187d105d in ahci_mem_write hw/ide/ahci.c:435
    #15 0x556a1831d959 in memory_region_write_accessor /home/elmarco/src/qemu/memory.c:525
    #16 0x556a1831dc35 in access_with_adjusted_size /home/elmarco/src/qemu/memory.c:591
    #17 0x556a18323ce3 in memory_region_dispatch_write /home/elmarco/src/qemu/memory.c:1262
    #18 0x556a1828cf67 in address_space_write_continue /home/elmarco/src/qemu/exec.c:2578
    #19 0x556a1828d20b in address_space_write /home/elmarco/src/qemu/exec.c:2635
    #20 0x556a1828d92b in address_space_rw /home/elmarco/src/qemu/exec.c:2737
    #21 0x556a1828daf7 in cpu_physical_memory_rw /home/elmarco/src/qemu/exec.c:2746
    #22 0x556a183068d3 in cpu_physical_memory_write /home/elmarco/src/qemu/include/exec/cpu-common.h:72
    #23 0x556a18308194 in qtest_process_command /home/elmarco/src/qemu/qtest.c:382
    #24 0x556a18309999 in qtest_process_inbuf /home/elmarco/src/qemu/qtest.c:573
    #25 0x556a18309a4a in qtest_read /home/elmarco/src/qemu/qtest.c:585
    #26 0x556a18598b85 in qemu_chr_be_write_impl /home/elmarco/src/qemu/qemu-char.c:387
    #27 0x556a18598c52 in qemu_chr_be_write /home/elmarco/src/qemu/qemu-char.c:399
    #28 0x556a185a2afa in tcp_chr_read /home/elmarco/src/qemu/qemu-char.c:2902
    #29 0x556a18cbaf52 in qio_channel_fd_source_dispatch io/channel-watch.c:84

Follow John Snow recommendation:
  Everywhere else ncq_err is used, it is accompanied by a list cleanup
  except for ncq_cb, which is the case you are fixing here.

  Move the sglist destruction inside of ncq_err and then delete it from
  the other two locations to keep it tidy.

  Call dma_buf_commit in ide_dma_cb after the early return. Though, this
  is also a little wonky because this routine does more than clear the
  list, but it is at the moment the centralized "we're done with the
  sglist" function and none of the other side effects that occur in
  dma_buf_commit will interfere with the reset that occurs from
  ide_restart_bh, I think

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 5839df7b71)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:25:23 -05:00
Mark Cave-Ayland
ff71767e06 macio: set res_count value to 0 after non-block ATAPI DMA transfers
res_count should be set to the number of outstanding bytes after a DBDMA
request. Unfortunately this wasn't being set to zero by the non-block
transfer codepath meaning drivers that checked the descriptor result for
such requests (e.g reading the CDROM TOC) would assume from a non-zero result
that the transfer had failed.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

(cherry picked from commit 16275edb34)
Conflicts:
	hw/ide/macio.c

* removed context dependancy on ddd495e5

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:24:25 -05:00
John Snow
ec211e7426 atapi: fix halted DMA reset
Followup to 87ac25fd, this time for ATAPI DMA.

Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1470164128-28158-1-git-send-email-jsnow@redhat.com
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 7f951b2d77)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-09 14:16:18 -05:00
John Snow
16a87c4a5d ide: fix halted IO segfault at reset
If one attempts to perform a system_reset after a failed IO request
that causes the VM to enter a paused state, QEMU will segfault trying
to free up the pending IO requests.

These requests have already been completed and freed, though, so all
we need to do is NULL them before we enter the paused state.

Existing AHCI tests verify that halted requests are still resumed
successfully after a STOP event.

Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1469635201-11918-2-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 87ac25fd1f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:59:20 -05:00
Stefan Hajnoczi
86cc089aa7 virtio: error out if guest exceeds virtqueue size
A broken or malicious guest can submit more requests than the virtqueue
size permits, causing unbounded memory allocation in QEMU.

The guest can submit requests without bothering to wait for completion
and is therefore not bound by virtqueue size.  This requires reusing
vring descriptors in more than one request, which is not allowed by the
VIRTIO 1.0 specification.

In "3.2.1 Supplying Buffers to The Device", the VIRTIO 1.0 specification
says:

  1. The driver places the buffer into free descriptor(s) in the
     descriptor table, chaining as necessary

and

  Note that the above code does not take precautions against the
  available ring buffer wrapping around: this is not possible since the
  ring buffer is the same size as the descriptor table, so step (1) will
  prevent such a condition.

This implies that placing more buffers into the virtqueue than the
descriptor table size is not allowed.

QEMU is missing the check to prevent this case.  Processing a request
allocates a VirtQueueElement leading to unbounded memory allocation
controlled by the guest.

Exit with an error if the guest provides more requests than the
virtqueue size permits.  This bounds memory allocation and makes the
buggy guest visible to the user.

This patch fixes CVE-2016-5403 and was reported by Zhenhao Hong from 360
Marvel Team, China.

Reported-by: Zhenhao Hong <hongzhenhao@360.cn>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit afd9096eb1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:56:53 -05:00
Dave Hansen
502c8e86ea target-i386: fix typo in xsetbv implementation
QEMU 2.6 added support for the XSAVE family of instructions, which
includes the XSETBV instruction which allows setting the XCR0
register.

But, when booting Linux kernels with XSAVE support enabled, I was
getting very early crashes where the instruction pointer was set
to 0x3.  I tracked it down to a jump instruction generated by this:

        gen_jmp_im(s->pc - pc_start);

where s->pc is pointing to the instruction after XSETBV and pc_start
is pointing _at_ XSETBV.  Subtract the two and you get 0x3.  Whoops.

The fix is to replace this typo with the pattern found everywhere
else in the file when folks want to end the translation buffer.

Richard Henderson confirmed that this is a bug and that this is the
correct fix.

Signed-off-by: Dave Hansen <dave.hansen@linux.intel.com>
Cc: qemu-stable@nongnu.org
Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit ba03584f4f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:56:11 -05:00
Michael S. Tsirkin
a87cef825a pcie: fix link active status bit migration
We changed link status register in pci express endpoint capability
over time. Specifically,

commit b2101eae63 ("pcie: Set the "link
active" in the link status register") set data link layer link active
bit in this register without adding compatibility to old machine types.

When migrating from qemu 2.3 and older this affects xhci devices which
under machine type 2.0 and older have a pci express endpoint capability
even if they are on a pci bus.

Add compatibility flags to make this bit value match what it was under
2.3.

Additionally, to avoid breaking migration from qemu 2.3 and up,
suppress checking link status during migration: this seems sane
since hardware can change link status at any time.

https://bugzilla.redhat.com/show_bug.cgi?id=1352860

Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Fixes: b2101eae63
    ("pcie: Set the "link active" in the link status register")
Cc: qemu-stable@nongnu.org
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 6b4495401b)
Conflicts:
	hw/pci/pcie.c

* removed functional dependency on 6383292

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:45:19 -05:00
Eric Blake
97b5a97f2f nbd: Limit nbdflags to 16 bits
Rather than asserting that nbdflags is within range, just give
it the correct type to begin with :)  nbdflags corresponds to
the per-export portion of NBD Protocol "transmission flags", which
is 16 bits in response to NBD_OPT_EXPORT_NAME and NBD_OPT_GO.

Furthermore, upstream NBD has never passed the global flags to
the kernel via ioctl(NBD_SET_FLAGS) (the ioctl was first
introduced in NBD 2.9.22; then a latent bug in NBD 3.1 actually
tried to OR the global flags with the transmission flags, with
the disaster that the addition of NBD_FLAG_NO_ZEROES in 3.9
caused all earlier NBD 3.x clients to treat every export as
read-only; NBD 3.10 and later intentionally clip things to 16
bits to pass only transmission flags).  Qemu should follow suit,
since the current two global flags (NBD_FLAG_FIXED_NEWSTYLE
and NBD_FLAG_NO_ZEROES) have no impact on the kernel's behavior
during transmission.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>

Message-Id: <1469129688-22848-3-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7423f41782)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:19:20 -05:00
Peter Maydell
2317b328bc nbd: Don't use *_to_cpup() functions
The *_to_cpup() functions are not very useful, as they simply do
a pointer dereference and then a *_to_cpu(). Instead use either:
 * ld*_*_p(), if the data is at an address that might not be
   correctly aligned for the load
 * a local dereference and *_to_cpu(), if the pointer is
   the correct type and known to be correctly aligned

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1465570836-22211-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 773dce3c72)
* context prereq for 7423f417
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:19:04 -05:00
Eric Blake
ce00e529bc nbd: More debug typo fixes, use correct formats
Clean up some debug message oddities missed earlier; this includes
some typos, and recognizing that %d is not necessarily compatible
with uint32_t. Also add a couple messages that I found useful
while debugging things.

Signed-off-by: Eric Blake <eblake@redhat.com>

Message-Id: <1463006384-7734-3-git-send-email-eblake@redhat.com>
[Do not use PRIx16, clang complains. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

(cherry picked from commit 2cb347493c)
* context prereq for 7423f41
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:15:11 -05:00
Stefan Weil
28eae0af65 Fix some typos found by codespell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit cb8d4c8f54)
* context prereq for 2cb34749
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:14:47 -05:00
Peter Lieven
5634eb8ffb block/iscsi: fix rounding in iscsi_allocationmap_set
when setting clusters as alloacted the boundaries have
to be expanded. As Paolo pointed out the calculation of
the number of clusters is wrong:

Suppose cluster_sectors is 2, sector_num = 1, nb_sectors = 6:

In the "mark allocated" case, you want to set 0..8, i.e.
cluster_num=0, nb_clusters=4.

   0--.--2--.--4--.--6--.--8
   <--|_________________|-->  (<--> = expanded)

Instead you are setting nb_clusters=3, so that 6..8 is not marked.

   0--.--2--.--4--.--6--.--8
   <--|______________|!!!     (! = wrong)

Cc: qemu-stable@nongnu.org
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1468831940-15556-2-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit eb36b953e0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:03:41 -05:00
Fam Zheng
b6ece2c6f3 util: Fix MIN_NON_ZERO
MIN_NON_ZERO(1, 0) is evaluated to 0. Rewrite the macro to fix it.

Reported-by: Miroslav Rezanina <mrezanin@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1468306113-847-1-git-send-email-famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d27ba624aa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:03:16 -05:00
Alberto Garcia
8d7d7764d5 qemu-iotests: Test naming of throttling groups
Throttling groups are named using the 'group' parameter of the
block_set_io_throttle command and the throttling.group command-line
option. If that parameter is unspecified the groups get the name of
the block device.

This patch adds a new test to check the naming of throttling groups.

Signed-off-by: Alberto Garcia <berto@igalia.com>
* backport of 435d5ee
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:01:24 -05:00
Alberto Garcia
704ab2fce4 blockdev: Fix regression with the default naming of throttling groups
When I/O limits are set for a block device, the name of the throttling
group is taken from the BlockBackend if the user doesn't specify one.

Commit efaa7c4eeb moved the naming of the BlockBackend in
blockdev_init() to the end of the function, after I/O limits are set.
The consequence is that the throttling group gets an empty name.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-stable@nongnu.org
* backport of ff356ee
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 16:01:21 -05:00
David Hildenbrand
025c4e39f4 s390x/ipl: fix reboots for migration from different bios
When migrating from a different QEMU version, the start_address and
bios_start_address may differ. During migration these values are migrated
and overwrite the values that were detected by QEMU itself.

On a reboot, QEMU will reload its own BIOS, but use the migrated start
addresses, which does not work if the values differ.

Fix this by not relying on the migrated values anymore, but still
provide them during migration, so existing QEMUs continue to work.

Signed-off-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
(cherry picked from commit bb0995468a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:26:37 -05:00
Michael S. Tsirkin
82c8516779 Revert "virtio-net: unbreak self announcement and guest offloads after migration"
This reverts commit 1f8828ef57.

Cc: qemu-stable@nongnu.org
Reported-by: Robin Geuze <robing@transip.nl>
Tested-by: Robin Geuze <robing@transip.nl>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 6c6668232e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:23:40 -05:00
Michael S. Tsirkin
909d87d347 virtio: set low features early on load
virtio migrates the low 32 feature bits twice, the first copy is there
for compatibility but ever since
019a3edbb2: ("virtio: make features 64bit
wide") it's ignored on load. This is wrong since virtio_net_load tests
self announcement and guest offloads before the second copy including
high feature bits is loaded.  This means that self announcement, control
vq and guest offloads are all broken after migration.

Fix it up by loading low feature bits: somewhat ugly since high and low
bits become out of sync temporarily, but seems unavoidable for
compatibility.  The right thing to do for new features is probably to
test the host features, anyway.

Fixes: 019a3edbb2
    ("virtio: make features 64bit wide")
Cc: qemu-stable@nongnu.org
Reported-by: Robin Geuze <robing@transip.nl>
Tested-by: Robin Geuze <robing@transip.nl>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

(cherry picked from commit 62cee1a28a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:23:27 -05:00
Artyom Tarasenko
9566ceeef4 target-sparc: fix register corruption in ldstub if there is no write permission
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
(cherry picked from commit b64d2e57e7)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:10:07 -05:00
Eric Blake
44152ece75 scsi: Advertise limits by blocksize, not 512
s->blocksize may be larger than 512, in which case our
tweaks to max_xfer_len and opt_xfer_len must be scaled
appropriately.

CC: qemu-stable@nongnu.org
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit efaf4781a9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:09:13 -05:00
Fam Zheng
c9fb07ba56 scsi-generic: Merge block max xfer len in INQUIRY response
The rationale is similar to the above mode sense response interception:
this is practically the only channel to communicate restraints from
elsewhere such as host and block driver.

The scsi bus we attach onto can have a larger max xfer len than what is
accepted by the host file system (guarding between the host scsi LUN and
QEMU), in which case the SG_IO we generate would get -EINVAL.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464243305-10661-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 063143d5b1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 15:08:54 -05:00
Eric Blake
ab2aac59e8 nbd: Allow larger requests
The NBD layer was breaking up request at a limit of 2040 sectors
(just under 1M) to cater to old qemu-nbd. But the server limit
was raised to 32M in commit 2d8214885 to match the kernel, more
than three years ago; and the upstream NBD Protocol is proposing
documentation that without any explicit communication to state
otherwise, a client should be able to safely assume that a 32M
transaction will work.  It is time to rely on the larger sizing,
and any downstream distro that cares about maximum
interoperability to older qemu-nbd servers can just tweak the
value of #define NBD_MAX_SECTORS.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>

(cherry picked from commit 476b923c32)
Conflicts:
	include/block/nbd.h

* removed context dependency on 943cec86

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 14:34:45 -05:00
Alex Williamson
e19b9ad27c vfio/pci: Fix VGA quirks
Commit 2d82f8a3cd ("vfio/pci: Convert all MemoryRegion to dynamic
alloc and consistent functions") converted VFIOPCIDevice.vga to be
dynamically allocted, negating the need for VFIOPCIDevice.has_vga.
Unfortunately not all of the has_vga users were converted, nor was
the field removed from the structure.  Correct these oversights.

Reported-by: Peter Maloney <peter.maloney@brockmann-consult.de>
Tested-by: Peter Maloney <peter.maloney@brockmann-consult.de>
Fixes: 2d82f8a3cd ("vfio/pci: Convert all MemoryRegion to dynamic alloc and consistent functions")
Fixes: https://bugs.launchpad.net/qemu/+bug/1591628
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
(cherry picked from commit 4d3fc4fdc6)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 14:30:32 -05:00
4f696c8533 pci-assign: Move "Invalid ROM" error message to pci-assign-load-rom.c
In function pci_assign_dev_load_option_rom, For those pci devices don't
have 'rom' file under sysfs or if loading ROM from external file, The
function returns NULL, and won't set the passed 'size' variable.

In these 2 cases, qemu still reports "Invalid ROM" error message, Users
may be confused by it.

Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <1466010327-22368-1-git-send-email-lma@suse.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit be968c721e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 14:29:49 -05:00
Eric Blake
a50bb5fd5f qapi: Fix crash on missing alternate member of QAPI struct
If a QAPI struct has a mandatory alternate member which is not
present on input, the input visitor reports an error for the
missing alternate without setting the discriminator, but the
cleanup code for the struct still tries to use the dealloc
visitor to clean up the alternate.

Commit dbf11922 changed visit_start_alternate to set *obj to NULL
when an error occurs, where it was previously left untouched.
Thus, before the patch, the dealloc visitor is blindly trying to
cleanup whatever branch corresponds to (*obj)->type == 0 (that is,
QTYPE_NONE, because *obj still pointed to zeroed memory), which
selects the default branch of the switch and sets an error, but
this second error is ignored by the way the dealloc visitor is
used; but after the patch, the attempt to switch dereferences NULL.

When cleaning up after a partial object parse, we specifically
check for !*obj after visit_start_struct() (see gen_visit_object());
doing the same for alternates fixes the crash. Enhance the testsuite
to give coverage for both missing struct and missing alternate
members.

Also add an abort - we expect visit_start_alternate() to either set an
error or to set (*obj)->type to a valid QType that corresponds to
actual user input, and QTYPE_NONE should never be reachable from valid
input.  Had the abort() been in place earlier, we might have noticed
the dealloc visitor dereferencing bogus zeroed memory prior to when
commit dbf11922 forced our hand by setting *obj to NULL and causing a
fault.

Test case:

{'execute':'blockdev-add', 'arguments':{'options':{'driver':'raw'}}}

The choice of 'driver':'raw' selects a BlockdevOptionsGenericFormat
struct, which has a mandatory 'file':'BlockdevRef' in QAPI.  Since
'file' is missing as a sibling of 'driver', this should report a
graceful error rather than fault.  After this patch, we are back to:

{"error": {"class": "GenericError", "desc": "Parameter 'file' is missing"}}

Generated code in qapi-visit.c changes as:

|@@ -2444,6 +2444,9 @@ void visit_type_BlockdevRef(Visitor *v,
|     if (err) {
|         goto out;
|     }
|+    if (!*obj) {
|+        goto out_obj;
|+    }
|     switch ((*obj)->type) {
|     case QTYPE_QDICT:
|         visit_start_struct(v, name, NULL, 0, &err);
|@@ -2459,10 +2462,13 @@ void visit_type_BlockdevRef(Visitor *v,
|     case QTYPE_QSTRING:
|         visit_type_str(v, name, &(*obj)->u.reference, &err);
|         break;
|+    case QTYPE_NONE:
|+        abort();
|     default:
|         error_setg(&err, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
|                    "BlockdevRef");
|     }
|+out_obj:
|     visit_end_alternate(v);

Reported by Kashyap Chamarthy <kchamart@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1466012271-5204-1-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>

(cherry picked from commit 9b4e38fe6a)
Conflicts:
	tests/test-qmp-input-visitor.c

* removed contexual/functional dependencies on 68ab47e

Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 14:28:18 -05:00
Max Reitz
4bfe16ba7b qcow2: Avoid making the L1 table too big
We refuse to open images whose L1 table we deem "too big". Consequently,
we should not produce such images ourselves.

Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160615153630.2116-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
[mreitz: Added QEMU_BUILD_BUG_ON()]
Signed-off-by: Max Reitz <mreitz@redhat.com>

(cherry picked from commit 84c26520d3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:55:23 -05:00
Kevin Wolf
683c1c5ea5 backup: Don't leak BackupBlockJob in error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
(cherry picked from commit 91ab688379)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:54:51 -05:00
Peter Lieven
45f4e4be09 net: fix qemu_announce_self not emitting packets
commit fefe2a78 accidently dropped the code path for injecting
raw packets. This feature is needed for sending gratuitous ARPs
after an incoming migration has completed. The result is increased
network downtime for vservers where the network card is not virtio-net
with the VIRTIO_NET_F_GUEST_ANNOUNCE feature.

Fixes: fefe2a78ab
Cc: qemu-stable@nongnu.org
Cc: hongyang.yang@easystack.cn
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit ca1ee3d6b5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:45:18 -05:00
Daniel P. Berrange
d1911a6fa7 ui: fix regression in printing VNC host/port on startup
If VNC is chosen as the compile time default display backend,
QEMU will print the host/port it listens on at startup.
Previously this would look like

  VNC server running on '::1:5900'

but in 04d2529da2 the ':' was
accidentally replaced with a ';'. This the ':' back.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1465382576-25552-1-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 83cf07b0b5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:44:54 -05:00
Daniel P. Berrange
510531ea44 io: remove mistaken call to object_ref on QTask
The QTask struct is just a standalone struct, not a QOM Object,
so calling object_ref() on it is not appropriate. This results
in mangling the 'destroy' field in the QTask struct, causing
the later call to qtask_free() to try to call the function
at address 0x1, with predictably segfault happy results.

There is in fact no need for ref counting with QTask, as the
call to qtask_abort() or qtask_complete() will automatically
free associated memory.

This fixes the crash shown in

  https://bugs.launchpad.net/qemu/+bug/1589923

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
(cherry picked from commit bc35d51077)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:44:29 -05:00
Gerd Hoffmann
d59d37dea4 vmsvga: don't process more than 1024 fifo commands at once
vmsvga_fifo_run is called in regular intervals (on each display update)
and will resume where it left off.  So we can simply exit the loop,
without having to worry about how processing will continue.

Fixes: CVE-2016-4453
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-5-git-send-email-kraxel@redhat.com
(cherry picked from commit 4e68a0ee17)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:37:49 -05:00
Gerd Hoffmann
71798fda8b vmsvga: shadow fifo registers
The fifo is normal ram.  So kvm vcpu threads and qemu iothread can
access the fifo in parallel without syncronization.  Which in turn
implies we can't use the fifo pointers in-place because the guest
can try changing them underneath us.  So add shadows for them, to
make sure the guest can't modify them after we've applied sanity
checks.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-4-git-send-email-kraxel@redhat.com
(cherry picked from commit 7e486f7577)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:37:40 -05:00
Gerd Hoffmann
3141be668f vmsvga: add more fifo checks
Make sure all fifo ptrs are within range.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-3-git-send-email-kraxel@redhat.com
(cherry picked from commit c2e3c54d39)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:37:21 -05:00
Gerd Hoffmann
394647d711 vmsvga: move fifo sanity checks to vmsvga_fifo_length
Sanity checks are applied when the fifo is enabled by the guest
(SVGA_REG_CONFIG_DONE write).  Which doesn't help much if the guest
changes the fifo registers afterwards.  Move the checks to
vmsvga_fifo_length so they are done each time qemu is about to read
from the fifo.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-2-git-send-email-kraxel@redhat.com
(cherry picked from commit 5213602678)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:37:08 -05:00
Fam Zheng
63a396d151 block: Drop bdrv_ioctl_bh_cb
Similar to the "!drv || !drv->bdrv_aio_ioctl" case above, here it is
okay to set co.ret and return. As pointed out by Paolo, a BH will be
created as necessary by the caller (bdrv_co_maybe_schedule_bh).
Besides, as pointed out by Kevin, "data" was leaked before.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20160601015223.19277-1-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit c8a9fd8071)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:33:39 -05:00
Prasad J Pandit
f882993a8c scsi: mptsas: infinite loop while fetching requests
The LSI SAS1068 Host Bus Adapter emulator in Qemu, periodically
looks for requests and fetches them. A loop doing that in
mptsas_fetch_requests() could run infinitely if 's->state' was
not operational. Move check to avoid such a loop.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464077264-25473-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 06630554cc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:32:04 -05:00
Prasad J Pandit
8b95d8e1d5 scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)
Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the ring buffer size to an arbitrary
value leading to OOB access issue. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464000485-27041-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 3e831b40e0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:31:47 -05:00
Steven Luo
54eb4cf5fc Fix configure test for PBKDF2 in nettle
On my Debian jessie system, including nettle/pbkdf2.h does not cause
NULL to be defined, which causes the test to fail to compile.  Include
stddef.h to bring in a definition of NULL.

Cc: qemu-trivial@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Steven Luo <steven+qemu@steven676.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 9e87a691bd)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:30:59 -05:00
Greg Kurz
e81a24a748 savevm: fail if migration blockers are present
QEMU has currently two ways to prevent migration to occur:
- migration blocker when it depends on runtime state
- VMStateDescription.unmigratable when migration is not supported at all

This patch gathers all the logic into a single function to be called from
both the savevm and the migrate paths.

This fixes a bug with 9p, at least, where savevm would succeed and the
following would happen in the guest after loadvm:

$ ls /host
ls: cannot access /host: Protocol error

With this patch:

(qemu) savevm foo
Migration is disabled when VirtFS export path '/' is mounted in the guest
using mount_tag 'host'

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <146239057139.11271.9011797645454781543.stgit@bahia.huguette.org>

[Update subject according to Paolo's suggestion - Amit]

Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 24f3902b08)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:29:25 -05:00
Eric Blake
fb26337641 nbd: Don't trim unrequested bytes
Similar to commit df7b97ff, we are mishandling clients that
give an unaligned NBD_CMD_TRIM request, and potentially
trimming bytes that occur before their request; which in turn
can cause potential unintended data loss (unlikely in
practice, since most clients are sane and issue aligned trim
requests).  However, while we fixed read and write by switching
to the byte interfaces of blk_, we don't yet have a byte
interface for discard.  On the other hand, trim is advisory, so
rounding the user's request to simply ignore the first and last
unaligned sectors (or the entire request, if it is sub-sector
in length) is just fine.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464173965-9694-1-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 353ab96973)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:28:06 -05:00
Peter Lieven
509e13298f block/iscsi: avoid potential overflow of acb->task->cdb
at least in the path via virtio-blk the maximum size is not
restricted.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a6b3167fa0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:24:41 -05:00
Gavin Shan
6e7ee9862b vfio: Fix broken EEH
vfio_eeh_container_op() is the backend that communicates with
host kernel to support EEH functionality in QEMU. However, the
functon should return the value from host kernel instead of 0
unconditionally.

dwg: Specifically the problem occurs for the handful of EEH
sub-operations which can return a non-zero, non-error result.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: clarification to commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

(cherry picked from commit d917e88d85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-05 13:23:19 -05:00
Gerd Hoffmann
7ff5dc445d vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression.  The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.

This patch introduces a new sr_vbe register set.  The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[].  Normal vga register reads and
writes go to sr[].  Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.

This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.

Cc: qemu-stable@nongnu.org
Reported-by: Thomas Lamprecht <thomas@lamprecht.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1463475294-14119-1-git-send-email-kraxel@redhat.com
(cherry picked from commit 94ef4f337f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:46:46 -05:00
Thomas Huth
a1f006fe93 usb/ohci: Fix crash with when specifying too many num-ports
QEMU currently crashes when an OHCI controller is instantiated with
too many ports, e.g. "-device pci-ohci,num-ports=100,masterbus=1".
Thus add a proper check in usb_ohci_init() to make sure that we
do not use more than OHCI_MAX_PORTS = 15 ports here.

Ticket: https://bugs.launchpad.net/qemu/+bug/1581308
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1463995387-11710-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit d400fc018b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:43:40 -05:00
Peter Lieven
cba9a8042f block/nfs: refuse readahead if cache.direct is on
if we open a NFS export with disabled cache we should refuse
the readahead feature as it will cache data inside libnfs.

If a export was opened with readahead enabled it should
futher not be allowed to disable the cache while running.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1463662083-20814-2-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
(cherry picked from commit 38f8d5e025)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:42:32 -05:00
Prasad J Pandit
9b28a7fc88 esp: check dma length before reading scsi command(CVE-2016-4441)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.

Fixes CVE-2016-4441.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-3-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6c1fef6b59)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:42:05 -05:00
Prasad J Pandit
0a5e3685ea esp: check command buffer length before write(CVE-2016-4439)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.

Fixes CVE-2016-4439.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c98c6c105f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:41:59 -05:00
Paolo Bonzini
2522f0fcd1 json-streamer: fix double-free on exiting during a parse
Now that json-streamer tries not to leak tokens on incomplete parse,
the tokens can be freed twice if QEMU destroys the json-streamer
object during the parser->emit call.  To fix this, create the new
empty GQueue earlier, so that it is already in place when the old
one is passed to parser->emit.

Reported-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1467636059-12557-1-git-send-email-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit a942d8fa01)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:34:29 -05:00
Eric Blake
ebe0376e8c json-streamer: Don't leak tokens on incomplete parse
Valgrind complained about a number of leaks in
tests/check-qobject-json:

==12657==    definitely lost: 17,247 bytes in 1,234 blocks

All of which had the same root cause: on an incomplete parse,
we were abandoning the token queue without cleaning up the
allocated data within each queue element.  Introduced in
commit 95385fe, when we switched from QList (which recursively
frees contents) to g_queue (which does not).

We don't yet require glib 2.32 with its g_queue_free_full(),
so open-code it instead.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463608012-12760-1-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit ba4dba5434)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:34:10 -05:00
Greg Kurz
9520c6cb1f migration: regain control of images when migration fails to complete
We currently have an error path during migration that can cause
the source QEMU to abort:

migration_thread()
  migration_completion()
    runstate_is_running() ----------------> true if guest is running
    bdrv_inactivate_all() ----------------> inactivate images
    qemu_savevm_state_complete_precopy()
     ... qemu_fflush()
           socket_writev_buffer() --------> error because destination fails
         qemu_fflush() -------------------> set error on migration stream
  migration_completion() -----------------> set migrate state to FAILED
migration_thread() -----------------------> break migration loop
  vm_start() -----------------------------> restart guest with inactive
                                            images

and you get:

qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/18446744073709551615)
qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
Aborted (core dumped)

If we try postcopy with a similar scenario, we also get the writev error
message but QEMU leaves the guest paused because entered_postcopy is true.

We could possibly do the same with precopy and leave the guest paused.
But since the historical default for migration errors is to restart the
source, this patch adds a call to bdrv_invalidate_cache_all() instead.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <146357896785.6003.11983081732454362715.stgit@bahia.huguette.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit fe904ea824)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:31:03 -05:00
Stefan Weil
dbbadeb48c configure: Allow builds with extra warnings
The clang compiler supports a useful compiler option -Weverything,
and GCC also has other warnings not enabled by -Wall.

If glib header files trigger a warning, however, testing glib with
-Werror will always fail. A size mismatch is also detected without
-Werror, so simply remove it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <1461879221-13338-1-git-send-email-sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 5919e0328b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:30:30 -05:00
Paolo Bonzini
bd5d278668 target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
sfence was introduced before lfence and mfence.  This fixes Linux
2.4's measurement of checksumming speeds for the pIII_sse
algorithm:

md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
   8regs     :   384.400 MB/sec
   32regs    :   259.200 MB/sec
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0240b2a>]    Not tainted
EFLAGS: 00000246
eax: c15d8000   ebx: 00000000   ecx: 00000000   edx: c15d5000
esi: 8005003b   edi: 00000004   ebp: 00000000   esp: c15bdf50
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 1, stackpage=c15bd000)
Stack: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
       00000000 00000206 c0241c6c 00001000 c15d4000 c15d7000 c15d4000
c15d4000
Call Trace:    [<c0241c6c>] [<c0105000>] [<c0241db4>] [<c010503b>]
[<c0105000>]
  [<c0107416>] [<c0105030>]

Code: 0f ae f8 0f 10 04 24 0f 10 4c 24 10 0f 10 54 24 20 0f 10 5c
 <0>Kernel panic: Attempted to kill init!

Reported-by: Stefan Weil <sw@weilnetz.de>
Fixes: 121f315788
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 14cb949a3e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:29:06 -05:00
Aurelien Jarno
a525decfaa target-mips: fix call to memset in soft reset code
Recent versions of GCC report the following error when compiling
target-mips/helper.c:

  qemu/target-mips/helper.c:542:9: warning: ‘memset’ used with length
  equal to number of elements without multiplication by element size
  [-Wmemset-elt-size]

This is indeed correct and due to a wrong usage of sizeof(). Fix that.

Cc: Stefan Weil <sw@weilnetz.de>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: qemu-stable@nongnu.org
LP: https://bugs.launchpad.net/qemu/+bug/1577841
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
(cherry picked from commit 9d989c732b)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:27:40 -05:00
Roman Kagan
2cf1a1223b usb:xhci: no DMA on HC reset
This patch is a rough fix to a memory corruption we are observing when
running VMs with xhci USB controller and OVMF firmware.

Specifically, on the following call chain

xhci_reset
  xhci_disable_slot
    xhci_disable_ep
      xhci_set_ep_state

QEMU overwrites guest memory using stale guest addresses.

This doesn't happen when the guest (firmware) driver sets up xhci for
the first time as there are no slots configured yet.  However when the
firmware hands over the control to the OS some slots and endpoints are
already set up with their context in the guest RAM.  Now the OS' driver
resets the controller again and xhci_set_ep_state then reads and writes
that memory which is now owned by the OS.

As a quick fix, skip calling xhci_set_ep_state in xhci_disable_ep if the
device context base address array pointer is zero (indicating we're in
the HC reset and no DMA is possible).

Cc: qemu-stable@nongnu.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-id: 1462384435-1034-1-git-send-email-rkagan@virtuozzo.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 491d68d938)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:26:45 -05:00
Dominik Dingel
ea819be42b exec.c: Ensure right alignment also for file backed ram
While in the anonymous ram case we already take care of the right alignment
such an alignment gurantee does not exist for file backed ram allocation.

Instead, pagesize is used for alignment. On s390 this is not enough for gmap,
as we need to satisfy an alignment up to segments.

Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>

Message-Id: <1461585338-45863-1-git-send-email-dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit d2f39add72)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:24:41 -05:00
Hemant Kumar
5a908cb1a8 tools: kvm_stat: Powerpc related fixes
kvm_stat script is failing to execute on powerpc :
 # ./kvm_stat
Traceback (most recent call last):
  File "./kvm_stat", line 825, in <module>
    main()
  File "./kvm_stat", line 813, in main
    providers = get_providers(options)
  File "./kvm_stat", line 778, in get_providers
    providers.append(TracepointProvider())
  File "./kvm_stat", line 416, in __init__
    self.filters = get_filters()
  File "./kvm_stat", line 315, in get_filters
    if ARCH.exit_reasons:
AttributeError: 'ArchPPC' object has no attribute 'exit_reasons'

This is because, its trying to access a non-defined attribute.

Also, the IOCTL number of RESET is incorrect for powerpc. The correct
number has been added.

Signed-off-by: Hemant Kumar <hemant@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
* cherry-picked from linux commit c7d4fb5a
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:23:35 -05:00
Li Zhijian
07a3a482c3 vl: change runstate only if new state is different from current state
Previously, qemu will abort at following scenario:
(qemu) stop
(qemu) system_reset
(qemu) system_reset
(qemu) 2016-04-13T20:54:38.979158Z qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'prelaunch'

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1460604352-18630-1-git-send-email-lizhijian@cn.fujitsu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e92a2d9cb3)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:01:36 -05:00
Gerd Hoffmann
5b6c12e245 spice/gl: add & use qemu_spice_gl_monitor_config
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
(cherry picked from commit 39414ef4e9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 16:00:18 -05:00
Prasad J Pandit
d00ba3fa9b i386: kvmvapic: initialise imm32 variable
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.

Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1460013608-16670-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>

(cherry picked from commit 691a02e2ce)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-08-04 15:52:54 -05:00
4073 changed files with 192020 additions and 553641 deletions

View File

@@ -1,15 +0,0 @@
# http://editorconfig.org
root = true
[*]
end_of_line = lf
insert_final_newline = true
charset = utf-8
[Makefile*]
indent_style = tab
indent_size = 8
[*.{c,h}]
indent_style = space
indent_size = 4

View File

@@ -1,8 +0,0 @@
# GDB may have ./.gdbinit loading disabled by default. In that case you can
# follow the instructions it prints. They boil down to adding the following to
# your home directory's ~/.gdbinit file:
#
# add-auto-load-safe-path /path/to/qemu/.gdbinit
# Load QEMU-specific sub-commands and settings
source scripts/qemu-gdb.py

61
.gitignore vendored
View File

@@ -5,17 +5,20 @@
/config-target.*
/config.status
/config-temp
/trace-events-all
/trace/generated-tracers.h
/trace/generated-tracers.c
/trace/generated-tracers-dtrace.h
/trace/generated-tracers.dtrace
/trace/generated-events.h
/trace/generated-events.c
/trace/generated-helpers-wrappers.h
/trace/generated-helpers.h
/trace/generated-helpers.c
/trace/generated-tcg-tracers.h
/trace/generated-ust-provider.h
/trace/generated-ust.c
/ui/shader/texture-blit-frag.h
/ui/shader/texture-blit-vert.h
/ui/shader/texture-blit-flip-vert.h
/ui/input-keymap-*.c
*-timestamp
/*-softmmu
/*-darwin-user
@@ -35,8 +38,9 @@
/qmp-introspect.[ch]
/qmp-marshal.c
/qemu-doc.html
/qemu-tech.html
/qemu-doc.info
/qemu-doc.txt
/qemu-tech.info
/qemu-img
/qemu-nbd
/qemu-options.def
@@ -46,21 +50,16 @@
/qemu-io
/qemu-ga
/qemu-bridge-helper
/qemu-keymap
/qemu-monitor.texi
/qemu-monitor-info.texi
/qemu-version.h
/qemu-version.h.tmp
/module_block.h
/scsi/qemu-pr-helper
/vhost-user-scsi
/vhost-user-blk
/qmp-commands.txt
/vscclient
/fsdev/virtfs-proxy-helper
*.tmp
*.[1-9]
*.a
*.aux
*.cp
*.dvi
*.exe
*.msi
*.dll
@@ -82,6 +81,10 @@
*.d
!/scripts/qemu-guest-agent/fsfreeze-hook.d
*.o
*.lo
*.la
*.pc
.libs
.sdk
*.gcda
*.gcno
@@ -91,10 +94,6 @@
/pc-bios/optionrom/linuxboot.bin
/pc-bios/optionrom/linuxboot.raw
/pc-bios/optionrom/linuxboot.img
/pc-bios/optionrom/linuxboot_dma.asm
/pc-bios/optionrom/linuxboot_dma.bin
/pc-bios/optionrom/linuxboot_dma.raw
/pc-bios/optionrom/linuxboot_dma.img
/pc-bios/optionrom/multiboot.asm
/pc-bios/optionrom/multiboot.bin
/pc-bios/optionrom/multiboot.raw
@@ -105,38 +104,8 @@
/pc-bios/optionrom/kvmvapic.img
/pc-bios/s390-ccw/s390-ccw.elf
/pc-bios/s390-ccw/s390-ccw.img
/docs/interop/qemu-ga-qapi.texi
/docs/interop/qemu-ga-ref.html
/docs/interop/qemu-ga-ref.info*
/docs/interop/qemu-ga-ref.txt
/docs/interop/qemu-qmp-qapi.texi
/docs/interop/qemu-qmp-ref.html
/docs/interop/qemu-qmp-ref.info*
/docs/interop/qemu-qmp-ref.txt
/docs/version.texi
*.tps
.stgit-*
.git-submodule-status
cscope.*
tags
TAGS
docker-src.*
*~
*.ast_raw
*.depend_raw
trace.h
trace.c
trace-ust.h
trace-ust.h
trace-dtrace.h
trace-dtrace.dtrace
trace-root.h
trace-root.c
trace-ust-root.h
trace-ust-root.h
trace-ust-all.h
trace-ust-all.c
trace-dtrace-root.h
trace-dtrace-root.dtrace
trace-ust-all.h
trace-ust-all.c

18
.gitmodules vendored
View File

@@ -22,24 +22,12 @@
[submodule "roms/sgabios"]
path = roms/sgabios
url = git://git.qemu-project.org/sgabios.git
[submodule "pixman"]
path = pixman
url = git://anongit.freedesktop.org/pixman
[submodule "dtc"]
path = dtc
url = git://git.qemu-project.org/dtc.git
[submodule "roms/u-boot"]
path = roms/u-boot
url = git://git.qemu-project.org/u-boot.git
[submodule "roms/skiboot"]
path = roms/skiboot
url = git://git.qemu.org/skiboot.git
[submodule "roms/QemuMacDrivers"]
path = roms/QemuMacDrivers
url = git://git.qemu.org/QemuMacDrivers.git
[submodule "ui/keycodemapdb"]
path = ui/keycodemapdb
url = git://git.qemu.org/keycodemapdb.git
[submodule "capstone"]
path = capstone
url = git://git.qemu.org/capstone.git
[submodule "roms/seabios-hppa"]
path = roms/seabios-hppa
url = git://github.com/hdeller/seabios-hppa.git

View File

@@ -8,17 +8,10 @@ Aurelien Jarno <aurelien@aurel32.net> aurel32 <aurel32@c046a42c-6fe2-441c-8c8c-7
Blue Swirl <blauwirbel@gmail.com> blueswir1 <blueswir1@c046a42c-6fe2-441c-8c8c-71466251a162>
Edgar E. Iglesias <edgar.iglesias@gmail.com> edgar_igl <edgar_igl@c046a42c-6fe2-441c-8c8c-71466251a162>
Fabrice Bellard <fabrice@bellard.org> bellard <bellard@c046a42c-6fe2-441c-8c8c-71466251a162>
James Hogan <jhogan@kernel.org> <james.hogan@imgtec.com>
Jocelyn Mayer <l_indien@magic.fr> j_mayer <j_mayer@c046a42c-6fe2-441c-8c8c-71466251a162>
Paul Brook <paul@codesourcery.com> pbrook <pbrook@c046a42c-6fe2-441c-8c8c-71466251a162>
Paul Burton <paul.burton@mips.com> <paul.burton@imgtec.com>
Paul Burton <paul.burton@mips.com> <paul@archlinuxmips.org>
Thiemo Seufer <ths@networkno.de> ths <ths@c046a42c-6fe2-441c-8c8c-71466251a162>
malc <av1474@comtv.ru> malc <malc@c046a42c-6fe2-441c-8c8c-71466251a162>
# There is also a:
# (no author) <(no author)@c046a42c-6fe2-441c-8c8c-71466251a162>
# for the cvs2svn initialization commit e63c3dc74bf.
#
# Also list preferred name forms where people have changed their
# git author config
Daniel P. Berrangé <berrange@redhat.com>

View File

@@ -1,49 +0,0 @@
language: c
git:
submodules: false
env:
global:
- LC_ALL=C
matrix:
- IMAGE=debian-amd64
TARGET_LIST=x86_64-softmmu,x86_64-linux-user
- IMAGE=debian-win32-cross
TARGET_LIST=arm-softmmu,i386-softmmu,lm32-softmmu
- IMAGE=debian-win64-cross
TARGET_LIST=aarch64-softmmu,sparc64-softmmu,x86_64-softmmu
- IMAGE=debian-armel-cross
TARGET_LIST=arm-softmmu,arm-linux-user,armeb-linux-user
- IMAGE=debian-armhf-cross
TARGET_LIST=arm-softmmu,arm-linux-user,armeb-linux-user
- IMAGE=debian-arm64-cross
TARGET_LIST=aarch64-softmmu,aarch64-linux-user
- IMAGE=debian-s390x-cross
TARGET_LIST=s390x-softmmu,s390x-linux-user
- IMAGE=debian-mips-cross
TARGET_LIST=mips-softmmu,mipsel-linux-user
- IMAGE=debian-mips64el-cross
TARGET_LIST=mips64el-softmmu,mips64el-linux-user
- IMAGE=debian-powerpc-cross
TARGET_LIST=ppc-softmmu,ppcemb-softmmu,ppc-linux-user
- IMAGE=debian-ppc64el-cross
TARGET_LIST=ppc64-softmmu,ppc64-linux-user,ppc64abi32-linux-user
build:
pre_ci:
- make docker-image-${IMAGE} V=1
pre_ci_boot:
image_name: qemu
image_tag: ${IMAGE}
pull: false
options: "-e HOME=/root"
ci:
- unset CC
# some targets require newer up to date packages, for example TARGET_LIST matching
# aarch64*-softmmu|arm*-softmmu|ppc*-softmmu|microblaze*-softmmu|mips64el-softmmu)
# see the configure script:
# error_exit "DTC (libfdt) version >= 1.4.2 not present. Your options:"
# " (1) Preferred: Install the DTC (libfdt) devel package"
# " (2) Fetch the DTC submodule, using:"
# " git submodule update --init dtc"
- dpkg --compare-versions `dpkg-query --showformat='${Version}' --show libfdt-dev` ge 1.4.2 || git submodule update --init dtc
- ./configure ${QEMU_CONFIGURE_OPTS} --target-list=${TARGET_LIST}
- make -j$(($(getconf _NPROCESSORS_ONLN) + 1))

View File

@@ -1,25 +1,23 @@
sudo: false
language: c
python:
- "2.6"
- "2.4"
compiler:
- gcc
- clang
cache: ccache
addons:
apt:
packages:
# Build dependencies
- libaio-dev
- libattr1-dev
- libbrlapi-dev
- libcap-ng-dev
- libgcc-4.8-dev
- libgnutls-dev
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libncurses5-dev
- libnfs-dev
- libnss3-dev
- libpixman-1-dev
- libpng12-dev
@@ -35,26 +33,22 @@ addons:
- sparse
- uuid-dev
# The channel name "irc.oftc.net#qemu" is encrypted against qemu/qemu
# to prevent IRC notifications from forks. This was created using:
# $ travis encrypt -r "qemu/qemu" "irc.oftc.net#qemu"
notifications:
irc:
channels:
- secure: "F7GDRgjuOo5IUyRLqSkmDL7kvdU4UcH3Lm/W2db2JnDHTGCqgEdaYEYKciyCLZ57vOTsTsOgesN8iUT7hNHBd1KWKjZe9KDTZWppWRYVwAwQMzVeSOsbbU4tRoJ6Pp+3qhH1Z0eGYR9ZgKYAoTumDFgSAYRp4IscKS8jkoedOqM="
- "irc.oftc.net#qemu"
on_success: change
on_failure: always
env:
global:
- TEST_CMD="make check"
- MAKEFLAGS="-j3"
matrix:
- CONFIG=""
- CONFIG="--enable-debug --enable-debug-tcg --enable-trace-backends=log"
- CONFIG="--disable-linux-aio --disable-cap-ng --disable-attr --disable-brlapi --disable-uuid --disable-libusb"
- CONFIG="--enable-modules --disable-linux-user"
- CONFIG="--with-coroutine=ucontext --disable-linux-user"
- CONFIG="--with-coroutine=sigaltstack --disable-linux-user"
- CONFIG="--enable-modules"
- CONFIG="--with-coroutine=ucontext"
- CONFIG="--with-coroutine=sigaltstack"
git:
# we want to do this ourselves
submodules: false
@@ -66,12 +60,12 @@ before_install:
before_script:
- ./configure ${CONFIG}
script:
- make ${MAKEFLAGS} && ${TEST_CMD}
- make -j3 && ${TEST_CMD}
matrix:
include:
# Test with CLang for compile portability
- env: CONFIG=""
compiler: clang
# Sparse is GCC only
- env: CONFIG="--enable-sparse"
compiler: gcc
# gprof/gcov are GCC features
- env: CONFIG="--enable-gprof --enable-gcov --disable-pie"
compiler: gcc
@@ -88,116 +82,9 @@ matrix:
- env: CONFIG="--enable-trace-backends=ust"
TEST_CMD=""
compiler: gcc
- env: CONFIG="--disable-tcg"
- env: CONFIG="--with-coroutine=gthread"
TEST_CMD=""
compiler: gcc
- env: CONFIG=""
os: osx
compiler: clang
# Plain Trusty System Build
- env: CONFIG="--disable-linux-user"
sudo: required
addons:
dist: trusty
compiler: gcc
before_install:
- sudo apt-get update -qq
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
# Plain Trusty Linux User Build
- env: CONFIG="--disable-system"
sudo: required
addons:
dist: trusty
compiler: gcc
before_install:
- sudo apt-get update -qq
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
# Trusty System build with latest stable clang & python 3.0
- sudo: required
addons:
dist: trusty
language: generic
compiler: none
python:
- "3.0"
env:
- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
- CONFIG="--disable-linux-user --cc=clang-3.9 --cxx=clang++-3.9 --python=/usr/bin/python3"
before_install:
- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add -
- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty llvm-toolchain-trusty-3.9 main'
- sudo apt-get update -qq
- sudo apt-get install -qq -y clang-3.9
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
before_script:
- ./configure ${CONFIG} || cat config.log
# Trusty Linux User build with latest stable clang & python 3.6
- sudo: required
addons:
dist: trusty
language: generic
compiler: none
python:
- "3.6"
env:
- COMPILER_NAME=clang CXX=clang++-3.9 CC=clang-3.9
- CONFIG="--disable-system --cc=clang-3.9 --cxx=clang++-3.9 --python=/usr/bin/python3"
before_install:
- wget -nv -O - http://llvm.org/apt/llvm-snapshot.gpg.key | sudo apt-key add -
- sudo apt-add-repository -y 'deb http://llvm.org/apt/trusty llvm-toolchain-trusty-3.9 main'
- sudo apt-get update -qq
- sudo apt-get install -qq -y clang-3.9
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive
before_script:
- ./configure ${CONFIG} || cat config.log
# Using newer GCC with sanitizers
- addons:
apt:
sources:
# PPAs for newer toolchains
- ubuntu-toolchain-r-test
packages:
# Extra toolchains
- gcc-5
- g++-5
# Build dependencies
- libaio-dev
- libattr1-dev
- libbrlapi-dev
- libcap-ng-dev
- libgnutls-dev
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libnfs-dev
- libncurses5-dev
- libnss3-dev
- libpixman-1-dev
- libpng12-dev
- librados-dev
- libsdl1.2-dev
- libseccomp-dev
- libspice-protocol-dev
- libspice-server-dev
- libssh2-1-dev
- liburcu-dev
- libusb-1.0-0-dev
- libvte-2.90-dev
- sparse
- uuid-dev
language: generic
compiler: none
env:
- COMPILER_NAME=gcc CXX=g++-5 CC=gcc-5
- CONFIG="--cc=gcc-5 --cxx=g++-5 --disable-pie --disable-linux-user"
- TEST_CMD=""
before_script:
- ./configure ${CONFIG} --extra-cflags="-g3 -O0 -fsanitize=thread -fuse-ld=gold" || cat config.log

View File

@@ -9,7 +9,7 @@ patches before submitting.
Of course, the most important aspect in any coding style is whitespace.
Crusty old coders who have trouble spotting the glasses on their noses
can tell the difference between a tab and eight spaces from a distance
of approximately fifteen parsecs. Many a flamewar has been fought and
of approximately fifteen parsecs. Many a flamewar have been fought and
lost on this issue.
QEMU indents are four spaces. Tabs are never used, except in Makefiles
@@ -31,11 +31,7 @@ Do not leave whitespace dangling off the ends of lines.
2. Line width
Lines should be 80 characters; try not to make them longer.
Sometimes it is hard to do, especially when dealing with QEMU subsystems
that use long function or symbol names. Even in that case, do not make
lines much longer than 80 characters.
Lines are 80 characters; not longer.
Rationale:
- Some people like to tile their 24" screens with a 6x4 matrix of 80x24
@@ -43,8 +39,6 @@ Rationale:
let them keep doing it.
- Code and especially patches is much more readable if limited to a sane
line length. Eighty is traditional.
- The four-space indentation makes the most common excuse ("But look
at all that white space on the left!") moot.
- It is the QEMU coding style.
3. Naming
@@ -116,45 +110,3 @@ if (a == 1) {
Rationale: Yoda conditions (as in 'if (1 == a)') are awkward to read.
Besides, good compilers already warn users when '==' is mis-typed as '=',
even when the constant is on the right.
7. Comment style
We use traditional C-style /* */ comments and avoid // comments.
Rationale: The // form is valid in C99, so this is purely a matter of
consistency of style. The checkpatch script will warn you about this.
8. trace-events style
8.1 0x prefix
In trace-events files, use a '0x' prefix to specify hex numbers, as in:
some_trace(unsigned x, uint64_t y) "x 0x%x y 0x" PRIx64
An exception is made for groups of numbers that are hexadecimal by
convention and separated by the symbols '.', '/', ':', or ' ' (such as
PCI bus id):
another_trace(int cssid, int ssid, int dev_num) "bus id: %x.%x.%04x"
However, you can use '0x' for such groups if you want. Anyway, be sure that
it is obvious that numbers are in hex, ex.:
data_dump(uint8_t c1, uint8_t c2, uint8_t c3) "bytes (in hex): %02x %02x %02x"
Rationale: hex numbers are hard to read in logs when there is no 0x prefix,
especially when (occasionally) the representation doesn't contain any letters
and especially in one line with other decimal numbers. Number groups are allowed
to not use '0x' because for some things notations like %x.%x.%x are used not
only in Qemu. Also dumping raw data bytes with '0x' is less readable.
8.2 '#' printf flag
Do not use printf flag '#', like '%#x'.
Rationale: there are two ways to add a '0x' prefix to printed number: '0x%...'
and '%#...'. For consistency the only one way should be used. Arguments for
'0x%' are:
- it is more popular
- '%#' omits the 0x for the value 0 which makes output inconsistent

View File

@@ -1,270 +0,0 @@
A. HISTORY OF THE SOFTWARE
==========================
Python was created in the early 1990s by Guido van Rossum at Stichting
Mathematisch Centrum (CWI, see http://www.cwi.nl) in the Netherlands
as a successor of a language called ABC. Guido remains Python's
principal author, although it includes many contributions from others.
In 1995, Guido continued his work on Python at the Corporation for
National Research Initiatives (CNRI, see http://www.cnri.reston.va.us)
in Reston, Virginia where he released several versions of the
software.
In May 2000, Guido and the Python core development team moved to
BeOpen.com to form the BeOpen PythonLabs team. In October of the same
year, the PythonLabs team moved to Digital Creations (now Zope
Corporation, see http://www.zope.com). In 2001, the Python Software
Foundation (PSF, see http://www.python.org/psf/) was formed, a
non-profit organization created specifically to own Python-related
Intellectual Property. Zope Corporation is a sponsoring member of
the PSF.
All Python releases are Open Source (see http://www.opensource.org for
the Open Source Definition). Historically, most, but not all, Python
releases have also been GPL-compatible; the table below summarizes
the various releases.
Release Derived Year Owner GPL-
from compatible? (1)
0.9.0 thru 1.2 1991-1995 CWI yes
1.3 thru 1.5.2 1.2 1995-1999 CNRI yes
1.6 1.5.2 2000 CNRI no
2.0 1.6 2000 BeOpen.com no
1.6.1 1.6 2001 CNRI yes (2)
2.1 2.0+1.6.1 2001 PSF no
2.0.1 2.0+1.6.1 2001 PSF yes
2.1.1 2.1+2.0.1 2001 PSF yes
2.2 2.1.1 2001 PSF yes
2.1.2 2.1.1 2002 PSF yes
2.1.3 2.1.2 2002 PSF yes
2.2.1 2.2 2002 PSF yes
2.2.2 2.2.1 2002 PSF yes
2.2.3 2.2.2 2003 PSF yes
2.3 2.2.2 2002-2003 PSF yes
2.3.1 2.3 2002-2003 PSF yes
2.3.2 2.3.1 2002-2003 PSF yes
2.3.3 2.3.2 2002-2003 PSF yes
2.3.4 2.3.3 2004 PSF yes
2.3.5 2.3.4 2005 PSF yes
2.4 2.3 2004 PSF yes
2.4.1 2.4 2005 PSF yes
2.4.2 2.4.1 2005 PSF yes
2.4.3 2.4.2 2006 PSF yes
2.5 2.4 2006 PSF yes
2.7 2.6 2010 PSF yes
Footnotes:
(1) GPL-compatible doesn't mean that we're distributing Python under
the GPL. All Python licenses, unlike the GPL, let you distribute
a modified version without making your changes open source. The
GPL-compatible licenses make it possible to combine Python with
other software that is released under the GPL; the others don't.
(2) According to Richard Stallman, 1.6.1 is not GPL-compatible,
because its license has a choice of law clause. According to
CNRI, however, Stallman's lawyer has told CNRI's lawyer that 1.6.1
is "not incompatible" with the GPL.
Thanks to the many outside volunteers who have worked under Guido's
direction to make these releases possible.
B. TERMS AND CONDITIONS FOR ACCESSING OR OTHERWISE USING PYTHON
===============================================================
PYTHON SOFTWARE FOUNDATION LICENSE VERSION 2
--------------------------------------------
1. This LICENSE AGREEMENT is between the Python Software Foundation
("PSF"), and the Individual or Organization ("Licensee") accessing and
otherwise using this software ("Python") in source or binary form and
its associated documentation.
2. Subject to the terms and conditions of this License Agreement, PSF
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python
alone or in any derivative version, provided, however, that PSF's
License Agreement and PSF's notice of copyright, i.e., "Copyright (c)
2001, 2002, 2003, 2004, 2005, 2006 Python Software Foundation; All Rights
Reserved" are retained in Python alone or in any derivative version
prepared by Licensee.
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python.
4. PSF is making Python available to Licensee on an "AS IS"
basis. PSF MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, PSF MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. PSF SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. Nothing in this License Agreement shall be deemed to create any
relationship of agency, partnership, or joint venture between PSF and
Licensee. This License Agreement does not grant permission to use PSF
trademarks or trade name in a trademark sense to endorse or promote
products or services of Licensee, or any third party.
8. By copying, installing or otherwise using Python, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
BEOPEN.COM LICENSE AGREEMENT FOR PYTHON 2.0
-------------------------------------------
BEOPEN PYTHON OPEN SOURCE LICENSE AGREEMENT VERSION 1
1. This LICENSE AGREEMENT is between BeOpen.com ("BeOpen"), having an
office at 160 Saratoga Avenue, Santa Clara, CA 95051, and the
Individual or Organization ("Licensee") accessing and otherwise using
this software in source or binary form and its associated
documentation ("the Software").
2. Subject to the terms and conditions of this BeOpen Python License
Agreement, BeOpen hereby grants Licensee a non-exclusive,
royalty-free, world-wide license to reproduce, analyze, test, perform
and/or display publicly, prepare derivative works, distribute, and
otherwise use the Software alone or in any derivative version,
provided, however, that the BeOpen Python License is retained in the
Software, alone or in any derivative version prepared by Licensee.
3. BeOpen is making the Software available to Licensee on an "AS IS"
basis. BEOPEN MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, BEOPEN MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF THE SOFTWARE WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
4. BEOPEN SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF THE
SOFTWARE FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS
AS A RESULT OF USING, MODIFYING OR DISTRIBUTING THE SOFTWARE, OR ANY
DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
5. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
6. This License Agreement shall be governed by and interpreted in all
respects by the law of the State of California, excluding conflict of
law provisions. Nothing in this License Agreement shall be deemed to
create any relationship of agency, partnership, or joint venture
between BeOpen and Licensee. This License Agreement does not grant
permission to use BeOpen trademarks or trade names in a trademark
sense to endorse or promote products or services of Licensee, or any
third party. As an exception, the "BeOpen Python" logos available at
http://www.pythonlabs.com/logos.html may be used according to the
permissions granted on that web page.
7. By copying, installing or otherwise using the software, Licensee
agrees to be bound by the terms and conditions of this License
Agreement.
CNRI LICENSE AGREEMENT FOR PYTHON 1.6.1
---------------------------------------
1. This LICENSE AGREEMENT is between the Corporation for National
Research Initiatives, having an office at 1895 Preston White Drive,
Reston, VA 20191 ("CNRI"), and the Individual or Organization
("Licensee") accessing and otherwise using Python 1.6.1 software in
source or binary form and its associated documentation.
2. Subject to the terms and conditions of this License Agreement, CNRI
hereby grants Licensee a nonexclusive, royalty-free, world-wide
license to reproduce, analyze, test, perform and/or display publicly,
prepare derivative works, distribute, and otherwise use Python 1.6.1
alone or in any derivative version, provided, however, that CNRI's
License Agreement and CNRI's notice of copyright, i.e., "Copyright (c)
1995-2001 Corporation for National Research Initiatives; All Rights
Reserved" are retained in Python 1.6.1 alone or in any derivative
version prepared by Licensee. Alternately, in lieu of CNRI's License
Agreement, Licensee may substitute the following text (omitting the
quotes): "Python 1.6.1 is made available subject to the terms and
conditions in CNRI's License Agreement. This Agreement together with
Python 1.6.1 may be located on the Internet using the following
unique, persistent identifier (known as a handle): 1895.22/1013. This
Agreement may also be obtained from a proxy server on the Internet
using the following URL: http://hdl.handle.net/1895.22/1013".
3. In the event Licensee prepares a derivative work that is based on
or incorporates Python 1.6.1 or any part thereof, and wants to make
the derivative work available to others as provided herein, then
Licensee hereby agrees to include in any such work a brief summary of
the changes made to Python 1.6.1.
4. CNRI is making Python 1.6.1 available to Licensee on an "AS IS"
basis. CNRI MAKES NO REPRESENTATIONS OR WARRANTIES, EXPRESS OR
IMPLIED. BY WAY OF EXAMPLE, BUT NOT LIMITATION, CNRI MAKES NO AND
DISCLAIMS ANY REPRESENTATION OR WARRANTY OF MERCHANTABILITY OR FITNESS
FOR ANY PARTICULAR PURPOSE OR THAT THE USE OF PYTHON 1.6.1 WILL NOT
INFRINGE ANY THIRD PARTY RIGHTS.
5. CNRI SHALL NOT BE LIABLE TO LICENSEE OR ANY OTHER USERS OF PYTHON
1.6.1 FOR ANY INCIDENTAL, SPECIAL, OR CONSEQUENTIAL DAMAGES OR LOSS AS
A RESULT OF MODIFYING, DISTRIBUTING, OR OTHERWISE USING PYTHON 1.6.1,
OR ANY DERIVATIVE THEREOF, EVEN IF ADVISED OF THE POSSIBILITY THEREOF.
6. This License Agreement will automatically terminate upon a material
breach of its terms and conditions.
7. This License Agreement shall be governed by the federal
intellectual property law of the United States, including without
limitation the federal copyright law, and, to the extent such
U.S. federal law does not apply, by the law of the Commonwealth of
Virginia, excluding Virginia's conflict of law provisions.
Notwithstanding the foregoing, with regard to derivative works based
on Python 1.6.1 that incorporate non-separable material that was
previously distributed under the GNU General Public License (GPL), the
law of the Commonwealth of Virginia shall govern this License
Agreement only as to issues arising under or with respect to
Paragraphs 4, 5, and 7 of this License Agreement. Nothing in this
License Agreement shall be deemed to create any relationship of
agency, partnership, or joint venture between CNRI and Licensee. This
License Agreement does not grant permission to use CNRI trademarks or
trade name in a trademark sense to endorse or promote products or
services of Licensee, or any third party.
8. By clicking on the "ACCEPT" button where indicated, or by copying,
installing or otherwise using Python 1.6.1, Licensee agrees to be
bound by the terms and conditions of this License Agreement.
ACCEPT
CWI LICENSE AGREEMENT FOR PYTHON 0.9.0 THROUGH 1.2
--------------------------------------------------
Copyright (c) 1991 - 1995, Stichting Mathematisch Centrum Amsterdam,
The Netherlands. All rights reserved.
Permission to use, copy, modify, and distribute this software and its
documentation for any purpose and without fee is hereby granted,
provided that the above copyright notice appear in all copies and that
both that copyright notice and this permission notice appear in
supporting documentation, and that the name of Stichting Mathematisch
Centrum or CWI not be used in advertising or publicity pertaining to
distribution of the software without specific, written prior
permission.
STICHTING MATHEMATISCH CENTRUM DISCLAIMS ALL WARRANTIES WITH REGARD TO
THIS SOFTWARE, INCLUDING ALL IMPLIED WARRANTIES OF MERCHANTABILITY AND
FITNESS, IN NO EVENT SHALL STICHTING MATHEMATISCH CENTRUM BE LIABLE
FOR ANY SPECIAL, INDIRECT OR CONSEQUENTIAL DAMAGES OR ANY DAMAGES
WHATSOEVER RESULTING FROM LOSS OF USE, DATA OR PROFITS, WHETHER IN AN
ACTION OF CONTRACT, NEGLIGENCE OR OTHER TORTIOUS ACTION, ARISING OUT
OF OR IN CONNECTION WITH THE USE OR PERFORMANCE OF THIS SOFTWARE.

View File

@@ -1,6 +1,6 @@
This file documents changes for QEMU releases 0.12 and earlier.
For changelog information for later releases, see
https://wiki.qemu.org/ChangeLog or look at the git history for
http://wiki.qemu-project.org/ChangeLog or look at the git history for
more detailed information.

22
HACKING
View File

@@ -1,28 +1,10 @@
1. Preprocessor
1.1. Variadic macros
For variadic macros, stick with this C99-like syntax:
#define DPRINTF(fmt, ...) \
do { printf("IRQ: " fmt, ## __VA_ARGS__); } while (0)
1.2. Include directives
Order include directives as follows:
#include "qemu/osdep.h" /* Always first... */
#include <...> /* then system headers... */
#include "..." /* and finally QEMU headers. */
The "qemu/osdep.h" header contains preprocessor macros that affect the behavior
of core system headers like <stdint.h>. It must be the first include so that
core system headers included by external libraries get the preprocessor macros
that QEMU depends on.
Do not include "qemu/osdep.h" from header files since the .c file will have
already included it.
2. C types
It should be common sense to use the right type, but we have collected
@@ -176,10 +158,6 @@ painful. These are:
* you may assume that right shift of a signed integer duplicates
the sign bit (ie it is an arithmetic shift, not a logical shift)
In addition, QEMU assumes that the compiler does not use the latitude
given in C99 and C11 to treat aspects of signed '<<' as undefined, as
documented in the GNU Compiler Collection manual starting at version 4.0.
7. Error handling and reporting
7.1 Reporting errors to the human user

File diff suppressed because it is too large Load Diff

684
Makefile
View File

@@ -6,13 +6,7 @@ BUILD_DIR=$(CURDIR)
# Before including a proper config-host.mak, assume we are in the source tree
SRC_PATH=.
UNCHECKED_GOALS := %clean TAGS cscope ctags dist \
html info pdf txt \
help check-help print-% \
docker docker-% vm-test vm-build-%
print-%:
@echo '$*=$($*)'
UNCHECKED_GOALS := %clean TAGS cscope ctags
# All following code might depend on configuration variables
ifneq ($(wildcard config-host.mak),)
@@ -20,56 +14,24 @@ ifneq ($(wildcard config-host.mak),)
all:
include config-host.mak
PYTHON_UTF8 = LC_ALL= LANG=C LC_CTYPE=en_US.UTF-8 $(PYTHON)
git-submodule-update:
.PHONY: git-submodule-update
git_module_status := $(shell \
cd '$(SRC_PATH)' && \
GIT="$(GIT)" ./scripts/git-submodule.sh status $(GIT_SUBMODULES); \
echo $$?; \
)
ifeq (1,$(git_module_status))
ifeq (no,$(GIT_UPDATE))
git-submodule-update:
$(call quiet-command, \
echo && \
echo "GIT submodule checkout is out of date. Please run" && \
echo " scripts/git-submodule.sh update $(GIT_SUBMODULES)" && \
echo "from the source directory checkout $(SRC_PATH)" && \
echo && \
exit 1)
else
git-submodule-update:
$(call quiet-command, \
(cd $(SRC_PATH) && GIT="$(GIT)" ./scripts/git-submodule.sh update $(GIT_SUBMODULES)), \
"GIT","$(GIT_SUBMODULES)")
endif
endif
.git-submodule-status: git-submodule-update config-host.mak
# Check that we're not trying to do an out-of-tree build from
# a tree that's been used for an in-tree build.
ifneq ($(realpath $(SRC_PATH)),$(realpath .))
ifneq ($(wildcard $(SRC_PATH)/config-host.mak),)
$(error This is an out of tree build but your source tree ($(SRC_PATH)) \
seems to have been used for an in-tree build. You can fix this by running \
"$(MAKE) distclean && rm -rf *-linux-user *-softmmu" in your source tree)
"make distclean && rm -rf *-linux-user *-softmmu" in your source tree)
endif
endif
CONFIG_SOFTMMU := $(if $(filter %-softmmu,$(TARGET_DIRS)),y)
CONFIG_USER_ONLY := $(if $(filter %-user,$(TARGET_DIRS)),y)
CONFIG_XEN := $(CONFIG_XEN_BACKEND)
CONFIG_ALL=y
-include config-all-devices.mak
-include config-all-disas.mak
config-host.mak: $(SRC_PATH)/configure $(SRC_PATH)/pc-bios
include $(SRC_PATH)/rules.mak
config-host.mak: $(SRC_PATH)/configure
@echo $@ is out-of-date, running configure
@# TODO: The next lines include code which supports a smooth
@# transition from old configurations without config.status.
@@ -87,193 +49,39 @@ ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fa
endif
endif
include $(SRC_PATH)/rules.mak
GENERATED_HEADERS = config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += qmp-introspect.h
GENERATED_SOURCES += qmp-introspect.c
GENERATED_FILES = qemu-version.h config-host.h qemu-options.def
GENERATED_FILES += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_FILES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_FILES += qmp-introspect.h
GENERATED_FILES += qmp-introspect.c
GENERATED_HEADERS += trace/generated-events.h
GENERATED_SOURCES += trace/generated-events.c
GENERATED_FILES += trace/generated-tcg-tracers.h
GENERATED_FILES += trace/generated-helpers-wrappers.h
GENERATED_FILES += trace/generated-helpers.h
GENERATED_FILES += trace/generated-helpers.c
ifdef CONFIG_TRACE_UST
GENERATED_FILES += trace-ust-all.h
GENERATED_FILES += trace-ust-all.c
GENERATED_HEADERS += trace/generated-tracers.h
ifeq ($(findstring dtrace,$(TRACE_BACKENDS)),dtrace)
GENERATED_HEADERS += trace/generated-tracers-dtrace.h
endif
GENERATED_SOURCES += trace/generated-tracers.c
GENERATED_FILES += module_block.h
GENERATED_HEADERS += trace/generated-tcg-tracers.h
TRACE_HEADERS = trace-root.h $(trace-events-subdirs:%=%/trace.h)
TRACE_SOURCES = trace-root.c $(trace-events-subdirs:%=%/trace.c)
TRACE_DTRACE =
ifdef CONFIG_TRACE_DTRACE
TRACE_HEADERS += trace-dtrace-root.h $(trace-events-subdirs:%=%/trace-dtrace.h)
TRACE_DTRACE += trace-dtrace-root.dtrace $(trace-events-subdirs:%=%/trace-dtrace.dtrace)
GENERATED_HEADERS += trace/generated-helpers-wrappers.h
GENERATED_HEADERS += trace/generated-helpers.h
GENERATED_SOURCES += trace/generated-helpers.c
ifeq ($(findstring ust,$(TRACE_BACKENDS)),ust)
GENERATED_HEADERS += trace/generated-ust-provider.h
GENERATED_SOURCES += trace/generated-ust.c
endif
ifdef CONFIG_TRACE_UST
TRACE_HEADERS += trace-ust-root.h $(trace-events-subdirs:%=%/trace-ust.h)
endif
GENERATED_FILES += $(TRACE_HEADERS)
GENERATED_FILES += $(TRACE_SOURCES)
GENERATED_FILES += $(BUILD_DIR)/trace-events-all
GENERATED_FILES += .git-submodule-status
trace-group-name = $(shell dirname $1 | sed -e 's/[^a-zA-Z0-9]/_/g')
tracetool-y = $(SRC_PATH)/scripts/tracetool.py
tracetool-y += $(shell find $(SRC_PATH)/scripts/tracetool -name "*.py")
%/trace.h: %/trace.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
%/trace.h-timestamp: $(SRC_PATH)/%/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=$(call trace-group-name,$@) \
--format=h \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
%/trace.c: %/trace.c-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
%/trace.c-timestamp: $(SRC_PATH)/%/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=$(call trace-group-name,$@) \
--format=c \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
%/trace-ust.h: %/trace-ust.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
%/trace-ust.h-timestamp: $(SRC_PATH)/%/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=$(call trace-group-name,$@) \
--format=ust-events-h \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
%/trace-dtrace.dtrace: %/trace-dtrace.dtrace-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
%/trace-dtrace.dtrace-timestamp: $(SRC_PATH)/%/trace-events $(BUILD_DIR)/config-host.mak $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=$(call trace-group-name,$@) \
--format=d \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
%/trace-dtrace.h: %/trace-dtrace.dtrace $(tracetool-y)
$(call quiet-command,dtrace -o $@ -h -s $<, "GEN","$@")
%/trace-dtrace.o: %/trace-dtrace.dtrace $(tracetool-y)
trace-root.h: trace-root.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-root.h-timestamp: $(SRC_PATH)/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=root \
--format=h \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
trace-root.c: trace-root.c-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-root.c-timestamp: $(SRC_PATH)/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=root \
--format=c \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
trace-ust-root.h: trace-ust-root.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-ust-root.h-timestamp: $(SRC_PATH)/trace-events $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=root \
--format=ust-events-h \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
trace-ust-all.h: trace-ust-all.h-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-ust-all.h-timestamp: $(trace-events-files) $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=ust-events-h \
--backends=$(TRACE_BACKENDS) \
$(trace-events-files) > $@,"GEN","$(@:%-timestamp=%)")
trace-ust-all.c: trace-ust-all.c-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-ust-all.c-timestamp: $(trace-events-files) $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=ust-events-c \
--backends=$(TRACE_BACKENDS) \
$(trace-events-files) > $@,"GEN","$(@:%-timestamp=%)")
trace-dtrace-root.dtrace: trace-dtrace-root.dtrace-timestamp
@cmp $< $@ >/dev/null 2>&1 || cp $< $@
trace-dtrace-root.dtrace-timestamp: $(SRC_PATH)/trace-events $(BUILD_DIR)/config-host.mak $(tracetool-y)
$(call quiet-command,$(TRACETOOL) \
--group=root \
--format=d \
--backends=$(TRACE_BACKENDS) \
$< > $@,"GEN","$(@:%-timestamp=%)")
trace-dtrace-root.h: trace-dtrace-root.dtrace
$(call quiet-command,dtrace -o $@ -h -s $<, "GEN","$@")
trace-dtrace-root.o: trace-dtrace-root.dtrace
KEYCODEMAP_GEN = $(SRC_PATH)/ui/keycodemapdb/tools/keymap-gen
KEYCODEMAP_CSV = $(SRC_PATH)/ui/keycodemapdb/data/keymaps.csv
KEYCODEMAP_FILES = \
ui/input-keymap-atset1-to-qcode.c \
ui/input-keymap-linux-to-qcode.c \
ui/input-keymap-qcode-to-atset1.c \
ui/input-keymap-qcode-to-atset2.c \
ui/input-keymap-qcode-to-atset3.c \
ui/input-keymap-qcode-to-linux.c \
ui/input-keymap-qcode-to-qnum.c \
ui/input-keymap-qcode-to-sun.c \
ui/input-keymap-qnum-to-qcode.c \
ui/input-keymap-usb-to-qcode.c \
ui/input-keymap-win32-to-qcode.c \
ui/input-keymap-x11-to-qcode.c \
ui/input-keymap-xorgevdev-to-qcode.c \
ui/input-keymap-xorgkbd-to-qcode.c \
ui/input-keymap-xorgxquartz-to-qcode.c \
ui/input-keymap-xorgxwin-to-qcode.c \
$(NULL)
GENERATED_FILES += $(KEYCODEMAP_FILES)
ui/input-keymap-%.c: $(KEYCODEMAP_GEN) $(KEYCODEMAP_CSV) $(SRC_PATH)/ui/Makefile.objs
$(call quiet-command,\
stem=$* && src=$${stem%-to-*} dst=$${stem#*-to-} && \
test -e $(KEYCODEMAP_GEN) && \
$(PYTHON) $(KEYCODEMAP_GEN) \
--lang glib2 \
--varname qemu_input_map_$${src}_to_$${dst} \
code-map $(KEYCODEMAP_CSV) $${src} $${dst} \
> $@ || rm -f $@, "GEN", "$@")
$(KEYCODEMAP_GEN): .git-submodule-status
$(KEYCODEMAP_CSV): .git-submodule-status
# Don't try to regenerate Makefile or configure
# We don't generate any of them
Makefile: ;
configure: ;
.PHONY: all clean cscope distclean html info install install-doc \
pdf txt recurse-all dist msi FORCE
.PHONY: all clean cscope distclean dvi html info install install-doc \
pdf recurse-all speed test dist msi
$(call set-vpath, $(SRC_PATH))
@@ -282,10 +90,11 @@ LIBS+=-lz $(LIBS_TOOLS)
HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-doc.txt qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
DOCS+=docs/interop/qemu-qmp-ref.html docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.7
DOCS+=docs/interop/qemu-ga-ref.html docs/interop/qemu-ga-ref.txt docs/interop/qemu-ga-ref.7
DOCS+=docs/qemu-block-drivers.7
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
DOCS+=qmp-commands.txt
ifdef CONFIG_LINUX
DOCS+=kvm_stat.1
endif
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -293,26 +102,26 @@ else
DOCS=
endif
SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory --quiet) BUILD_DIR=$(BUILD_DIR)
SUBDIR_MAKEFLAGS=$(if $(V),,--no-print-directory) BUILD_DIR=$(BUILD_DIR)
SUBDIR_DEVICES_MAK=$(patsubst %, %/config-devices.mak, $(TARGET_DIRS))
SUBDIR_DEVICES_MAK_DEP=$(patsubst %, %-config-devices.mak.d, $(TARGET_DIRS))
ifeq ($(SUBDIR_DEVICES_MAK),)
config-all-devices.mak:
$(call quiet-command,echo '# no devices' > $@,"GEN","$@")
$(call quiet-command,echo '# no devices' > $@," GEN $@")
else
config-all-devices.mak: $(SUBDIR_DEVICES_MAK)
$(call quiet-command, sed -n \
's|^\([^=]*\)=\(.*\)$$|\1:=$$(findstring y,$$(\1)\2)|p' \
$(SUBDIR_DEVICES_MAK) | sort -u > $@, \
"GEN","$@")
" GEN $@")
endif
-include $(SUBDIR_DEVICES_MAK_DEP)
%/config-devices.mak: default-configs/%.mak $(SRC_PATH)/scripts/make_device_config.sh
%/config-devices.mak: default-configs/%.mak
$(call quiet-command, \
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp,"GEN","$@.tmp")
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp")
$(call quiet-command, if test -f $@; then \
if cmp -s $@.old $@; then \
mv $@.tmp $@; \
@@ -323,13 +132,13 @@ endif
else \
echo "WARNING: $@ out of date.";\
fi; \
echo "Run \"$(MAKE) defconfig\" to regenerate."; \
echo "Run \"make defconfig\" to regenerate."; \
rm $@.tmp; \
fi; \
else \
mv $@.tmp $@; \
cp -p $@ $@.old; \
fi,"GEN","$@");
fi, " GEN $@");
defconfig:
rm -f config-all-devices.mak $(SUBDIR_DEVICES_MAK)
@@ -340,14 +149,10 @@ endif
dummy := $(call unnest-vars,, \
stub-obj-y \
chardev-obj-y \
util-obj-y \
qga-obj-y \
ivshmem-client-obj-y \
ivshmem-server-obj-y \
libvhost-user-obj-y \
vhost-user-scsi-obj-y \
vhost-user-blk-obj-y \
qga-vss-dll-obj-y \
block-obj-y \
block-obj-m \
@@ -356,41 +161,18 @@ dummy := $(call unnest-vars,, \
qom-obj-y \
io-obj-y \
common-obj-y \
common-obj-m \
trace-obj-y)
common-obj-m)
include $(SRC_PATH)/tests/Makefile.include
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile
endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
qemu-version.h: FORCE
$(call quiet-command, \
(cd $(SRC_PATH); \
printf '#define QEMU_PKGVERSION '; \
if test -n "$(PKGVERSION)"; then \
printf '"$(PKGVERSION)"\n'; \
else \
if test -d .git; then \
printf '" ('; \
git describe --match 'v*' 2>/dev/null | tr -d '\n'; \
if ! git diff-index --quiet HEAD &>/dev/null; then \
printf -- '-dirty'; \
fi; \
printf ')"\n'; \
else \
printf '""\n'; \
fi; \
fi) > $@.tmp)
$(call quiet-command, if ! cmp -s $@ $@.tmp; then \
mv $@.tmp $@; \
else \
rm $@.tmp; \
fi)
config-host.h: config-host.h-timestamp
config-host.h-timestamp: config-host.mak
qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$@")
qemu-options.def: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
SOFTMMU_SUBDIR_RULES=$(filter %-softmmu,$(SUBDIR_RULES))
@@ -403,150 +185,126 @@ $(SOFTMMU_SUBDIR_RULES): config-all-devices.mak
subdir-%:
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C $* V="$(V)" TARGET_DIR="$*/" all,)
subdir-pixman: pixman/Makefile
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pixman V="$(V)" all,)
pixman/Makefile: $(SRC_PATH)/pixman/configure
(cd pixman; CFLAGS="$(CFLAGS) -fPIC $(extra_cflags) $(extra_ldflags)" $(SRC_PATH)/pixman/configure $(AUTOCONF_HOST) --disable-gtk --disable-shared --enable-static)
$(SRC_PATH)/pixman/configure:
(cd $(SRC_PATH)/pixman; autoreconf -v --install)
DTC_MAKE_ARGS=-I$(SRC_PATH)/dtc VPATH=$(SRC_PATH)/dtc -C dtc V="$(V)" LIBFDT_srcdir=$(SRC_PATH)/dtc/libfdt
DTC_CFLAGS=$(CFLAGS) $(QEMU_CFLAGS)
DTC_CPPFLAGS=-I$(BUILD_DIR)/dtc -I$(SRC_PATH)/dtc -I$(SRC_PATH)/dtc/libfdt
subdir-dtc: .git-submodule-status dtc/libfdt dtc/tests
subdir-dtc:dtc/libfdt dtc/tests
$(call quiet-command,$(MAKE) $(DTC_MAKE_ARGS) CPPFLAGS="$(DTC_CPPFLAGS)" CFLAGS="$(DTC_CFLAGS)" LDFLAGS="$(LDFLAGS)" ARFLAGS="$(ARFLAGS)" CC="$(CC)" AR="$(AR)" LD="$(LD)" $(SUBDIR_MAKEFLAGS) libfdt/libfdt.a,)
dtc/%: .git-submodule-status
dtc/%:
mkdir -p $@
# Overriding CFLAGS causes us to lose defines added in the sub-makefile.
# Not overriding CFLAGS leads to mis-matches between compilation modes.
# Therefore we replicate some of the logic in the sub-makefile.
# Remove all the extra -Warning flags that QEMU uses that Capstone doesn't;
# no need to annoy QEMU developers with such things.
CAP_CFLAGS = $(patsubst -W%,,$(CFLAGS) $(QEMU_CFLAGS))
CAP_CFLAGS += -DCAPSTONE_USE_SYS_DYN_MEM
CAP_CFLAGS += -DCAPSTONE_HAS_ARM
CAP_CFLAGS += -DCAPSTONE_HAS_ARM64
CAP_CFLAGS += -DCAPSTONE_HAS_POWERPC
CAP_CFLAGS += -DCAPSTONE_HAS_X86
subdir-capstone: .git-submodule-status
$(call quiet-command,$(MAKE) -C $(SRC_PATH)/capstone CAPSTONE_SHARED=no BUILDDIR="$(BUILD_DIR)/capstone" CC="$(CC)" AR="$(AR)" LD="$(LD)" RANLIB="$(RANLIB)" CFLAGS="$(CAP_CFLAGS)" $(SUBDIR_MAKEFLAGS) $(BUILD_DIR)/capstone/$(LIBCAPSTONE))
$(SUBDIR_RULES): libqemuutil.a $(common-obj-y) $(chardev-obj-y) \
$(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))
$(SUBDIR_RULES): libqemuutil.a libqemustub.a $(common-obj-y) $(qom-obj-y) $(crypto-aes-obj-$(CONFIG_USER_ONLY))
ROMSUBDIR_RULES=$(patsubst %,romsubdir-%, $(ROMS))
# Only keep -O and -g cflags
romsubdir-%:
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pc-bios/$* V="$(V)" TARGET_DIR="$*/" CFLAGS="$(filter -O% -g%,$(CFLAGS))",)
$(call quiet-command,$(MAKE) $(SUBDIR_MAKEFLAGS) -C pc-bios/$* V="$(V)" TARGET_DIR="$*/",)
ALL_SUBDIRS=$(TARGET_DIRS) $(patsubst %,pc-bios/%, $(ROMS))
recurse-all: $(SUBDIR_RULES) $(ROMSUBDIR_RULES)
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc config-host.h
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<,"RC","version.o")
$(BUILD_DIR)/version.o: $(SRC_PATH)/version.rc config-host.h | $(BUILD_DIR)/version.lo
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.o")
$(BUILD_DIR)/version.lo: $(SRC_PATH)/version.rc config-host.h
$(call quiet-command,$(WINDRES) -I$(BUILD_DIR) -o $@ $<," RC version.lo")
Makefile: $(version-obj-y)
Makefile: $(version-obj-y) $(version-lobj-y)
######################################################################
# Build libraries
libqemuutil.a: $(util-obj-y) $(trace-obj-y) $(stub-obj-y)
libvhost-user.a: $(libvhost-user-obj-y)
libqemustub.a: $(stub-obj-y)
libqemuutil.a: $(util-obj-y)
block-modules = $(foreach o,$(block-obj-m),"$(basename $(subst /,-,$o))",) NULL
util/module.o-cflags = -D'CONFIG_BLOCK_MODULES=$(block-modules)'
######################################################################
COMMON_LDADDS = libqemuutil.a
qemu-img.o: qemu-img-cmds.h
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
qemu-img$(EXESUF): qemu-img.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-nbd$(EXESUF): qemu-nbd.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-io$(EXESUF): qemu-io.o $(block-obj-y) $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) libqemuutil.a libqemustub.a
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o $(COMMON_LDADDS)
qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o libqemuutil.a libqemustub.a
qemu-keymap$(EXESUF): qemu-keymap.o ui/input-keymap.o $(COMMON_LDADDS)
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o $(COMMON_LDADDS)
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
scsi/qemu-pr-helper$(EXESUF): scsi/qemu-pr-helper.o scsi/utils.o $(crypto-obj-y) $(io-obj-y) $(qom-obj-y) $(COMMON_LDADDS)
ifdef CONFIG_MPATH
scsi/qemu-pr-helper$(EXESUF): LIBS += -ludev -lmultipath -lmpathpersist
endif
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$@")
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
qemu-ga$(EXESUF): LIBS = $(LIBS_QGA)
qemu-ga$(EXESUF): QEMU_CFLAGS += -I qga/qapi-generated
qemu-keymap$(EXESUF): LIBS += $(XKBCOMMON_LIBS)
qemu-keymap$(EXESUF): QEMU_CFLAGS += $(XKBCOMMON_CFLAGS)
gen-out-type = $(subst .,-,$(suffix $@))
qapi-py = $(SRC_PATH)/scripts/qapi.py $(SRC_PATH)/scripts/ordereddict.py
qga/qapi-generated/qga-qapi-types.c qga/qapi-generated/qga-qapi-types.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-types.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
" GEN $@")
qga/qapi-generated/qga-qapi-visit.c qga/qapi-generated/qga-qapi-visit.h :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-visit.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
" GEN $@")
qga/qapi-generated/qga-qmp-commands.h qga/qapi-generated/qga-qmp-marshal.c :\
$(SRC_PATH)/qga/qapi-schema.json $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-commands.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o qga/qapi-generated -p "qga-" $<, \
"GEN","$@")
" GEN $@")
qapi-modules = $(SRC_PATH)/qapi-schema.json $(SRC_PATH)/qapi/common.json \
$(SRC_PATH)/qapi/block.json $(SRC_PATH)/qapi/block-core.json \
$(SRC_PATH)/qapi/char.json \
$(SRC_PATH)/qapi/crypto.json \
$(SRC_PATH)/qapi/introspect.json \
$(SRC_PATH)/qapi/migration.json \
$(SRC_PATH)/qapi/net.json \
$(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/run-state.json \
$(SRC_PATH)/qapi/sockets.json \
$(SRC_PATH)/qapi/tpm.json \
$(SRC_PATH)/qapi/trace.json \
$(SRC_PATH)/qapi/transaction.json \
$(SRC_PATH)/qapi/ui.json
$(SRC_PATH)/qapi/event.json $(SRC_PATH)/qapi/introspect.json \
$(SRC_PATH)/qapi/crypto.json $(SRC_PATH)/qapi/rocker.json \
$(SRC_PATH)/qapi/trace.json
qapi-types.c qapi-types.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-types.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-types.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-types.py \
$(gen-out-type) -o "." -b $<, \
"GEN","$@")
" GEN $@")
qapi-visit.c qapi-visit.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-visit.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-visit.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-visit.py \
$(gen-out-type) -o "." -b $<, \
"GEN","$@")
" GEN $@")
qapi-event.c qapi-event.h :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-event.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-event.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-event.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
" GEN $@")
qmp-commands.h qmp-marshal.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-commands.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-commands.py \
$(gen-out-type) -o "." -m $<, \
" GEN $@")
qmp-introspect.h qmp-introspect.c :\
$(qapi-modules) $(SRC_PATH)/scripts/qapi-introspect.py $(qapi-py)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi-introspect.py \
$(call quiet-command,$(PYTHON) $(SRC_PATH)/scripts/qapi-introspect.py \
$(gen-out-type) -o "." $<, \
"GEN","$@")
" GEN $@")
QGALIB_GEN=$(addprefix qga/qapi-generated/, qga-qapi-types.h qga-qapi-visit.h qga-qmp-commands.h)
$(qga-obj-y): $(QGALIB_GEN)
$(qga-obj-y) qemu-ga.o: $(QGALIB_GEN)
qemu-ga$(EXESUF): $(qga-obj-y) $(COMMON_LDADDS)
qemu-ga$(EXESUF): $(qga-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
ifdef QEMU_GA_MSI_ENABLED
@@ -560,7 +318,7 @@ $(QEMU_GA_MSI): config-host.mak
$(QEMU_GA_MSI): $(SRC_PATH)/qga/installer/qemu-ga.wxs
$(call quiet-command,QEMU_GA_VERSION="$(QEMU_GA_VERSION)" QEMU_GA_MANUFACTURER="$(QEMU_GA_MANUFACTURER)" QEMU_GA_DISTRO="$(QEMU_GA_DISTRO)" BUILD_DIR="$(BUILD_DIR)" \
wixl -o $@ $(QEMU_GA_MSI_ARCH) $(QEMU_GA_MSI_WITH_VSS) $(QEMU_GA_MSI_MINGW_DLL_PATH) $<,"WIXL","$@")
wixl -o $@ $(QEMU_GA_MSI_ARCH) $(QEMU_GA_MSI_WITH_VSS) $(QEMU_GA_MSI_MINGW_DLL_PATH) $<, " WIXL $@")
else
msi:
@echo "MSI build not configured or dependency resolution failed (reconfigure with --enable-guest-agent-msi option)"
@@ -571,43 +329,33 @@ ifneq ($(EXESUF),)
qemu-ga: qemu-ga$(EXESUF) $(QGA_VSS_PROVIDER) $(QEMU_GA_MSI)
endif
ifdef CONFIG_IVSHMEM
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) $(COMMON_LDADDS)
ivshmem-client$(EXESUF): $(ivshmem-client-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) $(COMMON_LDADDS)
ivshmem-server$(EXESUF): $(ivshmem-server-obj-y) libqemuutil.a libqemustub.a
$(call LINK, $^)
endif
vhost-user-scsi$(EXESUF): $(vhost-user-scsi-obj-y) libvhost-user.a
$(call LINK, $^)
vhost-user-blk$(EXESUF): $(vhost-user-blk-obj-y) libvhost-user.a
$(call LINK, $^)
module_block.h: $(SRC_PATH)/scripts/modules/module_block.py config-host.mak
$(call quiet-command,$(PYTHON) $< $@ \
$(addprefix $(SRC_PATH)/,$(patsubst %.mo,%.c,$(block-obj-m))), \
"GEN","$@")
clean:
# avoid old build problems by removing potentially incorrect old files
rm -f config.mak op-i386.h opc-i386.h gen-op-i386.h op-arm.h opc-arm.h gen-op-arm.h
rm -f qemu-options.def
rm -f *.msi
find . \( -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
find . \( -name '*.l[oa]' -o -name '*.so' -o -name '*.dll' -o -name '*.mo' -o -name '*.[oda]' \) -type f -exec rm {} +
rm -f $(filter-out %.tlb,$(TOOLS)) $(HELPERS-y) qemu-ga TAGS cscope.* *.pod *~ */*~
rm -f fsdev/*.pod scsi/*.pod
rm -f fsdev/*.pod
rm -rf .libs */.libs
rm -f qemu-img-cmds.h
rm -f ui/shader/*-vert.h ui/shader/*-frag.h
@# May not be present in GENERATED_FILES
@# May not be present in GENERATED_HEADERS
rm -f trace/generated-tracers-dtrace.dtrace*
rm -f trace/generated-tracers-dtrace.h*
rm -f $(foreach f,$(GENERATED_FILES),$(f) $(f)-timestamp)
rm -f $(foreach f,$(GENERATED_HEADERS),$(f) $(f)-timestamp)
rm -f $(foreach f,$(GENERATED_SOURCES),$(f) $(f)-timestamp)
rm -rf qapi-generated
rm -rf qga/qapi-generated
for d in $(ALL_SUBDIRS); do \
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
rm -f $$d/qemu-options.def; \
done
rm -f $(SUBDIR_DEVICES_MAK) config-all-devices.mak
VERSION ?= $(shell cat VERSION)
@@ -621,23 +369,18 @@ distclean: clean
rm -f config-all-devices.mak config-all-disas.mak config.status
rm -f po/*.mo tests/qemu-iotests/common.env
rm -f roms/seabios/config.mak roms/vgabios/config.mak
rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps
rm -f qemu-doc.info qemu-doc.aux qemu-doc.cp qemu-doc.cps qemu-doc.dvi
rm -f qemu-doc.fn qemu-doc.fns qemu-doc.info qemu-doc.ky qemu-doc.kys
rm -f qemu-doc.log qemu-doc.pdf qemu-doc.pg qemu-doc.toc qemu-doc.tp
rm -f qemu-doc.vr qemu-doc.txt
rm -f qemu-doc.vr
rm -f config.log
rm -f linux-headers/asm
rm -f docs/version.texi
rm -f docs/interop/qemu-ga-qapi.texi docs/interop/qemu-qmp-qapi.texi
rm -f docs/interop/qemu-qmp-ref.7 docs/interop/qemu-ga-ref.7
rm -f docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
rm -f docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
rm -f docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
rm -f docs/qemu-block-drivers.7
rm -f qemu-tech.info qemu-tech.aux qemu-tech.cp qemu-tech.dvi qemu-tech.fn qemu-tech.info qemu-tech.ky qemu-tech.log qemu-tech.pdf qemu-tech.pg qemu-tech.toc qemu-tech.tp qemu-tech.vr
for d in $(TARGET_DIRS); do \
rm -rf $$d || exit 1 ; \
done
rm -Rf .sdk
if test -f pixman/config.log; then $(MAKE) -C pixman distclean; fi
if test -f dtc/version_gen.h; then $(MAKE) $(DTC_MAKE_ARGS) clean; fi
KEYMAPS=da en-gb et fr fr-ch is lt modifiers no pt-br sv \
@@ -654,32 +397,24 @@ pxe-e1000.rom pxe-eepro100.rom pxe-ne2k_pci.rom \
pxe-pcnet.rom pxe-rtl8139.rom pxe-virtio.rom \
efi-e1000.rom efi-eepro100.rom efi-ne2k_pci.rom \
efi-pcnet.rom efi-rtl8139.rom efi-virtio.rom \
efi-e1000e.rom efi-vmxnet3.rom \
qemu-icon.bmp qemu_logo_no_text.svg \
bamboo.dtb petalogix-s3adsp1800.dtb petalogix-ml605.dtb \
multiboot.bin linuxboot.bin linuxboot_dma.bin kvmvapic.bin \
s390-ccw.img s390-netboot.img \
spapr-rtas.bin slof.bin skiboot.lid \
multiboot.bin linuxboot.bin kvmvapic.bin \
s390-ccw.img \
spapr-rtas.bin slof.bin \
palcode-clipper \
u-boot.e500 \
qemu_vga.ndrv \
hppa-firmware.img
u-boot.e500
else
BLOBS=
endif
install-doc: $(DOCS)
$(INSTALL_DIR) "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qemu-doc.html qemu-tech.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) qmp-commands.txt "$(DESTDIR)$(qemu_docdir)"
ifdef CONFIG_POSIX
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DATA) qemu.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/interop/qemu-qmp-ref.7 "$(DESTDIR)$(mandir)/man7"
$(INSTALL_DATA) docs/qemu-block-drivers.7 "$(DESTDIR)$(mandir)/man7"
ifneq ($(TOOLS),)
$(INSTALL_DATA) qemu-img.1 "$(DESTDIR)$(mandir)/man1"
$(INSTALL_DIR) "$(DESTDIR)$(mandir)/man8"
@@ -687,9 +422,6 @@ ifneq ($(TOOLS),)
endif
ifneq (,$(findstring qemu-ga,$(TOOLS)))
$(INSTALL_DATA) qemu-ga.8 "$(DESTDIR)$(mandir)/man8"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.html "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.txt "$(DESTDIR)$(qemu_docdir)"
$(INSTALL_DATA) docs/interop/qemu-ga-ref.7 "$(DESTDIR)$(mandir)/man7"
endif
endif
ifdef CONFIG_VIRTFS
@@ -708,7 +440,8 @@ endif
endif
install: all $(if $(BUILD_DOCS),install-doc) install-datadir install-localstatedir
install: all $(if $(BUILD_DOCS),install-doc) \
install-datadir install-localstatedir
ifneq ($(TOOLS),)
$(call install-prog,$(subst qemu-ga,qemu-ga$(EXESUF),$(TOOLS)),$(DESTDIR)$(bindir))
endif
@@ -735,19 +468,23 @@ endif
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(BUILD_DIR)/trace-events-all "$(DESTDIR)$(qemu_datadir)/trace-events-all"
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done
# various test targets
test speed: all
$(MAKE) -C tests/tcg $@
.PHONY: ctags
ctags:
rm -f tags
rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec ctags --append {} +
.PHONY: TAGS
TAGS:
rm -f TAGS
rm -f $@
find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
cscope:
@@ -760,91 +497,94 @@ ui/shader/%-vert.h: $(SRC_PATH)/ui/shader/%.vert $(SRC_PATH)/scripts/shaderinclu
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
"VERT","$@")
" VERT $@")
ui/shader/%-frag.h: $(SRC_PATH)/ui/shader/%.frag $(SRC_PATH)/scripts/shaderinclude.pl
@mkdir -p $(dir $@)
$(call quiet-command,\
perl $(SRC_PATH)/scripts/shaderinclude.pl $< > $@,\
"FRAG","$@")
" FRAG $@")
ui/shader.o: $(SRC_PATH)/ui/shader.c \
ui/shader/texture-blit-vert.h \
ui/shader/texture-blit-flip-vert.h \
ui/shader/texture-blit-frag.h
ui/console-gl.o: $(SRC_PATH)/ui/console-gl.c \
ui/shader/texture-blit-vert.h ui/shader/texture-blit-frag.h
# documentation
MAKEINFO=makeinfo
MAKEINFOINCLUDES= -I docs -I $(<D) -I $(@D)
MAKEINFOFLAGS=--no-split --number-sections $(MAKEINFOINCLUDES)
TEXI2PODFLAGS=$(MAKEINFOINCLUDES) "-DVERSION=$(VERSION)"
TEXI2PDFFLAGS=$(if $(V),,--quiet) -I $(SRC_PATH) $(MAKEINFOINCLUDES)
MAKEINFOFLAGS=--no-headers --no-split --number-sections
TEXIFLAG=$(if $(V),,--quiet)
%.dvi: %.texi
$(call quiet-command,texi2dvi $(TEXIFLAG) -I . $<," GEN $@")
docs/version.texi: $(SRC_PATH)/VERSION
$(call quiet-command,echo "@set VERSION $(VERSION)" > $@,"GEN","$@")
%.html: %.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --html $< -o $@, \
" GEN $@")
%.html: %.texi docs/version.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--html $< -o $@,"GEN","$@")
%.info: %.texi
$(call quiet-command,$(MAKEINFO) $< -o $@," GEN $@")
%.info: %.texi docs/version.texi
$(call quiet-command,$(MAKEINFO) $(MAKEINFOFLAGS) $< -o $@,"GEN","$@")
%.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<," GEN $@")
%.txt: %.texi docs/version.texi
$(call quiet-command,LC_ALL=C $(MAKEINFO) $(MAKEINFOFLAGS) --no-headers \
--plaintext $< -o $@,"GEN","$@")
qemu-options.texi: $(SRC_PATH)/qemu-options.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
%.pdf: %.texi docs/version.texi
$(call quiet-command,texi2pdf $(TEXI2PDFFLAGS) $< -o $@,"GEN","$@")
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@," GEN $@")
qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@,"GEN","$@")
docs/interop/qemu-qmp-qapi.texi docs/interop/qemu-ga-qapi.texi: $(SRC_PATH)/scripts/qapi2texi.py $(qapi-py)
docs/interop/qemu-qmp-qapi.texi: $(qapi-modules)
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
docs/interop/qemu-ga-qapi.texi: $(SRC_PATH)/qga/qapi-schema.json
$(call quiet-command,$(PYTHON_UTF8) $(SRC_PATH)/scripts/qapi2texi.py $< > $@,"GEN","$@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
qemu.1: qemu-option-trace.texi
qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
" GEN $@")
qemu-img.1: qemu-img.texi qemu-img-cmds.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \
" GEN $@")
fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< fsdev/virtfs-proxy-helper.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " fsdev/virtfs-proxy-helper.pod > $@, \
" GEN $@")
qemu-nbd.8: qemu-nbd.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
$(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
" GEN $@")
qemu-ga.8: qemu-ga.texi
docs/qemu-block-drivers.7: docs/qemu-block-drivers.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-ga.pod && \
$(POD2MAN) --section=8 --center=" " --release=" " qemu-ga.pod > $@, \
" GEN $@")
html: qemu-doc.html docs/interop/qemu-qmp-ref.html docs/interop/qemu-ga-ref.html
info: qemu-doc.info docs/interop/qemu-qmp-ref.info docs/interop/qemu-ga-ref.info
pdf: qemu-doc.pdf docs/interop/qemu-qmp-ref.pdf docs/interop/qemu-ga-ref.pdf
txt: qemu-doc.txt docs/interop/qemu-qmp-ref.txt docs/interop/qemu-ga-ref.txt
kvm_stat.1: scripts/kvm/kvm_stat.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
" GEN $@")
qemu-doc.html qemu-doc.info qemu-doc.pdf qemu-doc.txt: \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
dvi: qemu-doc.dvi qemu-tech.dvi
html: qemu-doc.html qemu-tech.html
info: qemu-doc.info qemu-tech.info
pdf: qemu-doc.pdf qemu-tech.pdf
qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi docs/qemu-block-drivers.texi
docs/interop/qemu-ga-ref.dvi docs/interop/qemu-ga-ref.html \
docs/interop/qemu-ga-ref.info docs/interop/qemu-ga-ref.pdf \
docs/interop/qemu-ga-ref.txt docs/interop/qemu-ga-ref.7: \
docs/interop/qemu-ga-ref.texi docs/interop/qemu-ga-qapi.texi
docs/interop/qemu-qmp-ref.dvi docs/interop/qemu-qmp-ref.html \
docs/interop/qemu-qmp-ref.info docs/interop/qemu-qmp-ref.pdf \
docs/interop/qemu-qmp-ref.txt docs/interop/qemu-qmp-ref.7: \
docs/interop/qemu-qmp-ref.texi docs/interop/qemu-qmp-qapi.texi
qemu-monitor-info.texi
ifdef CONFIG_WIN32
@@ -904,58 +644,10 @@ endif # CONFIG_WIN
# Add a dependency on the generated files, so that they are always
# rebuilt before other object files
ifneq ($(wildcard config-host.mak),)
ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fail))
Makefile: $(GENERATED_FILES)
Makefile: $(GENERATED_HEADERS)
endif
endif
.SECONDARY: $(TRACE_HEADERS) $(TRACE_HEADERS:%=%-timestamp) \
$(TRACE_SOURCES) $(TRACE_SOURCES:%=%-timestamp) \
$(TRACE_DTRACE) $(TRACE_DTRACE:%=%-timestamp)
# Include automatically generated dependency files
# Dependencies in Makefile.objs files come from our recursive subdir rules
-include $(wildcard *.d tests/*.d)
include $(SRC_PATH)/tests/docker/Makefile.include
include $(SRC_PATH)/tests/vm/Makefile.include
.PHONY: help
help:
@echo 'Generic targets:'
@echo ' all - Build all'
@echo ' dir/file.o - Build specified target only'
@echo ' install - Install QEMU, documentation and tools'
@echo ' ctags/TAGS - Generate tags file for editors'
@echo ' cscope - Generate cscope index'
@echo ''
@$(if $(TARGET_DIRS), \
echo 'Architecture specific targets:'; \
$(foreach t, $(TARGET_DIRS), \
printf " %-30s - Build for %s\\n" $(patsubst %,subdir-%,$(t)) $(t);) \
echo '')
@echo 'Cleaning targets:'
@echo ' clean - Remove most generated files but keep the config'
@echo ' distclean - Remove all generated files'
@echo ' dist - Build a distributable tarball'
@echo ''
@echo 'Test targets:'
@echo ' check - Run all tests (check-help for details)'
@echo ' docker - Help about targets running tests inside Docker containers'
@echo ' vm-test - Help about targets running tests inside VM'
@echo ''
@echo 'Documentation targets:'
@echo ' html info pdf txt'
@echo ' - Build documentation in specified format'
@echo ''
ifdef CONFIG_WIN32
@echo 'Windows targets:'
@echo ' installer - Build NSIS-based installer for QEMU'
ifdef QEMU_GA_MSI_ENABLED
@echo ' msi - Build MSI-based installer for qemu-ga'
endif
@echo ''
endif
@echo ' $(MAKE) [targets] (quiet build, default)'
@echo ' $(MAKE) V=1 [targets] (verbose build)'

View File

@@ -4,16 +4,17 @@ stub-obj-y = stubs/ crypto/
util-obj-y = util/ qobject/ qapi/
util-obj-y += qmp-introspect.o qapi-types.o qapi-visit.o qapi-event.o
chardev-obj-y = chardev/
#######################################################################
# block-obj-y is code used by both qemu system emulation and qemu-img
block-obj-y = async.o thread-pool.o
block-obj-y += nbd/
block-obj-y += block.o blockjob.o
block-obj-y += block/ scsi/
block-obj-y += main-loop.o iohandler.o qemu-timer.o
block-obj-$(CONFIG_POSIX) += aio-posix.o
block-obj-$(CONFIG_WIN32) += aio-win32.o
block-obj-y += block/
block-obj-y += qemu-io-cmds.o
block-obj-$(CONFIG_REPLICATION) += replication.o
block-obj-m = block/
@@ -40,7 +41,7 @@ io-obj-y = io/
ifeq ($(CONFIG_SOFTMMU),y)
common-obj-y = blockdev.o blockdev-nbd.o block/
common-obj-y += bootdevice.o iothread.o
common-obj-y += iothread.o
common-obj-y += net/
common-obj-y += qdev-monitor.o device-hotplug.o
common-obj-$(CONFIG_WIN32) += os-win32.o
@@ -49,9 +50,15 @@ common-obj-$(CONFIG_POSIX) += os-posix.o
common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += qemu-char.o #aio.o
common-obj-y += page_cache.o
common-obj-y += qjson.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
common-obj-y += audio/
common-obj-y += hw/
common-obj-y += accel.o
common-obj-y += replay/
@@ -62,16 +69,13 @@ bt-host.o-cflags := $(BLUEZ_CFLAGS)
common-obj-y += dma-helpers.o
common-obj-y += vl.o
vl.o-cflags := $(GPROF_CFLAGS) $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += tpm.o
common-obj-$(CONFIG_SLIRP) += slirp/
common-obj-y += backends/
common-obj-y += chardev/
common-obj-$(CONFIG_SECCOMP) += qemu-seccomp.o
qemu-seccomp.o-cflags := $(SECCOMP_CFLAGS)
qemu-seccomp.o-libs := $(SECCOMP_LIBS)
common-obj-$(CONFIG_FDT) += device_tree.o
@@ -85,7 +89,7 @@ endif
#######################################################################
# Target-independent parts used in system and user emulation
common-obj-y += cpus-common.o
common-obj-y += tcg-runtime.o
common-obj-y += hw/
common-obj-y += qom/
common-obj-y += disas/
@@ -93,6 +97,7 @@ common-obj-y += disas/
######################################################################
# Resource file for Windows executables
version-obj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.o
version-lobj-$(CONFIG_WIN32) += $(BUILD_DIR)/version.lo
######################################################################
# tracing
@@ -109,80 +114,5 @@ qga-vss-dll-obj-y = qga/
######################################################################
# contrib
ivshmem-client-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-client/
ivshmem-server-obj-$(CONFIG_IVSHMEM) = contrib/ivshmem-server/
libvhost-user-obj-y = contrib/libvhost-user/
vhost-user-scsi.o-cflags := $(LIBISCSI_CFLAGS)
vhost-user-scsi.o-libs := $(LIBISCSI_LIBS)
vhost-user-scsi-obj-y = contrib/vhost-user-scsi/
vhost-user-blk-obj-y = contrib/vhost-user-blk/
######################################################################
trace-events-subdirs =
trace-events-subdirs += util
trace-events-subdirs += crypto
trace-events-subdirs += io
trace-events-subdirs += migration
trace-events-subdirs += block
trace-events-subdirs += chardev
trace-events-subdirs += hw/block
trace-events-subdirs += hw/block/dataplane
trace-events-subdirs += hw/char
trace-events-subdirs += hw/intc
trace-events-subdirs += hw/net
trace-events-subdirs += hw/rdma
trace-events-subdirs += hw/rdma/vmw
trace-events-subdirs += hw/virtio
trace-events-subdirs += hw/audio
trace-events-subdirs += hw/misc
trace-events-subdirs += hw/misc/macio
trace-events-subdirs += hw/usb
trace-events-subdirs += hw/scsi
trace-events-subdirs += hw/nvram
trace-events-subdirs += hw/display
trace-events-subdirs += hw/input
trace-events-subdirs += hw/timer
trace-events-subdirs += hw/dma
trace-events-subdirs += hw/sparc
trace-events-subdirs += hw/sparc64
trace-events-subdirs += hw/sd
trace-events-subdirs += hw/isa
trace-events-subdirs += hw/mem
trace-events-subdirs += hw/i386
trace-events-subdirs += hw/i386/xen
trace-events-subdirs += hw/9pfs
trace-events-subdirs += hw/ppc
trace-events-subdirs += hw/pci
trace-events-subdirs += hw/pci-host
trace-events-subdirs += hw/s390x
trace-events-subdirs += hw/vfio
trace-events-subdirs += hw/acpi
trace-events-subdirs += hw/arm
trace-events-subdirs += hw/alpha
trace-events-subdirs += hw/hppa
trace-events-subdirs += hw/xen
trace-events-subdirs += hw/ide
trace-events-subdirs += ui
trace-events-subdirs += audio
trace-events-subdirs += net
trace-events-subdirs += target/arm
trace-events-subdirs += target/i386
trace-events-subdirs += target/mips
trace-events-subdirs += target/sparc
trace-events-subdirs += target/s390x
trace-events-subdirs += target/ppc
trace-events-subdirs += qom
trace-events-subdirs += linux-user
trace-events-subdirs += qapi
trace-events-subdirs += accel/tcg
trace-events-subdirs += accel/kvm
trace-events-subdirs += nbd
trace-events-subdirs += scsi
trace-events-files = $(SRC_PATH)/trace-events $(trace-events-subdirs:%=$(SRC_PATH)/%/trace-events)
trace-obj-y = trace-root.o
trace-obj-y += $(trace-events-subdirs:%=%/trace.o)
trace-obj-$(CONFIG_TRACE_UST) += trace-ust-all.o
trace-obj-$(CONFIG_TRACE_DTRACE) += trace-dtrace-root.o
trace-obj-$(CONFIG_TRACE_DTRACE) += $(trace-events-subdirs:%=%/trace-dtrace.o)
ivshmem-client-obj-y = contrib/ivshmem-client/
ivshmem-server-obj-y = contrib/ivshmem-server/

View File

@@ -11,7 +11,7 @@ $(call set-vpath, $(SRC_PATH):$(BUILD_DIR))
ifdef CONFIG_LINUX
QEMU_CFLAGS += -I../linux-headers
endif
QEMU_CFLAGS += -I.. -I$(SRC_PATH)/target/$(TARGET_BASE_ARCH) -DNEED_CPU_H
QEMU_CFLAGS += -I.. -I$(SRC_PATH)/target-$(TARGET_BASE_ARCH) -DNEED_CPU_H
QEMU_CFLAGS+=-I$(SRC_PATH)/include
@@ -22,11 +22,11 @@ QEMU_PROG_BUILD = $(QEMU_PROG)
else
# system emulator name
QEMU_PROG=qemu-system-$(TARGET_NAME)$(EXESUF)
ifneq (,$(findstring -mwindows,$(SDL_LIBS)))
ifneq (,$(findstring -mwindows,$(libs_softmmu)))
# Terminate program name with a 'w' because the linker builds a windows executable.
QEMU_PROGW=qemu-system-$(TARGET_NAME)w$(EXESUF)
$(QEMU_PROG): $(QEMU_PROGW)
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG),"GEN","$(TARGET_DIR)$(QEMU_PROG)")
$(call quiet-command,$(OBJCOPY) --subsystem console $(QEMU_PROGW) $(QEMU_PROG)," GEN $(TARGET_DIR)$(QEMU_PROG)")
QEMU_PROG_BUILD = $(QEMU_PROGW)
else
QEMU_PROG_BUILD = $(QEMU_PROG)
@@ -36,6 +36,10 @@ endif
PROGS=$(QEMU_PROG) $(QEMU_PROGW)
STPFILES=
ifdef CONFIG_LINUX_USER
PROGS+=$(QEMU_PROG)-binfmt
endif
config-target.h: config-target.h-timestamp
config-target.h-timestamp: config-target.mak
@@ -48,41 +52,34 @@ else
TARGET_TYPE=system
endif
tracetool-y = $(SRC_PATH)/scripts/tracetool.py
tracetool-y += $(shell find $(SRC_PATH)/scripts/tracetool -name "*.py")
$(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--binary=$(bindir)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
$< > $@,"GEN","$(TARGET_DIR)$(QEMU_PROG).stp-installed")
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp-installed")
$(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=stap \
--backends=$(TRACE_BACKENDS) \
--binary=$(realpath .)/$(QEMU_PROG) \
--target-name=$(TARGET_NAME) \
--target-type=$(TARGET_TYPE) \
$< > $@,"GEN","$(TARGET_DIR)$(QEMU_PROG).stp")
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(BUILD_DIR)/trace-events-all $(tracetool-y)
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(call quiet-command,$(TRACETOOL) \
--group=all \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
--probe-prefix=qemu.$(TARGET_TYPE).$(TARGET_NAME) \
$< > $@,"GEN","$(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG)-simpletrace.stp")
else
stap:
endif
.PHONY: stap
all: $(PROGS) stap
@@ -91,28 +88,36 @@ all: $(PROGS) stap
#########################################################
# cpu emulator library
obj-y += exec.o
obj-y += accel/
obj-$(CONFIG_TCG) += tcg/tcg.o tcg/tcg-op.o tcg/tcg-op-vec.o tcg/tcg-op-gvec.o
obj-$(CONFIG_TCG) += tcg/tcg-common.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tcg/tci.o
obj-y = exec.o translate-all.o cpu-exec.o
obj-y += translate-common.o
obj-y += cpu-exec-common.o
obj-y += tcg/tcg.o tcg/tcg-op.o tcg/optimize.o
obj-$(CONFIG_TCG_INTERPRETER) += tci.o
obj-y += tcg/tcg-common.o
obj-$(CONFIG_TCG_INTERPRETER) += disas/tci.o
obj-y += fpu/softfloat.o
obj-y += target/$(TARGET_BASE_ARCH)/
obj-y += target-$(TARGET_BASE_ARCH)/
obj-y += disas.o
obj-$(call notempty,$(TARGET_XML_FILES)) += gdbstub-xml.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decContext.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/decNumber.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal32.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal64.o
obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
#########################################################
# Linux user emulator target
ifdef CONFIG_LINUX_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/linux-user/host/$(ARCH) \
-I$(SRC_PATH)/linux-user
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user
obj-y += linux-user/
obj-y += gdbstub.o thunk.o
obj-y += gdbstub.o thunk.o user-exec.o
obj-binfmt-y += linux-user/
endif #CONFIG_LINUX_USER
@@ -125,7 +130,7 @@ QEMU_CFLAGS+=-I$(SRC_PATH)/bsd-user -I$(SRC_PATH)/bsd-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/bsd-user/$(HOST_VARIANT_DIR)
obj-y += bsd-user/
obj-y += gdbstub.o
obj-y += gdbstub.o user-exec.o
endif #CONFIG_BSD_USER
@@ -133,14 +138,21 @@ endif #CONFIG_BSD_USER
# System emulator target
ifdef CONFIG_SOFTMMU
obj-y += arch_init.o cpus.o monitor.o gdbstub.o balloon.o ioport.o numa.o
obj-y += qtest.o
obj-y += qtest.o bootdevice.o
obj-y += hw/
obj-y += memory.o
obj-$(CONFIG_KVM) += kvm-all.o
obj-y += memory.o cputlb.o
obj-y += memory_mapping.o
obj-y += dump.o
obj-y += migration/ram.o
obj-y += migration/ram.o migration/savevm.o
LIBS := $(libs_softmmu) $(LIBS)
# xen support
obj-$(CONFIG_XEN) += xen-common.o
obj-$(CONFIG_XEN_I386) += xen-hvm.o xen-mapcache.o
obj-$(call lnot,$(CONFIG_XEN)) += xen-common-stub.o
obj-$(call lnot,$(CONFIG_XEN_I386)) += xen-hvm-stub.o
# Hardware support
ifeq ($(TARGET_NAME), sparc64)
obj-y += hw/sparc64/
@@ -148,27 +160,29 @@ else
obj-y += hw/$(TARGET_BASE_ARCH)/
endif
GENERATED_FILES += hmp-commands.h hmp-commands-info.h
GENERATED_HEADERS += hmp-commands.h hmp-commands-info.h qmp-commands-old.h
endif # CONFIG_SOFTMMU
# Workaround for http://gcc.gnu.org/PR55489, see configure.
%/translate.o: QEMU_CFLAGS += $(TRANSLATE_OPT_CFLAGS)
ifdef CONFIG_LINUX_USER
dummy := $(call unnest-vars,,obj-y obj-binfmt-y)
else
dummy := $(call unnest-vars,,obj-y)
endif
all-obj-y := $(obj-y)
target-obj-y :=
block-obj-y :=
common-obj-y :=
chardev-obj-y :=
include $(SRC_PATH)/Makefile.objs
dummy := $(call unnest-vars,,target-obj-y)
target-obj-y-save := $(target-obj-y)
dummy := $(call unnest-vars,.., \
block-obj-y \
block-obj-m \
chardev-obj-y \
crypto-obj-y \
crypto-aes-obj-y \
qom-obj-y \
@@ -179,36 +193,40 @@ target-obj-y := $(target-obj-y-save)
all-obj-y += $(common-obj-y)
all-obj-y += $(target-obj-y)
all-obj-y += $(qom-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y) $(chardev-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(block-obj-y)
all-obj-$(CONFIG_USER_ONLY) += $(crypto-aes-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(crypto-obj-y)
all-obj-$(CONFIG_SOFTMMU) += $(io-obj-y)
$(QEMU_PROG_BUILD): config-devices.mak
COMMON_LDADDS = ../libqemuutil.a
# build either PROG or PROGW
$(QEMU_PROG_BUILD): $(all-obj-y) $(COMMON_LDADDS)
$(QEMU_PROG_BUILD): $(all-obj-y) ../libqemuutil.a ../libqemustub.a
$(call LINK, $(filter-out %.mak, $^))
ifdef CONFIG_DARWIN
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o $@,"REZ","$(TARGET_DIR)$@")
$(call quiet-command,SetFile -a C $@,"SETFILE","$(TARGET_DIR)$@")
$(call quiet-command,Rez -append $(SRC_PATH)/pc-bios/qemu.rsrc -o $@," REZ $(TARGET_DIR)$@")
$(call quiet-command,SetFile -a C $@," SETFILE $(TARGET_DIR)$@")
endif
$(QEMU_PROG)-binfmt: $(obj-binfmt-y)
$(call LINK,$^)
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
$(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES),"GEN","$(TARGET_DIR)$@")
$(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES)," GEN $(TARGET_DIR)$@")
hmp-commands.h: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$(TARGET_DIR)$@")
hmp-commands.h: $(SRC_PATH)/hmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@,"GEN","$(TARGET_DIR)$@")
hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
clean: clean-target
qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
clean:
rm -f *.a *~ $(PROGS)
rm -f $(shell find . -name '*.[od]')
rm -f hmp-commands.h gdbstub-xml.c
rm -f hmp-commands.h qmp-commands-old.h gdbstub-xml.c
ifdef CONFIG_TRACE_SYSTEMTAP
rm -f *.stp
endif
@@ -223,5 +241,5 @@ ifdef CONFIG_TRACE_SYSTEMTAP
$(INSTALL_DATA) $(QEMU_PROG)-simpletrace.stp "$(DESTDIR)$(qemu_datadir)/../systemtap/tapset/$(QEMU_PROG)-simpletrace.stp"
endif
GENERATED_FILES += config-target.h
Makefile: $(GENERATED_FILES)
GENERATED_HEADERS += config-target.h
Makefile: $(GENERATED_HEADERS)

23
README
View File

@@ -42,11 +42,12 @@ of other UNIX targets. The simple steps to build QEMU are:
../configure
make
Complete details of the process for building and configuring QEMU for
all supported host platforms can be found in the qemu-tech.html file.
Additional information can also be found online via the QEMU website:
https://qemu.org/Hosts/Linux
https://qemu.org/Hosts/Mac
https://qemu.org/Hosts/W32
http://qemu-project.org/Hosts/Linux
http://qemu-project.org/Hosts/W32
Submitting patches
@@ -54,7 +55,7 @@ Submitting patches
The QEMU source code is maintained under the GIT version control system.
git clone git://git.qemu.org/qemu.git
git clone git://git.qemu-project.org/qemu.git
When submitting patches, the preferred approach is to use 'git
format-patch' and/or 'git send-email' to format & send the mail to the
@@ -65,13 +66,9 @@ guidelines set out in the HACKING and CODING_STYLE files.
Additional information on submitting patches can be found online via
the QEMU website
https://qemu.org/Contribute/SubmitAPatch
https://qemu.org/Contribute/TrivialPatches
http://qemu-project.org/Contribute/SubmitAPatch
http://qemu-project.org/Contribute/TrivialPatches
The QEMU website is also maintained under source control.
git clone git://git.qemu.org/qemu-web.git
https://www.qemu.org/2017/02/04/the-new-qemu-website-is-up/
Bug reporting
=============
@@ -89,7 +86,7 @@ reported via launchpad.
For additional information on bug reporting consult:
https://qemu.org/Contribute/ReportABug
http://qemu-project.org/Contribute/ReportABug
Contact
@@ -99,12 +96,12 @@ The QEMU community can be contacted in a number of ways, with the two
main methods being email and IRC
- qemu-devel@nongnu.org
https://lists.nongnu.org/mailman/listinfo/qemu-devel
http://lists.nongnu.org/mailman/listinfo/qemu-devel
- #qemu on irc.oftc.net
Information on additional methods of contacting the community can be
found online via the QEMU website:
https://qemu.org/Contribute/StartHere
http://qemu-project.org/Contribute/StartHere
-- End

View File

@@ -1 +1 @@
2.11.50
2.6.2

View File

@@ -26,14 +26,23 @@
#include "qemu/osdep.h"
#include "sysemu/accel.h"
#include "hw/boards.h"
#include "qemu-common.h"
#include "sysemu/arch_init.h"
#include "sysemu/sysemu.h"
#include "sysemu/kvm.h"
#include "sysemu/qtest.h"
#include "hw/xen/xen.h"
#include "qom/object.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "hw/boards.h"
int tcg_tb_size;
static bool tcg_allowed = true;
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
return 0;
}
static const TypeInfo accel_type = {
.name = TYPE_ACCEL,
@@ -68,22 +77,21 @@ static int accel_init_machine(AccelClass *acc, MachineState *ms)
return ret;
}
void configure_accelerator(MachineState *ms)
int configure_accelerator(MachineState *ms)
{
const char *accel, *p;
const char *p;
char buf[10];
int ret;
bool accel_initialised = false;
bool init_failed = false;
AccelClass *acc = NULL;
accel = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (accel == NULL) {
p = qemu_opt_get(qemu_get_machine_opts(), "accel");
if (p == NULL) {
/* Use the default "accelerator", tcg */
accel = "tcg";
p = "tcg";
}
p = accel;
while (!accel_initialised && *p != '\0') {
if (*p == ':') {
p++;
@@ -91,6 +99,7 @@ void configure_accelerator(MachineState *ms)
p = get_opt_name(buf, sizeof(buf), p, ':');
acc = accel_find(buf);
if (!acc) {
fprintf(stderr, "\"%s\" accelerator not found.\n", buf);
continue;
}
if (acc->available && !acc->available()) {
@@ -101,8 +110,9 @@ void configure_accelerator(MachineState *ms)
ret = accel_init_machine(acc, ms);
if (ret < 0) {
init_failed = true;
error_report("failed to initialize %s: %s",
acc->name, strerror(-ret));
fprintf(stderr, "failed to initialize %s: %s\n",
acc->name,
strerror(-ret));
} else {
accel_initialised = true;
}
@@ -110,25 +120,39 @@ void configure_accelerator(MachineState *ms)
if (!accel_initialised) {
if (!init_failed) {
error_report("-machine accel=%s: No accelerator found", accel);
fprintf(stderr, "No accelerator found!\n");
}
exit(1);
}
if (init_failed) {
error_report("Back to %s accelerator", acc->name);
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}
void accel_register_compat_props(AccelState *accel)
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *class = ACCEL_GET_CLASS(accel);
register_compat_props_array(class->global_props);
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&accel_type);
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -1,4 +0,0 @@
obj-$(CONFIG_SOFTMMU) += accel.o
obj-y += kvm/
obj-$(CONFIG_TCG) += tcg/
obj-y += stubs/

View File

@@ -1 +0,0 @@
obj-$(CONFIG_KVM) += kvm-all.o

View File

@@ -1,16 +0,0 @@
# Trace events for debugging and performance instrumentation
# kvm-all.c
kvm_ioctl(int type, void *arg) "type 0x%x, arg %p"
kvm_vm_ioctl(int type, void *arg) "type 0x%x, arg %p"
kvm_vcpu_ioctl(int cpu_index, int type, void *arg) "cpu_index %d, type 0x%x, arg %p"
kvm_run_exit(int cpu_index, uint32_t reason) "cpu_index %d, reason %d"
kvm_device_ioctl(int fd, int type, void *arg) "dev fd %d, type 0x%x, arg %p"
kvm_failed_reg_get(uint64_t id, const char *msg) "Warning: Unable to retrieve ONEREG %" PRIu64 " from KVM: %s"
kvm_failed_reg_set(uint64_t id, const char *msg) "Warning: Unable to set ONEREG %" PRIu64 " to KVM: %s"
kvm_irqchip_commit_routes(void) ""
kvm_irqchip_add_msi_route(char *name, int vector, int virq) "dev %s vector %d virq %d"
kvm_irqchip_update_msi_route(int virq) "Updating MSI route virq=%d"
kvm_irqchip_release_virq(int virq) "virq %d"
kvm_set_user_memory(uint32_t slot, uint32_t flags, uint64_t guest_phys_addr, uint64_t memory_size, uint64_t userspace_addr, int ret) "Slot#%d flags=0x%x gpa=0x%"PRIx64 " size=0x%"PRIx64 " ua=0x%"PRIx64 " ret=%d"

View File

@@ -1,5 +0,0 @@
obj-$(call lnot,$(CONFIG_HAX)) += hax-stub.o
obj-$(call lnot,$(CONFIG_HVF)) += hvf-stub.o
obj-$(call lnot,$(CONFIG_WHPX)) += whpx-stub.o
obj-$(call lnot,$(CONFIG_KVM)) += kvm-stub.o
obj-$(call lnot,$(CONFIG_TCG)) += tcg-stub.o

View File

@@ -1,34 +0,0 @@
/*
* QEMU HAXM support
*
* Copyright (c) 2015, Intel Corporation
*
* Copyright 2016 Google, Inc.
*
* This software is licensed under the terms of the GNU General Public
* License version 2, as published by the Free Software Foundation, and
* may be copied, distributed, and modified under those terms.
*
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/hax.h"
int hax_sync_vcpus(void)
{
return 0;
}
int hax_init_vcpu(CPUState *cpu)
{
return -ENOSYS;
}
int hax_smp_cpu_exec(CPUState *cpu)
{
return -ENOSYS;
}

View File

@@ -1,31 +0,0 @@
/*
* QEMU HVF support
*
* Copyright 2017 Red Hat, Inc.
*
* This software is licensed under the terms of the GNU General Public
* License version 2 or later, as published by the Free Software Foundation,
* and may be copied, distributed, and modified under those terms.
*
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/hvf.h"
int hvf_init_vcpu(CPUState *cpu)
{
return -ENOSYS;
}
int hvf_vcpu_exec(CPUState *cpu)
{
return -ENOSYS;
}
void hvf_vcpu_destroy(CPUState *cpu)
{
}

View File

@@ -1,30 +0,0 @@
/*
* QEMU TCG accelerator stub
*
* Copyright Red Hat, Inc. 2013
*
* Author: Paolo Bonzini <pbonzini@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "tcg/tcg.h"
#include "exec/cpu-common.h"
#include "exec/exec-all.h"
void tb_flush(CPUState *cpu)
{
}
void tb_unlock(void)
{
}
void tlb_set_dirty(CPUState *cpu, target_ulong vaddr)
{
}

View File

@@ -1,48 +0,0 @@
/*
* QEMU Windows Hypervisor Platform accelerator (WHPX) stub
*
* Copyright Microsoft Corp. 2017
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/whpx.h"
int whpx_init_vcpu(CPUState *cpu)
{
return -1;
}
int whpx_vcpu_exec(CPUState *cpu)
{
return -1;
}
void whpx_destroy_vcpu(CPUState *cpu)
{
}
void whpx_vcpu_kick(CPUState *cpu)
{
}
void whpx_cpu_synchronize_state(CPUState *cpu)
{
}
void whpx_cpu_synchronize_post_reset(CPUState *cpu)
{
}
void whpx_cpu_synchronize_post_init(CPUState *cpu)
{
}
void whpx_cpu_synchronize_pre_loadvm(CPUState *cpu)
{
}

View File

@@ -1,8 +0,0 @@
obj-$(CONFIG_SOFTMMU) += tcg-all.o
obj-$(CONFIG_SOFTMMU) += cputlb.o
obj-y += tcg-runtime.o tcg-runtime-gvec.o
obj-y += cpu-exec.o cpu-exec-common.o translate-all.o
obj-y += translator.o
obj-$(CONFIG_USER_ONLY) += user-exec.o
obj-$(call lnot,$(CONFIG_SOFTMMU)) += user-exec-stub.o

View File

@@ -1,245 +0,0 @@
/*
* Atomic helper templates
* Included from tcg-runtime.c and cputlb.c.
*
* Copyright (c) 2016 Red Hat, Inc
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*/
#if DATA_SIZE == 16
# define SUFFIX o
# define DATA_TYPE Int128
# define BSWAP bswap128
#elif DATA_SIZE == 8
# define SUFFIX q
# define DATA_TYPE uint64_t
# define BSWAP bswap64
#elif DATA_SIZE == 4
# define SUFFIX l
# define DATA_TYPE uint32_t
# define BSWAP bswap32
#elif DATA_SIZE == 2
# define SUFFIX w
# define DATA_TYPE uint16_t
# define BSWAP bswap16
#elif DATA_SIZE == 1
# define SUFFIX b
# define DATA_TYPE uint8_t
# define BSWAP
#else
# error unsupported data size
#endif
#if DATA_SIZE >= 4
# define ABI_TYPE DATA_TYPE
#else
# define ABI_TYPE uint32_t
#endif
/* Define host-endian atomic operations. Note that END is used within
the ATOMIC_NAME macro, and redefined below. */
#if DATA_SIZE == 1
# define END
#elif defined(HOST_WORDS_BIGENDIAN)
# define END _be
#else
# define END _le
#endif
ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_cmpxchg__nocheck(haddr, cmpv, newv);
ATOMIC_MMU_CLEANUP;
return ret;
}
#if DATA_SIZE >= 16
ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong addr EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE val, *haddr = ATOMIC_MMU_LOOKUP;
__atomic_load(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
return val;
}
void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
__atomic_store(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
}
#else
ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_xchg__nocheck(haddr, val);
ATOMIC_MMU_CLEANUP;
return ret;
}
#define GEN_ATOMIC_HELPER(X) \
ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr, \
ABI_TYPE val EXTRA_ARGS) \
{ \
ATOMIC_MMU_DECLS; \
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP; \
DATA_TYPE ret = atomic_##X(haddr, val); \
ATOMIC_MMU_CLEANUP; \
return ret; \
}
GEN_ATOMIC_HELPER(fetch_add)
GEN_ATOMIC_HELPER(fetch_and)
GEN_ATOMIC_HELPER(fetch_or)
GEN_ATOMIC_HELPER(fetch_xor)
GEN_ATOMIC_HELPER(add_fetch)
GEN_ATOMIC_HELPER(and_fetch)
GEN_ATOMIC_HELPER(or_fetch)
GEN_ATOMIC_HELPER(xor_fetch)
#undef GEN_ATOMIC_HELPER
#endif /* DATA SIZE >= 16 */
#undef END
#if DATA_SIZE > 1
/* Define reverse-host-endian atomic operations. Note that END is used
within the ATOMIC_NAME macro. */
#ifdef HOST_WORDS_BIGENDIAN
# define END _le
#else
# define END _be
#endif
ABI_TYPE ATOMIC_NAME(cmpxchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE cmpv, ABI_TYPE newv EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ret = atomic_cmpxchg__nocheck(haddr, BSWAP(cmpv), BSWAP(newv));
ATOMIC_MMU_CLEANUP;
return BSWAP(ret);
}
#if DATA_SIZE >= 16
ABI_TYPE ATOMIC_NAME(ld)(CPUArchState *env, target_ulong addr EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE val, *haddr = ATOMIC_MMU_LOOKUP;
__atomic_load(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
return BSWAP(val);
}
void ATOMIC_NAME(st)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
val = BSWAP(val);
__atomic_store(haddr, &val, __ATOMIC_RELAXED);
ATOMIC_MMU_CLEANUP;
}
#else
ABI_TYPE ATOMIC_NAME(xchg)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
ABI_TYPE ret = atomic_xchg__nocheck(haddr, BSWAP(val));
ATOMIC_MMU_CLEANUP;
return BSWAP(ret);
}
#define GEN_ATOMIC_HELPER(X) \
ABI_TYPE ATOMIC_NAME(X)(CPUArchState *env, target_ulong addr, \
ABI_TYPE val EXTRA_ARGS) \
{ \
ATOMIC_MMU_DECLS; \
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP; \
DATA_TYPE ret = atomic_##X(haddr, BSWAP(val)); \
ATOMIC_MMU_CLEANUP; \
return BSWAP(ret); \
}
GEN_ATOMIC_HELPER(fetch_and)
GEN_ATOMIC_HELPER(fetch_or)
GEN_ATOMIC_HELPER(fetch_xor)
GEN_ATOMIC_HELPER(and_fetch)
GEN_ATOMIC_HELPER(or_fetch)
GEN_ATOMIC_HELPER(xor_fetch)
#undef GEN_ATOMIC_HELPER
/* Note that for addition, we need to use a separate cmpxchg loop instead
of bswaps for the reverse-host-endian helpers. */
ABI_TYPE ATOMIC_NAME(fetch_add)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ldo, ldn, ret, sto;
ldo = atomic_read__nocheck(haddr);
while (1) {
ret = BSWAP(ldo);
sto = BSWAP(ret + val);
ldn = atomic_cmpxchg__nocheck(haddr, ldo, sto);
if (ldn == ldo) {
ATOMIC_MMU_CLEANUP;
return ret;
}
ldo = ldn;
}
}
ABI_TYPE ATOMIC_NAME(add_fetch)(CPUArchState *env, target_ulong addr,
ABI_TYPE val EXTRA_ARGS)
{
ATOMIC_MMU_DECLS;
DATA_TYPE *haddr = ATOMIC_MMU_LOOKUP;
DATA_TYPE ldo, ldn, ret, sto;
ldo = atomic_read__nocheck(haddr);
while (1) {
ret = BSWAP(ldo) + val;
sto = BSWAP(ret);
ldn = atomic_cmpxchg__nocheck(haddr, ldo, sto);
if (ldn == ldo) {
ATOMIC_MMU_CLEANUP;
return ret;
}
ldo = ldn;
}
}
#endif /* DATA_SIZE >= 16 */
#undef END
#endif /* DATA_SIZE > 1 */
#undef BSWAP
#undef ABI_TYPE
#undef DATA_TYPE
#undef SUFFIX
#undef DATA_SIZE

View File

@@ -1,743 +0,0 @@
/*
* emulator main execution loop
*
* Copyright (c) 2003-2005 Fabrice Bellard
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "cpu.h"
#include "trace.h"
#include "disas/disas.h"
#include "exec/exec-all.h"
#include "tcg.h"
#include "qemu/atomic.h"
#include "sysemu/qtest.h"
#include "qemu/timer.h"
#include "exec/address-spaces.h"
#include "qemu/rcu.h"
#include "exec/tb-hash.h"
#include "exec/tb-lookup.h"
#include "exec/log.h"
#include "qemu/main-loop.h"
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
#include "hw/i386/apic.h"
#endif
#include "sysemu/cpus.h"
#include "sysemu/replay.h"
/* -icount align implementation. */
typedef struct SyncClocks {
int64_t diff_clk;
int64_t last_cpu_icount;
int64_t realtime_clock;
} SyncClocks;
#if !defined(CONFIG_USER_ONLY)
/* Allow the guest to have a max 3ms advance.
* The difference between the 2 clocks could therefore
* oscillate around 0.
*/
#define VM_CLOCK_ADVANCE 3000000
#define THRESHOLD_REDUCE 1.5
#define MAX_DELAY_PRINT_RATE 2000000000LL
#define MAX_NB_PRINTS 100
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
int64_t cpu_icount;
if (!icount_align_option) {
return;
}
cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
sc->diff_clk += cpu_icount_to_ns(sc->last_cpu_icount - cpu_icount);
sc->last_cpu_icount = cpu_icount;
if (sc->diff_clk > VM_CLOCK_ADVANCE) {
#ifndef _WIN32
struct timespec sleep_delay, rem_delay;
sleep_delay.tv_sec = sc->diff_clk / 1000000000LL;
sleep_delay.tv_nsec = sc->diff_clk % 1000000000LL;
if (nanosleep(&sleep_delay, &rem_delay) < 0) {
sc->diff_clk = rem_delay.tv_sec * 1000000000LL + rem_delay.tv_nsec;
} else {
sc->diff_clk = 0;
}
#else
Sleep(sc->diff_clk / SCALE_MS);
sc->diff_clk = 0;
#endif
}
}
static void print_delay(const SyncClocks *sc)
{
static float threshold_delay;
static int64_t last_realtime_clock;
static int nb_prints;
if (icount_align_option &&
sc->realtime_clock - last_realtime_clock >= MAX_DELAY_PRINT_RATE &&
nb_prints < MAX_NB_PRINTS) {
if ((-sc->diff_clk / (float)1000000000LL > threshold_delay) ||
(-sc->diff_clk / (float)1000000000LL <
(threshold_delay - THRESHOLD_REDUCE))) {
threshold_delay = (-sc->diff_clk / 1000000000LL) + 1;
printf("Warning: The guest is now late by %.1f to %.1f seconds\n",
threshold_delay - 1,
threshold_delay);
nb_prints++;
last_realtime_clock = sc->realtime_clock;
}
}
}
static void init_delay_params(SyncClocks *sc,
const CPUState *cpu)
{
if (!icount_align_option) {
return;
}
sc->realtime_clock = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL_RT);
sc->diff_clk = qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) - sc->realtime_clock;
sc->last_cpu_icount = cpu->icount_extra + cpu->icount_decr.u16.low;
if (sc->diff_clk < max_delay) {
max_delay = sc->diff_clk;
}
if (sc->diff_clk > max_advance) {
max_advance = sc->diff_clk;
}
/* Print every 2s max if the guest is late. We limit the number
of printed messages to NB_PRINT_MAX(currently 100) */
print_delay(sc);
}
#else
static void align_clocks(SyncClocks *sc, const CPUState *cpu)
{
}
static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
{
}
#endif /* CONFIG USER ONLY */
/* Execute a TB, and fix up the CPU state afterwards if necessary */
static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
{
CPUArchState *env = cpu->env_ptr;
uintptr_t ret;
TranslationBlock *last_tb;
int tb_exit;
uint8_t *tb_ptr = itb->tc.ptr;
qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
"Trace %d: %p ["
TARGET_FMT_lx "/" TARGET_FMT_lx "/%#x] %s\n",
cpu->cpu_index, itb->tc.ptr,
itb->cs_base, itb->pc, itb->flags,
lookup_symbol(itb->pc));
#if defined(DEBUG_DISAS)
if (qemu_loglevel_mask(CPU_LOG_TB_CPU)
&& qemu_log_in_addr_range(itb->pc)) {
qemu_log_lock();
#if defined(TARGET_I386)
log_cpu_state(cpu, CPU_DUMP_CCOP);
#else
log_cpu_state(cpu, 0);
#endif
qemu_log_unlock();
}
#endif /* DEBUG_DISAS */
cpu->can_do_io = !use_icount;
ret = tcg_qemu_tb_exec(env, tb_ptr);
cpu->can_do_io = 1;
last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
tb_exit = ret & TB_EXIT_MASK;
trace_exec_tb_exit(last_tb, tb_exit);
if (tb_exit > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
* of the start of the TB.
*/
CPUClass *cc = CPU_GET_CLASS(cpu);
qemu_log_mask_and_addr(CPU_LOG_EXEC, last_tb->pc,
"Stopped execution of TB chain before %p ["
TARGET_FMT_lx "] %s\n",
last_tb->tc.ptr, last_tb->pc,
lookup_symbol(last_tb->pc));
if (cc->synchronize_from_tb) {
cc->synchronize_from_tb(cpu, last_tb);
} else {
assert(cc->set_pc);
cc->set_pc(cpu, last_tb->pc);
}
}
return ret;
}
#ifndef CONFIG_USER_ONLY
/* Execute the code without caching the generated code. An interpreter
could be used if available. */
static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
TranslationBlock *orig_tb, bool ignore_icount)
{
TranslationBlock *tb;
uint32_t cflags = curr_cflags() | CF_NOCACHE;
if (ignore_icount) {
cflags &= ~CF_USE_ICOUNT;
}
/* Should never happen.
We only end up here when an existing TB is too long. */
cflags |= MIN(max_cycles, CF_COUNT_MASK);
tb_lock();
tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base,
orig_tb->flags, cflags);
tb->orig_tb = orig_tb;
tb_unlock();
/* execute the generated code */
trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb);
tb_lock();
tb_phys_invalidate(tb, -1);
tb_remove(tb);
tb_unlock();
}
#endif
void cpu_exec_step_atomic(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
TranslationBlock *tb;
target_ulong cs_base, pc;
uint32_t flags;
uint32_t cflags = 1;
uint32_t cf_mask = cflags & CF_HASH_MASK;
/* volatile because we modify it between setjmp and longjmp */
volatile bool in_exclusive_region = false;
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask);
if (tb == NULL) {
mmap_lock();
tb_lock();
tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask);
if (likely(tb == NULL)) {
tb = tb_gen_code(cpu, pc, cs_base, flags, cflags);
}
tb_unlock();
mmap_unlock();
}
start_exclusive();
/* Since we got here, we know that parallel_cpus must be true. */
parallel_cpus = false;
in_exclusive_region = true;
cc->cpu_exec_enter(cpu);
/* execute the generated code */
trace_exec_tb(tb, pc);
cpu_tb_exec(cpu, tb);
cc->cpu_exec_exit(cpu);
} else {
/* We may have exited due to another problem here, so we need
* to reset any tb_locks we may have taken but didn't release.
* The mmap_lock is dropped by tb_gen_code if it runs out of
* memory.
*/
#ifndef CONFIG_SOFTMMU
tcg_debug_assert(!have_mmap_lock());
#endif
tb_lock_reset();
}
if (in_exclusive_region) {
/* We might longjump out of either the codegen or the
* execution, so must make sure we only end the exclusive
* region if we started it.
*/
parallel_cpus = true;
end_exclusive();
}
}
struct tb_desc {
target_ulong pc;
target_ulong cs_base;
CPUArchState *env;
tb_page_addr_t phys_page1;
uint32_t flags;
uint32_t cf_mask;
uint32_t trace_vcpu_dstate;
};
static bool tb_cmp(const void *p, const void *d)
{
const TranslationBlock *tb = p;
const struct tb_desc *desc = d;
if (tb->pc == desc->pc &&
tb->page_addr[0] == desc->phys_page1 &&
tb->cs_base == desc->cs_base &&
tb->flags == desc->flags &&
tb->trace_vcpu_dstate == desc->trace_vcpu_dstate &&
(tb_cflags(tb) & (CF_HASH_MASK | CF_INVALID)) == desc->cf_mask) {
/* check next page if needed */
if (tb->page_addr[1] == -1) {
return true;
} else {
tb_page_addr_t phys_page2;
target_ulong virt_page2;
virt_page2 = (desc->pc & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
phys_page2 = get_page_addr_code(desc->env, virt_page2);
if (tb->page_addr[1] == phys_page2) {
return true;
}
}
}
return false;
}
TranslationBlock *tb_htable_lookup(CPUState *cpu, target_ulong pc,
target_ulong cs_base, uint32_t flags,
uint32_t cf_mask)
{
tb_page_addr_t phys_pc;
struct tb_desc desc;
uint32_t h;
desc.env = (CPUArchState *)cpu->env_ptr;
desc.cs_base = cs_base;
desc.flags = flags;
desc.cf_mask = cf_mask;
desc.trace_vcpu_dstate = *cpu->trace_dstate;
desc.pc = pc;
phys_pc = get_page_addr_code(desc.env, pc);
desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
h = tb_hash_func(phys_pc, pc, flags, cf_mask, *cpu->trace_dstate);
return qht_lookup(&tb_ctx.htable, tb_cmp, &desc, h);
}
void tb_set_jmp_target(TranslationBlock *tb, int n, uintptr_t addr)
{
if (TCG_TARGET_HAS_direct_jump) {
uintptr_t offset = tb->jmp_target_arg[n];
uintptr_t tc_ptr = (uintptr_t)tb->tc.ptr;
tb_target_set_jmp_target(tc_ptr, tc_ptr + offset, addr);
} else {
tb->jmp_target_arg[n] = addr;
}
}
/* Called with tb_lock held. */
static inline void tb_add_jump(TranslationBlock *tb, int n,
TranslationBlock *tb_next)
{
assert(n < ARRAY_SIZE(tb->jmp_list_next));
if (tb->jmp_list_next[n]) {
/* Another thread has already done this while we were
* outside of the lock; nothing to do in this case */
return;
}
qemu_log_mask_and_addr(CPU_LOG_EXEC, tb->pc,
"Linking TBs %p [" TARGET_FMT_lx
"] index %d -> %p [" TARGET_FMT_lx "]\n",
tb->tc.ptr, tb->pc, n,
tb_next->tc.ptr, tb_next->pc);
/* patch the native jump address */
tb_set_jmp_target(tb, n, (uintptr_t)tb_next->tc.ptr);
/* add in TB jmp circular list */
tb->jmp_list_next[n] = tb_next->jmp_list_first;
tb_next->jmp_list_first = (uintptr_t)tb | n;
}
static inline TranslationBlock *tb_find(CPUState *cpu,
TranslationBlock *last_tb,
int tb_exit, uint32_t cf_mask)
{
TranslationBlock *tb;
target_ulong cs_base, pc;
uint32_t flags;
bool acquired_tb_lock = false;
tb = tb_lookup__cpu_state(cpu, &pc, &cs_base, &flags, cf_mask);
if (tb == NULL) {
/* mmap_lock is needed by tb_gen_code, and mmap_lock must be
* taken outside tb_lock. As system emulation is currently
* single threaded the locks are NOPs.
*/
mmap_lock();
tb_lock();
acquired_tb_lock = true;
/* There's a chance that our desired tb has been translated while
* taking the locks so we check again inside the lock.
*/
tb = tb_htable_lookup(cpu, pc, cs_base, flags, cf_mask);
if (likely(tb == NULL)) {
/* if no translated code available, then translate it now */
tb = tb_gen_code(cpu, pc, cs_base, flags, cf_mask);
}
mmap_unlock();
/* We add the TB in the virtual pc hash table for the fast lookup */
atomic_set(&cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)], tb);
}
#ifndef CONFIG_USER_ONLY
/* We don't take care of direct jumps when address mapping changes in
* system emulation. So it's not safe to make a direct jump to a TB
* spanning two pages because the mapping for the second page can change.
*/
if (tb->page_addr[1] != -1) {
last_tb = NULL;
}
#endif
/* See if we can patch the calling TB. */
if (last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
if (!acquired_tb_lock) {
tb_lock();
acquired_tb_lock = true;
}
if (!(tb->cflags & CF_INVALID)) {
tb_add_jump(last_tb, tb_exit, tb);
}
}
if (acquired_tb_lock) {
tb_unlock();
}
return tb;
}
static inline bool cpu_handle_halt(CPUState *cpu)
{
if (cpu->halted) {
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
&& replay_interrupt()) {
X86CPU *x86_cpu = X86_CPU(cpu);
qemu_mutex_lock_iothread();
apic_poll_irq(x86_cpu->apic_state);
cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
qemu_mutex_unlock_iothread();
}
#endif
if (!cpu_has_work(cpu)) {
return true;
}
cpu->halted = 0;
}
return false;
}
static inline void cpu_handle_debug_exception(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUWatchpoint *wp;
if (!cpu->watchpoint_hit) {
QTAILQ_FOREACH(wp, &cpu->watchpoints, entry) {
wp->flags &= ~BP_WATCHPOINT_HIT;
}
}
cc->debug_excp_handler(cpu);
}
static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
{
if (cpu->exception_index < 0) {
#ifndef CONFIG_USER_ONLY
if (replay_has_exception()
&& cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
/* try to cause an exception pending in the log */
cpu_exec_nocache(cpu, 1, tb_find(cpu, NULL, 0, curr_cflags()), true);
}
#endif
if (cpu->exception_index < 0) {
return false;
}
}
if (cpu->exception_index >= EXCP_INTERRUPT) {
/* exit request from the cpu execution loop */
*ret = cpu->exception_index;
if (*ret == EXCP_DEBUG) {
cpu_handle_debug_exception(cpu);
}
cpu->exception_index = -1;
return true;
} else {
#if defined(CONFIG_USER_ONLY)
/* if user mode only, we simulate a fake exception
which will be handled outside the cpu execution
loop */
#if defined(TARGET_I386)
CPUClass *cc = CPU_GET_CLASS(cpu);
cc->do_interrupt(cpu);
#endif
*ret = cpu->exception_index;
cpu->exception_index = -1;
return true;
#else
if (replay_exception()) {
CPUClass *cc = CPU_GET_CLASS(cpu);
qemu_mutex_lock_iothread();
cc->do_interrupt(cpu);
qemu_mutex_unlock_iothread();
cpu->exception_index = -1;
} else if (!replay_has_interrupt()) {
/* give a chance to iothread in replay mode */
*ret = EXCP_INTERRUPT;
return true;
}
#endif
}
return false;
}
static inline bool cpu_handle_interrupt(CPUState *cpu,
TranslationBlock **last_tb)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
/* Clear the interrupt flag now since we're processing
* cpu->interrupt_request and cpu->exit_request.
* Ensure zeroing happens before reading cpu->exit_request or
* cpu->interrupt_request (see also smp_wmb in cpu_exit())
*/
atomic_mb_set(&cpu->icount_decr.u16.high, 0);
if (unlikely(atomic_read(&cpu->interrupt_request))) {
int interrupt_request;
qemu_mutex_lock_iothread();
interrupt_request = cpu->interrupt_request;
if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
/* Mask out external interrupts for this step. */
interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
}
if (interrupt_request & CPU_INTERRUPT_DEBUG) {
cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
cpu->exception_index = EXCP_DEBUG;
qemu_mutex_unlock_iothread();
return true;
}
if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
/* Do nothing */
} else if (interrupt_request & CPU_INTERRUPT_HALT) {
replay_interrupt();
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
qemu_mutex_unlock_iothread();
return true;
}
#if defined(TARGET_I386)
else if (interrupt_request & CPU_INTERRUPT_INIT) {
X86CPU *x86_cpu = X86_CPU(cpu);
CPUArchState *env = &x86_cpu->env;
replay_interrupt();
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0, 0);
do_cpu_init(x86_cpu);
cpu->exception_index = EXCP_HALTED;
qemu_mutex_unlock_iothread();
return true;
}
#else
else if (interrupt_request & CPU_INTERRUPT_RESET) {
replay_interrupt();
cpu_reset(cpu);
qemu_mutex_unlock_iothread();
return true;
}
#endif
/* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit. */
else {
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
replay_interrupt();
*last_tb = NULL;
}
/* The target hook may have updated the 'cpu->interrupt_request';
* reload the 'interrupt_request' value */
interrupt_request = cpu->interrupt_request;
}
if (interrupt_request & CPU_INTERRUPT_EXITTB) {
cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
/* ensure that no TB jump will be modified as
the program flow was changed */
*last_tb = NULL;
}
/* If we exit via cpu_loop_exit/longjmp it is reset in cpu_exec */
qemu_mutex_unlock_iothread();
}
/* Finally, check if we need to exit to the main loop. */
if (unlikely(atomic_read(&cpu->exit_request)
|| (use_icount && cpu->icount_decr.u16.low + cpu->icount_extra == 0))) {
atomic_set(&cpu->exit_request, 0);
cpu->exception_index = EXCP_INTERRUPT;
return true;
}
return false;
}
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
TranslationBlock **last_tb, int *tb_exit)
{
uintptr_t ret;
int32_t insns_left;
trace_exec_tb(tb, tb->pc);
ret = cpu_tb_exec(cpu, tb);
tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
*tb_exit = ret & TB_EXIT_MASK;
if (*tb_exit != TB_EXIT_REQUESTED) {
*last_tb = tb;
return;
}
*last_tb = NULL;
insns_left = atomic_read(&cpu->icount_decr.u32);
if (insns_left < 0) {
/* Something asked us to stop executing chained TBs; just
* continue round the main loop. Whatever requested the exit
* will also have set something else (eg exit_request or
* interrupt_request) which will be handled by
* cpu_handle_interrupt. cpu_handle_interrupt will also
* clear cpu->icount_decr.u16.high.
*/
return;
}
/* Instruction counter expired. */
assert(use_icount);
#ifndef CONFIG_USER_ONLY
/* Ensure global icount has gone forward */
cpu_update_icount(cpu);
/* Refill decrementer and continue execution. */
insns_left = MIN(0xffff, cpu->icount_budget);
cpu->icount_decr.u16.low = insns_left;
cpu->icount_extra = cpu->icount_budget - insns_left;
if (!cpu->icount_extra) {
/* Execute any remaining instructions, then let the main loop
* handle the next event.
*/
if (insns_left > 0) {
cpu_exec_nocache(cpu, insns_left, tb, false);
}
}
#endif
}
/* main execution loop */
int cpu_exec(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
int ret;
SyncClocks sc = { 0 };
/* replay_interrupt may need current_cpu */
current_cpu = cpu;
if (cpu_handle_halt(cpu)) {
return EXCP_HALTED;
}
rcu_read_lock();
cc->cpu_exec_enter(cpu);
/* Calculate difference between guest clock and host clock.
* This delay includes the delay of the last cycle, so
* what we have to do is sleep until it is 0. As for the
* advance/delay we gain here, we try to fix it next time.
*/
init_delay_params(&sc, cpu);
/* prepare setjmp context for exception handling */
if (sigsetjmp(cpu->jmp_env, 0) != 0) {
#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
/* Some compilers wrongly smash all local variables after
* siglongjmp. There were bug reports for gcc 4.5.0 and clang.
* Reload essential local variables here for those compilers.
* Newer versions of gcc would complain about this code (-Wclobbered). */
cpu = current_cpu;
cc = CPU_GET_CLASS(cpu);
#else /* buggy compiler */
/* Assert that the compiler does not smash local variables. */
g_assert(cpu == current_cpu);
g_assert(cc == CPU_GET_CLASS(cpu));
#endif /* buggy compiler */
cpu->can_do_io = 1;
tb_lock_reset();
if (qemu_mutex_iothread_locked()) {
qemu_mutex_unlock_iothread();
}
}
/* if an exception is pending, we execute it here */
while (!cpu_handle_exception(cpu, &ret)) {
TranslationBlock *last_tb = NULL;
int tb_exit = 0;
while (!cpu_handle_interrupt(cpu, &last_tb)) {
uint32_t cflags = cpu->cflags_next_tb;
TranslationBlock *tb;
/* When requested, use an exact setting for cflags for the next
execution. This is used for icount, precise smc, and stop-
after-access watchpoints. Since this request should never
have CF_INVALID set, -1 is a convenient invalid value that
does not require tcg headers for cpu_common_reset. */
if (cflags == -1) {
cflags = curr_cflags();
} else {
cpu->cflags_next_tb = -1;
}
tb = tb_find(cpu, last_tb, tb_exit, cflags);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit);
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
}
}
cc->cpu_exec_exit(cpu);
rcu_read_unlock();
return ret;
}

File diff suppressed because it is too large Load Diff

View File

@@ -1,92 +0,0 @@
/*
* QEMU System Emulator, accelerator interfaces
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2014 Red Hat Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "sysemu/accel.h"
#include "sysemu/sysemu.h"
#include "qom/object.h"
#include "qemu-common.h"
#include "qom/cpu.h"
#include "sysemu/cpus.h"
#include "qemu/main-loop.h"
unsigned long tcg_tb_size;
#ifndef CONFIG_USER_ONLY
/* mask must never be zero, except for A20 change call */
static void tcg_handle_interrupt(CPUState *cpu, int mask)
{
int old_mask;
g_assert(qemu_mutex_iothread_locked());
old_mask = cpu->interrupt_request;
cpu->interrupt_request |= mask;
/*
* If called from iothread context, wake the target cpu in
* case its halted.
*/
if (!qemu_cpu_is_self(cpu)) {
qemu_cpu_kick(cpu);
} else {
cpu->icount_decr.u16.high = -1;
if (use_icount &&
!cpu->can_do_io
&& (mask & ~old_mask) != 0) {
cpu_abort(cpu, "Raised interrupt while not in I/O function");
}
}
}
#endif
static int tcg_init(MachineState *ms)
{
tcg_exec_init(tcg_tb_size * 1024 * 1024);
cpu_interrupt_handler = tcg_handle_interrupt;
return 0;
}
static void tcg_accel_class_init(ObjectClass *oc, void *data)
{
AccelClass *ac = ACCEL_CLASS(oc);
ac->name = "tcg";
ac->init_machine = tcg_init;
ac->allowed = &tcg_allowed;
}
#define TYPE_TCG_ACCEL ACCEL_CLASS_NAME("tcg")
static const TypeInfo tcg_accel_type = {
.name = TYPE_TCG_ACCEL,
.parent = TYPE_ACCEL,
.class_init = tcg_accel_class_init,
};
static void register_accel_types(void)
{
type_register_static(&tcg_accel_type);
}
type_init(register_accel_types);

View File

@@ -1,997 +0,0 @@
/*
* Generic vectorized operation runtime
*
* Copyright (c) 2018 Linaro
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include "qemu/host-utils.h"
#include "cpu.h"
#include "exec/helper-proto.h"
#include "tcg-gvec-desc.h"
/* Virtually all hosts support 16-byte vectors. Those that don't can emulate
* them via GCC's generic vector extension. This turns out to be simpler and
* more reliable than getting the compiler to autovectorize.
*
* In tcg-op-gvec.c, we asserted that both the size and alignment of the data
* are multiples of 16.
*
* When the compiler does not support all of the operations we require, the
* loops are written so that we can always fall back on the base types.
*/
#ifdef CONFIG_VECTOR16
typedef uint8_t vec8 __attribute__((vector_size(16)));
typedef uint16_t vec16 __attribute__((vector_size(16)));
typedef uint32_t vec32 __attribute__((vector_size(16)));
typedef uint64_t vec64 __attribute__((vector_size(16)));
typedef int8_t svec8 __attribute__((vector_size(16)));
typedef int16_t svec16 __attribute__((vector_size(16)));
typedef int32_t svec32 __attribute__((vector_size(16)));
typedef int64_t svec64 __attribute__((vector_size(16)));
#define DUP16(X) { X, X, X, X, X, X, X, X, X, X, X, X, X, X, X, X }
#define DUP8(X) { X, X, X, X, X, X, X, X }
#define DUP4(X) { X, X, X, X }
#define DUP2(X) { X, X }
#else
typedef uint8_t vec8;
typedef uint16_t vec16;
typedef uint32_t vec32;
typedef uint64_t vec64;
typedef int8_t svec8;
typedef int16_t svec16;
typedef int32_t svec32;
typedef int64_t svec64;
#define DUP16(X) X
#define DUP8(X) X
#define DUP4(X) X
#define DUP2(X) X
#endif /* CONFIG_VECTOR16 */
static inline void clear_high(void *d, intptr_t oprsz, uint32_t desc)
{
intptr_t maxsz = simd_maxsz(desc);
intptr_t i;
if (unlikely(maxsz > oprsz)) {
for (i = oprsz; i < maxsz; i += sizeof(uint64_t)) {
*(uint64_t *)(d + i) = 0;
}
}
}
void HELPER(gvec_add8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) + *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) + *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) + *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_add64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) + *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_adds64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) + vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) - *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) - *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) - *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) - *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_subs64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) - vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) * *(vec8 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) * *(vec16 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) * *(vec32 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mul64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) * *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls8)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec8 vecb = (vec8)DUP16(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls16)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec16 vecb = (vec16)DUP8(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls32)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec32 vecb = (vec32)DUP4(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_muls64)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) * vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg8)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = -*(vec8 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg16)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = -*(vec16 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg32)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = -*(vec32 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_neg64)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = -*(vec64 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_mov)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
memcpy(d, a, oprsz);
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup64)(void *d, uint32_t desc, uint64_t c)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
if (c == 0) {
oprsz = 0;
} else {
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
*(uint64_t *)(d + i) = c;
}
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup32)(void *d, uint32_t desc, uint32_t c)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
if (c == 0) {
oprsz = 0;
} else {
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
*(uint32_t *)(d + i) = c;
}
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_dup16)(void *d, uint32_t desc, uint32_t c)
{
HELPER(gvec_dup32)(d, desc, 0x00010001 * (c & 0xffff));
}
void HELPER(gvec_dup8)(void *d, uint32_t desc, uint32_t c)
{
HELPER(gvec_dup32)(d, desc, 0x01010101 * (c & 0xff));
}
void HELPER(gvec_not)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = ~*(vec64 *)(a + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_and)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) & *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_or)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) | *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_xor)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) ^ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_andc)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) &~ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_orc)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) |~ *(vec64 *)(b + i);
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ands)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) & vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_xors)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) ^ vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ors)(void *d, void *a, uint64_t b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
vec64 vecb = (vec64)DUP2(b);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) | vecb;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shl64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) << shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(vec8 *)(d + i) = *(vec8 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(vec16 *)(d + i) = *(vec16 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(vec32 *)(d + i) = *(vec32 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_shr64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(vec64 *)(d + i) = *(vec64 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar8i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec8)) {
*(svec8 *)(d + i) = *(svec8 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar16i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec16)) {
*(svec16 *)(d + i) = *(svec16 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar32i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec32)) {
*(svec32 *)(d + i) = *(svec32 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sar64i)(void *d, void *a, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
int shift = simd_data(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(vec64)) {
*(svec64 *)(d + i) = *(svec64 *)(a + i) >> shift;
}
clear_high(d, oprsz, desc);
}
/* If vectors are enabled, the compiler fills in -1 for true.
Otherwise, we must take care of this by hand. */
#ifdef CONFIG_VECTOR16
# define DO_CMP0(X) X
#else
# define DO_CMP0(X) -(X)
#endif
#define DO_CMP1(NAME, TYPE, OP) \
void HELPER(NAME)(void *d, void *a, void *b, uint32_t desc) \
{ \
intptr_t oprsz = simd_oprsz(desc); \
intptr_t i; \
for (i = 0; i < oprsz; i += sizeof(vec64)) { \
*(TYPE *)(d + i) = DO_CMP0(*(TYPE *)(a + i) OP *(TYPE *)(b + i)); \
} \
clear_high(d, oprsz, desc); \
}
#define DO_CMP2(SZ) \
DO_CMP1(gvec_eq##SZ, vec##SZ, ==) \
DO_CMP1(gvec_ne##SZ, vec##SZ, !=) \
DO_CMP1(gvec_lt##SZ, svec##SZ, <) \
DO_CMP1(gvec_le##SZ, svec##SZ, <=) \
DO_CMP1(gvec_ltu##SZ, vec##SZ, <) \
DO_CMP1(gvec_leu##SZ, vec##SZ, <=)
DO_CMP2(8)
DO_CMP2(16)
DO_CMP2(32)
DO_CMP2(64)
#undef DO_CMP0
#undef DO_CMP1
#undef DO_CMP2
void HELPER(gvec_ssadd8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int8_t)) {
int r = *(int8_t *)(a + i) + *(int8_t *)(b + i);
if (r > INT8_MAX) {
r = INT8_MAX;
} else if (r < INT8_MIN) {
r = INT8_MIN;
}
*(int8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int16_t)) {
int r = *(int16_t *)(a + i) + *(int16_t *)(b + i);
if (r > INT16_MAX) {
r = INT16_MAX;
} else if (r < INT16_MIN) {
r = INT16_MIN;
}
*(int16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int32_t)) {
int32_t ai = *(int32_t *)(a + i);
int32_t bi = *(int32_t *)(b + i);
int32_t di = ai + bi;
if (((di ^ ai) &~ (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT32_MAX : INT32_MIN);
}
*(int32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ssadd64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int64_t)) {
int64_t ai = *(int64_t *)(a + i);
int64_t bi = *(int64_t *)(b + i);
int64_t di = ai + bi;
if (((di ^ ai) &~ (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT64_MAX : INT64_MIN);
}
*(int64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
int r = *(int8_t *)(a + i) - *(int8_t *)(b + i);
if (r > INT8_MAX) {
r = INT8_MAX;
} else if (r < INT8_MIN) {
r = INT8_MIN;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int16_t)) {
int r = *(int16_t *)(a + i) - *(int16_t *)(b + i);
if (r > INT16_MAX) {
r = INT16_MAX;
} else if (r < INT16_MIN) {
r = INT16_MIN;
}
*(int16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int32_t)) {
int32_t ai = *(int32_t *)(a + i);
int32_t bi = *(int32_t *)(b + i);
int32_t di = ai - bi;
if (((di ^ ai) & (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT32_MAX : INT32_MIN);
}
*(int32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_sssub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(int64_t)) {
int64_t ai = *(int64_t *)(a + i);
int64_t bi = *(int64_t *)(b + i);
int64_t di = ai - bi;
if (((di ^ ai) & (ai ^ bi)) < 0) {
/* Signed overflow. */
di = (di < 0 ? INT64_MAX : INT64_MIN);
}
*(int64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
unsigned r = *(uint8_t *)(a + i) + *(uint8_t *)(b + i);
if (r > UINT8_MAX) {
r = UINT8_MAX;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint16_t)) {
unsigned r = *(uint16_t *)(a + i) + *(uint16_t *)(b + i);
if (r > UINT16_MAX) {
r = UINT16_MAX;
}
*(uint16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
uint32_t ai = *(uint32_t *)(a + i);
uint32_t bi = *(uint32_t *)(b + i);
uint32_t di = ai + bi;
if (di < ai) {
di = UINT32_MAX;
}
*(uint32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_usadd64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
uint64_t ai = *(uint64_t *)(a + i);
uint64_t bi = *(uint64_t *)(b + i);
uint64_t di = ai + bi;
if (di < ai) {
di = UINT64_MAX;
}
*(uint64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub8)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint8_t)) {
int r = *(uint8_t *)(a + i) - *(uint8_t *)(b + i);
if (r < 0) {
r = 0;
}
*(uint8_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub16)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint16_t)) {
int r = *(uint16_t *)(a + i) - *(uint16_t *)(b + i);
if (r < 0) {
r = 0;
}
*(uint16_t *)(d + i) = r;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub32)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint32_t)) {
uint32_t ai = *(uint32_t *)(a + i);
uint32_t bi = *(uint32_t *)(b + i);
uint32_t di = ai - bi;
if (ai < bi) {
di = 0;
}
*(uint32_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}
void HELPER(gvec_ussub64)(void *d, void *a, void *b, uint32_t desc)
{
intptr_t oprsz = simd_oprsz(desc);
intptr_t i;
for (i = 0; i < oprsz; i += sizeof(uint64_t)) {
uint64_t ai = *(uint64_t *)(a + i);
uint64_t bi = *(uint64_t *)(b + i);
uint64_t di = ai - bi;
if (ai < bi) {
di = 0;
}
*(uint64_t *)(d + i) = di;
}
clear_high(d, oprsz, desc);
}

View File

@@ -1,254 +0,0 @@
DEF_HELPER_FLAGS_2(div_i32, TCG_CALL_NO_RWG_SE, s32, s32, s32)
DEF_HELPER_FLAGS_2(rem_i32, TCG_CALL_NO_RWG_SE, s32, s32, s32)
DEF_HELPER_FLAGS_2(divu_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(remu_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(div_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(rem_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(divu_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(remu_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(shl_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(shr_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(sar_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(mulsh_i64, TCG_CALL_NO_RWG_SE, s64, s64, s64)
DEF_HELPER_FLAGS_2(muluh_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(clz_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(ctz_i32, TCG_CALL_NO_RWG_SE, i32, i32, i32)
DEF_HELPER_FLAGS_2(clz_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_2(ctz_i64, TCG_CALL_NO_RWG_SE, i64, i64, i64)
DEF_HELPER_FLAGS_1(clrsb_i32, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_FLAGS_1(clrsb_i64, TCG_CALL_NO_RWG_SE, i64, i64)
DEF_HELPER_FLAGS_1(ctpop_i32, TCG_CALL_NO_RWG_SE, i32, i32)
DEF_HELPER_FLAGS_1(ctpop_i64, TCG_CALL_NO_RWG_SE, i64, i64)
DEF_HELPER_FLAGS_1(lookup_tb_ptr, TCG_CALL_NO_WG_SE, ptr, env)
DEF_HELPER_FLAGS_1(exit_atomic, TCG_CALL_NO_WG, noreturn, env)
#ifdef CONFIG_SOFTMMU
DEF_HELPER_FLAGS_5(atomic_cmpxchgb, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgw_be, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgw_le, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgl_be, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgl_le, TCG_CALL_NO_WG,
i32, env, tl, i32, i32, i32)
#ifdef CONFIG_ATOMIC64
DEF_HELPER_FLAGS_5(atomic_cmpxchgq_be, TCG_CALL_NO_WG,
i64, env, tl, i64, i64, i32)
DEF_HELPER_FLAGS_5(atomic_cmpxchgq_le, TCG_CALL_NO_WG,
i64, env, tl, i64, i64, i32)
#endif
#ifdef CONFIG_ATOMIC64
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), q_le), \
TCG_CALL_NO_WG, i64, env, tl, i64, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), q_be), \
TCG_CALL_NO_WG, i64, env, tl, i64, i32)
#else
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32) \
DEF_HELPER_FLAGS_4(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32, i32)
#endif /* CONFIG_ATOMIC64 */
#else
DEF_HELPER_FLAGS_4(atomic_cmpxchgb, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgw_be, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgw_le, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgl_be, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
DEF_HELPER_FLAGS_4(atomic_cmpxchgl_le, TCG_CALL_NO_WG, i32, env, tl, i32, i32)
#ifdef CONFIG_ATOMIC64
DEF_HELPER_FLAGS_4(atomic_cmpxchgq_be, TCG_CALL_NO_WG, i64, env, tl, i64, i64)
DEF_HELPER_FLAGS_4(atomic_cmpxchgq_le, TCG_CALL_NO_WG, i64, env, tl, i64, i64)
#endif
#ifdef CONFIG_ATOMIC64
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), q_le), \
TCG_CALL_NO_WG, i64, env, tl, i64) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), q_be), \
TCG_CALL_NO_WG, i64, env, tl, i64)
#else
#define GEN_ATOMIC_HELPERS(NAME) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), b), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), w_be), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_le), \
TCG_CALL_NO_WG, i32, env, tl, i32) \
DEF_HELPER_FLAGS_3(glue(glue(atomic_, NAME), l_be), \
TCG_CALL_NO_WG, i32, env, tl, i32)
#endif /* CONFIG_ATOMIC64 */
#endif /* CONFIG_SOFTMMU */
GEN_ATOMIC_HELPERS(fetch_add)
GEN_ATOMIC_HELPERS(fetch_and)
GEN_ATOMIC_HELPERS(fetch_or)
GEN_ATOMIC_HELPERS(fetch_xor)
GEN_ATOMIC_HELPERS(add_fetch)
GEN_ATOMIC_HELPERS(and_fetch)
GEN_ATOMIC_HELPERS(or_fetch)
GEN_ATOMIC_HELPERS(xor_fetch)
GEN_ATOMIC_HELPERS(xchg)
#undef GEN_ATOMIC_HELPERS
DEF_HELPER_FLAGS_3(gvec_mov, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_dup8, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup16, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup32, TCG_CALL_NO_RWG, void, ptr, i32, i32)
DEF_HELPER_FLAGS_3(gvec_dup64, TCG_CALL_NO_RWG, void, ptr, i32, i64)
DEF_HELPER_FLAGS_4(gvec_add8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_add64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_adds8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_adds64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_sub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_subs8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_subs64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_mul8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_mul64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_muls8, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls16, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls32, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_muls64, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ssadd64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_sssub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_usadd64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ussub64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg8, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg16, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg32, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_neg64, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_not, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_and, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_or, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_xor, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_andc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_orc, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ands, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_xors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_4(gvec_ors, TCG_CALL_NO_RWG, void, ptr, ptr, i64, i32)
DEF_HELPER_FLAGS_3(gvec_shl8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shl64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_shr64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar8i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar16i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar32i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_3(gvec_sar64i, TCG_CALL_NO_RWG, void, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_eq64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ne64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_lt64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_le64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_ltu64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu8, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu16, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu32, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)
DEF_HELPER_FLAGS_4(gvec_leu64, TCG_CALL_NO_RWG, void, ptr, ptr, ptr, i32)

View File

@@ -1,10 +0,0 @@
# Trace events for debugging and performance instrumentation
# TCG related tracing (mostly disabled by default)
# cpu-exec.c
disable exec_tb(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR
disable exec_tb_nocache(void *tb, uintptr_t pc) "tb:%p pc=0x%"PRIxPTR
disable exec_tb_exit(void *last_tb, unsigned int flags) "tb:%p flags=0x%x"
# translate-all.c
translate_block(void *tb, uintptr_t pc, uint8_t *tb_code) "tb:%p, pc:0x%"PRIxPTR", tb_code:%p"

View File

@@ -1,138 +0,0 @@
/*
* Generic intermediate code generation.
*
* Copyright (C) 2016-2017 Lluís Vilanova <vilanova@ac.upc.edu>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "cpu.h"
#include "tcg/tcg.h"
#include "tcg/tcg-op.h"
#include "exec/exec-all.h"
#include "exec/gen-icount.h"
#include "exec/log.h"
#include "exec/translator.h"
/* Pairs with tcg_clear_temp_count.
To be called by #TranslatorOps.{translate_insn,tb_stop} if
(1) the target is sufficiently clean to support reporting,
(2) as and when all temporaries are known to be consumed.
For most targets, (2) is at the end of translate_insn. */
void translator_loop_temp_check(DisasContextBase *db)
{
if (tcg_check_temp_count()) {
qemu_log("warning: TCG temporary leaks before "
TARGET_FMT_lx "\n", db->pc_next);
}
}
void translator_loop(const TranslatorOps *ops, DisasContextBase *db,
CPUState *cpu, TranslationBlock *tb)
{
int max_insns;
/* Initialize DisasContext */
db->tb = tb;
db->pc_first = tb->pc;
db->pc_next = db->pc_first;
db->is_jmp = DISAS_NEXT;
db->num_insns = 0;
db->singlestep_enabled = cpu->singlestep_enabled;
/* Instruction counting */
max_insns = tb_cflags(db->tb) & CF_COUNT_MASK;
if (max_insns == 0) {
max_insns = CF_COUNT_MASK;
}
if (max_insns > TCG_MAX_INSNS) {
max_insns = TCG_MAX_INSNS;
}
if (db->singlestep_enabled || singlestep) {
max_insns = 1;
}
max_insns = ops->init_disas_context(db, cpu, max_insns);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Reset the temp count so that we can identify leaks */
tcg_clear_temp_count();
/* Start translating. */
gen_tb_start(db->tb);
ops->tb_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
while (true) {
db->num_insns++;
ops->insn_start(db, cpu);
tcg_debug_assert(db->is_jmp == DISAS_NEXT); /* no early exit */
/* Pass breakpoint hits to target for further processing */
if (unlikely(!QTAILQ_EMPTY(&cpu->breakpoints))) {
CPUBreakpoint *bp;
QTAILQ_FOREACH(bp, &cpu->breakpoints, entry) {
if (bp->pc == db->pc_next) {
if (ops->breakpoint_check(db, cpu, bp)) {
break;
}
}
}
/* The breakpoint_check hook may use DISAS_TOO_MANY to indicate
that only one more instruction is to be executed. Otherwise
it should use DISAS_NORETURN when generating an exception,
but may use a DISAS_TARGET_* value for Something Else. */
if (db->is_jmp > DISAS_TOO_MANY) {
break;
}
}
/* Disassemble one instruction. The translate_insn hook should
update db->pc_next and db->is_jmp to indicate what should be
done next -- either exiting this loop or locate the start of
the next instruction. */
if (db->num_insns == max_insns && (tb_cflags(db->tb) & CF_LAST_IO)) {
/* Accept I/O on the last instruction. */
gen_io_start();
ops->translate_insn(db, cpu);
gen_io_end();
} else {
ops->translate_insn(db, cpu);
}
/* Stop translation if translate_insn so indicated. */
if (db->is_jmp != DISAS_NEXT) {
break;
}
/* Stop translation if the output buffer is full,
or we have executed all of the allowed instructions. */
if (tcg_op_buf_full() || db->num_insns >= max_insns) {
db->is_jmp = DISAS_TOO_MANY;
break;
}
}
/* Emit code to exit the TB, as indicated by db->is_jmp. */
ops->tb_stop(db, cpu);
gen_tb_end(db->tb, db->num_insns);
/* The disas_log hook may use these values rather than recompute. */
db->tb->size = db->pc_next - db->pc_first;
db->tb->icount = db->num_insns;
#ifdef DEBUG_DISAS
if (qemu_loglevel_mask(CPU_LOG_TB_IN_ASM)
&& qemu_log_in_addr_range(db->pc_first)) {
qemu_log_lock();
qemu_log("----------------\n");
ops->disas_log(db, cpu);
qemu_log("\n");
qemu_log_unlock();
}
#endif
}

View File

@@ -1,34 +0,0 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qom/cpu.h"
#include "sysemu/replay.h"
void cpu_resume(CPUState *cpu)
{
}
void qemu_init_vcpu(CPUState *cpu)
{
}
/* User mode emulation does not support record/replay yet. */
bool replay_exception(void)
{
return true;
}
bool replay_has_exception(void)
{
return false;
}
bool replay_interrupt(void)
{
return true;
}
bool replay_has_interrupt(void)
{
return false;
}

View File

@@ -16,10 +16,8 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block.h"
#include "qemu/rcu_queue.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
#include "qemu/cutils.h"
#include "trace.h"
#ifdef CONFIG_EPOLL_CREATE1
#include <sys/epoll.h>
#endif
@@ -29,9 +27,6 @@ struct AioHandler
GPollFD pfd;
IOHandler *io_read;
IOHandler *io_write;
AioPollFn *io_poll;
IOHandler *io_poll_begin;
IOHandler *io_poll_end;
int deleted;
void *opaque;
bool is_external;
@@ -66,7 +61,7 @@ static bool aio_epoll_try_enable(AioContext *ctx)
AioHandler *node;
struct epoll_event event;
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
int r;
if (node->deleted || !node->pfd.events) {
continue;
@@ -86,22 +81,29 @@ static void aio_epoll_update(AioContext *ctx, AioHandler *node, bool is_new)
{
struct epoll_event event;
int r;
int ctl;
if (!ctx->epoll_enabled) {
return;
}
if (!node->pfd.events) {
ctl = EPOLL_CTL_DEL;
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_DEL, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
} else {
event.data.ptr = node;
event.events = epoll_events_from_pfd(node->pfd.events);
ctl = is_new ? EPOLL_CTL_ADD : EPOLL_CTL_MOD;
}
r = epoll_ctl(ctx->epollfd, ctl, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
if (is_new) {
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_ADD, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
} else {
r = epoll_ctl(ctx->epollfd, EPOLL_CTL_MOD, node->pfd.fd, &event);
if (r) {
aio_epoll_disable(ctx);
}
}
}
}
@@ -119,7 +121,7 @@ static int aio_epoll(AioContext *ctx, GPollFD *pfds,
}
if (timeout <= 0 || ret > 0) {
ret = epoll_wait(ctx->epollfd, events,
ARRAY_SIZE(events),
sizeof(events) / sizeof(events[0]),
timeout);
if (ret <= 0) {
goto out;
@@ -205,68 +207,45 @@ void aio_set_fd_handler(AioContext *ctx,
bool is_external,
IOHandler *io_read,
IOHandler *io_write,
AioPollFn *io_poll,
void *opaque)
{
AioHandler *node;
bool is_new = false;
bool deleted = false;
qemu_lockcnt_lock(&ctx->list_lock);
node = find_aio_handler(ctx, fd);
/* Are we deleting the fd handler? */
if (!io_read && !io_write && !io_poll) {
if (node == NULL) {
qemu_lockcnt_unlock(&ctx->list_lock);
return;
}
/* If the GSource is in the process of being destroyed then
* g_source_remove_poll() causes an assertion failure. Skip
* removal in that case, because glib cleans up its state during
* destruction anyway.
*/
if (!g_source_is_destroyed(&ctx->source)) {
if (!io_read && !io_write) {
if (node) {
g_source_remove_poll(&ctx->source, &node->pfd);
}
/* If the lock is held, just mark the node as deleted */
if (qemu_lockcnt_count(&ctx->list_lock)) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up while
* no one is walking the handlers list.
*/
QLIST_REMOVE(node, node);
deleted = true;
}
if (!node->io_poll) {
ctx->poll_disable_cnt--;
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
deleted = true;
}
}
} else {
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD_RCU(&ctx->aio_handlers, node, node);
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd);
is_new = true;
ctx->poll_disable_cnt += !io_poll;
} else {
ctx->poll_disable_cnt += !io_poll - !node->io_poll;
}
/* Update handler with latest information */
node->io_read = io_read;
node->io_write = io_write;
node->io_poll = io_poll;
node->opaque = opaque;
node->is_external = is_external;
@@ -275,127 +254,72 @@ void aio_set_fd_handler(AioContext *ctx,
}
aio_epoll_update(ctx, node, is_new);
qemu_lockcnt_unlock(&ctx->list_lock);
aio_notify(ctx);
if (deleted) {
g_free(node);
}
}
void aio_set_fd_poll(AioContext *ctx, int fd,
IOHandler *io_poll_begin,
IOHandler *io_poll_end)
{
AioHandler *node = find_aio_handler(ctx, fd);
if (!node) {
return;
}
node->io_poll_begin = io_poll_begin;
node->io_poll_end = io_poll_end;
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *notifier,
bool is_external,
EventNotifierHandler *io_read,
AioPollFn *io_poll)
EventNotifierHandler *io_read)
{
aio_set_fd_handler(ctx, event_notifier_get_fd(notifier), is_external,
(IOHandler *)io_read, NULL, io_poll, notifier);
aio_set_fd_handler(ctx, event_notifier_get_fd(notifier),
is_external, (IOHandler *)io_read, NULL, notifier);
}
void aio_set_event_notifier_poll(AioContext *ctx,
EventNotifier *notifier,
EventNotifierHandler *io_poll_begin,
EventNotifierHandler *io_poll_end)
{
aio_set_fd_poll(ctx, event_notifier_get_fd(notifier),
(IOHandler *)io_poll_begin,
(IOHandler *)io_poll_end);
}
static void poll_set_started(AioContext *ctx, bool started)
{
AioHandler *node;
if (started == ctx->poll_started) {
return;
}
ctx->poll_started = started;
qemu_lockcnt_inc(&ctx->list_lock);
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
IOHandler *fn;
if (node->deleted) {
continue;
}
if (started) {
fn = node->io_poll_begin;
} else {
fn = node->io_poll_end;
}
if (fn) {
fn(node->opaque);
}
}
qemu_lockcnt_dec(&ctx->list_lock);
}
bool aio_prepare(AioContext *ctx)
{
/* Poll mode cannot be used with glib's event loop, disable it. */
poll_set_started(ctx, false);
return false;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
bool result = false;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
qemu_lockcnt_inc(&ctx->list_lock);
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
int revents;
revents = node->pfd.revents & node->pfd.events;
if (revents & (G_IO_IN | G_IO_HUP | G_IO_ERR) && node->io_read &&
aio_node_check(ctx, node->is_external)) {
result = true;
break;
return true;
}
if (revents & (G_IO_OUT | G_IO_ERR) && node->io_write &&
aio_node_check(ctx, node->is_external)) {
result = true;
break;
return true;
}
}
qemu_lockcnt_dec(&ctx->list_lock);
return result;
return false;
}
static bool aio_dispatch_handlers(AioContext *ctx)
bool aio_dispatch(AioContext *ctx)
{
AioHandler *node, *tmp;
AioHandler *node;
bool progress = false;
QLIST_FOREACH_SAFE_RCU(node, &ctx->aio_handlers, node, tmp) {
/*
* If there are callbacks left that have been queued, we need to call them.
* Do not call select in this case, because it is possible that the caller
* does not need a complete flush (as is the case for aio_poll loops).
*/
if (aio_bh_poll(ctx)) {
progress = true;
}
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents;
ctx->walking_handlers++;
revents = node->pfd.revents & node->pfd.events;
node->pfd.revents = 0;
@@ -418,28 +342,23 @@ static bool aio_dispatch_handlers(AioContext *ctx)
progress = true;
}
if (node->deleted) {
if (qemu_lockcnt_dec_if_lock(&ctx->list_lock)) {
QLIST_REMOVE(node, node);
g_free(node);
qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
/* Run our timers */
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
void aio_dispatch(AioContext *ctx)
{
qemu_lockcnt_inc(&ctx->list_lock);
aio_bh_poll(ctx);
aio_dispatch_handlers(ctx);
qemu_lockcnt_dec(&ctx->list_lock);
timerlistgroup_run_timers(&ctx->tlg);
}
/* These thread-local variables are used only in a small part of aio_poll
* around the call to the poll() system call. In particular they are not
* used while aio_poll is performing callbacks, which makes it much easier
@@ -486,101 +405,15 @@ static void add_pollfd(AioHandler *node)
npfd++;
}
static bool run_poll_handlers_once(AioContext *ctx)
{
bool progress = false;
AioHandler *node;
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->io_poll &&
aio_node_check(ctx, node->is_external) &&
node->io_poll(node->opaque)) {
progress = true;
}
/* Caller handles freeing deleted nodes. Don't do it here. */
}
return progress;
}
/* run_poll_handlers:
* @ctx: the AioContext
* @max_ns: maximum time to poll for, in nanoseconds
*
* Polls for a given time.
*
* Note that ctx->notify_me must be non-zero so this function can detect
* aio_notify().
*
* Note that the caller must have incremented ctx->list_lock.
*
* Returns: true if progress was made, false otherwise
*/
static bool run_poll_handlers(AioContext *ctx, int64_t max_ns)
{
bool progress;
int64_t end_time;
assert(ctx->notify_me);
assert(qemu_lockcnt_count(&ctx->list_lock) > 0);
assert(ctx->poll_disable_cnt == 0);
trace_run_poll_handlers_begin(ctx, max_ns);
end_time = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) + max_ns;
do {
progress = run_poll_handlers_once(ctx);
} while (!progress && qemu_clock_get_ns(QEMU_CLOCK_REALTIME) < end_time);
trace_run_poll_handlers_end(ctx, progress);
return progress;
}
/* try_poll_mode:
* @ctx: the AioContext
* @blocking: busy polling is only attempted when blocking is true
*
* ctx->notify_me must be non-zero so this function can detect aio_notify().
*
* Note that the caller must have incremented ctx->list_lock.
*
* Returns: true if progress was made, false otherwise
*/
static bool try_poll_mode(AioContext *ctx, bool blocking)
{
if (blocking && ctx->poll_max_ns && ctx->poll_disable_cnt == 0) {
/* See qemu_soonest_timeout() uint64_t hack */
int64_t max_ns = MIN((uint64_t)aio_compute_timeout(ctx),
(uint64_t)ctx->poll_ns);
if (max_ns) {
poll_set_started(ctx, true);
if (run_poll_handlers(ctx, max_ns)) {
return true;
}
}
}
poll_set_started(ctx, false);
/* Even if we don't run busy polling, try polling once in case it can make
* progress and the caller will be able to avoid ppoll(2)/epoll_wait(2).
*/
return run_poll_handlers_once(ctx);
}
bool aio_poll(AioContext *ctx, bool blocking)
{
AioHandler *node;
int i;
int ret = 0;
int i, ret;
bool progress;
int64_t timeout;
int64_t start = 0;
aio_context_acquire(ctx);
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
* everything (file descriptors, bottom halves, timers) will
@@ -593,86 +426,41 @@ bool aio_poll(AioContext *ctx, bool blocking)
atomic_add(&ctx->notify_me, 2);
}
qemu_lockcnt_inc(&ctx->list_lock);
ctx->walking_handlers++;
if (ctx->poll_max_ns) {
start = qemu_clock_get_ns(QEMU_CLOCK_REALTIME);
}
assert(npfd == 0);
progress = try_poll_mode(ctx, blocking);
if (!progress) {
assert(npfd == 0);
/* fill pollfds */
if (!aio_epoll_enabled(ctx)) {
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->pfd.events
&& aio_node_check(ctx, node->is_external)) {
add_pollfd(node);
}
}
}
timeout = blocking ? aio_compute_timeout(ctx) : 0;
/* wait until next event */
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
AioHandler epoll_handler;
epoll_handler.pfd.fd = ctx->epollfd;
epoll_handler.pfd.events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR;
npfd = 0;
add_pollfd(&epoll_handler);
ret = aio_epoll(ctx, pollfds, npfd, timeout);
} else {
ret = qemu_poll_ns(pollfds, npfd, timeout);
/* fill pollfds */
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->pfd.events
&& !aio_epoll_enabled(ctx)
&& aio_node_check(ctx, node->is_external)) {
add_pollfd(node);
}
}
timeout = blocking ? aio_compute_timeout(ctx) : 0;
/* wait until next event */
if (timeout) {
aio_context_release(ctx);
}
if (aio_epoll_check_poll(ctx, pollfds, npfd, timeout)) {
AioHandler epoll_handler;
epoll_handler.pfd.fd = ctx->epollfd;
epoll_handler.pfd.events = G_IO_IN | G_IO_OUT | G_IO_HUP | G_IO_ERR;
npfd = 0;
add_pollfd(&epoll_handler);
ret = aio_epoll(ctx, pollfds, npfd, timeout);
} else {
ret = qemu_poll_ns(pollfds, npfd, timeout);
}
if (blocking) {
atomic_sub(&ctx->notify_me, 2);
}
/* Adjust polling time */
if (ctx->poll_max_ns) {
int64_t block_ns = qemu_clock_get_ns(QEMU_CLOCK_REALTIME) - start;
if (block_ns <= ctx->poll_ns) {
/* This is the sweet spot, no adjustment needed */
} else if (block_ns > ctx->poll_max_ns) {
/* We'd have to poll for too long, poll less */
int64_t old = ctx->poll_ns;
if (ctx->poll_shrink) {
ctx->poll_ns /= ctx->poll_shrink;
} else {
ctx->poll_ns = 0;
}
trace_poll_shrink(ctx, old, ctx->poll_ns);
} else if (ctx->poll_ns < ctx->poll_max_ns &&
block_ns < ctx->poll_max_ns) {
/* There is room to grow, poll longer */
int64_t old = ctx->poll_ns;
int64_t grow = ctx->poll_grow;
if (grow == 0) {
grow = 2;
}
if (ctx->poll_ns) {
ctx->poll_ns *= grow;
} else {
ctx->poll_ns = 4000; /* start polling at 4 microseconds */
}
if (ctx->poll_ns > ctx->poll_max_ns) {
ctx->poll_ns = ctx->poll_max_ns;
}
trace_poll_grow(ctx, old, ctx->poll_ns);
}
if (timeout) {
aio_context_acquire(ctx);
}
aio_notify_accept(ctx);
@@ -685,44 +473,27 @@ bool aio_poll(AioContext *ctx, bool blocking)
}
npfd = 0;
ctx->walking_handlers--;
progress |= aio_bh_poll(ctx);
if (ret > 0) {
progress |= aio_dispatch_handlers(ctx);
/* Run dispatch even if there were no readable fds to run timers */
if (aio_dispatch(ctx)) {
progress = true;
}
qemu_lockcnt_dec(&ctx->list_lock);
progress |= timerlistgroup_run_timers(&ctx->tlg);
aio_context_release(ctx);
return progress;
}
void aio_context_setup(AioContext *ctx)
void aio_context_setup(AioContext *ctx, Error **errp)
{
#ifdef CONFIG_EPOLL_CREATE1
assert(!ctx->epollfd);
ctx->epollfd = epoll_create1(EPOLL_CLOEXEC);
if (ctx->epollfd == -1) {
fprintf(stderr, "Failed to create epoll instance: %s", strerror(errno));
ctx->epoll_available = false;
} else {
ctx->epoll_available = true;
}
#endif
}
void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
int64_t grow, int64_t shrink, Error **errp)
{
/* No thread synchronization here, it doesn't matter if an incorrect value
* is used once.
*/
ctx->poll_max_ns = max_ns;
ctx->poll_ns = 0;
ctx->poll_grow = grow;
ctx->poll_shrink = shrink;
aio_notify(ctx);
}

View File

@@ -20,8 +20,6 @@
#include "block/block.h"
#include "qemu/queue.h"
#include "qemu/sockets.h"
#include "qapi/error.h"
#include "qemu/rcu_queue.h"
struct AioHandler {
EventNotifier *e;
@@ -40,13 +38,11 @@ void aio_set_fd_handler(AioContext *ctx,
bool is_external,
IOHandler *io_read,
IOHandler *io_write,
AioPollFn *io_poll,
void *opaque)
{
/* fd is a SOCKET in our case */
AioHandler *node;
qemu_lockcnt_lock(&ctx->list_lock);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.fd == fd && !node->deleted) {
break;
@@ -56,14 +52,14 @@ void aio_set_fd_handler(AioContext *ctx,
/* Are we deleting the fd handler? */
if (!io_read && !io_write) {
if (node) {
/* If aio_poll is in progress, just mark the node as deleted */
if (qemu_lockcnt_count(&ctx->list_lock)) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the list_lock.
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
@@ -71,13 +67,12 @@ void aio_set_fd_handler(AioContext *ctx,
}
} else {
HANDLE event;
long bitmask = 0;
if (node == NULL) {
/* Alloc and insert if it's not already there */
node = g_new0(AioHandler, 1);
node->pfd.fd = fd;
QLIST_INSERT_HEAD_RCU(&ctx->aio_handlers, node, node);
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
}
node->pfd.events = 0;
@@ -96,38 +91,22 @@ void aio_set_fd_handler(AioContext *ctx,
node->io_write = io_write;
node->is_external = is_external;
if (io_read) {
bitmask |= FD_READ | FD_ACCEPT | FD_CLOSE;
}
if (io_write) {
bitmask |= FD_WRITE | FD_CONNECT;
}
event = event_notifier_get_handle(&ctx->notifier);
WSAEventSelect(node->pfd.fd, event, bitmask);
WSAEventSelect(node->pfd.fd, event,
FD_READ | FD_ACCEPT | FD_CLOSE |
FD_CONNECT | FD_WRITE | FD_OOB);
}
qemu_lockcnt_unlock(&ctx->list_lock);
aio_notify(ctx);
}
void aio_set_fd_poll(AioContext *ctx, int fd,
IOHandler *io_poll_begin,
IOHandler *io_poll_end)
{
/* Not implemented */
}
void aio_set_event_notifier(AioContext *ctx,
EventNotifier *e,
bool is_external,
EventNotifierHandler *io_notify,
AioPollFn *io_poll)
EventNotifierHandler *io_notify)
{
AioHandler *node;
qemu_lockcnt_lock(&ctx->list_lock);
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->e == e && !node->deleted) {
break;
@@ -139,14 +118,14 @@ void aio_set_event_notifier(AioContext *ctx,
if (node) {
g_source_remove_poll(&ctx->source, &node->pfd);
/* aio_poll is in progress, just mark the node as deleted */
if (qemu_lockcnt_count(&ctx->list_lock)) {
/* If the lock is held, just mark the node as deleted */
if (ctx->walking_handlers) {
node->deleted = 1;
node->pfd.revents = 0;
} else {
/* Otherwise, delete it for real. We can't just mark it as
* deleted because deleted nodes are only cleaned up after
* releasing the list_lock.
* releasing the walking_handlers lock.
*/
QLIST_REMOVE(node, node);
g_free(node);
@@ -160,7 +139,7 @@ void aio_set_event_notifier(AioContext *ctx,
node->pfd.fd = (uintptr_t)event_notifier_get_handle(e);
node->pfd.events = G_IO_IN;
node->is_external = is_external;
QLIST_INSERT_HEAD_RCU(&ctx->aio_handlers, node, node);
QLIST_INSERT_HEAD(&ctx->aio_handlers, node, node);
g_source_add_poll(&ctx->source, &node->pfd);
}
@@ -168,18 +147,9 @@ void aio_set_event_notifier(AioContext *ctx,
node->io_notify = io_notify;
}
qemu_lockcnt_unlock(&ctx->list_lock);
aio_notify(ctx);
}
void aio_set_event_notifier_poll(AioContext *ctx,
EventNotifier *notifier,
EventNotifierHandler *io_poll_begin,
EventNotifierHandler *io_poll_end)
{
/* Not implemented */
}
bool aio_prepare(AioContext *ctx)
{
static struct timeval tv0;
@@ -187,16 +157,10 @@ bool aio_prepare(AioContext *ctx)
bool have_select_revents = false;
fd_set rfds, wfds;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
qemu_lockcnt_inc(&ctx->list_lock);
/* fill fd sets */
FD_ZERO(&rfds);
FD_ZERO(&wfds);
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->io_read) {
FD_SET ((SOCKET)node->pfd.fd, &rfds);
}
@@ -206,7 +170,7 @@ bool aio_prepare(AioContext *ctx)
}
if (select(0, &rfds, &wfds, NULL, &tv0) > 0) {
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
node->pfd.revents = 0;
if (FD_ISSET(node->pfd.fd, &rfds)) {
node->pfd.revents |= G_IO_IN;
@@ -220,53 +184,45 @@ bool aio_prepare(AioContext *ctx)
}
}
qemu_lockcnt_dec(&ctx->list_lock);
return have_select_revents;
}
bool aio_pending(AioContext *ctx)
{
AioHandler *node;
bool result = false;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
qemu_lockcnt_inc(&ctx->list_lock);
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (node->pfd.revents && node->io_notify) {
result = true;
break;
return true;
}
if ((node->pfd.revents & G_IO_IN) && node->io_read) {
result = true;
break;
return true;
}
if ((node->pfd.revents & G_IO_OUT) && node->io_write) {
result = true;
break;
return true;
}
}
qemu_lockcnt_dec(&ctx->list_lock);
return result;
return false;
}
static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
{
AioHandler *node;
bool progress = false;
AioHandler *tmp;
/*
* We have to walk very carefully in case aio_set_fd_handler is
* called while we're walking.
*/
QLIST_FOREACH_SAFE_RCU(node, &ctx->aio_handlers, node, tmp) {
node = QLIST_FIRST(&ctx->aio_handlers);
while (node) {
AioHandler *tmp;
int revents = node->pfd.revents;
ctx->walking_handlers++;
if (!node->deleted &&
(revents || event_notifier_get_handle(node->e) == event) &&
node->io_notify) {
@@ -301,25 +257,28 @@ static bool aio_dispatch_handlers(AioContext *ctx, HANDLE event)
}
}
if (node->deleted) {
if (qemu_lockcnt_dec_if_lock(&ctx->list_lock)) {
QLIST_REMOVE(node, node);
g_free(node);
qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
}
tmp = node;
node = QLIST_NEXT(node, node);
ctx->walking_handlers--;
if (!ctx->walking_handlers && tmp->deleted) {
QLIST_REMOVE(tmp, node);
g_free(tmp);
}
}
return progress;
}
void aio_dispatch(AioContext *ctx)
bool aio_dispatch(AioContext *ctx)
{
qemu_lockcnt_inc(&ctx->list_lock);
aio_bh_poll(ctx);
aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
qemu_lockcnt_dec(&ctx->list_lock);
timerlistgroup_run_timers(&ctx->tlg);
bool progress;
progress = aio_bh_poll(ctx);
progress |= aio_dispatch_handlers(ctx, INVALID_HANDLE_VALUE);
progress |= timerlistgroup_run_timers(&ctx->tlg);
return progress;
}
bool aio_poll(AioContext *ctx, bool blocking)
@@ -330,6 +289,7 @@ bool aio_poll(AioContext *ctx, bool blocking)
int count;
int timeout;
aio_context_acquire(ctx);
progress = false;
/* aio_notify can avoid the expensive event_notifier_set if
@@ -343,18 +303,20 @@ bool aio_poll(AioContext *ctx, bool blocking)
atomic_add(&ctx->notify_me, 2);
}
qemu_lockcnt_inc(&ctx->list_lock);
have_select_revents = aio_prepare(ctx);
ctx->walking_handlers++;
/* fill fd sets */
count = 0;
QLIST_FOREACH_RCU(node, &ctx->aio_handlers, node) {
QLIST_FOREACH(node, &ctx->aio_handlers, node) {
if (!node->deleted && node->io_notify
&& aio_node_check(ctx, node->is_external)) {
events[count++] = event_notifier_get_handle(node->e);
}
}
ctx->walking_handlers--;
first = true;
/* ctx->notifier is always registered. */
@@ -370,11 +332,17 @@ bool aio_poll(AioContext *ctx, bool blocking)
timeout = blocking && !have_select_revents
? qemu_timeout_ns_to_ms(aio_compute_timeout(ctx)) : 0;
if (timeout) {
aio_context_release(ctx);
}
ret = WaitForMultipleObjects(count, events, FALSE, timeout);
if (blocking) {
assert(first);
atomic_sub(&ctx->notify_me, 2);
}
if (timeout) {
aio_context_acquire(ctx);
}
if (first) {
aio_notify_accept(ctx);
@@ -397,18 +365,12 @@ bool aio_poll(AioContext *ctx, bool blocking)
progress |= aio_dispatch_handlers(ctx, event);
} while (count > 0);
qemu_lockcnt_dec(&ctx->list_lock);
progress |= timerlistgroup_run_timers(&ctx->tlg);
aio_context_release(ctx);
return progress;
}
void aio_context_setup(AioContext *ctx)
void aio_context_setup(AioContext *ctx, Error **errp)
{
}
void aio_context_set_poll_params(AioContext *ctx, int64_t max_ns,
int64_t grow, int64_t shrink, Error **errp)
{
error_setg(errp, "AioContext polling is not implemented on Windows");
}

View File

@@ -22,12 +22,11 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/sysemu.h"
#include "sysemu/arch_init.h"
#include "hw/pci/pci.h"
#include "hw/audio/soundhw.h"
#include "hw/audio/audio.h"
#include "hw/smbios/smbios.h"
#include "qemu/config-file.h"
#include "qemu/error-report.h"
#include "qmp-commands.h"
@@ -53,8 +52,6 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_CRIS
#elif defined(TARGET_I386)
#define QEMU_ARCH QEMU_ARCH_I386
#elif defined(TARGET_HPPA)
#define QEMU_ARCH QEMU_ARCH_HPPA
#elif defined(TARGET_M68K)
#define QEMU_ARCH QEMU_ARCH_M68K
#elif defined(TARGET_LM32)
@@ -65,8 +62,6 @@ int graphic_depth = 32;
#define QEMU_ARCH QEMU_ARCH_MIPS
#elif defined(TARGET_MOXIE)
#define QEMU_ARCH QEMU_ARCH_MOXIE
#elif defined(TARGET_NIOS2)
#define QEMU_ARCH QEMU_ARCH_NIOS2
#elif defined(TARGET_OPENRISC)
#define QEMU_ARCH QEMU_ARCH_OPENRISC
#elif defined(TARGET_PPC)
@@ -87,6 +82,184 @@ int graphic_depth = 32;
const uint32_t arch_type = QEMU_ARCH;
static struct defconfig_file {
const char *filename;
/* Indicates it is an user config file (disabled by -no-user-config) */
bool userconfig;
} default_config_files[] = {
{ CONFIG_QEMU_CONFDIR "/qemu.conf", true },
{ NULL }, /* end of list */
};
int qemu_read_default_config_files(bool userconfig)
{
int ret;
struct defconfig_file *f;
for (f = default_config_files; f->filename; f++) {
if (!userconfig && f->userconfig) {
continue;
}
ret = qemu_read_config_file(f->filename);
if (ret < 0 && ret != -ENOENT) {
return ret;
}
}
return 0;
}
struct soundhw {
const char *name;
const char *descr;
int enabled;
int isa;
union {
int (*init_isa) (ISABus *bus);
int (*init_pci) (PCIBus *bus);
} init;
};
static struct soundhw soundhw[9];
static int soundhw_count;
void isa_register_soundhw(const char *name, const char *descr,
int (*init_isa)(ISABus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 1;
soundhw[soundhw_count].init.init_isa = init_isa;
soundhw_count++;
}
void pci_register_soundhw(const char *name, const char *descr,
int (*init_pci)(PCIBus *bus))
{
assert(soundhw_count < ARRAY_SIZE(soundhw) - 1);
soundhw[soundhw_count].name = name;
soundhw[soundhw_count].descr = descr;
soundhw[soundhw_count].isa = 0;
soundhw[soundhw_count].init.init_pci = init_pci;
soundhw_count++;
}
void select_soundhw(const char *optarg)
{
struct soundhw *c;
if (is_help_option(optarg)) {
show_valid_cards:
if (soundhw_count) {
printf("Valid sound card names (comma separated):\n");
for (c = soundhw; c->name; ++c) {
printf ("%-11s %s\n", c->name, c->descr);
}
printf("\n-soundhw all will enable all of the above\n");
} else {
printf("Machine has no user-selectable audio hardware "
"(it may or may not have always-present audio hardware).\n");
}
exit(!is_help_option(optarg));
}
else {
size_t l;
const char *p;
char *e;
int bad_card = 0;
if (!strcmp(optarg, "all")) {
for (c = soundhw; c->name; ++c) {
c->enabled = 1;
}
return;
}
p = optarg;
while (*p) {
e = strchr(p, ',');
l = !e ? strlen(p) : (size_t) (e - p);
for (c = soundhw; c->name; ++c) {
if (!strncmp(c->name, p, l) && !c->name[l]) {
c->enabled = 1;
break;
}
}
if (!c->name) {
if (l > 80) {
error_report("Unknown sound card name (too big to show)");
}
else {
error_report("Unknown sound card name `%.*s'",
(int) l, p);
}
bad_card = 1;
}
p += l + (e != NULL);
}
if (bad_card) {
goto show_valid_cards;
}
}
}
void audio_init(void)
{
struct soundhw *c;
ISABus *isa_bus = (ISABus *) object_resolve_path_type("", TYPE_ISA_BUS, NULL);
PCIBus *pci_bus = (PCIBus *) object_resolve_path_type("", TYPE_PCI_BUS, NULL);
for (c = soundhw; c->name; ++c) {
if (c->enabled) {
if (c->isa) {
if (!isa_bus) {
error_report("ISA bus not available for %s", c->name);
exit(1);
}
c->init.init_isa(isa_bus);
} else {
if (!pci_bus) {
error_report("PCI bus not available for %s", c->name);
exit(1);
}
c->init.init_pci(pci_bus);
}
}
}
}
void do_acpitable_option(const QemuOpts *opts)
{
#ifdef TARGET_I386
Error *err = NULL;
acpi_table_add(opts, &err);
if (err) {
error_reportf_err(err, "Wrong acpi table provided: ");
exit(1);
}
#endif
}
void do_smbios_option(QemuOpts *opts)
{
#ifdef TARGET_I386
smbios_entry_add(opts);
#endif
}
void cpudef_init(void)
{
#if defined(cpudef_setup)
cpudef_setup(); /* parse cpu definitions in target config file */
#endif
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

View File

@@ -1,8 +1,7 @@
/*
* Data plane event loop
* QEMU System Emulator
*
* Copyright (c) 2003-2008 Fabrice Bellard
* Copyright (c) 2009-2017 QEMU contributors
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -30,9 +29,6 @@
#include "block/thread-pool.h"
#include "qemu/main-loop.h"
#include "qemu/atomic.h"
#include "block/raw-aio.h"
#include "qemu/coroutine_int.h"
#include "trace.h"
/***********************************************************/
/* bottom halves (can be seen as timers which expire ASAP) */
@@ -47,26 +43,6 @@ struct QEMUBH {
bool deleted;
};
void aio_bh_schedule_oneshot(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
{
QEMUBH *bh;
bh = g_new(QEMUBH, 1);
*bh = (QEMUBH){
.ctx = ctx,
.cb = cb,
.opaque = opaque,
};
qemu_lockcnt_lock(&ctx->list_lock);
bh->next = ctx->first_bh;
bh->scheduled = 1;
bh->deleted = 1;
/* Make sure that the members are ready before putting bh into list */
smp_wmb();
ctx->first_bh = bh;
qemu_lockcnt_unlock(&ctx->list_lock);
aio_notify(ctx);
}
QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
{
QEMUBH *bh;
@@ -76,12 +52,12 @@ QEMUBH *aio_bh_new(AioContext *ctx, QEMUBHFunc *cb, void *opaque)
.cb = cb,
.opaque = opaque,
};
qemu_lockcnt_lock(&ctx->list_lock);
qemu_mutex_lock(&ctx->bh_lock);
bh->next = ctx->first_bh;
/* Make sure that the members are ready before putting bh into list */
smp_wmb();
ctx->first_bh = bh;
qemu_lockcnt_unlock(&ctx->list_lock);
qemu_mutex_unlock(&ctx->bh_lock);
return bh;
}
@@ -90,56 +66,53 @@ void aio_bh_call(QEMUBH *bh)
bh->cb(bh->opaque);
}
/* Multiple occurrences of aio_bh_poll cannot be called concurrently.
* The count in ctx->list_lock is incremented before the call, and is
* not affected by the call.
*/
/* Multiple occurrences of aio_bh_poll cannot be called concurrently */
int aio_bh_poll(AioContext *ctx)
{
QEMUBH *bh, **bhp, *next;
int ret;
bool deleted = false;
ctx->walking_bh++;
ret = 0;
for (bh = atomic_rcu_read(&ctx->first_bh); bh; bh = next) {
next = atomic_rcu_read(&bh->next);
for (bh = ctx->first_bh; bh; bh = next) {
/* Make sure that fetching bh happens before accessing its members */
smp_read_barrier_depends();
next = bh->next;
/* The atomic_xchg is paired with the one in qemu_bh_schedule. The
* implicit memory barrier ensures that the callback sees all writes
* done by the scheduling thread. It also ensures that the scheduling
* thread sees the zero before bh->cb has run, and thus will call
* aio_notify again if necessary.
*/
if (atomic_xchg(&bh->scheduled, 0)) {
/* Idle BHs don't count as progress */
if (!bh->idle) {
if (!bh->deleted && atomic_xchg(&bh->scheduled, 0)) {
/* Idle BHs and the notify BH don't count as progress */
if (!bh->idle && bh != ctx->notify_dummy_bh) {
ret = 1;
}
bh->idle = 0;
aio_bh_call(bh);
}
if (bh->deleted) {
deleted = true;
}
}
ctx->walking_bh--;
/* remove deleted bhs */
if (!deleted) {
return ret;
}
if (qemu_lockcnt_dec_if_lock(&ctx->list_lock)) {
if (!ctx->walking_bh) {
qemu_mutex_lock(&ctx->bh_lock);
bhp = &ctx->first_bh;
while (*bhp) {
bh = *bhp;
if (bh->deleted && !bh->scheduled) {
if (bh->deleted) {
*bhp = bh->next;
g_free(bh);
} else {
bhp = &bh->next;
}
}
qemu_lockcnt_inc_and_unlock(&ctx->list_lock);
qemu_mutex_unlock(&ctx->bh_lock);
}
return ret;
}
@@ -174,7 +147,7 @@ void qemu_bh_schedule(QEMUBH *bh)
*/
void qemu_bh_cancel(QEMUBH *bh)
{
atomic_mb_set(&bh->scheduled, 0);
bh->scheduled = 0;
}
/* This func is async.The bottom half will do the delete action at the finial
@@ -193,9 +166,8 @@ aio_compute_timeout(AioContext *ctx)
int timeout = -1;
QEMUBH *bh;
for (bh = atomic_rcu_read(&ctx->first_bh); bh;
bh = atomic_rcu_read(&bh->next)) {
if (bh->scheduled) {
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (!bh->deleted && bh->scheduled) {
if (bh->idle) {
/* idle bottom halves will be polled at least
* every 10ms */
@@ -243,9 +215,9 @@ aio_ctx_check(GSource *source)
aio_notify_accept(ctx);
for (bh = ctx->first_bh; bh; bh = bh->next) {
if (bh->scheduled) {
if (!bh->deleted && bh->scheduled) {
return true;
}
}
}
return aio_pending(ctx) || (timerlistgroup_deadline_ns(&ctx->tlg) == 0);
}
@@ -267,21 +239,10 @@ aio_ctx_finalize(GSource *source)
{
AioContext *ctx = (AioContext *) source;
qemu_bh_delete(ctx->notify_dummy_bh);
thread_pool_free(ctx->thread_pool);
#ifdef CONFIG_LINUX_AIO
if (ctx->linux_aio) {
laio_detach_aio_context(ctx->linux_aio, ctx);
laio_cleanup(ctx->linux_aio);
ctx->linux_aio = NULL;
}
#endif
assert(QSLIST_EMPTY(&ctx->scheduled_coroutines));
qemu_bh_delete(ctx->co_schedule_bh);
qemu_lockcnt_lock(&ctx->list_lock);
assert(!qemu_lockcnt_count(&ctx->list_lock));
qemu_mutex_lock(&ctx->bh_lock);
while (ctx->first_bh) {
QEMUBH *next = ctx->first_bh->next;
@@ -291,12 +252,12 @@ aio_ctx_finalize(GSource *source)
g_free(ctx->first_bh);
ctx->first_bh = next;
}
qemu_lockcnt_unlock(&ctx->list_lock);
qemu_mutex_unlock(&ctx->bh_lock);
aio_set_event_notifier(ctx, &ctx->notifier, false, NULL, NULL);
aio_set_event_notifier(ctx, &ctx->notifier, false, NULL);
event_notifier_cleanup(&ctx->notifier);
qemu_rec_mutex_destroy(&ctx->lock);
qemu_lockcnt_destroy(&ctx->list_lock);
rfifolock_destroy(&ctx->lock);
qemu_mutex_destroy(&ctx->bh_lock);
timerlistgroup_deinit(&ctx->tlg);
}
@@ -321,17 +282,6 @@ ThreadPool *aio_get_thread_pool(AioContext *ctx)
return ctx->thread_pool;
}
#ifdef CONFIG_LINUX_AIO
LinuxAioState *aio_get_linux_aio(AioContext *ctx)
{
if (!ctx->linux_aio) {
ctx->linux_aio = laio_init();
laio_attach_aio_context(ctx->linux_aio, ctx);
}
return ctx->linux_aio;
}
#endif
void aio_notify(AioContext *ctx)
{
/* Write e.g. bh->scheduled before reading ctx->notify_me. Pairs
@@ -351,86 +301,56 @@ void aio_notify_accept(AioContext *ctx)
}
}
static void aio_timerlist_notify(void *opaque, QEMUClockType type)
static void aio_timerlist_notify(void *opaque)
{
aio_notify(opaque);
}
static void aio_rfifolock_cb(void *opaque)
{
AioContext *ctx = opaque;
/* Kick owner thread in case they are blocked in aio_poll() */
qemu_bh_schedule(ctx->notify_dummy_bh);
}
static void notify_dummy_bh(void *opaque)
{
/* Do nothing, we were invoked just to force the event loop to iterate */
}
static void event_notifier_dummy_cb(EventNotifier *e)
{
}
/* Returns true if aio_notify() was called (e.g. a BH was scheduled) */
static bool event_notifier_poll(void *opaque)
{
EventNotifier *e = opaque;
AioContext *ctx = container_of(e, AioContext, notifier);
return atomic_read(&ctx->notified);
}
static void co_schedule_bh_cb(void *opaque)
{
AioContext *ctx = opaque;
QSLIST_HEAD(, Coroutine) straight, reversed;
QSLIST_MOVE_ATOMIC(&reversed, &ctx->scheduled_coroutines);
QSLIST_INIT(&straight);
while (!QSLIST_EMPTY(&reversed)) {
Coroutine *co = QSLIST_FIRST(&reversed);
QSLIST_REMOVE_HEAD(&reversed, co_scheduled_next);
QSLIST_INSERT_HEAD(&straight, co, co_scheduled_next);
}
while (!QSLIST_EMPTY(&straight)) {
Coroutine *co = QSLIST_FIRST(&straight);
QSLIST_REMOVE_HEAD(&straight, co_scheduled_next);
trace_aio_co_schedule_bh_cb(ctx, co);
aio_context_acquire(ctx);
/* Protected by write barrier in qemu_aio_coroutine_enter */
atomic_set(&co->scheduled, NULL);
qemu_coroutine_enter(co);
aio_context_release(ctx);
}
}
AioContext *aio_context_new(Error **errp)
{
int ret;
AioContext *ctx;
Error *local_err = NULL;
ctx = (AioContext *) g_source_new(&aio_source_funcs, sizeof(AioContext));
aio_context_setup(ctx);
aio_context_setup(ctx, &local_err);
if (local_err) {
error_propagate(errp, local_err);
goto fail;
}
ret = event_notifier_init(&ctx->notifier, false);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed to initialize event notifier");
goto fail;
}
g_source_set_can_recurse(&ctx->source, true);
qemu_lockcnt_init(&ctx->list_lock);
ctx->co_schedule_bh = aio_bh_new(ctx, co_schedule_bh_cb, ctx);
QSLIST_INIT(&ctx->scheduled_coroutines);
aio_set_event_notifier(ctx, &ctx->notifier,
false,
(EventNotifierHandler *)
event_notifier_dummy_cb,
event_notifier_poll);
#ifdef CONFIG_LINUX_AIO
ctx->linux_aio = NULL;
#endif
event_notifier_dummy_cb);
ctx->thread_pool = NULL;
qemu_rec_mutex_init(&ctx->lock);
qemu_mutex_init(&ctx->bh_lock);
rfifolock_init(&ctx->lock, aio_rfifolock_cb, ctx);
timerlistgroup_init(&ctx->tlg, aio_timerlist_notify, ctx);
ctx->poll_ns = 0;
ctx->poll_max_ns = 0;
ctx->poll_grow = 0;
ctx->poll_shrink = 0;
ctx->notify_dummy_bh = aio_bh_new(ctx, notify_dummy_bh, NULL);
return ctx;
fail:
@@ -438,55 +358,6 @@ fail:
return NULL;
}
void aio_co_schedule(AioContext *ctx, Coroutine *co)
{
trace_aio_co_schedule(ctx, co);
const char *scheduled = atomic_cmpxchg(&co->scheduled, NULL,
__func__);
if (scheduled) {
fprintf(stderr,
"%s: Co-routine was already scheduled in '%s'\n",
__func__, scheduled);
abort();
}
QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
co, co_scheduled_next);
qemu_bh_schedule(ctx->co_schedule_bh);
}
void aio_co_wake(struct Coroutine *co)
{
AioContext *ctx;
/* Read coroutine before co->ctx. Matches smp_wmb in
* qemu_coroutine_enter.
*/
smp_read_barrier_depends();
ctx = atomic_read(&co->ctx);
aio_co_enter(ctx, co);
}
void aio_co_enter(AioContext *ctx, struct Coroutine *co)
{
if (ctx != qemu_get_current_aio_context()) {
aio_co_schedule(ctx, co);
return;
}
if (qemu_in_coroutine()) {
Coroutine *self = qemu_coroutine_self();
assert(self != co);
QSIMPLEQ_INSERT_TAIL(&self->co_queue_wakeup, co, co_queue_next);
} else {
aio_context_acquire(ctx);
qemu_aio_coroutine_enter(ctx, co);
aio_context_release(ctx);
}
}
void aio_context_ref(AioContext *ctx)
{
g_source_ref(&ctx->source);
@@ -499,10 +370,10 @@ void aio_context_unref(AioContext *ctx)
void aio_context_acquire(AioContext *ctx)
{
qemu_rec_mutex_lock(&ctx->lock);
rfifolock_lock(&ctx->lock);
}
void aio_context_release(AioContext *ctx)
{
qemu_rec_mutex_unlock(&ctx->lock);
rfifolock_unlock(&ctx->lock);
}

View File

@@ -11,9 +11,3 @@ common-obj-$(CONFIG_AUDIO_WIN_INT) += audio_win_int.o
common-obj-y += wavcapture.o
sdlaudio.o-cflags := $(SDL_CFLAGS)
sdlaudio.o-libs := $(SDL_LIBS)
alsaaudio.o-libs := $(ALSA_LIBS)
paaudio.o-libs := $(PULSE_LIBS)
coreaudio.o-libs := $(COREAUDIO_LIBS)
dsoundaudio.o-libs := $(DSOUND_LIBS)
ossaudio.o-libs := $(OSS_LIBS)

View File

@@ -823,7 +823,7 @@ static int alsa_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = obt.samples;
alsa->pcm_buf = audio_calloc(__func__, obt.samples, 1 << hw->info.shift);
alsa->pcm_buf = audio_calloc (AUDIO_FUNC, obt.samples, 1 << hw->info.shift);
if (!alsa->pcm_buf) {
dolog ("Could not allocate DAC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);
@@ -934,7 +934,7 @@ static int alsa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = obt.samples;
alsa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
alsa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!alsa->pcm_buf) {
dolog ("Could not allocate ADC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);

View File

@@ -28,7 +28,6 @@
#include "qemu/timer.h"
#include "sysemu/sysemu.h"
#include "qemu/cutils.h"
#include "sysemu/replay.h"
#define AUDIO_CAP "audio"
#include "audio_int.h"
@@ -424,12 +423,12 @@ static void audio_process_options (const char *prefix,
const char qemu_prefix[] = "QEMU_";
size_t preflen, optlen;
if (audio_bug(__func__, !prefix)) {
if (audio_bug (AUDIO_FUNC, !prefix)) {
dolog ("prefix = NULL\n");
return;
}
if (audio_bug(__func__, !opt)) {
if (audio_bug (AUDIO_FUNC, !opt)) {
dolog ("opt = NULL\n");
return;
}
@@ -792,7 +791,7 @@ static int audio_attach_capture (HWVoiceOut *hw)
SWVoiceOut *sw;
HWVoiceOut *hw_cap = &cap->hw;
sc = audio_calloc(__func__, 1, sizeof(*sc));
sc = audio_calloc (AUDIO_FUNC, 1, sizeof (*sc));
if (!sc) {
dolog ("Could not allocate soft capture voice (%zu bytes)\n",
sizeof (*sc));
@@ -848,7 +847,7 @@ static int audio_pcm_hw_find_min_in (HWVoiceIn *hw)
int audio_pcm_hw_get_live_in (HWVoiceIn *hw)
{
int live = hw->total_samples_captured - audio_pcm_hw_find_min_in (hw);
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -886,7 +885,7 @@ static int audio_pcm_sw_get_rpos_in (SWVoiceIn *sw)
int live = hw->total_samples_captured - sw->total_hw_samples_acquired;
int rpos;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -909,7 +908,7 @@ int audio_pcm_sw_read (SWVoiceIn *sw, void *buf, int size)
rpos = audio_pcm_sw_get_rpos_in (sw) % hw->samples;
live = hw->total_samples_captured - sw->total_hw_samples_acquired;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live_in=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -935,7 +934,7 @@ int audio_pcm_sw_read (SWVoiceIn *sw, void *buf, int size)
}
osamp = swlim;
if (audio_bug(__func__, osamp < 0)) {
if (audio_bug (AUDIO_FUNC, osamp < 0)) {
dolog ("osamp=%d\n", osamp);
return 0;
}
@@ -990,7 +989,7 @@ static int audio_pcm_hw_get_live_out (HWVoiceOut *hw, int *nb_live)
if (nb_live1) {
int live = smin;
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
return 0;
}
@@ -1014,7 +1013,7 @@ int audio_pcm_sw_write (SWVoiceOut *sw, void *buf, int size)
hwsamples = sw->hw->samples;
live = sw->total_hw_samples_mixed;
if (audio_bug(__func__, live < 0 || live > hwsamples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hwsamples)){
dolog ("live=%d hw->samples=%d\n", live, hwsamples);
return 0;
}
@@ -1113,7 +1112,7 @@ static int audio_is_timer_needed (void)
static void audio_reset_timer (AudioState *s)
{
if (audio_is_timer_needed ()) {
timer_mod_anticipate_ns(s->ts,
timer_mod (s->ts,
qemu_clock_get_ns(QEMU_CLOCK_VIRTUAL) + conf.period.ticks);
}
else {
@@ -1132,6 +1131,8 @@ static void audio_timer (void *opaque)
*/
int AUD_write (SWVoiceOut *sw, void *buf, int size)
{
int bytes;
if (!sw) {
/* XXX: Consider options */
return size;
@@ -1142,11 +1143,14 @@ int AUD_write (SWVoiceOut *sw, void *buf, int size)
return 0;
}
return sw->hw->pcm_ops->write(sw, buf, size);
bytes = sw->hw->pcm_ops->write (sw, buf, size);
return bytes;
}
int AUD_read (SWVoiceIn *sw, void *buf, int size)
{
int bytes;
if (!sw) {
/* XXX: Consider options */
return size;
@@ -1157,7 +1161,8 @@ int AUD_read (SWVoiceIn *sw, void *buf, int size)
return 0;
}
return sw->hw->pcm_ops->read(sw, buf, size);
bytes = sw->hw->pcm_ops->read (sw, buf, size);
return bytes;
}
int AUD_get_buffer_size_out (SWVoiceOut *sw)
@@ -1263,7 +1268,7 @@ static int audio_get_avail (SWVoiceIn *sw)
}
live = sw->hw->total_samples_captured - sw->total_hw_samples_acquired;
if (audio_bug(__func__, live < 0 || live > sw->hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > sw->hw->samples)) {
dolog ("live=%d sw->hw->samples=%d\n", live, sw->hw->samples);
return 0;
}
@@ -1287,7 +1292,7 @@ static int audio_get_free (SWVoiceOut *sw)
live = sw->total_hw_samples_mixed;
if (audio_bug(__func__, live < 0 || live > sw->hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > sw->hw->samples)) {
dolog ("live=%d sw->hw->samples=%d\n", live, sw->hw->samples);
return 0;
}
@@ -1354,7 +1359,7 @@ static void audio_run_out (AudioState *s)
live = 0;
}
if (audio_bug(__func__, live < 0 || live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, live < 0 || live > hw->samples)) {
dolog ("live=%d hw->samples=%d\n", live, hw->samples);
continue;
}
@@ -1388,8 +1393,7 @@ static void audio_run_out (AudioState *s)
prev_rpos = hw->rpos;
played = hw->pcm_ops->run_out (hw, live);
replay_audio_out(&played);
if (audio_bug(__func__, hw->rpos >= hw->samples)) {
if (audio_bug (AUDIO_FUNC, hw->rpos >= hw->samples)) {
dolog ("hw->rpos=%d hw->samples=%d played=%d\n",
hw->rpos, hw->samples, played);
hw->rpos = 0;
@@ -1410,7 +1414,7 @@ static void audio_run_out (AudioState *s)
continue;
}
if (audio_bug(__func__, played > sw->total_hw_samples_mixed)) {
if (audio_bug (AUDIO_FUNC, played > sw->total_hw_samples_mixed)) {
dolog ("played=%d sw->total_hw_samples_mixed=%d\n",
played, sw->total_hw_samples_mixed);
played = sw->total_hw_samples_mixed;
@@ -1452,12 +1456,9 @@ static void audio_run_in (AudioState *s)
while ((hw = audio_pcm_hw_find_any_enabled_in (hw))) {
SWVoiceIn *sw;
int captured = 0, min;
int captured, min;
if (replay_mode != REPLAY_MODE_PLAY) {
captured = hw->pcm_ops->run_in(hw);
}
replay_audio_in(&captured, hw->conv_buf, &hw->wpos, hw->samples);
captured = hw->pcm_ops->run_in (hw);
min = audio_pcm_hw_find_min_in (hw);
hw->total_samples_captured += captured - min;
@@ -1513,7 +1514,7 @@ static void audio_run_capture (AudioState *s)
continue;
}
if (audio_bug(__func__, captured > sw->total_hw_samples_mixed)) {
if (audio_bug (AUDIO_FUNC, captured > sw->total_hw_samples_mixed)) {
dolog ("captured=%d sw->total_hw_samples_mixed=%d\n",
captured, sw->total_hw_samples_mixed);
captured = sw->total_hw_samples_mixed;
@@ -1744,21 +1745,13 @@ static void audio_vm_change_state_handler (void *opaque, int running,
audio_reset_timer (s);
}
static bool is_cleaning_up;
bool audio_is_cleaning_up(void)
{
return is_cleaning_up;
}
void audio_cleanup(void)
static void audio_atexit (void)
{
AudioState *s = &glob_audio_state;
HWVoiceOut *hwo, *hwon;
HWVoiceIn *hwi, *hwin;
HWVoiceOut *hwo = NULL;
HWVoiceIn *hwi = NULL;
is_cleaning_up = true;
QLIST_FOREACH_SAFE(hwo, &glob_audio_state.hw_head_out, entries, hwon) {
while ((hwo = audio_pcm_hw_find_any_out (hwo))) {
SWVoiceCap *sc;
if (hwo->enabled) {
@@ -1774,20 +1767,17 @@ void audio_cleanup(void)
cb->ops.destroy (cb->opaque);
}
}
QLIST_REMOVE(hwo, entries);
}
QLIST_FOREACH_SAFE(hwi, &glob_audio_state.hw_head_in, entries, hwin) {
while ((hwi = audio_pcm_hw_find_any_in (hwi))) {
if (hwi->enabled) {
hwi->pcm_ops->ctl_in (hwi, VOICE_DISABLE);
}
hwi->pcm_ops->fini_in (hwi);
QLIST_REMOVE(hwi, entries);
}
if (s->drv) {
s->drv->fini (s->drv_opaque);
s->drv = NULL;
}
}
@@ -1815,7 +1805,7 @@ static void audio_init (void)
QLIST_INIT (&s->hw_head_out);
QLIST_INIT (&s->hw_head_in);
QLIST_INIT (&s->cap_head);
atexit(audio_cleanup);
atexit (audio_atexit);
s->ts = timer_new_ns(QEMU_CLOCK_VIRTUAL, audio_timer, s);
@@ -1924,7 +1914,7 @@ CaptureVoiceOut *AUD_add_capture (
goto err0;
}
cb = audio_calloc(__func__, 1, sizeof(*cb));
cb = audio_calloc (AUDIO_FUNC, 1, sizeof (*cb));
if (!cb) {
dolog ("Could not allocate capture callback information, size %zu\n",
sizeof (*cb));
@@ -1942,7 +1932,7 @@ CaptureVoiceOut *AUD_add_capture (
HWVoiceOut *hw;
CaptureVoiceOut *cap;
cap = audio_calloc(__func__, 1, sizeof(*cap));
cap = audio_calloc (AUDIO_FUNC, 1, sizeof (*cap));
if (!cap) {
dolog ("Could not allocate capture voice, size %zu\n",
sizeof (*cap));
@@ -1955,8 +1945,8 @@ CaptureVoiceOut *AUD_add_capture (
/* XXX find a more elegant way */
hw->samples = 4096 * 4;
hw->mix_buf = audio_calloc(__func__, hw->samples,
sizeof(struct st_sample));
hw->mix_buf = audio_calloc (AUDIO_FUNC, hw->samples,
sizeof (struct st_sample));
if (!hw->mix_buf) {
dolog ("Could not allocate capture mix buffer (%d samples)\n",
hw->samples);
@@ -1965,7 +1955,7 @@ CaptureVoiceOut *AUD_add_capture (
audio_pcm_init_info (&hw->info, as);
cap->buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
cap->buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!cap->buf) {
dolog ("Could not allocate capture buffer "
"(%d samples, each %d bytes)\n",
@@ -1982,7 +1972,8 @@ CaptureVoiceOut *AUD_add_capture (
QLIST_INSERT_HEAD (&s->cap_head, cap, entries);
QLIST_INSERT_HEAD (&cap->cb_head, cb, entries);
QLIST_FOREACH(hw, &glob_audio_state.hw_head_out, entries) {
hw = NULL;
while ((hw = audio_pcm_hw_find_any_out (hw))) {
audio_attach_capture (hw);
}
return cap;

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef QEMU_AUDIO_H
#define QEMU_AUDIO_H
@@ -163,12 +162,4 @@ static inline void *advance (void *p, int incr)
int wav_start_capture (CaptureState *s, const char *path, int freq,
int bits, int nchannels);
bool audio_is_cleaning_up(void);
void audio_cleanup(void);
void audio_sample_to_uint64(void *samples, int pos,
uint64_t *left, uint64_t *right);
void audio_sample_from_uint64(void *samples, int pos,
uint64_t left, uint64_t right);
#endif /* QEMU_AUDIO_H */
#endif /* audio.h */

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef QEMU_AUDIO_INT_H
#define QEMU_AUDIO_INT_H
@@ -252,4 +251,10 @@ static inline int audio_ring_dist (int dst, int src, int len)
#define AUDIO_STRINGIFY_(n) #n
#define AUDIO_STRINGIFY(n) AUDIO_STRINGIFY_(n)
#endif /* QEMU_AUDIO_INT_H */
#if defined _MSC_VER || defined __GNUC__
#define AUDIO_FUNC __FUNCTION__
#else
#define AUDIO_FUNC __FILE__ ":" AUDIO_STRINGIFY (__LINE__)
#endif
#endif /* audio_int.h */

View File

@@ -31,7 +31,7 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err = sigfillset (&set);
if (err) {
logerr(p, errno, "%s(%s): sigfillset failed", cap, __func__);
logerr (p, errno, "%s(%s): sigfillset failed", cap, AUDIO_FUNC);
return -1;
}
@@ -57,8 +57,8 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err2 = pthread_sigmask (SIG_SETMASK, &old_set, NULL);
if (err2) {
logerr(p, err2, "%s(%s): pthread_sigmask (restore) failed",
cap, __func__);
logerr (p, err2, "%s(%s): pthread_sigmask (restore) failed",
cap, AUDIO_FUNC);
/* We have failed to restore original signal mask, all bets are off,
so terminate the process */
exit (EXIT_FAILURE);
@@ -74,17 +74,17 @@ int audio_pt_init (struct audio_pt *p, void *(*func) (void *),
err2:
err2 = pthread_cond_destroy (&p->cond);
if (err2) {
logerr(p, err2, "%s(%s): pthread_cond_destroy failed", cap, __func__);
logerr (p, err2, "%s(%s): pthread_cond_destroy failed", cap, AUDIO_FUNC);
}
err1:
err2 = pthread_mutex_destroy (&p->mutex);
if (err2) {
logerr(p, err2, "%s(%s): pthread_mutex_destroy failed", cap, __func__);
logerr (p, err2, "%s(%s): pthread_mutex_destroy failed", cap, AUDIO_FUNC);
}
err0:
logerr(p, err, "%s(%s): %s failed", cap, __func__, efunc);
logerr (p, err, "%s(%s): %s failed", cap, AUDIO_FUNC, efunc);
return -1;
}
@@ -94,13 +94,13 @@ int audio_pt_fini (struct audio_pt *p, const char *cap)
err = pthread_cond_destroy (&p->cond);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_destroy failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_destroy failed", cap, AUDIO_FUNC);
ret = -1;
}
err = pthread_mutex_destroy (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_destroy failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_destroy failed", cap, AUDIO_FUNC);
ret = -1;
}
return ret;
@@ -112,7 +112,7 @@ int audio_pt_lock (struct audio_pt *p, const char *cap)
err = pthread_mutex_lock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_lock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_lock failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -124,7 +124,7 @@ int audio_pt_unlock (struct audio_pt *p, const char *cap)
err = pthread_mutex_unlock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_unlock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_unlock failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -136,7 +136,7 @@ int audio_pt_wait (struct audio_pt *p, const char *cap)
err = pthread_cond_wait (&p->cond, &p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_wait failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_wait failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -148,12 +148,12 @@ int audio_pt_unlock_and_signal (struct audio_pt *p, const char *cap)
err = pthread_mutex_unlock (&p->mutex);
if (err) {
logerr(p, err, "%s(%s): pthread_mutex_unlock failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_mutex_unlock failed", cap, AUDIO_FUNC);
return -1;
}
err = pthread_cond_signal (&p->cond);
if (err) {
logerr(p, err, "%s(%s): pthread_cond_signal failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_cond_signal failed", cap, AUDIO_FUNC);
return -1;
}
return 0;
@@ -166,7 +166,7 @@ int audio_pt_join (struct audio_pt *p, void **arg, const char *cap)
err = pthread_join (p->thread, &ret);
if (err) {
logerr(p, err, "%s(%s): pthread_join failed", cap, __func__);
logerr (p, err, "%s(%s): pthread_join failed", cap, AUDIO_FUNC);
return -1;
}
*arg = ret;

View File

@@ -19,4 +19,4 @@ int audio_pt_wait (struct audio_pt *, const char *);
int audio_pt_unlock_and_signal (struct audio_pt *, const char *);
int audio_pt_join (struct audio_pt *, void **, const char *);
#endif /* QEMU_AUDIO_PT_INT_H */
#endif /* audio_pt_int.h */

View File

@@ -57,13 +57,13 @@ static void glue (audio_init_nb_voices_, TYPE) (struct audio_driver *drv)
glue (s->nb_hw_voices_, TYPE) = max_voices;
}
if (audio_bug(__func__, !voice_size && max_voices)) {
if (audio_bug (AUDIO_FUNC, !voice_size && max_voices)) {
dolog ("drv=`%s' voice_size=0 max_voices=%d\n",
drv->name, max_voices);
glue (s->nb_hw_voices_, TYPE) = 0;
}
if (audio_bug(__func__, voice_size && !max_voices)) {
if (audio_bug (AUDIO_FUNC, voice_size && !max_voices)) {
dolog ("drv=`%s' voice_size=%d max_voices=0\n",
drv->name, voice_size);
}
@@ -77,7 +77,7 @@ static void glue (audio_pcm_hw_free_resources_, TYPE) (HW *hw)
static int glue (audio_pcm_hw_alloc_resources_, TYPE) (HW *hw)
{
HWBUF = audio_calloc(__func__, hw->samples, sizeof(struct st_sample));
HWBUF = audio_calloc (AUDIO_FUNC, hw->samples, sizeof (struct st_sample));
if (!HWBUF) {
dolog ("Could not allocate " NAME " buffer (%d samples)\n",
hw->samples);
@@ -105,7 +105,7 @@ static int glue (audio_pcm_sw_alloc_resources_, TYPE) (SW *sw)
samples = ((int64_t) sw->hw->samples << 32) / sw->ratio;
sw->buf = audio_calloc(__func__, samples, sizeof(struct st_sample));
sw->buf = audio_calloc (AUDIO_FUNC, samples, sizeof (struct st_sample));
if (!sw->buf) {
dolog ("Could not allocate buffer for `%s' (%d samples)\n",
SW_NAME (sw), samples);
@@ -238,17 +238,17 @@ static HW *glue (audio_pcm_hw_add_new_, TYPE) (struct audsettings *as)
return NULL;
}
if (audio_bug(__func__, !drv)) {
if (audio_bug (AUDIO_FUNC, !drv)) {
dolog ("No host audio driver\n");
return NULL;
}
if (audio_bug(__func__, !drv->pcm_ops)) {
if (audio_bug (AUDIO_FUNC, !drv->pcm_ops)) {
dolog ("Host audio driver without pcm_ops\n");
return NULL;
}
hw = audio_calloc(__func__, 1, glue(drv->voice_size_, TYPE));
hw = audio_calloc (AUDIO_FUNC, 1, glue (drv->voice_size_, TYPE));
if (!hw) {
dolog ("Can not allocate voice `%s' size %d\n",
drv->name, glue (drv->voice_size_, TYPE));
@@ -266,7 +266,7 @@ static HW *glue (audio_pcm_hw_add_new_, TYPE) (struct audsettings *as)
goto err0;
}
if (audio_bug(__func__, hw->samples <= 0)) {
if (audio_bug (AUDIO_FUNC, hw->samples <= 0)) {
dolog ("hw->samples=%d\n", hw->samples);
goto err1;
}
@@ -339,7 +339,7 @@ static SW *glue (audio_pcm_create_voice_pair_, TYPE) (
hw_as = *as;
}
sw = audio_calloc(__func__, 1, sizeof(*sw));
sw = audio_calloc (AUDIO_FUNC, 1, sizeof (*sw));
if (!sw) {
dolog ("Could not allocate soft voice `%s' (%zu bytes)\n",
sw_name ? sw_name : "unknown", sizeof (*sw));
@@ -379,7 +379,7 @@ static void glue (audio_close_, TYPE) (SW *sw)
void glue (AUD_close_, TYPE) (QEMUSoundCard *card, SW *sw)
{
if (sw) {
if (audio_bug(__func__, !card)) {
if (audio_bug (AUDIO_FUNC, !card)) {
dolog ("card=%p\n", card);
return;
}
@@ -399,7 +399,7 @@ SW *glue (AUD_open_, TYPE) (
{
AudioState *s = &glob_audio_state;
if (audio_bug(__func__, !card || !name || !callback_fn || !as)) {
if (audio_bug (AUDIO_FUNC, !card || !name || !callback_fn || !as)) {
dolog ("card=%p name=%p callback_fn=%p as=%p\n",
card, name, callback_fn, as);
goto fail;
@@ -408,12 +408,12 @@ SW *glue (AUD_open_, TYPE) (
ldebug ("open %s, freq %d, nchannels %d, fmt %d\n",
name, as->freq, as->nchannels, as->fmt);
if (audio_bug(__func__, audio_validate_settings(as))) {
if (audio_bug (AUDIO_FUNC, audio_validate_settings (as))) {
audio_print_settings (as);
goto fail;
}
if (audio_bug(__func__, !s->drv)) {
if (audio_bug (AUDIO_FUNC, !s->drv)) {
dolog ("Can not open `%s' (no host audio driver)\n", name);
goto fail;
}

View File

@@ -36,6 +36,8 @@
#define MAC_OS_X_VERSION_10_6 1060
#endif
static int isAtexit;
typedef struct {
int buffer_frames;
int nbuffers;
@@ -376,6 +378,11 @@ static inline UInt32 isPlaying (AudioDeviceID outputDeviceID)
return result;
}
static void coreaudio_atexit (void)
{
isAtexit = 1;
}
static int coreaudio_lock (coreaudioVoiceOut *core, const char *fn_name)
{
int err;
@@ -623,7 +630,7 @@ static void coreaudio_fini_out (HWVoiceOut *hw)
int err;
coreaudioVoiceOut *core = (coreaudioVoiceOut *) hw;
if (!audio_is_cleaning_up()) {
if (!isAtexit) {
/* stop playback */
if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID, core->ioprocid);
@@ -666,7 +673,7 @@ static int coreaudio_ctl_out (HWVoiceOut *hw, int cmd, ...)
case VOICE_DISABLE:
/* stop playback */
if (!audio_is_cleaning_up()) {
if (!isAtexit) {
if (isPlaying(core->outputDeviceID)) {
status = AudioDeviceStop(core->outputDeviceID,
core->ioprocid);
@@ -690,6 +697,7 @@ static void *coreaudio_audio_init (void)
CoreaudioConf *conf = g_malloc(sizeof(CoreaudioConf));
*conf = glob_conf;
atexit(coreaudio_atexit);
return conf;
}

View File

@@ -543,7 +543,7 @@ static int dsound_run_out (HWVoiceOut *hw, int live)
}
}
if (audio_bug(__func__, len < 0 || len > bufsize)) {
if (audio_bug (AUDIO_FUNC, len < 0 || len > bufsize)) {
dolog ("len=%d bufsize=%d old_pos=%ld ppos=%ld\n",
len, bufsize, old_pos, ppos);
return 0;

View File

@@ -24,8 +24,6 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h"
#include "audio.h"
#define AUDIO_CAP "mixeng"
@@ -268,37 +266,6 @@ f_sample *mixeng_clip[2][2][2][3] = {
}
};
void audio_sample_to_uint64(void *samples, int pos,
uint64_t *left, uint64_t *right)
{
struct st_sample *sample = samples;
sample += pos;
#ifdef FLOAT_MIXENG
error_report(
"Coreaudio and floating point samples are not supported by replay yet");
abort();
#else
*left = sample->l;
*right = sample->r;
#endif
}
void audio_sample_from_uint64(void *samples, int pos,
uint64_t left, uint64_t right)
{
struct st_sample *sample = samples;
sample += pos;
#ifdef FLOAT_MIXENG
error_report(
"Coreaudio and floating point samples are not supported by replay yet");
abort();
#else
sample->l = left;
sample->r = right;
#endif
}
/*
* August 21, 1998
* Copyright 1998 Fabrice Bellard.
@@ -344,7 +311,7 @@ struct rate {
*/
void *st_rate_start (int inrate, int outrate)
{
struct rate *rate = audio_calloc(__func__, 1, sizeof(*rate));
struct rate *rate = audio_calloc (AUDIO_FUNC, 1, sizeof (*rate));
if (!rate) {
dolog ("Could not allocate resampler (%zu bytes)\n", sizeof (*rate));

View File

@@ -21,7 +21,6 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef QEMU_MIXENG_H
#define QEMU_MIXENG_H
@@ -49,4 +48,4 @@ void st_rate_stop (void *opaque);
void mixeng_clear (struct st_sample *buf, int len);
void mixeng_volume (struct st_sample *buf, int len, struct mixeng_volume *vol);
#endif /* QEMU_MIXENG_H */
#endif /* mixeng.h */

View File

@@ -23,7 +23,6 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/host-utils.h"
#include "audio.h"
#include "qemu/timer.h"

View File

@@ -22,6 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <sys/soundcard.h>
#include "qemu-common.h"
@@ -582,9 +583,11 @@ static int oss_init_out(HWVoiceOut *hw, struct audsettings *as,
}
if (!oss->mmapped) {
oss->pcm_buf = audio_calloc(__func__,
hw->samples,
1 << hw->info.shift);
oss->pcm_buf = audio_calloc (
AUDIO_FUNC,
hw->samples,
1 << hw->info.shift
);
if (!oss->pcm_buf) {
dolog (
"Could not allocate DAC buffer (%d samples, each %d bytes)\n",
@@ -703,7 +706,7 @@ static int oss_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
}
hw->samples = (obt.nfrags * obt.fragsize) >> hw->info.shift;
oss->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
oss->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!oss->pcm_buf) {
dolog ("Could not allocate ADC buffer (%d samples, each %d bytes)\n",
hw->samples, 1 << hw->info.shift);

View File

@@ -89,7 +89,7 @@ static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
} \
goto label; \
} \
} while (0)
} while (0);
#define CHECK_DEAD_GOTO(c, stream, rerror, label) \
do { \
@@ -107,7 +107,7 @@ static inline int PA_STREAM_IS_GOOD(pa_stream_state_t x)
} \
goto label; \
} \
} while (0)
} while (0);
static int qpa_simple_read (PAVoiceIn *p, void *data, size_t length, int *rerror)
{
@@ -206,7 +206,7 @@ static void *qpa_thread_out (void *arg)
PAVoiceOut *pa = arg;
HWVoiceOut *hw = &pa->hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -222,7 +222,7 @@ static void *qpa_thread_out (void *arg)
break;
}
if (audio_pt_wait(&pa->pt, __func__)) {
if (audio_pt_wait (&pa->pt, AUDIO_FUNC)) {
goto exit;
}
}
@@ -230,7 +230,7 @@ static void *qpa_thread_out (void *arg)
decr = to_mix = audio_MIN (pa->live, pa->g->conf.samples >> 2);
rpos = pa->rpos;
if (audio_pt_unlock(&pa->pt, __func__)) {
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -251,7 +251,7 @@ static void *qpa_thread_out (void *arg)
to_mix -= chunk;
}
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -261,7 +261,7 @@ static void *qpa_thread_out (void *arg)
}
exit:
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
return NULL;
}
@@ -270,7 +270,7 @@ static int qpa_run_out (HWVoiceOut *hw, int live)
int decr;
PAVoiceOut *pa = (PAVoiceOut *) hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return 0;
}
@@ -279,10 +279,10 @@ static int qpa_run_out (HWVoiceOut *hw, int live)
pa->live = live - decr;
hw->rpos = pa->rpos;
if (pa->live > 0) {
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
}
return decr;
}
@@ -298,7 +298,7 @@ static void *qpa_thread_in (void *arg)
PAVoiceIn *pa = arg;
HWVoiceIn *hw = &pa->hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -314,7 +314,7 @@ static void *qpa_thread_in (void *arg)
break;
}
if (audio_pt_wait(&pa->pt, __func__)) {
if (audio_pt_wait (&pa->pt, AUDIO_FUNC)) {
goto exit;
}
}
@@ -322,7 +322,7 @@ static void *qpa_thread_in (void *arg)
incr = to_grab = audio_MIN (pa->dead, pa->g->conf.samples >> 2);
wpos = pa->wpos;
if (audio_pt_unlock(&pa->pt, __func__)) {
if (audio_pt_unlock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -342,7 +342,7 @@ static void *qpa_thread_in (void *arg)
to_grab -= chunk;
}
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return NULL;
}
@@ -352,7 +352,7 @@ static void *qpa_thread_in (void *arg)
}
exit:
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
return NULL;
}
@@ -361,7 +361,7 @@ static int qpa_run_in (HWVoiceIn *hw)
int live, incr, dead;
PAVoiceIn *pa = (PAVoiceIn *) hw;
if (audio_pt_lock(&pa->pt, __func__)) {
if (audio_pt_lock (&pa->pt, AUDIO_FUNC)) {
return 0;
}
@@ -372,10 +372,10 @@ static int qpa_run_in (HWVoiceIn *hw)
pa->dead = dead - incr;
hw->wpos = pa->wpos;
if (pa->dead > 0) {
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
}
else {
audio_pt_unlock(&pa->pt, __func__);
audio_pt_unlock (&pa->pt, AUDIO_FUNC);
}
return incr;
}
@@ -579,7 +579,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
pa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->rpos = hw->rpos;
if (!pa->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
@@ -587,7 +587,7 @@ static int qpa_init_out(HWVoiceOut *hw, struct audsettings *as,
goto fail2;
}
if (audio_pt_init(&pa->pt, qpa_thread_out, hw, AUDIO_CAP, __func__)) {
if (audio_pt_init (&pa->pt, qpa_thread_out, hw, AUDIO_CAP, AUDIO_FUNC)) {
goto fail3;
}
@@ -636,7 +636,7 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
audio_pcm_init_info (&hw->info, &obt_as);
hw->samples = g->conf.samples;
pa->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
pa->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
pa->wpos = hw->wpos;
if (!pa->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
@@ -644,7 +644,7 @@ static int qpa_init_in(HWVoiceIn *hw, struct audsettings *as, void *drv_opaque)
goto fail2;
}
if (audio_pt_init(&pa->pt, qpa_thread_in, hw, AUDIO_CAP, __func__)) {
if (audio_pt_init (&pa->pt, qpa_thread_in, hw, AUDIO_CAP, AUDIO_FUNC)) {
goto fail3;
}
@@ -667,17 +667,17 @@ static void qpa_fini_out (HWVoiceOut *hw)
void *ret;
PAVoiceOut *pa = (PAVoiceOut *) hw;
audio_pt_lock(&pa->pt, __func__);
audio_pt_lock (&pa->pt, AUDIO_FUNC);
pa->done = 1;
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_join(&pa->pt, &ret, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
audio_pt_join (&pa->pt, &ret, AUDIO_FUNC);
if (pa->stream) {
pa_stream_unref (pa->stream);
pa->stream = NULL;
}
audio_pt_fini(&pa->pt, __func__);
audio_pt_fini (&pa->pt, AUDIO_FUNC);
g_free (pa->pcm_buf);
pa->pcm_buf = NULL;
}
@@ -687,17 +687,17 @@ static void qpa_fini_in (HWVoiceIn *hw)
void *ret;
PAVoiceIn *pa = (PAVoiceIn *) hw;
audio_pt_lock(&pa->pt, __func__);
audio_pt_lock (&pa->pt, AUDIO_FUNC);
pa->done = 1;
audio_pt_unlock_and_signal(&pa->pt, __func__);
audio_pt_join(&pa->pt, &ret, __func__);
audio_pt_unlock_and_signal (&pa->pt, AUDIO_FUNC);
audio_pt_join (&pa->pt, &ret, AUDIO_FUNC);
if (pa->stream) {
pa_stream_unref (pa->stream);
pa->stream = NULL;
}
audio_pt_fini(&pa->pt, __func__);
audio_pt_fini (&pa->pt, AUDIO_FUNC);
g_free (pa->pcm_buf);
pa->pcm_buf = NULL;
}
@@ -781,22 +781,23 @@ static int qpa_ctl_in (HWVoiceIn *hw, int cmd, ...)
pa_threaded_mainloop_lock (g->mainloop);
op = pa_context_set_source_output_volume (g->context,
pa_stream_get_index (pa->stream),
/* FIXME: use the upcoming "set_source_output_{volume,mute}" */
op = pa_context_set_source_volume_by_index (g->context,
pa_stream_get_device_index (pa->stream),
&v, NULL, NULL);
if (!op) {
qpa_logerr (pa_context_errno (g->context),
"set_source_output_volume() failed\n");
"set_source_volume() failed\n");
} else {
pa_operation_unref(op);
}
op = pa_context_set_source_output_mute (g->context,
op = pa_context_set_source_mute_by_index (g->context,
pa_stream_get_index (pa->stream),
sw->vol.mute, NULL, NULL);
if (!op) {
qpa_logerr (pa_context_errno (g->context),
"set_source_output_mute() failed\n");
"set_source_mute() failed\n");
} else {
pa_operation_unref (op);
}

View File

@@ -71,12 +71,6 @@ void NAME (void *opaque, struct st_sample *ibuf, struct st_sample *obuf,
while (rate->ipos <= (rate->opos >> 32)) {
ilast = *ibuf++;
rate->ipos++;
/* if ipos overflow, there is a infinite loop */
if (rate->ipos == 0xffffffff) {
rate->ipos = 1;
rate->opos = rate->opos & 0xffffffff;
}
/* See if we finished the input buffer yet */
if (ibuf >= iend) {
goto the_end;

View File

@@ -38,14 +38,10 @@
#define AUDIO_CAP "sdl"
#include "audio_int.h"
#define USE_SEMAPHORE (SDL_MAJOR_VERSION < 2)
typedef struct SDLVoiceOut {
HWVoiceOut hw;
int live;
#if USE_SEMAPHORE
int rpos;
#endif
int decr;
} SDLVoiceOut;
@@ -57,10 +53,8 @@ static struct {
static struct SDLAudioState {
int exit;
#if USE_SEMAPHORE
SDL_mutex *mutex;
SDL_sem *sem;
#endif
int initialized;
bool driver_created;
} glob_sdl;
@@ -79,45 +73,31 @@ static void GCC_FMT_ATTR (1, 2) sdl_logerr (const char *fmt, ...)
static int sdl_lock (SDLAudioState *s, const char *forfn)
{
#if USE_SEMAPHORE
if (SDL_LockMutex (s->mutex)) {
sdl_logerr ("SDL_LockMutex for %s failed\n", forfn);
return -1;
}
#else
SDL_LockAudio();
#endif
return 0;
}
static int sdl_unlock (SDLAudioState *s, const char *forfn)
{
#if USE_SEMAPHORE
if (SDL_UnlockMutex (s->mutex)) {
sdl_logerr ("SDL_UnlockMutex for %s failed\n", forfn);
return -1;
}
#else
SDL_UnlockAudio();
#endif
return 0;
}
static int sdl_post (SDLAudioState *s, const char *forfn)
{
#if USE_SEMAPHORE
if (SDL_SemPost (s->sem)) {
sdl_logerr ("SDL_SemPost for %s failed\n", forfn);
return -1;
}
#endif
return 0;
}
#if USE_SEMAPHORE
static int sdl_wait (SDLAudioState *s, const char *forfn)
{
if (SDL_SemWait (s->sem)) {
@@ -126,7 +106,6 @@ static int sdl_wait (SDLAudioState *s, const char *forfn)
}
return 0;
}
#endif
static int sdl_unlock_and_post (SDLAudioState *s, const char *forfn)
{
@@ -267,7 +246,6 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
int to_mix, decr;
/* dolog ("in callback samples=%d\n", samples); */
#if USE_SEMAPHORE
sdl_wait (s, "sdl_callback");
if (s->exit) {
return;
@@ -277,7 +255,7 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
return;
}
if (audio_bug(__func__, sdl->live < 0 || sdl->live > hw->samples)) {
if (audio_bug (AUDIO_FUNC, sdl->live < 0 || sdl->live > hw->samples)) {
dolog ("sdl->live=%d hw->samples=%d\n",
sdl->live, hw->samples);
return;
@@ -286,11 +264,6 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
if (!sdl->live) {
goto again;
}
#else
if (s->exit || !sdl->live) {
break;
}
#endif
/* dolog ("in callback live=%d\n", live); */
to_mix = audio_MIN (samples, sdl->live);
@@ -301,11 +274,7 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
/* dolog ("in callback to_mix %d, chunk %d\n", to_mix, chunk); */
hw->clip (buf, src, chunk);
#if USE_SEMAPHORE
sdl->rpos = (sdl->rpos + chunk) % hw->samples;
#else
hw->rpos = (hw->rpos + chunk) % hw->samples;
#endif
to_mix -= chunk;
buf += chunk << hw->info.shift;
}
@@ -313,21 +282,12 @@ static void sdl_callback (void *opaque, Uint8 *buf, int len)
sdl->live -= decr;
sdl->decr += decr;
#if USE_SEMAPHORE
again:
if (sdl_unlock (s, "sdl_callback")) {
return;
}
#endif
}
/* dolog ("done len=%d\n", len); */
#if (SDL_MAJOR_VERSION >= 2)
/* SDL2 does not clear the remaining buffer for us, so do it on our own */
if (samples) {
memset(buf, 0, samples << hw->info.shift);
}
#endif
}
static int sdl_write_out (SWVoiceOut *sw, void *buf, int len)
@@ -355,12 +315,8 @@ static int sdl_run_out (HWVoiceOut *hw, int live)
decr = audio_MIN (sdl->decr, live);
sdl->decr -= decr;
#if USE_SEMAPHORE
sdl->live = live - decr;
hw->rpos = sdl->rpos;
#else
sdl->live = live;
#endif
if (sdl->live > 0) {
sdl_unlock_and_post (s, "sdl_run_out");
@@ -449,7 +405,6 @@ static void *sdl_audio_init (void)
return NULL;
}
#if USE_SEMAPHORE
s->mutex = SDL_CreateMutex ();
if (!s->mutex) {
sdl_logerr ("Failed to create SDL mutex\n");
@@ -464,7 +419,6 @@ static void *sdl_audio_init (void)
SDL_QuitSubSystem (SDL_INIT_AUDIO);
return NULL;
}
#endif
s->driver_created = true;
return s;
@@ -474,10 +428,8 @@ static void sdl_audio_fini (void *opaque)
{
SDLAudioState *s = opaque;
sdl_close (s);
#if USE_SEMAPHORE
SDL_DestroySemaphore (s->sem);
SDL_DestroyMutex (s->mutex);
#endif
SDL_QuitSubSystem (SDL_INIT_AUDIO);
s->driver_created = false;
}

View File

@@ -19,7 +19,6 @@
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "qemu/host-utils.h"
#include "qemu/error-report.h"
#include "qemu/timer.h"
#include "ui/qemu-spice.h"

View File

@@ -1,17 +0,0 @@
# See docs/devel/tracing.txt for syntax documentation.
# audio/alsaaudio.c
alsa_revents(int revents) "revents = %d"
alsa_pollout(int i, int fd) "i = %d fd = %d"
alsa_set_handler(int events, int index, int fd, int err) "events=0x%x index=%d fd=%d err=%d"
alsa_wrote_zero(int len) "Failed to write %d frames (wrote zero)"
alsa_read_zero(long len) "Failed to read %ld frames (read zero)"
alsa_xrun_out(void) "Recovering from playback xrun"
alsa_xrun_in(void) "Recovering from capture xrun"
alsa_resume_out(void) "Resuming suspended output stream"
alsa_resume_in(void) "Resuming suspended input stream"
alsa_no_frames(int state) "No frames available and ALSA state is %d"
# audio/ossaudio.c
oss_version(int version) "OSS version = 0x%x"
oss_invalid_available_size(int size, int bufsize) "Invalid available size, size=%d bufsize=%d"

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu/host-utils.h"
#include "hw/hw.h"
#include "qemu/timer.h"
#include "audio.h"
@@ -139,7 +139,7 @@ static int wav_init_out(HWVoiceOut *hw, struct audsettings *as,
audio_pcm_init_info (&hw->info, &wav_as);
hw->samples = 1024;
wav->pcm_buf = audio_calloc(__func__, hw->samples, 1 << hw->info.shift);
wav->pcm_buf = audio_calloc (AUDIO_FUNC, hw->samples, 1 << hw->info.shift);
if (!wav->pcm_buf) {
dolog ("Could not allocate buffer (%d bytes)\n",
hw->samples << hw->info.shift);

View File

@@ -1,7 +1,6 @@
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "monitor/monitor.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "audio.h"
@@ -89,7 +88,6 @@ static void wav_capture_destroy (void *opaque)
WAVState *wav = opaque;
AUD_del_capture (wav->cap, wav);
g_free (wav);
}
static void wav_capture_info (void *opaque)

View File

@@ -1,12 +1,11 @@
common-obj-y += rng.o rng-egd.o
common-obj-$(CONFIG_POSIX) += rng-random.o
common-obj-y += msmouse.o testdev.o
common-obj-$(CONFIG_BRLAPI) += baum.o
baum.o-cflags := $(SDL_CFLAGS)
common-obj-$(CONFIG_TPM) += tpm.o
common-obj-y += hostmem.o hostmem-ram.o
common-obj-$(CONFIG_LINUX) += hostmem-file.o
common-obj-y += cryptodev.o
common-obj-y += cryptodev-builtin.o
common-obj-$(CONFIG_LINUX) += hostmem-memfd.o

View File

@@ -1,7 +1,7 @@
/*
* QEMU Baum Braille Device
*
* Copyright (c) 2008, 2010-2011, 2016-2017 Samuel Thibault
* Copyright (c) 2008 Samuel Thibault
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -24,13 +24,15 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "chardev/char.h"
#include "sysemu/char.h"
#include "qemu/timer.h"
#include "hw/usb.h"
#include "ui/console.h"
#include <brlapi.h>
#include <brlapi_constants.h>
#include <brlapi_keycodes.h>
#ifdef CONFIG_SDL
#include <SDL_syswm.h>
#endif
#if 0
#define DPRINTF(fmt, ...) \
@@ -85,12 +87,11 @@
#define BUF_SIZE 256
typedef struct {
Chardev parent;
CharDriverState *chr;
brlapi_handle_t *brlapi;
int brlapi_fd;
unsigned int x, y;
bool deferred_init;
uint8_t in_buf[BUF_SIZE];
uint8_t in_buf_used;
@@ -98,17 +99,11 @@ typedef struct {
uint8_t out_buf_used, out_buf_ptr;
QEMUTimer *cellCount_timer;
} BaumChardev;
#define TYPE_CHARDEV_BRAILLE "chardev-braille"
#define BAUM_CHARDEV(obj) OBJECT_CHECK(BaumChardev, (obj), TYPE_CHARDEV_BRAILLE)
} BaumDriverState;
/* Let's assume NABCC by default */
enum way {
DOTS2ASCII,
ASCII2DOTS
};
static const uint8_t nabcc_translation[2][256] = {
static const uint8_t nabcc_translation[256] = {
[0] = ' ',
#ifndef BRLAPI_DOTS
#define BRLAPI_DOTS(d1,d2,d3,d4,d5,d6,d7,d8) \
((d1?BRLAPI_DOT1:0)|\
@@ -120,151 +115,111 @@ static const uint8_t nabcc_translation[2][256] = {
(d7?BRLAPI_DOT7:0)|\
(d8?BRLAPI_DOT8:0))
#endif
#define DO(dots, ascii) \
[DOTS2ASCII][dots] = ascii, \
[ASCII2DOTS][ascii] = dots
DO(0, ' '),
DO(BRLAPI_DOTS(1, 0, 0, 0, 0, 0, 0, 0), 'a'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 0, 0, 0, 0), 'b'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 0, 0, 0, 0), 'c'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 1, 0, 0, 0), 'd'),
DO(BRLAPI_DOTS(1, 0, 0, 0, 1, 0, 0, 0), 'e'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 0, 0, 0, 0), 'f'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 1, 0, 0, 0), 'g'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 1, 0, 0, 0), 'h'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 0, 0, 0, 0), 'i'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 1, 0, 0, 0), 'j'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 0, 0, 0, 0), 'k'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 0, 0, 0, 0), 'l'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 0, 0, 0, 0), 'm'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 1, 0, 0, 0), 'n'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 1, 0, 0, 0), 'o'),
DO(BRLAPI_DOTS(1, 1, 1, 1, 0, 0, 0, 0), 'p'),
DO(BRLAPI_DOTS(1, 1, 1, 1, 1, 0, 0, 0), 'q'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 1, 0, 0, 0), 'r'),
DO(BRLAPI_DOTS(0, 1, 1, 1, 0, 0, 0, 0), 's'),
DO(BRLAPI_DOTS(0, 1, 1, 1, 1, 0, 0, 0), 't'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 0, 1, 0, 0), 'u'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 0, 1, 0, 0), 'v'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 1, 1, 0, 0), 'w'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 0, 1, 0, 0), 'x'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 1, 1, 0, 0), 'y'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 1, 1, 0, 0), 'z'),
[BRLAPI_DOTS(1,0,0,0,0,0,0,0)] = 'a',
[BRLAPI_DOTS(1,1,0,0,0,0,0,0)] = 'b',
[BRLAPI_DOTS(1,0,0,1,0,0,0,0)] = 'c',
[BRLAPI_DOTS(1,0,0,1,1,0,0,0)] = 'd',
[BRLAPI_DOTS(1,0,0,0,1,0,0,0)] = 'e',
[BRLAPI_DOTS(1,1,0,1,0,0,0,0)] = 'f',
[BRLAPI_DOTS(1,1,0,1,1,0,0,0)] = 'g',
[BRLAPI_DOTS(1,1,0,0,1,0,0,0)] = 'h',
[BRLAPI_DOTS(0,1,0,1,0,0,0,0)] = 'i',
[BRLAPI_DOTS(0,1,0,1,1,0,0,0)] = 'j',
[BRLAPI_DOTS(1,0,1,0,0,0,0,0)] = 'k',
[BRLAPI_DOTS(1,1,1,0,0,0,0,0)] = 'l',
[BRLAPI_DOTS(1,0,1,1,0,0,0,0)] = 'm',
[BRLAPI_DOTS(1,0,1,1,1,0,0,0)] = 'n',
[BRLAPI_DOTS(1,0,1,0,1,0,0,0)] = 'o',
[BRLAPI_DOTS(1,1,1,1,0,0,0,0)] = 'p',
[BRLAPI_DOTS(1,1,1,1,1,0,0,0)] = 'q',
[BRLAPI_DOTS(1,1,1,0,1,0,0,0)] = 'r',
[BRLAPI_DOTS(0,1,1,1,0,0,0,0)] = 's',
[BRLAPI_DOTS(0,1,1,1,1,0,0,0)] = 't',
[BRLAPI_DOTS(1,0,1,0,0,1,0,0)] = 'u',
[BRLAPI_DOTS(1,1,1,0,0,1,0,0)] = 'v',
[BRLAPI_DOTS(0,1,0,1,1,1,0,0)] = 'w',
[BRLAPI_DOTS(1,0,1,1,0,1,0,0)] = 'x',
[BRLAPI_DOTS(1,0,1,1,1,1,0,0)] = 'y',
[BRLAPI_DOTS(1,0,1,0,1,1,0,0)] = 'z',
DO(BRLAPI_DOTS(1, 0, 0, 0, 0, 0, 1, 0), 'A'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 0, 0, 1, 0), 'B'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 0, 0, 1, 0), 'C'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 1, 0, 1, 0), 'D'),
DO(BRLAPI_DOTS(1, 0, 0, 0, 1, 0, 1, 0), 'E'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 0, 0, 1, 0), 'F'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 1, 0, 1, 0), 'G'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 1, 0, 1, 0), 'H'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 0, 0, 1, 0), 'I'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 1, 0, 1, 0), 'J'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 0, 0, 1, 0), 'K'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 0, 0, 1, 0), 'L'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 0, 0, 1, 0), 'M'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 1, 0, 1, 0), 'N'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 1, 0, 1, 0), 'O'),
DO(BRLAPI_DOTS(1, 1, 1, 1, 0, 0, 1, 0), 'P'),
DO(BRLAPI_DOTS(1, 1, 1, 1, 1, 0, 1, 0), 'Q'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 1, 0, 1, 0), 'R'),
DO(BRLAPI_DOTS(0, 1, 1, 1, 0, 0, 1, 0), 'S'),
DO(BRLAPI_DOTS(0, 1, 1, 1, 1, 0, 1, 0), 'T'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 0, 1, 1, 0), 'U'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 0, 1, 1, 0), 'V'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 1, 1, 1, 0), 'W'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 0, 1, 1, 0), 'X'),
DO(BRLAPI_DOTS(1, 0, 1, 1, 1, 1, 1, 0), 'Y'),
DO(BRLAPI_DOTS(1, 0, 1, 0, 1, 1, 1, 0), 'Z'),
[BRLAPI_DOTS(1,0,0,0,0,0,1,0)] = 'A',
[BRLAPI_DOTS(1,1,0,0,0,0,1,0)] = 'B',
[BRLAPI_DOTS(1,0,0,1,0,0,1,0)] = 'C',
[BRLAPI_DOTS(1,0,0,1,1,0,1,0)] = 'D',
[BRLAPI_DOTS(1,0,0,0,1,0,1,0)] = 'E',
[BRLAPI_DOTS(1,1,0,1,0,0,1,0)] = 'F',
[BRLAPI_DOTS(1,1,0,1,1,0,1,0)] = 'G',
[BRLAPI_DOTS(1,1,0,0,1,0,1,0)] = 'H',
[BRLAPI_DOTS(0,1,0,1,0,0,1,0)] = 'I',
[BRLAPI_DOTS(0,1,0,1,1,0,1,0)] = 'J',
[BRLAPI_DOTS(1,0,1,0,0,0,1,0)] = 'K',
[BRLAPI_DOTS(1,1,1,0,0,0,1,0)] = 'L',
[BRLAPI_DOTS(1,0,1,1,0,0,1,0)] = 'M',
[BRLAPI_DOTS(1,0,1,1,1,0,1,0)] = 'N',
[BRLAPI_DOTS(1,0,1,0,1,0,1,0)] = 'O',
[BRLAPI_DOTS(1,1,1,1,0,0,1,0)] = 'P',
[BRLAPI_DOTS(1,1,1,1,1,0,1,0)] = 'Q',
[BRLAPI_DOTS(1,1,1,0,1,0,1,0)] = 'R',
[BRLAPI_DOTS(0,1,1,1,0,0,1,0)] = 'S',
[BRLAPI_DOTS(0,1,1,1,1,0,1,0)] = 'T',
[BRLAPI_DOTS(1,0,1,0,0,1,1,0)] = 'U',
[BRLAPI_DOTS(1,1,1,0,0,1,1,0)] = 'V',
[BRLAPI_DOTS(0,1,0,1,1,1,1,0)] = 'W',
[BRLAPI_DOTS(1,0,1,1,0,1,1,0)] = 'X',
[BRLAPI_DOTS(1,0,1,1,1,1,1,0)] = 'Y',
[BRLAPI_DOTS(1,0,1,0,1,1,1,0)] = 'Z',
DO(BRLAPI_DOTS(0, 0, 1, 0, 1, 1, 0, 0), '0'),
DO(BRLAPI_DOTS(0, 1, 0, 0, 0, 0, 0, 0), '1'),
DO(BRLAPI_DOTS(0, 1, 1, 0, 0, 0, 0, 0), '2'),
DO(BRLAPI_DOTS(0, 1, 0, 0, 1, 0, 0, 0), '3'),
DO(BRLAPI_DOTS(0, 1, 0, 0, 1, 1, 0, 0), '4'),
DO(BRLAPI_DOTS(0, 1, 0, 0, 0, 1, 0, 0), '5'),
DO(BRLAPI_DOTS(0, 1, 1, 0, 1, 0, 0, 0), '6'),
DO(BRLAPI_DOTS(0, 1, 1, 0, 1, 1, 0, 0), '7'),
DO(BRLAPI_DOTS(0, 1, 1, 0, 0, 1, 0, 0), '8'),
DO(BRLAPI_DOTS(0, 0, 1, 0, 1, 0, 0, 0), '9'),
[BRLAPI_DOTS(0,0,1,0,1,1,0,0)] = '0',
[BRLAPI_DOTS(0,1,0,0,0,0,0,0)] = '1',
[BRLAPI_DOTS(0,1,1,0,0,0,0,0)] = '2',
[BRLAPI_DOTS(0,1,0,0,1,0,0,0)] = '3',
[BRLAPI_DOTS(0,1,0,0,1,1,0,0)] = '4',
[BRLAPI_DOTS(0,1,0,0,0,1,0,0)] = '5',
[BRLAPI_DOTS(0,1,1,0,1,0,0,0)] = '6',
[BRLAPI_DOTS(0,1,1,0,1,1,0,0)] = '7',
[BRLAPI_DOTS(0,1,1,0,0,1,0,0)] = '8',
[BRLAPI_DOTS(0,0,1,0,1,0,0,0)] = '9',
DO(BRLAPI_DOTS(0, 0, 0, 1, 0, 1, 0, 0), '.'),
DO(BRLAPI_DOTS(0, 0, 1, 1, 0, 1, 0, 0), '+'),
DO(BRLAPI_DOTS(0, 0, 1, 0, 0, 1, 0, 0), '-'),
DO(BRLAPI_DOTS(1, 0, 0, 0, 0, 1, 0, 0), '*'),
DO(BRLAPI_DOTS(0, 0, 1, 1, 0, 0, 0, 0), '/'),
DO(BRLAPI_DOTS(1, 1, 1, 0, 1, 1, 0, 0), '('),
DO(BRLAPI_DOTS(0, 1, 1, 1, 1, 1, 0, 0), ')'),
[BRLAPI_DOTS(0,0,0,1,0,1,0,0)] = '.',
[BRLAPI_DOTS(0,0,1,1,0,1,0,0)] = '+',
[BRLAPI_DOTS(0,0,1,0,0,1,0,0)] = '-',
[BRLAPI_DOTS(1,0,0,0,0,1,0,0)] = '*',
[BRLAPI_DOTS(0,0,1,1,0,0,0,0)] = '/',
[BRLAPI_DOTS(1,1,1,0,1,1,0,0)] = '(',
[BRLAPI_DOTS(0,1,1,1,1,1,0,0)] = ')',
DO(BRLAPI_DOTS(1, 1, 1, 1, 0, 1, 0, 0), '&'),
DO(BRLAPI_DOTS(0, 0, 1, 1, 1, 1, 0, 0), '#'),
[BRLAPI_DOTS(1,1,1,1,0,1,0,0)] = '&',
[BRLAPI_DOTS(0,0,1,1,1,1,0,0)] = '#',
DO(BRLAPI_DOTS(0, 0, 0, 0, 0, 1, 0, 0), ','),
DO(BRLAPI_DOTS(0, 0, 0, 0, 1, 1, 0, 0), ';'),
DO(BRLAPI_DOTS(1, 0, 0, 0, 1, 1, 0, 0), ':'),
DO(BRLAPI_DOTS(0, 1, 1, 1, 0, 1, 0, 0), '!'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 1, 1, 0, 0), '?'),
DO(BRLAPI_DOTS(0, 0, 0, 0, 1, 0, 0, 0), '"'),
DO(BRLAPI_DOTS(0, 0, 1, 0, 0, 0, 0, 0), '\''),
DO(BRLAPI_DOTS(0, 0, 0, 1, 0, 0, 0, 0), '`'),
DO(BRLAPI_DOTS(0, 0, 0, 1, 1, 0, 1, 0), '^'),
DO(BRLAPI_DOTS(0, 0, 0, 1, 1, 0, 0, 0), '~'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 0, 1, 1, 0), '['),
DO(BRLAPI_DOTS(1, 1, 0, 1, 1, 1, 1, 0), ']'),
DO(BRLAPI_DOTS(0, 1, 0, 1, 0, 1, 0, 0), '{'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 1, 1, 0, 0), '}'),
DO(BRLAPI_DOTS(1, 1, 1, 1, 1, 1, 0, 0), '='),
DO(BRLAPI_DOTS(1, 1, 0, 0, 0, 1, 0, 0), '<'),
DO(BRLAPI_DOTS(0, 0, 1, 1, 1, 0, 0, 0), '>'),
DO(BRLAPI_DOTS(1, 1, 0, 1, 0, 1, 0, 0), '$'),
DO(BRLAPI_DOTS(1, 0, 0, 1, 0, 1, 0, 0), '%'),
DO(BRLAPI_DOTS(0, 0, 0, 1, 0, 0, 1, 0), '@'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 1, 1, 0, 0), '|'),
DO(BRLAPI_DOTS(1, 1, 0, 0, 1, 1, 1, 0), '\\'),
DO(BRLAPI_DOTS(0, 0, 0, 1, 1, 1, 0, 0), '_'),
[BRLAPI_DOTS(0,0,0,0,0,1,0,0)] = ',',
[BRLAPI_DOTS(0,0,0,0,1,1,0,0)] = ';',
[BRLAPI_DOTS(1,0,0,0,1,1,0,0)] = ':',
[BRLAPI_DOTS(0,1,1,1,0,1,0,0)] = '!',
[BRLAPI_DOTS(1,0,0,1,1,1,0,0)] = '?',
[BRLAPI_DOTS(0,0,0,0,1,0,0,0)] = '"',
[BRLAPI_DOTS(0,0,1,0,0,0,0,0)] ='\'',
[BRLAPI_DOTS(0,0,0,1,0,0,0,0)] = '`',
[BRLAPI_DOTS(0,0,0,1,1,0,1,0)] = '^',
[BRLAPI_DOTS(0,0,0,1,1,0,0,0)] = '~',
[BRLAPI_DOTS(0,1,0,1,0,1,1,0)] = '[',
[BRLAPI_DOTS(1,1,0,1,1,1,1,0)] = ']',
[BRLAPI_DOTS(0,1,0,1,0,1,0,0)] = '{',
[BRLAPI_DOTS(1,1,0,1,1,1,0,0)] = '}',
[BRLAPI_DOTS(1,1,1,1,1,1,0,0)] = '=',
[BRLAPI_DOTS(1,1,0,0,0,1,0,0)] = '<',
[BRLAPI_DOTS(0,0,1,1,1,0,0,0)] = '>',
[BRLAPI_DOTS(1,1,0,1,0,1,0,0)] = '$',
[BRLAPI_DOTS(1,0,0,1,0,1,0,0)] = '%',
[BRLAPI_DOTS(0,0,0,1,0,0,1,0)] = '@',
[BRLAPI_DOTS(1,1,0,0,1,1,0,0)] = '|',
[BRLAPI_DOTS(1,1,0,0,1,1,1,0)] ='\\',
[BRLAPI_DOTS(0,0,0,1,1,1,0,0)] = '_',
};
/* The guest OS has started discussing with us, finish initializing BrlAPI */
static int baum_deferred_init(BaumChardev *baum)
{
int tty = BRLAPI_TTY_DEFAULT;
QemuConsole *con;
if (baum->deferred_init) {
return 1;
}
if (brlapi__getDisplaySize(baum->brlapi, &baum->x, &baum->y) == -1) {
brlapi_perror("baum: brlapi__getDisplaySize");
return 0;
}
if (baum->y > 1) {
baum->y = 1;
}
if (baum->x > 84) {
baum->x = 84;
}
con = qemu_console_lookup_by_index(0);
if (con && qemu_console_is_graphic(con)) {
tty = qemu_console_get_window_id(con);
if (tty == -1)
tty = BRLAPI_TTY_DEFAULT;
}
if (brlapi__enterTtyMode(baum->brlapi, tty, NULL) == -1) {
brlapi_perror("baum: brlapi__enterTtyMode");
return 0;
}
baum->deferred_init = 1;
return 1;
}
/* The serial port can receive more of our data */
static void baum_chr_accept_input(struct Chardev *chr)
static void baum_accept_input(struct CharDriverState *chr)
{
BaumChardev *baum = BAUM_CHARDEV(chr);
BaumDriverState *baum = chr->opaque;
int room, first;
if (!baum->out_buf_used)
@@ -288,25 +243,24 @@ static void baum_chr_accept_input(struct Chardev *chr)
}
/* We want to send a packet */
static void baum_write_packet(BaumChardev *baum, const uint8_t *buf, int len)
static void baum_write_packet(BaumDriverState *baum, const uint8_t *buf, int len)
{
Chardev *chr = CHARDEV(baum);
uint8_t io_buf[1 + 2 * len], *cur = io_buf;
int room;
*cur++ = ESC;
while (len--)
if ((*cur++ = *buf++) == ESC)
*cur++ = ESC;
room = qemu_chr_be_can_write(chr);
room = qemu_chr_be_can_write(baum->chr);
len = cur - io_buf;
if (len <= room) {
/* Fits */
qemu_chr_be_write(chr, io_buf, len);
qemu_chr_be_write(baum->chr, io_buf, len);
} else {
int first;
uint8_t out;
/* Can't fit all, send what can be, and store the rest. */
qemu_chr_be_write(chr, io_buf, room);
qemu_chr_be_write(baum->chr, io_buf, room);
len -= room;
cur = io_buf + room;
if (len > BUF_SIZE - baum->out_buf_used) {
@@ -331,14 +285,14 @@ static void baum_write_packet(BaumChardev *baum, const uint8_t *buf, int len)
/* Called when the other end seems to have a wrong idea of our display size */
static void baum_cellCount_timer_cb(void *opaque)
{
BaumChardev *baum = BAUM_CHARDEV(opaque);
BaumDriverState *baum = opaque;
uint8_t cell_count[] = { BAUM_RSP_CellCount, baum->x * baum->y };
DPRINTF("Timeout waiting for DisplayData, sending cell count\n");
baum_write_packet(baum, cell_count, sizeof(cell_count));
}
/* Try to interpret a whole incoming packet */
static int baum_eat_packet(BaumChardev *baum, const uint8_t *buf, int len)
static int baum_eat_packet(BaumDriverState *baum, const uint8_t *buf, int len)
{
const uint8_t *cur = buf;
uint8_t req = 0;
@@ -392,10 +346,8 @@ static int baum_eat_packet(BaumChardev *baum, const uint8_t *buf, int len)
cursor = i + 1;
c &= ~(BRLAPI_DOT7|BRLAPI_DOT8);
}
c = nabcc_translation[DOTS2ASCII][c];
if (!c) {
if (!(c = nabcc_translation[c]))
c = '?';
}
text[i] = c;
}
timer_del(baum->cellCount_timer);
@@ -479,17 +431,15 @@ static int baum_eat_packet(BaumChardev *baum, const uint8_t *buf, int len)
}
/* The other end is writing some data. Store it and try to interpret */
static int baum_chr_write(Chardev *chr, const uint8_t *buf, int len)
static int baum_write(CharDriverState *chr, const uint8_t *buf, int len)
{
BaumChardev *baum = BAUM_CHARDEV(chr);
BaumDriverState *baum = chr->opaque;
int tocopy, cur, eaten, orig_len = len;
if (!len)
return 0;
if (!baum->brlapi)
return len;
if (!baum_deferred_init(baum))
return len;
while (len) {
/* Complete our buffer as much as possible */
@@ -520,31 +470,20 @@ static int baum_chr_write(Chardev *chr, const uint8_t *buf, int len)
}
/* Send the key code to the other end */
static void baum_send_key(BaumChardev *baum, uint8_t type, uint8_t value)
{
static void baum_send_key(BaumDriverState *baum, uint8_t type, uint8_t value) {
uint8_t packet[] = { type, value };
DPRINTF("writing key %x %x\n", type, value);
baum_write_packet(baum, packet, sizeof(packet));
}
static void baum_send_key2(BaumChardev *baum, uint8_t type, uint8_t value,
uint8_t value2)
{
uint8_t packet[] = { type, value, value2 };
DPRINTF("writing key %x %x\n", type, value);
baum_write_packet(baum, packet, sizeof(packet));
}
/* We got some data on the BrlAPI socket */
static void baum_chr_read(void *opaque)
{
BaumChardev *baum = BAUM_CHARDEV(opaque);
BaumDriverState *baum = opaque;
brlapi_keyCode_t code;
int ret;
if (!baum->brlapi)
return;
if (!baum_deferred_init(baum))
return;
while ((ret = brlapi__readKey(baum->brlapi, 0, &code)) == 1) {
DPRINTF("got key %"BRLAPI_PRIxKEYCODE"\n", code);
/* Emulate */
@@ -601,17 +540,7 @@ static void baum_chr_read(void *opaque)
}
break;
case BRLAPI_KEY_TYPE_SYM:
{
brlapi_keyCode_t keysym = code & BRLAPI_KEY_CODE_MASK;
if (keysym < 0x100) {
uint8_t dots = nabcc_translation[ASCII2DOTS][keysym];
if (dots) {
baum_send_key2(baum, BAUM_RSP_EntryKeys, 0, dots);
baum_send_key2(baum, BAUM_RSP_EntryKeys, 0, 0);
}
}
break;
}
break;
}
}
if (ret == -1 && (brlapi_errno != BRLAPI_ERROR_LIBCERR || errno != EINTR)) {
@@ -622,24 +551,45 @@ static void baum_chr_read(void *opaque)
}
}
static void char_braille_finalize(Object *obj)
static void baum_close(struct CharDriverState *chr)
{
BaumChardev *baum = BAUM_CHARDEV(obj);
BaumDriverState *baum = chr->opaque;
timer_free(baum->cellCount_timer);
if (baum->brlapi) {
brlapi__closeConnection(baum->brlapi);
g_free(baum->brlapi);
}
g_free(baum);
}
static void baum_chr_open(Chardev *chr,
ChardevBackend *backend,
bool *be_opened,
Error **errp)
static CharDriverState *chr_baum_init(const char *id,
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{
BaumChardev *baum = BAUM_CHARDEV(chr);
ChardevCommon *common = backend->u.braille.data;
BaumDriverState *baum;
CharDriverState *chr;
brlapi_handle_t *handle;
#if defined(CONFIG_SDL)
#if SDL_COMPILEDVERSION < SDL_VERSIONNUM(2, 0, 0)
SDL_SysWMinfo info;
#endif
#endif
int tty;
chr = qemu_chr_alloc(common, errp);
if (!chr) {
return NULL;
}
baum = g_malloc0(sizeof(BaumDriverState));
baum->chr = chr;
chr->opaque = baum;
chr->chr_write = baum_write;
chr->chr_accept_input = baum_accept_input;
chr->chr_close = baum_close;
handle = g_malloc0(brlapi_getHandleSize());
baum->brlapi = handle;
@@ -648,37 +598,52 @@ static void baum_chr_open(Chardev *chr,
if (baum->brlapi_fd == -1) {
error_setg(errp, "brlapi__openConnection: %s",
brlapi_strerror(brlapi_error_location()));
g_free(handle);
baum->brlapi = NULL;
return;
goto fail_handle;
}
baum->deferred_init = 0;
baum->cellCount_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL, baum_cellCount_timer_cb, baum);
if (brlapi__getDisplaySize(handle, &baum->x, &baum->y) == -1) {
error_setg(errp, "brlapi__getDisplaySize: %s",
brlapi_strerror(brlapi_error_location()));
goto fail;
}
#if defined(CONFIG_SDL)
#if SDL_COMPILEDVERSION < SDL_VERSIONNUM(2, 0, 0)
memset(&info, 0, sizeof(info));
SDL_VERSION(&info.version);
if (SDL_GetWMInfo(&info))
tty = info.info.x11.wmwindow;
else
#endif
#endif
tty = BRLAPI_TTY_DEFAULT;
if (brlapi__enterTtyMode(handle, tty, NULL) == -1) {
error_setg(errp, "brlapi__enterTtyMode: %s",
brlapi_strerror(brlapi_error_location()));
goto fail;
}
qemu_set_fd_handler(baum->brlapi_fd, baum_chr_read, NULL, baum);
return chr;
fail:
timer_free(baum->cellCount_timer);
brlapi__closeConnection(handle);
fail_handle:
g_free(handle);
g_free(chr);
g_free(baum);
return NULL;
}
static void char_braille_class_init(ObjectClass *oc, void *data)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
cc->open = baum_chr_open;
cc->chr_write = baum_chr_write;
cc->chr_accept_input = baum_chr_accept_input;
}
static const TypeInfo char_braille_type_info = {
.name = TYPE_CHARDEV_BRAILLE,
.parent = TYPE_CHARDEV,
.instance_size = sizeof(BaumChardev),
.instance_finalize = char_braille_finalize,
.class_init = char_braille_class_init,
};
static void register_types(void)
{
type_register_static(&char_braille_type_info);
register_char_driver("braille", CHARDEV_BACKEND_KIND_BRAILLE, NULL,
chr_baum_init);
}
type_init(register_types);

View File

@@ -1,400 +0,0 @@
/*
* QEMU Cryptodev backend for QEMU cipher APIs
*
* Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
*
* Authors:
* Gonglei <arei.gonglei@huawei.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "sysemu/cryptodev.h"
#include "hw/boards.h"
#include "qapi/error.h"
#include "standard-headers/linux/virtio_crypto.h"
#include "crypto/cipher.h"
/**
* @TYPE_CRYPTODEV_BACKEND_BUILTIN:
* name of backend that uses QEMU cipher API
*/
#define TYPE_CRYPTODEV_BACKEND_BUILTIN "cryptodev-backend-builtin"
#define CRYPTODEV_BACKEND_BUILTIN(obj) \
OBJECT_CHECK(CryptoDevBackendBuiltin, \
(obj), TYPE_CRYPTODEV_BACKEND_BUILTIN)
typedef struct CryptoDevBackendBuiltin
CryptoDevBackendBuiltin;
typedef struct CryptoDevBackendBuiltinSession {
QCryptoCipher *cipher;
uint8_t direction; /* encryption or decryption */
uint8_t type; /* cipher? hash? aead? */
QTAILQ_ENTRY(CryptoDevBackendBuiltinSession) next;
} CryptoDevBackendBuiltinSession;
/* Max number of symmetric sessions */
#define MAX_NUM_SESSIONS 256
#define CRYPTODEV_BUITLIN_MAX_AUTH_KEY_LEN 512
#define CRYPTODEV_BUITLIN_MAX_CIPHER_KEY_LEN 64
struct CryptoDevBackendBuiltin {
CryptoDevBackend parent_obj;
CryptoDevBackendBuiltinSession *sessions[MAX_NUM_SESSIONS];
};
static void cryptodev_builtin_init(
CryptoDevBackend *backend, Error **errp)
{
/* Only support one queue */
int queues = backend->conf.peers.queues;
CryptoDevBackendClient *cc;
if (queues != 1) {
error_setg(errp,
"Only support one queue in cryptdov-builtin backend");
return;
}
cc = cryptodev_backend_new_client(
"cryptodev-builtin", NULL);
cc->info_str = g_strdup_printf("cryptodev-builtin0");
cc->queue_index = 0;
backend->conf.peers.ccs[0] = cc;
backend->conf.crypto_services =
1u << VIRTIO_CRYPTO_SERVICE_CIPHER |
1u << VIRTIO_CRYPTO_SERVICE_HASH |
1u << VIRTIO_CRYPTO_SERVICE_MAC;
backend->conf.cipher_algo_l = 1u << VIRTIO_CRYPTO_CIPHER_AES_CBC;
backend->conf.hash_algo = 1u << VIRTIO_CRYPTO_HASH_SHA1;
/*
* Set the Maximum length of crypto request.
* Why this value? Just avoid to overflow when
* memory allocation for each crypto request.
*/
backend->conf.max_size = LONG_MAX - sizeof(CryptoDevBackendSymOpInfo);
backend->conf.max_cipher_key_len = CRYPTODEV_BUITLIN_MAX_CIPHER_KEY_LEN;
backend->conf.max_auth_key_len = CRYPTODEV_BUITLIN_MAX_AUTH_KEY_LEN;
cryptodev_backend_set_ready(backend, true);
}
static int
cryptodev_builtin_get_unused_session_index(
CryptoDevBackendBuiltin *builtin)
{
size_t i;
for (i = 0; i < MAX_NUM_SESSIONS; i++) {
if (builtin->sessions[i] == NULL) {
return i;
}
}
return -1;
}
#define AES_KEYSIZE_128 16
#define AES_KEYSIZE_192 24
#define AES_KEYSIZE_256 32
#define AES_KEYSIZE_128_XTS AES_KEYSIZE_256
#define AES_KEYSIZE_256_XTS 64
static int
cryptodev_builtin_get_aes_algo(uint32_t key_len, int mode, Error **errp)
{
int algo;
if (key_len == AES_KEYSIZE_128) {
algo = QCRYPTO_CIPHER_ALG_AES_128;
} else if (key_len == AES_KEYSIZE_192) {
algo = QCRYPTO_CIPHER_ALG_AES_192;
} else if (key_len == AES_KEYSIZE_256) { /* equals AES_KEYSIZE_128_XTS */
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
algo = QCRYPTO_CIPHER_ALG_AES_128;
} else {
algo = QCRYPTO_CIPHER_ALG_AES_256;
}
} else if (key_len == AES_KEYSIZE_256_XTS) {
if (mode == QCRYPTO_CIPHER_MODE_XTS) {
algo = QCRYPTO_CIPHER_ALG_AES_256;
} else {
goto err;
}
} else {
goto err;
}
return algo;
err:
error_setg(errp, "Unsupported key length :%u", key_len);
return -1;
}
static int cryptodev_builtin_create_cipher_session(
CryptoDevBackendBuiltin *builtin,
CryptoDevBackendSymSessionInfo *sess_info,
Error **errp)
{
int algo;
int mode;
QCryptoCipher *cipher;
int index;
CryptoDevBackendBuiltinSession *sess;
if (sess_info->op_type != VIRTIO_CRYPTO_SYM_OP_CIPHER) {
error_setg(errp, "Unsupported optype :%u", sess_info->op_type);
return -1;
}
index = cryptodev_builtin_get_unused_session_index(builtin);
if (index < 0) {
error_setg(errp, "Total number of sessions created exceeds %u",
MAX_NUM_SESSIONS);
return -1;
}
switch (sess_info->cipher_alg) {
case VIRTIO_CRYPTO_CIPHER_AES_ECB:
mode = QCRYPTO_CIPHER_MODE_ECB;
algo = cryptodev_builtin_get_aes_algo(sess_info->key_len,
mode, errp);
if (algo < 0) {
return -1;
}
break;
case VIRTIO_CRYPTO_CIPHER_AES_CBC:
mode = QCRYPTO_CIPHER_MODE_CBC;
algo = cryptodev_builtin_get_aes_algo(sess_info->key_len,
mode, errp);
if (algo < 0) {
return -1;
}
break;
case VIRTIO_CRYPTO_CIPHER_AES_CTR:
mode = QCRYPTO_CIPHER_MODE_CTR;
algo = cryptodev_builtin_get_aes_algo(sess_info->key_len,
mode, errp);
if (algo < 0) {
return -1;
}
break;
case VIRTIO_CRYPTO_CIPHER_AES_XTS:
mode = QCRYPTO_CIPHER_MODE_XTS;
algo = cryptodev_builtin_get_aes_algo(sess_info->key_len,
mode, errp);
if (algo < 0) {
return -1;
}
break;
case VIRTIO_CRYPTO_CIPHER_3DES_ECB:
mode = QCRYPTO_CIPHER_MODE_ECB;
algo = QCRYPTO_CIPHER_ALG_3DES;
break;
case VIRTIO_CRYPTO_CIPHER_3DES_CBC:
mode = QCRYPTO_CIPHER_MODE_CBC;
algo = QCRYPTO_CIPHER_ALG_3DES;
break;
case VIRTIO_CRYPTO_CIPHER_3DES_CTR:
mode = QCRYPTO_CIPHER_MODE_CTR;
algo = QCRYPTO_CIPHER_ALG_3DES;
break;
default:
error_setg(errp, "Unsupported cipher alg :%u",
sess_info->cipher_alg);
return -1;
}
cipher = qcrypto_cipher_new(algo, mode,
sess_info->cipher_key,
sess_info->key_len,
errp);
if (!cipher) {
return -1;
}
sess = g_new0(CryptoDevBackendBuiltinSession, 1);
sess->cipher = cipher;
sess->direction = sess_info->direction;
sess->type = sess_info->op_type;
builtin->sessions[index] = sess;
return index;
}
static int64_t cryptodev_builtin_sym_create_session(
CryptoDevBackend *backend,
CryptoDevBackendSymSessionInfo *sess_info,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
int64_t session_id = -1;
int ret;
switch (sess_info->op_code) {
case VIRTIO_CRYPTO_CIPHER_CREATE_SESSION:
ret = cryptodev_builtin_create_cipher_session(
builtin, sess_info, errp);
if (ret < 0) {
return ret;
} else {
session_id = ret;
}
break;
case VIRTIO_CRYPTO_HASH_CREATE_SESSION:
case VIRTIO_CRYPTO_MAC_CREATE_SESSION:
default:
error_setg(errp, "Unsupported opcode :%" PRIu32 "",
sess_info->op_code);
return -1;
}
return session_id;
}
static int cryptodev_builtin_sym_close_session(
CryptoDevBackend *backend,
uint64_t session_id,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
if (session_id >= MAX_NUM_SESSIONS ||
builtin->sessions[session_id] == NULL) {
error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
session_id);
return -1;
}
qcrypto_cipher_free(builtin->sessions[session_id]->cipher);
g_free(builtin->sessions[session_id]);
builtin->sessions[session_id] = NULL;
return 0;
}
static int cryptodev_builtin_sym_operation(
CryptoDevBackend *backend,
CryptoDevBackendSymOpInfo *op_info,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
CryptoDevBackendBuiltinSession *sess;
int ret;
if (op_info->session_id >= MAX_NUM_SESSIONS ||
builtin->sessions[op_info->session_id] == NULL) {
error_setg(errp, "Cannot find a valid session id: %" PRIu64 "",
op_info->session_id);
return -VIRTIO_CRYPTO_INVSESS;
}
if (op_info->op_type == VIRTIO_CRYPTO_SYM_OP_ALGORITHM_CHAINING) {
error_setg(errp,
"Algorithm chain is unsupported for cryptdoev-builtin");
return -VIRTIO_CRYPTO_NOTSUPP;
}
sess = builtin->sessions[op_info->session_id];
if (op_info->iv_len > 0) {
ret = qcrypto_cipher_setiv(sess->cipher, op_info->iv,
op_info->iv_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
}
}
if (sess->direction == VIRTIO_CRYPTO_OP_ENCRYPT) {
ret = qcrypto_cipher_encrypt(sess->cipher, op_info->src,
op_info->dst, op_info->src_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
}
} else {
ret = qcrypto_cipher_decrypt(sess->cipher, op_info->src,
op_info->dst, op_info->src_len, errp);
if (ret < 0) {
return -VIRTIO_CRYPTO_ERR;
}
}
return VIRTIO_CRYPTO_OK;
}
static void cryptodev_builtin_cleanup(
CryptoDevBackend *backend,
Error **errp)
{
CryptoDevBackendBuiltin *builtin =
CRYPTODEV_BACKEND_BUILTIN(backend);
size_t i;
int queues = backend->conf.peers.queues;
CryptoDevBackendClient *cc;
for (i = 0; i < MAX_NUM_SESSIONS; i++) {
if (builtin->sessions[i] != NULL) {
cryptodev_builtin_sym_close_session(
backend, i, 0, errp);
}
}
for (i = 0; i < queues; i++) {
cc = backend->conf.peers.ccs[i];
if (cc) {
cryptodev_backend_free_client(cc);
backend->conf.peers.ccs[i] = NULL;
}
}
cryptodev_backend_set_ready(backend, false);
}
static void
cryptodev_builtin_class_init(ObjectClass *oc, void *data)
{
CryptoDevBackendClass *bc = CRYPTODEV_BACKEND_CLASS(oc);
bc->init = cryptodev_builtin_init;
bc->cleanup = cryptodev_builtin_cleanup;
bc->create_session = cryptodev_builtin_sym_create_session;
bc->close_session = cryptodev_builtin_sym_close_session;
bc->do_sym_op = cryptodev_builtin_sym_operation;
}
static const TypeInfo cryptodev_builtin_info = {
.name = TYPE_CRYPTODEV_BACKEND_BUILTIN,
.parent = TYPE_CRYPTODEV_BACKEND,
.class_init = cryptodev_builtin_class_init,
.instance_size = sizeof(CryptoDevBackendBuiltin),
};
static void
cryptodev_builtin_register_types(void)
{
type_register_static(&cryptodev_builtin_info);
}
type_init(cryptodev_builtin_register_types);

View File

@@ -1,270 +0,0 @@
/*
* QEMU Crypto Device Implementation
*
* Copyright (c) 2016 HUAWEI TECHNOLOGIES CO., LTD.
*
* Authors:
* Gonglei <arei.gonglei@huawei.com>
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#include "qemu/osdep.h"
#include "sysemu/cryptodev.h"
#include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/visitor.h"
#include "qapi-visit.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
#include "hw/virtio/virtio-crypto.h"
static QTAILQ_HEAD(, CryptoDevBackendClient) crypto_clients;
CryptoDevBackendClient *
cryptodev_backend_new_client(const char *model,
const char *name)
{
CryptoDevBackendClient *cc;
cc = g_malloc0(sizeof(CryptoDevBackendClient));
cc->model = g_strdup(model);
if (name) {
cc->name = g_strdup(name);
}
QTAILQ_INSERT_TAIL(&crypto_clients, cc, next);
return cc;
}
void cryptodev_backend_free_client(
CryptoDevBackendClient *cc)
{
QTAILQ_REMOVE(&crypto_clients, cc, next);
g_free(cc->name);
g_free(cc->model);
g_free(cc->info_str);
g_free(cc);
}
void cryptodev_backend_cleanup(
CryptoDevBackend *backend,
Error **errp)
{
CryptoDevBackendClass *bc =
CRYPTODEV_BACKEND_GET_CLASS(backend);
if (bc->cleanup) {
bc->cleanup(backend, errp);
}
}
int64_t cryptodev_backend_sym_create_session(
CryptoDevBackend *backend,
CryptoDevBackendSymSessionInfo *sess_info,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendClass *bc =
CRYPTODEV_BACKEND_GET_CLASS(backend);
if (bc->create_session) {
return bc->create_session(backend, sess_info, queue_index, errp);
}
return -1;
}
int cryptodev_backend_sym_close_session(
CryptoDevBackend *backend,
uint64_t session_id,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendClass *bc =
CRYPTODEV_BACKEND_GET_CLASS(backend);
if (bc->close_session) {
return bc->close_session(backend, session_id, queue_index, errp);
}
return -1;
}
static int cryptodev_backend_sym_operation(
CryptoDevBackend *backend,
CryptoDevBackendSymOpInfo *op_info,
uint32_t queue_index, Error **errp)
{
CryptoDevBackendClass *bc =
CRYPTODEV_BACKEND_GET_CLASS(backend);
if (bc->do_sym_op) {
return bc->do_sym_op(backend, op_info, queue_index, errp);
}
return -VIRTIO_CRYPTO_ERR;
}
int cryptodev_backend_crypto_operation(
CryptoDevBackend *backend,
void *opaque,
uint32_t queue_index, Error **errp)
{
VirtIOCryptoReq *req = opaque;
if (req->flags == CRYPTODEV_BACKEND_ALG_SYM) {
CryptoDevBackendSymOpInfo *op_info;
op_info = req->u.sym_op_info;
return cryptodev_backend_sym_operation(backend,
op_info, queue_index, errp);
} else {
error_setg(errp, "Unsupported cryptodev alg type: %" PRIu32 "",
req->flags);
return -VIRTIO_CRYPTO_NOTSUPP;
}
return -VIRTIO_CRYPTO_ERR;
}
static void
cryptodev_backend_get_queues(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
CryptoDevBackend *backend = CRYPTODEV_BACKEND(obj);
uint32_t value = backend->conf.peers.queues;
visit_type_uint32(v, name, &value, errp);
}
static void
cryptodev_backend_set_queues(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
CryptoDevBackend *backend = CRYPTODEV_BACKEND(obj);
Error *local_err = NULL;
uint32_t value;
visit_type_uint32(v, name, &value, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu32 "'", object_get_typename(obj), name, value);
goto out;
}
backend->conf.peers.queues = value;
out:
error_propagate(errp, local_err);
}
static void
cryptodev_backend_complete(UserCreatable *uc, Error **errp)
{
CryptoDevBackend *backend = CRYPTODEV_BACKEND(uc);
CryptoDevBackendClass *bc = CRYPTODEV_BACKEND_GET_CLASS(uc);
Error *local_err = NULL;
if (bc->init) {
bc->init(backend, &local_err);
if (local_err) {
goto out;
}
}
return;
out:
error_propagate(errp, local_err);
}
void cryptodev_backend_set_used(CryptoDevBackend *backend, bool used)
{
backend->is_used = used;
}
bool cryptodev_backend_is_used(CryptoDevBackend *backend)
{
return backend->is_used;
}
void cryptodev_backend_set_ready(CryptoDevBackend *backend, bool ready)
{
backend->ready = ready;
}
bool cryptodev_backend_is_ready(CryptoDevBackend *backend)
{
return backend->ready;
}
static bool
cryptodev_backend_can_be_deleted(UserCreatable *uc)
{
return !cryptodev_backend_is_used(CRYPTODEV_BACKEND(uc));
}
static void cryptodev_backend_instance_init(Object *obj)
{
object_property_add(obj, "queues", "uint32",
cryptodev_backend_get_queues,
cryptodev_backend_set_queues,
NULL, NULL, NULL);
/* Initialize devices' queues property to 1 */
object_property_set_int(obj, 1, "queues", NULL);
}
static void cryptodev_backend_finalize(Object *obj)
{
CryptoDevBackend *backend = CRYPTODEV_BACKEND(obj);
cryptodev_backend_cleanup(backend, NULL);
}
static void
cryptodev_backend_class_init(ObjectClass *oc, void *data)
{
UserCreatableClass *ucc = USER_CREATABLE_CLASS(oc);
ucc->complete = cryptodev_backend_complete;
ucc->can_be_deleted = cryptodev_backend_can_be_deleted;
QTAILQ_INIT(&crypto_clients);
}
static const TypeInfo cryptodev_backend_info = {
.name = TYPE_CRYPTODEV_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(CryptoDevBackend),
.instance_init = cryptodev_backend_instance_init,
.instance_finalize = cryptodev_backend_finalize,
.class_size = sizeof(CryptoDevBackendClass),
.class_init = cryptodev_backend_class_init,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }
}
};
static void
cryptodev_backend_register_types(void)
{
type_register_static(&cryptodev_backend_info);
}
type_init(cryptodev_backend_register_types);

View File

@@ -31,9 +31,8 @@ typedef struct HostMemoryBackendFile HostMemoryBackendFile;
struct HostMemoryBackendFile {
HostMemoryBackend parent_obj;
bool discard_data;
bool share;
char *mem_path;
uint64_t align;
};
static void
@@ -52,19 +51,27 @@ file_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
#ifndef CONFIG_LINUX
error_setg(errp, "-mem-path not supported on this host");
#else
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
gchar *path;
backend->force_prealloc = mem_prealloc;
path = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_file(&backend->mr, OBJECT(backend),
path,
backend->size, fb->align, backend->share,
backend->size, fb->share,
fb->mem_path, errp);
g_free(path);
}
#endif
}
static void
file_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
}
static char *get_mem_path(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
@@ -77,7 +84,7 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
if (host_memory_backend_mr_inited(backend)) {
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
@@ -85,82 +92,33 @@ static void set_mem_path(Object *o, const char *str, Error **errp)
fb->mem_path = g_strdup(str);
}
static bool file_memory_backend_get_discard_data(Object *o, Error **errp)
{
return MEMORY_BACKEND_FILE(o)->discard_data;
}
static void file_memory_backend_set_discard_data(Object *o, bool value,
Error **errp)
{
MEMORY_BACKEND_FILE(o)->discard_data = value;
}
static void file_memory_backend_get_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
static bool file_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
uint64_t val = fb->align;
visit_type_size(v, name, &val, errp);
return fb->share;
}
static void file_memory_backend_set_align(Object *o, Visitor *v,
const char *name, void *opaque,
Error **errp)
static void file_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(o);
Error *local_err = NULL;
uint64_t val;
if (host_memory_backend_mr_inited(backend)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, name, &val, &local_err);
if (local_err) {
goto out;
}
fb->align = val;
out:
error_propagate(errp, local_err);
}
static void file_backend_unparent(Object *obj)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
HostMemoryBackendFile *fb = MEMORY_BACKEND_FILE(obj);
if (host_memory_backend_mr_inited(backend) && fb->discard_data) {
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
qemu_madvise(ptr, sz, QEMU_MADV_REMOVE);
if (memory_region_size(&backend->mr)) {
error_setg(errp, "cannot change property value");
return;
}
fb->share = value;
}
static void
file_backend_class_init(ObjectClass *oc, void *data)
file_backend_instance_init(Object *o)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = file_backend_memory_alloc;
oc->unparent = file_backend_unparent;
object_class_property_add_bool(oc, "discard-data",
file_memory_backend_get_discard_data, file_memory_backend_set_discard_data,
&error_abort);
object_class_property_add_str(oc, "mem-path",
get_mem_path, set_mem_path,
&error_abort);
object_class_property_add(oc, "align", "int",
file_memory_backend_get_align,
file_memory_backend_set_align,
NULL, NULL, &error_abort);
object_property_add_bool(o, "share",
file_memory_backend_get_share,
file_memory_backend_set_share, NULL);
object_property_add_str(o, "mem-path", get_mem_path,
set_mem_path, NULL);
}
static void file_backend_instance_finalize(Object *o)
@@ -174,6 +132,7 @@ static const TypeInfo file_backend_info = {
.name = TYPE_MEMORY_BACKEND_FILE,
.parent = TYPE_MEMORY_BACKEND,
.class_init = file_backend_class_init,
.instance_init = file_backend_instance_init,
.instance_finalize = file_backend_instance_finalize,
.instance_size = sizeof(HostMemoryBackendFile),
};

View File

@@ -1,170 +0,0 @@
/*
* QEMU host memfd memory backend
*
* Copyright (C) 2018 Red Hat Inc
*
* Authors:
* Marc-André Lureau <marcandre.lureau@redhat.com>
*
* This work is licensed under the terms of the GNU GPL, version 2 or later.
* See the COPYING file in the top-level directory.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/hostmem.h"
#include "sysemu/sysemu.h"
#include "qom/object_interfaces.h"
#include "qemu/memfd.h"
#include "qapi/error.h"
#define TYPE_MEMORY_BACKEND_MEMFD "memory-backend-memfd"
#define MEMORY_BACKEND_MEMFD(obj) \
OBJECT_CHECK(HostMemoryBackendMemfd, (obj), TYPE_MEMORY_BACKEND_MEMFD)
typedef struct HostMemoryBackendMemfd HostMemoryBackendMemfd;
struct HostMemoryBackendMemfd {
HostMemoryBackend parent_obj;
bool hugetlb;
uint64_t hugetlbsize;
bool seal;
};
static void
memfd_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(backend);
char *name;
int fd;
if (!backend->size) {
error_setg(errp, "can't create backend with size 0");
return;
}
if (host_memory_backend_mr_inited(backend)) {
return;
}
backend->force_prealloc = mem_prealloc;
fd = qemu_memfd_create(TYPE_MEMORY_BACKEND_MEMFD, backend->size,
m->hugetlb, m->hugetlbsize, m->seal ?
F_SEAL_GROW | F_SEAL_SHRINK | F_SEAL_SEAL : 0,
errp);
if (fd == -1) {
return;
}
name = object_get_canonical_path(OBJECT(backend));
memory_region_init_ram_from_fd(&backend->mr, OBJECT(backend),
name, backend->size, true, fd, errp);
g_free(name);
}
static bool
memfd_backend_get_hugetlb(Object *o, Error **errp)
{
return MEMORY_BACKEND_MEMFD(o)->hugetlb;
}
static void
memfd_backend_set_hugetlb(Object *o, bool value, Error **errp)
{
MEMORY_BACKEND_MEMFD(o)->hugetlb = value;
}
static void
memfd_backend_set_hugetlbsize(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
Error *local_err = NULL;
uint64_t value;
if (host_memory_backend_mr_inited(MEMORY_BACKEND(obj))) {
error_setg(&local_err, "cannot change property value");
goto out;
}
visit_type_size(v, name, &value, &local_err);
if (local_err) {
goto out;
}
if (!value) {
error_setg(&local_err, "Property '%s.%s' doesn't take value '%"
PRIu64 "'", object_get_typename(obj), name, value);
goto out;
}
m->hugetlbsize = value;
out:
error_propagate(errp, local_err);
}
static void
memfd_backend_get_hugetlbsize(Object *obj, Visitor *v, const char *name,
void *opaque, Error **errp)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
uint64_t value = m->hugetlbsize;
visit_type_size(v, name, &value, errp);
}
static bool
memfd_backend_get_seal(Object *o, Error **errp)
{
return MEMORY_BACKEND_MEMFD(o)->seal;
}
static void
memfd_backend_set_seal(Object *o, bool value, Error **errp)
{
MEMORY_BACKEND_MEMFD(o)->seal = value;
}
static void
memfd_backend_instance_init(Object *obj)
{
HostMemoryBackendMemfd *m = MEMORY_BACKEND_MEMFD(obj);
/* default to sealed file */
m->seal = true;
}
static void
memfd_backend_class_init(ObjectClass *oc, void *data)
{
HostMemoryBackendClass *bc = MEMORY_BACKEND_CLASS(oc);
bc->alloc = memfd_backend_memory_alloc;
object_class_property_add_bool(oc, "hugetlb",
memfd_backend_get_hugetlb,
memfd_backend_set_hugetlb,
&error_abort);
object_class_property_add(oc, "hugetlbsize", "int",
memfd_backend_get_hugetlbsize,
memfd_backend_set_hugetlbsize,
NULL, NULL, &error_abort);
object_class_property_add_bool(oc, "seal",
memfd_backend_get_seal,
memfd_backend_set_seal,
&error_abort);
}
static const TypeInfo memfd_backend_info = {
.name = TYPE_MEMORY_BACKEND_MEMFD,
.parent = TYPE_MEMORY_BACKEND,
.instance_init = memfd_backend_instance_init,
.class_init = memfd_backend_class_init,
.instance_size = sizeof(HostMemoryBackendMemfd),
};
static void register_types(void)
{
type_register_static(&memfd_backend_info);
}
type_init(register_types);

View File

@@ -28,8 +28,8 @@ ram_backend_memory_alloc(HostMemoryBackend *backend, Error **errp)
}
path = object_get_canonical_path_component(OBJECT(backend));
memory_region_init_ram_shared_nomigrate(&backend->mr, OBJECT(backend), path,
backend->size, backend->share, errp);
memory_region_init_ram(&backend->mr, OBJECT(backend), path,
backend->size, errp);
g_free(path);
}

View File

@@ -14,6 +14,7 @@
#include "hw/boards.h"
#include "qapi/error.h"
#include "qapi/visitor.h"
#include "qapi-types.h"
#include "qapi-visit.h"
#include "qemu/config-file.h"
#include "qom/object_interfaces.h"
@@ -44,7 +45,7 @@ host_memory_backend_set_size(Object *obj, Visitor *v, const char *name,
Error *local_err = NULL;
uint64_t value;
if (host_memory_backend_mr_inited(backend)) {
if (memory_region_size(&backend->mr)) {
error_setg(&local_err, "cannot change property value");
goto out;
}
@@ -145,7 +146,7 @@ static void host_memory_backend_set_merge(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->merge = value;
return;
}
@@ -171,7 +172,7 @@ static void host_memory_backend_set_dump(Object *obj, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->dump = value;
return;
}
@@ -196,7 +197,6 @@ static bool host_memory_backend_get_prealloc(Object *obj, Error **errp)
static void host_memory_backend_set_prealloc(Object *obj, bool value,
Error **errp)
{
Error *local_err = NULL;
HostMemoryBackend *backend = MEMORY_BACKEND(obj);
if (backend->force_prealloc) {
@@ -207,7 +207,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
}
}
if (!host_memory_backend_mr_inited(backend)) {
if (!memory_region_size(&backend->mr)) {
backend->prealloc = value;
return;
}
@@ -217,11 +217,7 @@ static void host_memory_backend_set_prealloc(Object *obj, bool value,
void *ptr = memory_region_get_ram_ptr(&backend->mr);
uint64_t sz = memory_region_size(&backend->mr);
os_mem_prealloc(fd, ptr, sz, smp_cpus, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
os_mem_prealloc(fd, ptr, sz);
backend->prealloc = true;
}
}
@@ -234,31 +230,32 @@ static void host_memory_backend_init(Object *obj)
backend->merge = machine_mem_merge(machine);
backend->dump = machine_dump_guest_core(machine);
backend->prealloc = mem_prealloc;
}
bool host_memory_backend_mr_inited(HostMemoryBackend *backend)
{
/*
* NOTE: We forbid zero-length memory backend, so here zero means
* "we haven't inited the backend memory region yet".
*/
return memory_region_size(&backend->mr) != 0;
object_property_add_bool(obj, "merge",
host_memory_backend_get_merge,
host_memory_backend_set_merge, NULL);
object_property_add_bool(obj, "dump",
host_memory_backend_get_dump,
host_memory_backend_set_dump, NULL);
object_property_add_bool(obj, "prealloc",
host_memory_backend_get_prealloc,
host_memory_backend_set_prealloc, NULL);
object_property_add(obj, "size", "int",
host_memory_backend_get_size,
host_memory_backend_set_size, NULL, NULL, NULL);
object_property_add(obj, "host-nodes", "int",
host_memory_backend_get_host_nodes,
host_memory_backend_set_host_nodes, NULL, NULL, NULL);
object_property_add_enum(obj, "policy", "HostMemPolicy",
HostMemPolicy_lookup,
host_memory_backend_get_policy,
host_memory_backend_set_policy, NULL);
}
MemoryRegion *
host_memory_backend_get_memory(HostMemoryBackend *backend, Error **errp)
{
return host_memory_backend_mr_inited(backend) ? &backend->mr : NULL;
}
void host_memory_backend_set_mapped(HostMemoryBackend *backend, bool mapped)
{
backend->is_mapped = mapped;
}
bool host_memory_backend_is_mapped(HostMemoryBackend *backend)
{
return backend->is_mapped;
return memory_region_size(&backend->mr) ? &backend->mr : NULL;
}
static void
@@ -273,7 +270,8 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
if (bc->alloc) {
bc->alloc(backend, &local_err);
if (local_err) {
goto out;
error_propagate(errp, local_err);
return;
}
ptr = memory_region_get_ram_ptr(&backend->mr);
@@ -303,7 +301,7 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
return;
} else if (maxnode == 0 && backend->policy != MPOL_DEFAULT) {
error_setg(errp, "host-nodes must be set for policy %s",
HostMemPolicy_str(backend->policy));
HostMemPolicy_lookup[backend->policy]);
return;
}
@@ -329,63 +327,24 @@ host_memory_backend_memory_complete(UserCreatable *uc, Error **errp)
* specified NUMA policy in place.
*/
if (backend->prealloc) {
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz,
smp_cpus, &local_err);
if (local_err) {
goto out;
}
os_mem_prealloc(memory_region_get_fd(&backend->mr), ptr, sz);
}
}
out:
error_propagate(errp, local_err);
}
static bool
host_memory_backend_can_be_deleted(UserCreatable *uc)
host_memory_backend_can_be_deleted(UserCreatable *uc, Error **errp)
{
if (host_memory_backend_is_mapped(MEMORY_BACKEND(uc))) {
MemoryRegion *mr;
mr = host_memory_backend_get_memory(MEMORY_BACKEND(uc), errp);
if (memory_region_is_mapped(mr)) {
return false;
} else {
return true;
}
}
static char *get_id(Object *o, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
return g_strdup(backend->id);
}
static void set_id(Object *o, const char *str, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
if (backend->id) {
error_setg(errp, "cannot change property value");
return;
}
backend->id = g_strdup(str);
}
static bool host_memory_backend_get_share(Object *o, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
return backend->share;
}
static void host_memory_backend_set_share(Object *o, bool value, Error **errp)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
if (host_memory_backend_mr_inited(backend)) {
error_setg(errp, "cannot change property value");
return;
}
backend->share = value;
}
static void
host_memory_backend_class_init(ObjectClass *oc, void *data)
{
@@ -393,38 +352,6 @@ host_memory_backend_class_init(ObjectClass *oc, void *data)
ucc->complete = host_memory_backend_memory_complete;
ucc->can_be_deleted = host_memory_backend_can_be_deleted;
object_class_property_add_bool(oc, "merge",
host_memory_backend_get_merge,
host_memory_backend_set_merge, &error_abort);
object_class_property_add_bool(oc, "dump",
host_memory_backend_get_dump,
host_memory_backend_set_dump, &error_abort);
object_class_property_add_bool(oc, "prealloc",
host_memory_backend_get_prealloc,
host_memory_backend_set_prealloc, &error_abort);
object_class_property_add(oc, "size", "int",
host_memory_backend_get_size,
host_memory_backend_set_size,
NULL, NULL, &error_abort);
object_class_property_add(oc, "host-nodes", "int",
host_memory_backend_get_host_nodes,
host_memory_backend_set_host_nodes,
NULL, NULL, &error_abort);
object_class_property_add_enum(oc, "policy", "HostMemPolicy",
&HostMemPolicy_lookup,
host_memory_backend_get_policy,
host_memory_backend_set_policy, &error_abort);
object_class_property_add_str(oc, "id", get_id, set_id, &error_abort);
object_class_property_add_bool(oc, "share",
host_memory_backend_get_share, host_memory_backend_set_share,
&error_abort);
}
static void host_memory_backend_finalize(Object *o)
{
HostMemoryBackend *backend = MEMORY_BACKEND(o);
g_free(backend->id);
}
static const TypeInfo host_memory_backend_info = {
@@ -435,7 +362,6 @@ static const TypeInfo host_memory_backend_info = {
.class_init = host_memory_backend_class_init,
.instance_size = sizeof(HostMemoryBackend),
.instance_init = host_memory_backend_init,
.instance_finalize = host_memory_backend_finalize,
.interfaces = (InterfaceInfo[]) {
{ TYPE_USER_CREATABLE },
{ }

93
backends/msmouse.c Normal file
View File

@@ -0,0 +1,93 @@
/*
* QEMU Microsoft serial mouse emulation
*
* Copyright (c) 2008 Lubomir Rintel
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "sysemu/char.h"
#include "ui/console.h"
#define MSMOUSE_LO6(n) ((n) & 0x3f)
#define MSMOUSE_HI2(n) (((n) & 0xc0) >> 6)
static void msmouse_event(void *opaque,
int dx, int dy, int dz, int buttons_state)
{
CharDriverState *chr = (CharDriverState *)opaque;
unsigned char bytes[4] = { 0x40, 0x00, 0x00, 0x00 };
/* Movement deltas */
bytes[0] |= (MSMOUSE_HI2(dy) << 2) | MSMOUSE_HI2(dx);
bytes[1] |= MSMOUSE_LO6(dx);
bytes[2] |= MSMOUSE_LO6(dy);
/* Buttons */
bytes[0] |= (buttons_state & 0x01 ? 0x20 : 0x00);
bytes[0] |= (buttons_state & 0x02 ? 0x10 : 0x00);
bytes[3] |= (buttons_state & 0x04 ? 0x20 : 0x00);
/* We always send the packet of, so that we do not have to keep track
of previous state of the middle button. This can potentially confuse
some very old drivers for two button mice though. */
qemu_chr_be_write(chr, bytes, 4);
}
static int msmouse_chr_write (struct CharDriverState *s, const uint8_t *buf, int len)
{
/* Ignore writes to mouse port */
return len;
}
static void msmouse_chr_close (struct CharDriverState *chr)
{
g_free (chr);
}
static CharDriverState *qemu_chr_open_msmouse(const char *id,
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{
ChardevCommon *common = backend->u.msmouse.data;
CharDriverState *chr;
chr = qemu_chr_alloc(common, errp);
if (!chr) {
return NULL;
}
chr->chr_write = msmouse_chr_write;
chr->chr_close = msmouse_chr_close;
chr->explicit_be_open = true;
qemu_add_mouse_event_handler(msmouse_event, chr, 0, "QEMU Microsoft Mouse");
return chr;
}
static void register_types(void)
{
register_char_driver("msmouse", CHARDEV_BACKEND_KIND_MSMOUSE, NULL,
qemu_chr_open_msmouse);
}
type_init(register_types);

View File

@@ -12,9 +12,10 @@
#include "qemu/osdep.h"
#include "sysemu/rng.h"
#include "chardev/char-fe.h"
#include "sysemu/char.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "hw/qdev.h" /* just for DEFINE_PROP_CHR */
#define TYPE_RNG_EGD "rng-egd"
#define RNG_EGD(obj) OBJECT_CHECK(RngEgd, (obj), TYPE_RNG_EGD)
@@ -23,7 +24,7 @@ typedef struct RngEgd
{
RngBackend parent;
CharBackend chr;
CharDriverState *chr;
char *chr_name;
} RngEgd;
@@ -40,9 +41,7 @@ static void rng_egd_request_entropy(RngBackend *b, RngRequest *req)
header[0] = 0x02;
header[1] = len;
/* XXX this blocks entire thread. Rewrite to use
* qemu_chr_fe_write and background I/O callbacks */
qemu_chr_fe_write_all(&s->chr, header, sizeof(header));
qemu_chr_fe_write(s->chr, header, sizeof(header));
size -= len;
}
@@ -86,7 +85,6 @@ static void rng_egd_chr_read(void *opaque, const uint8_t *buf, int size)
static void rng_egd_opened(RngBackend *b, Error **errp)
{
RngEgd *s = RNG_EGD(b);
Chardev *chr;
if (s->chr_name == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -94,19 +92,21 @@ static void rng_egd_opened(RngBackend *b, Error **errp)
return;
}
chr = qemu_chr_find(s->chr_name);
if (chr == NULL) {
s->chr = qemu_chr_find(s->chr_name);
if (s->chr == NULL) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
"Device '%s' not found", s->chr_name);
return;
}
if (!qemu_chr_fe_init(&s->chr, chr, errp)) {
if (qemu_chr_fe_claim(s->chr) != 0) {
error_setg(errp, QERR_DEVICE_IN_USE, s->chr_name);
return;
}
/* FIXME we should resubmit pending requests when the CDS reconnects. */
qemu_chr_fe_set_handlers(&s->chr, rng_egd_chr_can_read,
rng_egd_chr_read, NULL, NULL, s, NULL, true);
qemu_chr_add_handlers(s->chr, rng_egd_chr_can_read, rng_egd_chr_read,
NULL, s);
}
static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
@@ -125,10 +125,9 @@ static void rng_egd_set_chardev(Object *obj, const char *value, Error **errp)
static char *rng_egd_get_chardev(Object *obj, Error **errp)
{
RngEgd *s = RNG_EGD(obj);
Chardev *chr = qemu_chr_fe_get_driver(&s->chr);
if (chr && chr->label) {
return g_strdup(chr->label);
if (s->chr && s->chr->label) {
return g_strdup(s->chr->label);
}
return NULL;
@@ -145,7 +144,11 @@ static void rng_egd_finalize(Object *obj)
{
RngEgd *s = RNG_EGD(obj);
qemu_chr_fe_deinit(&s->chr, false);
if (s->chr) {
qemu_chr_add_handlers(s->chr, NULL, NULL, NULL, NULL);
qemu_chr_fe_release(s->chr);
}
g_free(s->chr_name);
}

View File

@@ -17,7 +17,7 @@
#include "qapi/qmp/qerror.h"
#include "qemu/main-loop.h"
struct RngRandom
struct RndRandom
{
RngBackend parent;
@@ -34,7 +34,7 @@ struct RngRandom
static void entropy_available(void *opaque)
{
RngRandom *s = RNG_RANDOM(opaque);
RndRandom *s = RNG_RANDOM(opaque);
while (!QSIMPLEQ_EMPTY(&s->parent.requests)) {
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests);
@@ -57,7 +57,7 @@ static void entropy_available(void *opaque)
static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
{
RngRandom *s = RNG_RANDOM(b);
RndRandom *s = RNG_RANDOM(b);
if (QSIMPLEQ_EMPTY(&s->parent.requests)) {
/* If there are no pending requests yet, we need to
@@ -68,7 +68,7 @@ static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
static void rng_random_opened(RngBackend *b, Error **errp)
{
RngRandom *s = RNG_RANDOM(b);
RndRandom *s = RNG_RANDOM(b);
if (s->filename == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -83,7 +83,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
static char *rng_random_get_filename(Object *obj, Error **errp)
{
RngRandom *s = RNG_RANDOM(obj);
RndRandom *s = RNG_RANDOM(obj);
return g_strdup(s->filename);
}
@@ -92,7 +92,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
Error **errp)
{
RngBackend *b = RNG_BACKEND(obj);
RngRandom *s = RNG_RANDOM(obj);
RndRandom *s = RNG_RANDOM(obj);
if (b->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
@@ -105,7 +105,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
static void rng_random_init(Object *obj)
{
RngRandom *s = RNG_RANDOM(obj);
RndRandom *s = RNG_RANDOM(obj);
object_property_add_str(obj, "filename",
rng_random_get_filename,
@@ -118,7 +118,7 @@ static void rng_random_init(Object *obj)
static void rng_random_finalize(Object *obj)
{
RngRandom *s = RNG_RANDOM(obj);
RndRandom *s = RNG_RANDOM(obj);
if (s->fd != -1) {
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
@@ -139,7 +139,7 @@ static void rng_random_class_init(ObjectClass *klass, void *data)
static const TypeInfo rng_random_info = {
.name = TYPE_RNG_RANDOM,
.parent = TYPE_RNG_BACKEND,
.instance_size = sizeof(RngRandom),
.instance_size = sizeof(RndRandom),
.class_init = rng_random_class_init,
.instance_init = rng_random_init,
.instance_finalize = rng_random_finalize,

View File

@@ -25,23 +25,18 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "chardev/char.h"
#include "sysemu/char.h"
#define BUF_SIZE 32
typedef struct {
Chardev parent;
CharDriverState *chr;
uint8_t in_buf[32];
int in_buf_used;
} TestdevChardev;
#define TYPE_CHARDEV_TESTDEV "chardev-testdev"
#define TESTDEV_CHARDEV(obj) \
OBJECT_CHECK(TestdevChardev, (obj), TYPE_CHARDEV_TESTDEV)
} TestdevCharState;
/* Try to interpret a whole incoming packet */
static int testdev_eat_packet(TestdevChardev *testdev)
static int testdev_eat_packet(TestdevCharState *testdev)
{
const uint8_t *cur = testdev->in_buf;
int len = testdev->in_buf_used;
@@ -82,9 +77,9 @@ static int testdev_eat_packet(TestdevChardev *testdev)
}
/* The other end is writing some data. Store it and try to interpret */
static int testdev_chr_write(Chardev *chr, const uint8_t *buf, int len)
static int testdev_write(CharDriverState *chr, const uint8_t *buf, int len)
{
TestdevChardev *testdev = TESTDEV_CHARDEV(chr);
TestdevCharState *testdev = chr->opaque;
int tocopy, eaten, orig_len = len;
while (len) {
@@ -107,23 +102,35 @@ static int testdev_chr_write(Chardev *chr, const uint8_t *buf, int len)
return orig_len;
}
static void char_testdev_class_init(ObjectClass *oc, void *data)
static void testdev_close(struct CharDriverState *chr)
{
ChardevClass *cc = CHARDEV_CLASS(oc);
TestdevCharState *testdev = chr->opaque;
cc->chr_write = testdev_chr_write;
g_free(testdev);
}
static const TypeInfo char_testdev_type_info = {
.name = TYPE_CHARDEV_TESTDEV,
.parent = TYPE_CHARDEV,
.instance_size = sizeof(TestdevChardev),
.class_init = char_testdev_class_init,
};
static CharDriverState *chr_testdev_init(const char *id,
ChardevBackend *backend,
ChardevReturn *ret,
Error **errp)
{
TestdevCharState *testdev;
CharDriverState *chr;
testdev = g_new0(TestdevCharState, 1);
testdev->chr = chr = g_new0(CharDriverState, 1);
chr->opaque = testdev;
chr->chr_write = testdev_write;
chr->chr_close = testdev_close;
return chr;
}
static void register_types(void)
{
type_register_static(&char_testdev_type_info);
register_char_driver("testdev", CHARDEV_BACKEND_KIND_TESTDEV, NULL,
chr_testdev_init);
}
type_init(register_types);

View File

@@ -15,193 +15,184 @@
#include "qemu/osdep.h"
#include "sysemu/tpm_backend.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "sysemu/tpm.h"
#include "qemu/thread.h"
#include "qemu/main-loop.h"
#include "block/thread-pool.h"
#include "qemu/error-report.h"
static void tpm_backend_request_completed(void *opaque, int ret)
{
TPMBackend *s = TPM_BACKEND(opaque);
TPMIfClass *tic = TPM_IF_GET_CLASS(s->tpmif);
tic->request_completed(s->tpmif, ret);
/* no need for atomic, as long the BQL is taken */
s->cmd = NULL;
object_unref(OBJECT(s));
}
static int tpm_backend_worker_thread(gpointer data)
{
TPMBackend *s = TPM_BACKEND(data);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *err = NULL;
k->handle_request(s, s->cmd, &err);
if (err) {
error_report_err(err);
return -1;
}
return 0;
}
void tpm_backend_finish_sync(TPMBackend *s)
{
while (s->cmd) {
aio_poll(qemu_get_aio_context(), true);
}
}
#include "sysemu/tpm_backend_int.h"
enum TpmType tpm_backend_get_type(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->type;
return k->ops->type;
}
int tpm_backend_init(TPMBackend *s, TPMIf *tpmif, Error **errp)
const char *tpm_backend_get_desc(TPMBackend *s)
{
if (s->tpmif) {
error_setg(errp, "TPM backend '%s' is already initialized", s->id);
return -1;
}
s->tpmif = tpmif;
object_ref(OBJECT(tpmif));
s->had_startup_error = false;
return 0;
}
int tpm_backend_startup_tpm(TPMBackend *s, size_t buffersize)
{
int res = 0;
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
/* terminate a running TPM */
tpm_backend_finish_sync(s);
return k->ops->desc();
}
res = k->startup_tpm ? k->startup_tpm(s, buffersize) : 0;
void tpm_backend_destroy(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
s->had_startup_error = (res != 0);
k->ops->destroy(s);
}
return res;
int tpm_backend_init(TPMBackend *s, TPMState *state,
TPMRecvDataCB *datacb)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->init(s, state, datacb);
}
int tpm_backend_startup_tpm(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->startup_tpm(s);
}
bool tpm_backend_had_startup_error(TPMBackend *s)
{
return s->had_startup_error;
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->ops->had_startup_error(s);
}
void tpm_backend_deliver_request(TPMBackend *s, TPMBackendCmd *cmd)
size_t tpm_backend_realloc_buffer(TPMBackend *s, TPMSizedBuffer *sb)
{
ThreadPool *pool = aio_get_thread_pool(qemu_get_aio_context());
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
if (s->cmd != NULL) {
error_report("There is a TPM request pending");
return;
}
return k->ops->realloc_buffer(sb);
}
s->cmd = cmd;
object_ref(OBJECT(s));
thread_pool_submit_aio(pool, tpm_backend_worker_thread, s,
tpm_backend_request_completed, s);
void tpm_backend_deliver_request(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->ops->deliver_request(s);
}
void tpm_backend_reset(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
if (k->reset) {
k->reset(s);
}
tpm_backend_finish_sync(s);
s->had_startup_error = false;
k->ops->reset(s);
}
void tpm_backend_cancel_cmd(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
k->cancel_cmd(s);
k->ops->cancel_cmd(s);
}
bool tpm_backend_get_tpm_established_flag(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_tpm_established_flag ?
k->get_tpm_established_flag(s) : false;
return k->ops->get_tpm_established_flag(s);
}
int tpm_backend_reset_tpm_established_flag(TPMBackend *s, uint8_t locty)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->reset_tpm_established_flag ?
k->reset_tpm_established_flag(s, locty) : 0;
return k->ops->reset_tpm_established_flag(s, locty);
}
TPMVersion tpm_backend_get_tpm_version(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_tpm_version(s);
return k->ops->get_tpm_version(s);
}
size_t tpm_backend_get_buffer_size(TPMBackend *s)
{
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
return k->get_buffer_size(s);
}
TPMInfo *tpm_backend_query_tpm(TPMBackend *s)
{
TPMInfo *info = g_new0(TPMInfo, 1);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
TPMIfClass *tic = TPM_IF_GET_CLASS(s->tpmif);
info->id = g_strdup(s->id);
info->model = tic->model;
info->options = k->get_tpm_options(s);
return info;
}
static void tpm_backend_instance_finalize(Object *obj)
static bool tpm_backend_prop_get_opened(Object *obj, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
object_unref(OBJECT(s->tpmif));
g_free(s->id);
return s->opened;
}
void tpm_backend_open(TPMBackend *s, Error **errp)
{
object_property_set_bool(OBJECT(s), true, "opened", errp);
}
static void tpm_backend_prop_set_opened(Object *obj, bool value, Error **errp)
{
TPMBackend *s = TPM_BACKEND(obj);
TPMBackendClass *k = TPM_BACKEND_GET_CLASS(s);
Error *local_err = NULL;
if (value == s->opened) {
return;
}
if (!value && s->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
return;
}
if (k->opened) {
k->opened(s, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
s->opened = true;
}
static void tpm_backend_instance_init(Object *obj)
{
object_property_add_bool(obj, "opened",
tpm_backend_prop_get_opened,
tpm_backend_prop_set_opened,
NULL);
}
void tpm_backend_thread_deliver_request(TPMBackendThread *tbt)
{
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_PROCESS_CMD, NULL);
}
void tpm_backend_thread_create(TPMBackendThread *tbt,
GFunc func, gpointer user_data)
{
if (!tbt->pool) {
tbt->pool = g_thread_pool_new(func, user_data, 1, TRUE, NULL);
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_INIT, NULL);
}
}
void tpm_backend_thread_end(TPMBackendThread *tbt)
{
if (tbt->pool) {
g_thread_pool_push(tbt->pool, (gpointer)TPM_BACKEND_CMD_END, NULL);
g_thread_pool_free(tbt->pool, FALSE, TRUE);
tbt->pool = NULL;
}
}
static const TypeInfo tpm_backend_info = {
.name = TYPE_TPM_BACKEND,
.parent = TYPE_OBJECT,
.instance_size = sizeof(TPMBackend),
.instance_finalize = tpm_backend_instance_finalize,
.instance_init = tpm_backend_instance_init,
.class_size = sizeof(TPMBackendClass),
.abstract = true,
};
static const TypeInfo tpm_if_info = {
.name = TYPE_TPM_IF,
.parent = TYPE_INTERFACE,
.class_size = sizeof(TPMIfClass),
};
static void register_types(void)
{
type_register_static(&tpm_backend_info);
type_register_static(&tpm_if_info);
}
type_init(register_types);

View File

@@ -29,10 +29,10 @@
#include "exec/cpu-common.h"
#include "sysemu/kvm.h"
#include "sysemu/balloon.h"
#include "trace-root.h"
#include "trace.h"
#include "qmp-commands.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qapi/qmp/qjson.h"
static QEMUBalloonEvent *balloon_event_fn;
static QEMUBalloonStatus *balloon_stat_fn;

2981
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,38 +1,36 @@
block-obj-y += raw-format.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o dmg.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o qcow2-bitmap.o
block-obj-y += qed.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += raw_bsd.o qcow.o vdi.o vmdk.o cloop.o bochs.o vpc.o vvfat.o
block-obj-y += qcow2.o qcow2-refcount.o qcow2-cluster.o qcow2-snapshot.o qcow2-cache.o
block-obj-y += qed.o qed-gencb.o qed-l2-cache.o qed-table.o qed-cluster.o
block-obj-y += qed-check.o
block-obj-y += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-$(CONFIG_VHDX) += vhdx.o vhdx-endian.o vhdx-log.o
block-obj-y += quorum.o
block-obj-y += parallels.o blkdebug.o blkverify.o blkreplay.o
block-obj-y += block-backend.o snapshot.o qapi.o
block-obj-$(CONFIG_WIN32) += file-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += file-posix.o
block-obj-$(CONFIG_WIN32) += raw-win32.o win32-aio.o
block-obj-$(CONFIG_POSIX) += raw-posix.o
block-obj-$(CONFIG_LINUX_AIO) += linux-aio.o
block-obj-y += null.o mirror.o commit.o io.o
block-obj-y += null.o mirror.o io.o
block-obj-y += throttle-groups.o
block-obj-$(CONFIG_LINUX) += nvme.o
block-obj-y += nbd.o nbd-client.o sheepdog.o
block-obj-$(CONFIG_LIBISCSI) += iscsi.o
block-obj-$(if $(CONFIG_LIBISCSI),y,n) += iscsi-opts.o
block-obj-$(CONFIG_LIBNFS) += nfs.o
block-obj-$(CONFIG_CURL) += curl.o
block-obj-$(CONFIG_RBD) += rbd.o
block-obj-$(CONFIG_GLUSTERFS) += gluster.o
block-obj-$(CONFIG_VXHS) += vxhs.o
block-obj-$(CONFIG_ARCHIPELAGO) += archipelago.o
block-obj-$(CONFIG_LIBSSH2) += ssh.o
block-obj-y += accounting.o dirty-bitmap.o
block-obj-y += dictzip.o
block-obj-y += tar.o
block-obj-y += write-threshold.o
block-obj-y += backup.o
block-obj-$(CONFIG_REPLICATION) += replication.o
block-obj-y += throttle.o
block-obj-y += crypto.o
common-obj-y += stream.o
common-obj-y += commit.o
common-obj-y += backup.o
nfs.o-libs := $(LIBNFS_LIBS)
iscsi.o-cflags := $(LIBISCSI_CFLAGS)
iscsi.o-libs := $(LIBISCSI_LIBS)
curl.o-cflags := $(CURL_CFLAGS)
@@ -41,12 +39,10 @@ rbd.o-cflags := $(RBD_CFLAGS)
rbd.o-libs := $(RBD_LIBS)
gluster.o-cflags := $(GLUSTERFS_CFLAGS)
gluster.o-libs := $(GLUSTERFS_LIBS)
vxhs.o-libs := $(VXHS_LIBS)
ssh.o-cflags := $(LIBSSH2_CFLAGS)
ssh.o-libs := $(LIBSSH2_LIBS)
block-obj-$(if $(CONFIG_BZIP2),m,n) += dmg-bz2.o
dmg-bz2.o-libs := $(BZIP2_LIBS)
archipelago.o-libs := $(ARCHIPELAGO_LIBS)
block-obj-m += dmg.o
dmg.o-libs := $(BZIP2_LIBS)
qcow.o-libs := -lz
linux-aio.o-libs := -laio
parallels.o-cflags := $(LIBXML2_CFLAGS)
parallels.o-libs := $(LIBXML2_LIBS)

View File

@@ -32,19 +32,15 @@
static QEMUClockType clock_type = QEMU_CLOCK_REALTIME;
static const int qtest_latency_ns = NANOSECONDS_PER_SECOND / 1000;
void block_acct_init(BlockAcctStats *stats)
{
qemu_mutex_init(&stats->lock);
if (qtest_enabled()) {
clock_type = QEMU_CLOCK_VIRTUAL;
}
}
void block_acct_setup(BlockAcctStats *stats, bool account_invalid,
bool account_failed)
void block_acct_init(BlockAcctStats *stats, bool account_invalid,
bool account_failed)
{
stats->account_invalid = account_invalid;
stats->account_failed = account_failed;
if (qtest_enabled()) {
clock_type = QEMU_CLOCK_VIRTUAL;
}
}
void block_acct_cleanup(BlockAcctStats *stats)
@@ -53,7 +49,6 @@ void block_acct_cleanup(BlockAcctStats *stats)
QSLIST_FOREACH_SAFE(s, &stats->intervals, entries, next) {
g_free(s);
}
qemu_mutex_destroy(&stats->lock);
}
void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
@@ -63,15 +58,12 @@ void block_acct_add_interval(BlockAcctStats *stats, unsigned interval_length)
s = g_new0(BlockAcctTimedStats, 1);
s->interval_length = interval_length;
s->stats = stats;
qemu_mutex_lock(&stats->lock);
QSLIST_INSERT_HEAD(&stats->intervals, s, entries);
for (i = 0; i < BLOCK_MAX_IOTYPE; i++) {
timed_average_init(&s->latency[i], clock_type,
(uint64_t) interval_length * NANOSECONDS_PER_SECOND);
}
qemu_mutex_unlock(&stats->lock);
}
BlockAcctTimedStats *block_acct_interval_next(BlockAcctStats *stats,
@@ -94,8 +86,7 @@ void block_acct_start(BlockAcctStats *stats, BlockAcctCookie *cookie,
cookie->type = type;
}
static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
bool failed)
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
@@ -107,16 +98,31 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
assert(cookie->type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->lock);
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;
if (failed) {
stats->failed_ops[cookie->type]++;
} else {
stats->nr_bytes[cookie->type] += cookie->bytes;
stats->nr_ops[cookie->type]++;
QSLIST_FOREACH(s, &stats->intervals, entries) {
timed_average_account(&s->latency[cookie->type], latency_ns);
}
}
void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
assert(cookie->type < BLOCK_MAX_IOTYPE);
stats->failed_ops[cookie->type]++;
if (stats->account_failed) {
BlockAcctTimedStats *s;
int64_t time_ns = qemu_clock_get_ns(clock_type);
int64_t latency_ns = time_ns - cookie->start_time_ns;
if (qtest_enabled()) {
latency_ns = qtest_latency_ns;
}
if (!failed || stats->account_failed) {
stats->total_time_ns[cookie->type] += latency_ns;
stats->last_access_time_ns = time_ns;
@@ -124,45 +130,29 @@ static void block_account_one_io(BlockAcctStats *stats, BlockAcctCookie *cookie,
timed_average_account(&s->latency[cookie->type], latency_ns);
}
}
qemu_mutex_unlock(&stats->lock);
}
void block_acct_done(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
block_account_one_io(stats, cookie, false);
}
void block_acct_failed(BlockAcctStats *stats, BlockAcctCookie *cookie)
{
block_account_one_io(stats, cookie, true);
}
void block_acct_invalid(BlockAcctStats *stats, enum BlockAcctType type)
{
assert(type < BLOCK_MAX_IOTYPE);
/* block_account_one_io() updates total_time_ns[], but this one does
* not. The reason is that invalid requests are accounted during their
* submission, therefore there's no actual I/O involved.
*/
qemu_mutex_lock(&stats->lock);
/* block_acct_done() and block_acct_failed() update
* total_time_ns[], but this one does not. The reason is that
* invalid requests are accounted during their submission,
* therefore there's no actual I/O involved. */
stats->invalid_ops[type]++;
if (stats->account_invalid) {
stats->last_access_time_ns = qemu_clock_get_ns(clock_type);
}
qemu_mutex_unlock(&stats->lock);
}
void block_acct_merge_done(BlockAcctStats *stats, enum BlockAcctType type,
int num_requests)
{
assert(type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->lock);
stats->merged[type] += num_requests;
qemu_mutex_unlock(&stats->lock);
}
int64_t block_acct_idle_time_ns(BlockAcctStats *stats)
@@ -177,9 +167,7 @@ double block_acct_queue_depth(BlockAcctTimedStats *stats,
assert(type < BLOCK_MAX_IOTYPE);
qemu_mutex_lock(&stats->stats->lock);
sum = timed_average_sum(&stats->latency[type], &elapsed);
qemu_mutex_unlock(&stats->stats->lock);
return (double) sum / elapsed;
}

1084
block/archipelago.c Normal file

File diff suppressed because it is too large Load Diff

View File

@@ -16,22 +16,27 @@
#include "trace.h"
#include "block/block.h"
#include "block/block_int.h"
#include "block/blockjob_int.h"
#include "block/block_backup.h"
#include "block/blockjob.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
#include "qemu/cutils.h"
#include "sysemu/block-backend.h"
#include "qemu/bitmap.h"
#include "qemu/error-report.h"
#define BACKUP_CLUSTER_SIZE_DEFAULT (1 << 16)
#define SLICE_TIME 100000000ULL /* ns */
typedef struct CowRequest {
int64_t start;
int64_t end;
QLIST_ENTRY(CowRequest) list;
CoQueue wait_queue; /* coroutines blocked on this request */
} CowRequest;
typedef struct BackupBlockJob {
BlockJob common;
BlockBackend *target;
BlockDriverState *target;
/* bitmap for sync=incremental */
BdrvDirtyBitmap *sync_bitmap;
MirrorSyncMode sync_mode;
@@ -39,15 +44,18 @@ typedef struct BackupBlockJob {
BlockdevOnError on_source_error;
BlockdevOnError on_target_error;
CoRwlock flush_rwlock;
uint64_t bytes_read;
uint64_t sectors_read;
unsigned long *done_bitmap;
int64_t cluster_size;
bool compress;
NotifierWithReturn before_write;
QLIST_HEAD(, CowRequest) inflight_reqs;
HBitmap *copy_bitmap;
} BackupBlockJob;
/* Size of a cluster in sectors, instead of bytes. */
static inline int64_t cluster_size_sectors(BackupBlockJob *job)
{
return job->cluster_size / BDRV_SECTOR_SIZE;
}
/* See if in-flight requests overlap and wait for them to complete */
static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
int64_t start,
@@ -59,8 +67,8 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
do {
retry = false;
QLIST_FOREACH(req, &job->inflight_reqs, list) {
if (end > req->start_byte && start < req->end_byte) {
qemu_co_queue_wait(&req->wait_queue, NULL);
if (end > req->start && start < req->end) {
qemu_co_queue_wait(&req->wait_queue);
retry = true;
break;
}
@@ -70,10 +78,10 @@ static void coroutine_fn wait_for_overlapping_requests(BackupBlockJob *job,
/* Keep track of an in-flight request */
static void cow_request_begin(CowRequest *req, BackupBlockJob *job,
int64_t start, int64_t end)
int64_t start, int64_t end)
{
req->start_byte = start;
req->end_byte = end;
req->start = start;
req->end = end;
qemu_co_queue_init(&req->wait_queue);
QLIST_INSERT_HEAD(&job->inflight_reqs, req, list);
}
@@ -85,81 +93,90 @@ static void cow_request_end(CowRequest *req)
qemu_co_queue_restart_all(&req->wait_queue);
}
static int coroutine_fn backup_do_cow(BackupBlockJob *job,
int64_t offset, uint64_t bytes,
static int coroutine_fn backup_do_cow(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
bool *error_is_read,
bool is_write_notifier)
{
BlockBackend *blk = job->common.blk;
BackupBlockJob *job = (BackupBlockJob *)bs->job;
CowRequest cow_request;
struct iovec iov;
QEMUIOVector bounce_qiov;
void *bounce_buffer = NULL;
int ret = 0;
int64_t start, end; /* bytes */
int n; /* bytes */
int64_t sectors_per_cluster = cluster_size_sectors(job);
int64_t start, end;
int n;
qemu_co_rwlock_rdlock(&job->flush_rwlock);
start = QEMU_ALIGN_DOWN(offset, job->cluster_size);
end = QEMU_ALIGN_UP(bytes + offset, job->cluster_size);
start = sector_num / sectors_per_cluster;
end = DIV_ROUND_UP(sector_num + nb_sectors, sectors_per_cluster);
trace_backup_do_cow_enter(job, start, offset, bytes);
trace_backup_do_cow_enter(job, start, sector_num, nb_sectors);
wait_for_overlapping_requests(job, start, end);
cow_request_begin(&cow_request, job, start, end);
for (; start < end; start += job->cluster_size) {
if (!hbitmap_get(job->copy_bitmap, start / job->cluster_size)) {
for (; start < end; start++) {
if (test_bit(start, job->done_bitmap)) {
trace_backup_do_cow_skip(job, start);
continue; /* already copied */
}
hbitmap_reset(job->copy_bitmap, start / job->cluster_size, 1);
trace_backup_do_cow_process(job, start);
n = MIN(job->cluster_size, job->common.len - start);
n = MIN(sectors_per_cluster,
job->common.len / BDRV_SECTOR_SIZE -
start * sectors_per_cluster);
if (!bounce_buffer) {
bounce_buffer = blk_blockalign(blk, job->cluster_size);
bounce_buffer = qemu_blockalign(bs, job->cluster_size);
}
iov.iov_base = bounce_buffer;
iov.iov_len = n;
iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&bounce_qiov, &iov, 1);
ret = blk_co_preadv(blk, start, bounce_qiov.size, &bounce_qiov,
is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
if (is_write_notifier) {
ret = bdrv_co_readv_no_serialising(bs,
start * sectors_per_cluster,
n, &bounce_qiov);
} else {
ret = bdrv_co_readv(bs, start * sectors_per_cluster, n,
&bounce_qiov);
}
if (ret < 0) {
trace_backup_do_cow_read_fail(job, start, ret);
if (error_is_read) {
*error_is_read = true;
}
hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
goto out;
}
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = blk_co_pwrite_zeroes(job->target, start,
bounce_qiov.size, BDRV_REQ_MAY_UNMAP);
ret = bdrv_co_write_zeroes(job->target,
start * sectors_per_cluster,
n, BDRV_REQ_MAY_UNMAP);
} else {
ret = blk_co_pwritev(job->target, start,
bounce_qiov.size, &bounce_qiov,
job->compress ? BDRV_REQ_WRITE_COMPRESSED : 0);
ret = bdrv_co_writev(job->target,
start * sectors_per_cluster, n,
&bounce_qiov);
}
if (ret < 0) {
trace_backup_do_cow_write_fail(job, start, ret);
if (error_is_read) {
*error_is_read = false;
}
hbitmap_set(job->copy_bitmap, start / job->cluster_size, 1);
goto out;
}
set_bit(start, job->done_bitmap);
/* Publish progress, guest I/O counts as progress too. Note that the
* offset field is an opaque progress value, it is not a disk offset.
*/
job->bytes_read += n;
job->common.offset += n;
job->sectors_read += n;
job->common.offset += n * BDRV_SECTOR_SIZE;
}
out:
@@ -169,7 +186,7 @@ out:
cow_request_end(&cow_request);
trace_backup_do_cow_return(job, offset, bytes, ret);
trace_backup_do_cow_return(job, sector_num, nb_sectors, ret);
qemu_co_rwlock_unlock(&job->flush_rwlock);
@@ -180,14 +197,14 @@ static int coroutine_fn backup_before_write_notify(
NotifierWithReturn *notifier,
void *opaque)
{
BackupBlockJob *job = container_of(notifier, BackupBlockJob, before_write);
BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert(req->bs == blk_bs(job->common.blk));
assert(QEMU_IS_ALIGNED(req->offset, BDRV_SECTOR_SIZE));
assert(QEMU_IS_ALIGNED(req->bytes, BDRV_SECTOR_SIZE));
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(job, req->offset, req->bytes, NULL, true);
return backup_do_cow(req->bs, sector_num, nb_sectors, NULL, true);
}
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -198,13 +215,22 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void backup_iostatus_reset(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (s->target->blk) {
blk_iostatus_reset(s->target->blk);
}
}
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
{
BdrvDirtyBitmap *bm;
BlockDriverState *bs = blk_bs(job->common.blk);
BlockDriverState *bs = job->common.bs;
if (ret < 0 || block_job_is_cancelled(&job->common)) {
/* Merge the successor back into the parent, delete nothing. */
@@ -233,93 +259,24 @@ static void backup_abort(BlockJob *job)
}
}
static void backup_clean(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
assert(s->target);
blk_unref(s->target);
s->target = NULL;
}
static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
blk_set_aio_context(s->target, aio_context);
}
void backup_do_checkpoint(BlockJob *job, Error **errp)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t len;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
if (backup_job->sync_mode != MIRROR_SYNC_MODE_NONE) {
error_setg(errp, "The backup job only supports block checkpoint in"
" sync=none mode");
return;
}
len = DIV_ROUND_UP(backup_job->common.len, backup_job->cluster_size);
hbitmap_set(backup_job->copy_bitmap, 0, len);
}
void backup_wait_for_overlapping_requests(BlockJob *job, int64_t offset,
uint64_t bytes)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
wait_for_overlapping_requests(backup_job, start, end);
}
void backup_cow_request_begin(CowRequest *req, BlockJob *job,
int64_t offset, uint64_t bytes)
{
BackupBlockJob *backup_job = container_of(job, BackupBlockJob, common);
int64_t start, end;
assert(job->driver->job_type == BLOCK_JOB_TYPE_BACKUP);
start = QEMU_ALIGN_DOWN(offset, backup_job->cluster_size);
end = QEMU_ALIGN_UP(offset + bytes, backup_job->cluster_size);
cow_request_begin(req, backup_job, start, end);
}
void backup_cow_request_end(CowRequest *req)
{
cow_request_end(req);
}
static void backup_drain(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
/* Need to keep a reference in case blk_drain triggers execution
* of backup_complete...
*/
if (s->target) {
BlockBackend *target = s->target;
blk_ref(target);
blk_drain(target);
blk_unref(target);
}
}
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.iostatus_reset = backup_iostatus_reset,
.commit = backup_commit,
.abort = backup_abort,
};
static BlockErrorAction backup_error_action(BackupBlockJob *job,
bool read, int error)
{
if (read) {
return block_job_error_action(&job->common, job->on_source_error,
true, error);
return block_job_error_action(&job->common, job->common.bs,
job->on_source_error, true, error);
} else {
return block_job_error_action(&job->common, job->on_target_error,
false, error);
return block_job_error_action(&job->common, job->target,
job->on_target_error, false, error);
}
}
@@ -329,8 +286,11 @@ typedef struct {
static void backup_complete(BlockJob *job, void *opaque)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque;
bdrv_unref(s->target);
block_job_completed(job, data->ret);
g_free(data);
}
@@ -346,11 +306,11 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
*/
if (job->common.speed) {
uint64_t delay_ns = ratelimit_calculate_delay(&job->limit,
job->bytes_read);
job->bytes_read = 0;
block_job_sleep_ns(&job->common, delay_ns);
job->sectors_read);
job->sectors_read = 0;
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, delay_ns);
} else {
block_job_sleep_ns(&job->common, 0);
block_job_sleep_ns(&job->common, QEMU_CLOCK_REALTIME, 0);
}
if (block_job_is_cancelled(&job->common)) {
@@ -362,122 +322,121 @@ static bool coroutine_fn yield_and_check(BackupBlockJob *job)
static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
{
int ret;
bool error_is_read;
int ret = 0;
int clusters_per_iter;
uint32_t granularity;
int64_t sector;
int64_t cluster;
int64_t end;
int64_t last_cluster = -1;
int64_t sectors_per_cluster = cluster_size_sectors(job);
BlockDriverState *bs = job->common.bs;
HBitmapIter hbi;
hbitmap_iter_init(&hbi, job->copy_bitmap, 0);
while ((cluster = hbitmap_iter_next(&hbi)) != -1) {
do {
if (yield_and_check(job)) {
return 0;
}
ret = backup_do_cow(job, cluster * job->cluster_size,
job->cluster_size, &error_is_read, false);
if (ret < 0 && backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT)
{
return ret;
}
} while (ret < 0);
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
clusters_per_iter = MAX((granularity / job->cluster_size), 1);
bdrv_dirty_iter_init(job->sync_bitmap, &hbi);
/* Find the next dirty sector(s) */
while ((sector = hbitmap_iter_next(&hbi)) != -1) {
cluster = sector / sectors_per_cluster;
/* Fake progress updates for any clusters we skipped */
if (cluster != last_cluster + 1) {
job->common.offset += ((cluster - last_cluster - 1) *
job->cluster_size);
}
for (end = cluster + clusters_per_iter; cluster < end; cluster++) {
do {
if (yield_and_check(job)) {
return ret;
}
ret = backup_do_cow(bs, cluster * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
if ((ret < 0) &&
backup_error_action(job, error_is_read, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
return ret;
}
} while (ret < 0);
}
/* If the bitmap granularity is smaller than the backup granularity,
* we need to advance the iterator pointer to the next cluster. */
if (granularity < job->cluster_size) {
bdrv_set_dirty_iter(&hbi, cluster * sectors_per_cluster);
}
last_cluster = cluster - 1;
}
return 0;
}
/* init copy_bitmap from sync_bitmap */
static void backup_incremental_init_copy_bitmap(BackupBlockJob *job)
{
BdrvDirtyBitmapIter *dbi;
int64_t offset;
int64_t end = DIV_ROUND_UP(bdrv_dirty_bitmap_size(job->sync_bitmap),
job->cluster_size);
dbi = bdrv_dirty_iter_new(job->sync_bitmap);
while ((offset = bdrv_dirty_iter_next(dbi)) != -1) {
int64_t cluster = offset / job->cluster_size;
int64_t next_cluster;
offset += bdrv_dirty_bitmap_granularity(job->sync_bitmap);
if (offset >= bdrv_dirty_bitmap_size(job->sync_bitmap)) {
hbitmap_set(job->copy_bitmap, cluster, end - cluster);
break;
}
offset = bdrv_dirty_bitmap_next_zero(job->sync_bitmap, offset);
if (offset == -1) {
hbitmap_set(job->copy_bitmap, cluster, end - cluster);
break;
}
next_cluster = DIV_ROUND_UP(offset, job->cluster_size);
hbitmap_set(job->copy_bitmap, cluster, next_cluster - cluster);
if (next_cluster >= end) {
break;
}
bdrv_set_dirty_iter(dbi, next_cluster * job->cluster_size);
/* Play some final catchup with the progress meter */
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
if (last_cluster + 1 < end) {
job->common.offset += ((end - last_cluster - 1) * job->cluster_size);
}
job->common.offset = job->common.len -
hbitmap_count(job->copy_bitmap) * job->cluster_size;
bdrv_dirty_iter_free(dbi);
return ret;
}
static void coroutine_fn backup_run(void *opaque)
{
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = blk_bs(job->common.blk);
int64_t offset, nb_clusters;
BlockDriverState *bs = job->common.bs;
BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
NotifierWithReturn before_write = {
.notify = backup_before_write_notify,
};
int64_t start, end;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int ret = 0;
QLIST_INIT(&job->inflight_reqs);
qemu_co_rwlock_init(&job->flush_rwlock);
nb_clusters = DIV_ROUND_UP(job->common.len, job->cluster_size);
job->copy_bitmap = hbitmap_alloc(nb_clusters, 0);
if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
backup_incremental_init_copy_bitmap(job);
} else {
hbitmap_set(job->copy_bitmap, 0, nb_clusters);
start = 0;
end = DIV_ROUND_UP(job->common.len, job->cluster_size);
job->done_bitmap = bitmap_new(end);
if (target->blk) {
blk_set_on_error(target->blk, on_target_error, on_target_error);
blk_iostatus_enable(target->blk);
}
job->before_write.notify = backup_before_write_notify;
bdrv_add_before_write_notifier(bs, &job->before_write);
bdrv_add_before_write_notifier(bs, &before_write);
if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
/* All bits are set in copy_bitmap to allow any cluster to be copied.
* This does not actually require them to be copied. */
while (!block_job_is_cancelled(&job->common)) {
/* Yield until the job is cancelled. We just let our before_write
* notify callback service CoW requests. */
block_job_yield(&job->common);
job->common.busy = false;
qemu_coroutine_yield();
job->common.busy = true;
}
} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
ret = backup_run_incremental(job);
} else {
/* Both FULL and TOP SYNC_MODE's require copying.. */
for (offset = 0; offset < job->common.len;
offset += job->cluster_size) {
for (; start < end; start++) {
bool error_is_read;
int alloced = 0;
if (yield_and_check(job)) {
break;
}
if (job->sync_mode == MIRROR_SYNC_MODE_TOP) {
int i;
int64_t n;
int i, n;
int alloced = 0;
/* Check to see if these blocks are already in the
* backing file. */
for (i = 0; i < job->cluster_size;) {
for (i = 0; i < sectors_per_cluster;) {
/* bdrv_is_allocated() only returns true/false based
* on the first set of sectors it comes across that
* are are all in the same state.
@@ -485,11 +444,12 @@ static void coroutine_fn backup_run(void *opaque)
* backup cluster length. We end up copying more than
* needed but at some point that is always the case. */
alloced =
bdrv_is_allocated(bs, offset + i,
job->cluster_size - i, &n);
bdrv_is_allocated(bs,
start * sectors_per_cluster + i,
sectors_per_cluster - i, &n);
i += n;
if (alloced || n == 0) {
if (alloced == 1 || n == 0) {
break;
}
}
@@ -501,12 +461,8 @@ static void coroutine_fn backup_run(void *opaque)
}
}
/* FULL sync mode we copy the whole drive. */
if (alloced < 0) {
ret = alloced;
} else {
ret = backup_do_cow(job, offset, job->cluster_size,
&error_is_read, false);
}
ret = backup_do_cow(bs, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read, false);
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
BlockErrorAction action =
@@ -514,44 +470,35 @@ static void coroutine_fn backup_run(void *opaque)
if (action == BLOCK_ERROR_ACTION_REPORT) {
break;
} else {
offset -= job->cluster_size;
start--;
continue;
}
}
}
}
notifier_with_return_remove(&job->before_write);
notifier_with_return_remove(&before_write);
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
hbitmap_free(job->copy_bitmap);
g_free(job->done_bitmap);
if (target->blk) {
blk_iostatus_disable(target->blk);
}
bdrv_op_unblock_all(target, job->common.blocker);
data = g_malloc(sizeof(*data));
data->ret = ret;
block_job_defer_to_main_loop(&job->common, backup_complete, data);
}
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.start = backup_run,
.set_speed = backup_set_speed,
.commit = backup_commit,
.abort = backup_abort,
.clean = backup_clean,
.attached_aio_context = backup_attached_aio_context,
.drain = backup_drain,
};
BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
BlockDriverState *target, int64_t speed,
MirrorSyncMode sync_mode, BdrvDirtyBitmap *sync_bitmap,
bool compress,
void backup_start(BlockDriverState *bs, BlockDriverState *target,
int64_t speed, MirrorSyncMode sync_mode,
BdrvDirtyBitmap *sync_bitmap,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
int creation_flags,
BlockCompletionFunc *cb, void *opaque,
BlockJobTxn *txn, Error **errp)
{
@@ -562,55 +509,57 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
assert(bs);
assert(target);
assert(cb);
if (bs == target) {
error_setg(errp, "Source and target cannot be the same");
return NULL;
return;
}
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (!bdrv_is_inserted(bs)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(bs));
return NULL;
return;
}
if (!bdrv_is_inserted(target)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(target));
return NULL;
}
if (compress && target->drv->bdrv_co_pwritev_compressed == NULL) {
error_setg(errp, "Compression is not supported for this drive %s",
bdrv_get_device_name(target));
return NULL;
return;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_BACKUP_SOURCE, errp)) {
return NULL;
return;
}
if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_BACKUP_TARGET, errp)) {
return NULL;
return;
}
if (sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
if (!sync_bitmap) {
error_setg(errp, "must provide a valid bitmap name for "
"\"incremental\" sync mode");
return NULL;
return;
}
/* Create a new bitmap, and freeze/disable this one. */
if (bdrv_dirty_bitmap_create_successor(bs, sync_bitmap, errp) < 0) {
return NULL;
return;
}
} else if (sync_bitmap) {
error_setg(errp,
"a sync_bitmap was provided to backup_run, "
"but received an incompatible sync_mode (%s)",
MirrorSyncMode_str(sync_mode));
return NULL;
MirrorSyncMode_lookup[sync_mode]);
return;
}
len = bdrv_getlength(bs);
@@ -620,46 +569,23 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
goto error;
}
/* job->common.len is fixed, so we can't allow resize */
job = block_job_create(job_id, &backup_job_driver, bs,
BLK_PERM_CONSISTENT_READ,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD,
speed, creation_flags, cb, opaque, errp);
job = block_job_create(&backup_job_driver, bs, speed, cb, opaque, errp);
if (!job) {
goto error;
}
/* The target must match the source in size, so no resize here either */
job->target = blk_new(BLK_PERM_WRITE,
BLK_PERM_CONSISTENT_READ | BLK_PERM_WRITE |
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_GRAPH_MOD);
ret = blk_insert_bs(job->target, target, errp);
if (ret < 0) {
goto error;
}
job->on_source_error = on_source_error;
job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
job->compress = compress;
/* If there is no backing file on the target, we cannot rely on COW if our
* backup cluster size is smaller than the target cluster size. Even for
* targets with a backing file, try to avoid COW if possible. */
ret = bdrv_get_info(target, &bdi);
if (ret == -ENOTSUP && !target->backing) {
/* Cluster size is not defined */
warn_report("The target block device doesn't provide "
"information about the block size and it doesn't have a "
"backing file. The default block size of %u bytes is "
"used. If the actual block size of the target exceeds "
"this default, the backup may be unusable",
BACKUP_CLUSTER_SIZE_DEFAULT);
job->cluster_size = BACKUP_CLUSTER_SIZE_DEFAULT;
} else if (ret < 0 && !target->backing) {
ret = bdrv_get_info(job->target, &bdi);
if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
"which has no backing file");
@@ -673,22 +599,18 @@ BlockJob *backup_job_create(const char *job_id, BlockDriverState *bs,
job->cluster_size = MAX(BACKUP_CLUSTER_SIZE_DEFAULT, bdi.cluster_size);
}
/* Required permissions are already taken with target's blk_new() */
block_job_add_bdrv(&job->common, "target", target, 0, BLK_PERM_ALL,
&error_abort);
bdrv_op_block_all(target, job->common.blocker);
job->common.len = len;
job->common.co = qemu_coroutine_create(backup_run);
block_job_txn_add_job(txn, &job->common);
return &job->common;
qemu_coroutine_enter(job->common.co, job);
return;
error:
if (sync_bitmap) {
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
}
if (job) {
backup_clean(&job->common);
block_job_early_fail(&job->common);
block_job_unref(&job->common);
}
return NULL;
}

View File

@@ -1,7 +1,6 @@
/*
* Block protocol for I/O error injection
*
* Copyright (C) 2016-2017 Red Hat, Inc.
* Copyright (c) 2010 Kevin Wolf <kwolf@redhat.com>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
@@ -29,23 +28,15 @@
#include "qemu/config-file.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/option.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qstring.h"
#include "sysemu/qtest.h"
typedef struct BDRVBlkdebugState {
int state;
int new_state;
uint64_t align;
uint64_t max_transfer;
uint64_t opt_write_zero;
uint64_t max_write_zero;
uint64_t opt_discard;
uint64_t max_discard;
/* For blkdebug_refresh_filename() */
char *config_file;
QLIST_HEAD(, BlkdebugRule) rules[BLKDBG__MAX];
QSIMPLEQ_HEAD(, BlkdebugRule) active_rules;
@@ -54,6 +45,7 @@ typedef struct BDRVBlkdebugState {
typedef struct BlkdebugAIOCB {
BlockAIOCB common;
QEMUBH *bh;
int ret;
} BlkdebugAIOCB;
@@ -63,6 +55,10 @@ typedef struct BlkdebugSuspendedReq {
QLIST_ENTRY(BlkdebugSuspendedReq) next;
} BlkdebugSuspendedReq;
static const AIOCBInfo blkdebug_aiocb_info = {
.aiocb_size = sizeof(BlkdebugAIOCB),
};
enum {
ACTION_INJECT_ERROR,
ACTION_SET_STATE,
@@ -78,7 +74,7 @@ typedef struct BlkdebugRule {
int error;
int immediately;
int once;
int64_t offset;
int64_t sector;
} inject;
struct {
int new_state;
@@ -149,6 +145,20 @@ static QemuOptsList *config_groups[] = {
NULL
};
static int get_event_by_name(const char *name, BlkdebugEvent *event)
{
int i;
for (i = 0; i < BLKDBG__MAX; i++) {
if (!strcmp(BlkdebugEvent_lookup[i], name)) {
*event = i;
return 0;
}
}
return -1;
}
struct add_rule_data {
BDRVBlkdebugState *s;
int action;
@@ -159,18 +169,16 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
struct add_rule_data *d = opaque;
BDRVBlkdebugState *s = d->s;
const char* event_name;
int event;
BlkdebugEvent event;
struct BlkdebugRule *rule;
int64_t sector;
/* Find the right event for the rule */
event_name = qemu_opt_get(opts, "event");
if (!event_name) {
error_setg(errp, "Missing event name for rule");
return -1;
}
event = qapi_enum_parse(&BlkdebugEvent_lookup, event_name, -1, errp);
if (event < 0) {
} else if (get_event_by_name(event_name, &event) < 0) {
error_setg(errp, "Invalid event name \"%s\"", event_name);
return -1;
}
@@ -189,9 +197,7 @@ static int add_rule(void *opaque, QemuOpts *opts, Error **errp)
rule->options.inject.once = qemu_opt_get_bool(opts, "once", 0);
rule->options.inject.immediately =
qemu_opt_get_bool(opts, "immediately", 0);
sector = qemu_opt_get_number(opts, "sector", -1);
rule->options.inject.offset =
sector == -1 ? -1 : sector * BDRV_SECTOR_SIZE;
rule->options.inject.sector = qemu_opt_get_number(opts, "sector", -1);
break;
case ACTION_SET_STATE:
@@ -244,6 +250,7 @@ static int read_config(BDRVBlkdebugState *s, const char *filename,
ret = qemu_config_parse(f, config_groups, filename);
if (ret < 0) {
error_setg(errp, "Could not parse blkdebug config file");
ret = -EINVAL;
goto fail;
}
}
@@ -292,7 +299,7 @@ static void blkdebug_parse_filename(const char *filename, QDict *options,
if (!strstart(filename, "blkdebug:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put_str(options, "x-image", filename);
qdict_put(options, "x-image", qstring_from_str(filename));
return;
}
@@ -311,7 +318,7 @@ static void blkdebug_parse_filename(const char *filename, QDict *options,
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put_str(options, "x-image", filename);
qdict_put(options, "x-image", qstring_from_str(filename));
}
static QemuOptsList runtime_opts = {
@@ -333,31 +340,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_SIZE,
.help = "Required alignment in bytes",
},
{
.name = "max-transfer",
.type = QEMU_OPT_SIZE,
.help = "Maximum transfer size in bytes",
},
{
.name = "opt-write-zero",
.type = QEMU_OPT_SIZE,
.help = "Optimum write zero alignment in bytes",
},
{
.name = "max-write-zero",
.type = QEMU_OPT_SIZE,
.help = "Maximum write zero size in bytes",
},
{
.name = "opt-discard",
.type = QEMU_OPT_SIZE,
.help = "Optimum discard alignment in bytes",
},
{
.name = "max-discard",
.type = QEMU_OPT_SIZE,
.help = "Maximum discard size in bytes",
},
{ /* end of list */ }
},
};
@@ -368,8 +350,9 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
BDRVBlkdebugState *s = bs->opaque;
QemuOpts *opts;
Error *local_err = NULL;
int ret;
const char *config;
uint64_t align;
int ret;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -380,8 +363,8 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
}
/* Read rules from config file or command line options */
s->config_file = g_strdup(qemu_opt_get(opts, "config"));
ret = read_config(s, s->config_file, options, errp);
config = qemu_opt_get(opts, "config");
ret = read_config(s, config, options, errp);
if (ret) {
goto out;
}
@@ -398,245 +381,127 @@ static int blkdebug_open(BlockDriverState *bs, QDict *options, int flags,
goto out;
}
bs->supported_write_flags = BDRV_REQ_FUA &
bs->file->bs->supported_write_flags;
bs->supported_zero_flags = (BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP) &
bs->file->bs->supported_zero_flags;
ret = -EINVAL;
/* Set alignment overrides */
s->align = qemu_opt_get_size(opts, "align", 0);
if (s->align && (s->align >= INT_MAX || !is_power_of_2(s->align))) {
error_setg(errp, "Cannot meet constraints with align %" PRIu64,
s->align);
goto out;
}
align = MAX(s->align, bs->file->bs->bl.request_alignment);
s->max_transfer = qemu_opt_get_size(opts, "max-transfer", 0);
if (s->max_transfer &&
(s->max_transfer >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_transfer, align))) {
error_setg(errp, "Cannot meet constraints with max-transfer %" PRIu64,
s->max_transfer);
goto out;
}
s->opt_write_zero = qemu_opt_get_size(opts, "opt-write-zero", 0);
if (s->opt_write_zero &&
(s->opt_write_zero >= INT_MAX ||
!QEMU_IS_ALIGNED(s->opt_write_zero, align))) {
error_setg(errp, "Cannot meet constraints with opt-write-zero %" PRIu64,
s->opt_write_zero);
goto out;
}
s->max_write_zero = qemu_opt_get_size(opts, "max-write-zero", 0);
if (s->max_write_zero &&
(s->max_write_zero >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_write_zero,
MAX(s->opt_write_zero, align)))) {
error_setg(errp, "Cannot meet constraints with max-write-zero %" PRIu64,
s->max_write_zero);
goto out;
}
s->opt_discard = qemu_opt_get_size(opts, "opt-discard", 0);
if (s->opt_discard &&
(s->opt_discard >= INT_MAX ||
!QEMU_IS_ALIGNED(s->opt_discard, align))) {
error_setg(errp, "Cannot meet constraints with opt-discard %" PRIu64,
s->opt_discard);
goto out;
}
s->max_discard = qemu_opt_get_size(opts, "max-discard", 0);
if (s->max_discard &&
(s->max_discard >= INT_MAX ||
!QEMU_IS_ALIGNED(s->max_discard,
MAX(s->opt_discard, align)))) {
error_setg(errp, "Cannot meet constraints with max-discard %" PRIu64,
s->max_discard);
goto out;
/* Set request alignment */
align = qemu_opt_get_size(opts, "align", bs->request_alignment);
if (align > 0 && align < INT_MAX && !(align & (align - 1))) {
bs->request_alignment = align;
} else {
error_setg(errp, "Invalid alignment");
ret = -EINVAL;
goto fail_unref;
}
ret = 0;
goto out;
fail_unref:
bdrv_unref_child(bs, bs->file);
out:
if (ret < 0) {
g_free(s->config_file);
}
qemu_opts_del(opts);
return ret;
}
static int rule_check(BlockDriverState *bs, uint64_t offset, uint64_t bytes)
static void error_callback_bh(void *opaque)
{
struct BlkdebugAIOCB *acb = opaque;
qemu_bh_delete(acb->bh);
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
}
static BlockAIOCB *inject_error(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque, BlkdebugRule *rule)
{
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
int error;
bool immediately;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
uint64_t inject_offset = rule->options.inject.offset;
if (inject_offset == -1 ||
(bytes && inject_offset >= offset &&
inject_offset < offset + bytes))
{
break;
}
}
if (!rule || !rule->options.inject.error) {
return 0;
}
immediately = rule->options.inject.immediately;
error = rule->options.inject.error;
int error = rule->options.inject.error;
struct BlkdebugAIOCB *acb;
QEMUBH *bh;
bool immediately = rule->options.inject.immediately;
if (rule->options.inject.once) {
QSIMPLEQ_REMOVE(&s->active_rules, rule, BlkdebugRule, active_next);
remove_rule(rule);
}
if (!immediately) {
aio_co_schedule(qemu_get_current_aio_context(), qemu_coroutine_self());
qemu_coroutine_yield();
if (immediately) {
return NULL;
}
return -error;
acb = qemu_aio_get(&blkdebug_aiocb_info, bs, cb, opaque);
acb->ret = -error;
bh = aio_bh_new(bdrv_get_aio_context(bs), error_callback_bh, acb);
acb->bh = bh;
qemu_bh_schedule(bh);
return &acb->common;
}
static int coroutine_fn
blkdebug_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static BlockAIOCB *blkdebug_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int err;
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
/* Sanity check block layer guarantees */
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (bs->bl.max_transfer) {
assert(bytes <= bs->bl.max_transfer);
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1 ||
(rule->options.inject.sector >= sector_num &&
rule->options.inject.sector < sector_num + nb_sectors)) {
break;
}
}
err = rule_check(bs, offset, bytes);
if (err) {
return err;
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
return bdrv_aio_readv(bs->file->bs, sector_num, qiov, nb_sectors,
cb, opaque);
}
static int coroutine_fn
blkdebug_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static BlockAIOCB *blkdebug_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
int err;
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
/* Sanity check block layer guarantees */
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (bs->bl.max_transfer) {
assert(bytes <= bs->bl.max_transfer);
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1 ||
(rule->options.inject.sector >= sector_num &&
rule->options.inject.sector < sector_num + nb_sectors)) {
break;
}
}
err = rule_check(bs, offset, bytes);
if (err) {
return err;
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
return bdrv_aio_writev(bs->file->bs, sector_num, qiov, nb_sectors,
cb, opaque);
}
static int blkdebug_co_flush(BlockDriverState *bs)
static BlockAIOCB *blkdebug_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
int err = rule_check(bs, 0, 0);
BDRVBlkdebugState *s = bs->opaque;
BlkdebugRule *rule = NULL;
if (err) {
return err;
QSIMPLEQ_FOREACH(rule, &s->active_rules, active_next) {
if (rule->options.inject.sector == -1) {
break;
}
}
return bdrv_co_flush(bs->file->bs);
if (rule && rule->options.inject.error) {
return inject_error(bs, cb, opaque, rule);
}
return bdrv_aio_flush(bs->file->bs, cb, opaque);
}
static int coroutine_fn blkdebug_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes,
BdrvRequestFlags flags)
{
uint32_t align = MAX(bs->bl.request_alignment,
bs->bl.pwrite_zeroes_alignment);
int err;
/* Only pass through requests that are larger than requested
* preferred alignment (so that we test the fallback to writes on
* unaligned portions), and check that the block layer never hands
* us anything unaligned that crosses an alignment boundary. */
if (bytes < align) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + bytes, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + bytes, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(bytes, align));
if (bs->bl.max_pwrite_zeroes) {
assert(bytes <= bs->bl.max_pwrite_zeroes);
}
err = rule_check(bs, offset, bytes);
if (err) {
return err;
}
return bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
}
static int coroutine_fn blkdebug_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
{
uint32_t align = bs->bl.pdiscard_alignment;
int err;
/* Only pass through requests that are larger than requested
* minimum alignment, and ensure that unaligned requests do not
* cross optimum discard boundaries. */
if (bytes < bs->bl.request_alignment) {
assert(QEMU_IS_ALIGNED(offset, align) ||
QEMU_IS_ALIGNED(offset + bytes, align) ||
DIV_ROUND_UP(offset, align) ==
DIV_ROUND_UP(offset + bytes, align));
return -ENOTSUP;
}
assert(QEMU_IS_ALIGNED(offset, bs->bl.request_alignment));
assert(QEMU_IS_ALIGNED(bytes, bs->bl.request_alignment));
if (align && bytes >= align) {
assert(QEMU_IS_ALIGNED(offset, align));
assert(QEMU_IS_ALIGNED(bytes, align));
}
if (bs->bl.max_pdiscard) {
assert(bytes <= bs->bl.max_pdiscard);
}
err = rule_check(bs, offset, bytes);
if (err) {
return err;
}
return bdrv_co_pdiscard(bs->file->bs, offset, bytes);
}
static int64_t coroutine_fn blkdebug_co_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
assert(QEMU_IS_ALIGNED(sector_num | nb_sectors,
DIV_ROUND_UP(bs->bl.request_alignment,
BDRV_SECTOR_SIZE)));
return bdrv_co_get_block_status_from_file(bs, sector_num, nb_sectors,
pnum, file);
}
static void blkdebug_close(BlockDriverState *bs)
{
@@ -649,8 +514,6 @@ static void blkdebug_close(BlockDriverState *bs)
remove_rule(rule);
}
}
g_free(s->config_file);
}
static void suspend_request(BlockDriverState *bs, BlkdebugRule *rule)
@@ -730,13 +593,13 @@ static int blkdebug_debug_breakpoint(BlockDriverState *bs, const char *event,
{
BDRVBlkdebugState *s = bs->opaque;
struct BlkdebugRule *rule;
int blkdebug_event;
BlkdebugEvent blkdebug_event;
blkdebug_event = qapi_enum_parse(&BlkdebugEvent_lookup, event, -1, NULL);
if (blkdebug_event < 0) {
if (get_event_by_name(event, &blkdebug_event) < 0) {
return -ENOENT;
}
rule = g_malloc(sizeof(*rule));
*rule = (struct BlkdebugRule) {
.event = blkdebug_event,
@@ -757,7 +620,7 @@ static int blkdebug_debug_resume(BlockDriverState *bs, const char *tag)
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co);
qemu_coroutine_enter(r->co, NULL);
return 0;
}
}
@@ -783,7 +646,7 @@ static int blkdebug_debug_remove_breakpoint(BlockDriverState *bs,
}
QLIST_FOREACH_SAFE(r, &s->suspended_reqs, next, r_next) {
if (!strcmp(r->tag, tag)) {
qemu_coroutine_enter(r->co);
qemu_coroutine_enter(r->co, NULL);
ret = 0;
}
}
@@ -808,9 +671,13 @@ static int64_t blkdebug_getlength(BlockDriverState *bs)
return bdrv_getlength(bs->file->bs);
}
static int blkdebug_truncate(BlockDriverState *bs, int64_t offset)
{
return bdrv_truncate(bs->file->bs, offset);
}
static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
{
BDRVBlkdebugState *s = bs->opaque;
QDict *opts;
const QDictEntry *e;
bool force_json = false;
@@ -831,20 +698,17 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
}
if (!force_json && bs->file->bs->exact_filename[0]) {
int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkdebug:%s:%s", s->config_file ?: "",
bs->file->bs->exact_filename);
if (ret >= sizeof(bs->exact_filename)) {
/* An overflow makes the filename unusable, so do not report any */
bs->exact_filename[0] = 0;
}
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkdebug:%s:%s",
qdict_get_try_str(options, "config") ?: "",
bs->file->bs->exact_filename);
}
opts = qdict_new();
qdict_put_str(opts, "driver", "blkdebug");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkdebug")));
QINCREF(bs->file->bs->full_open_options);
qdict_put(opts, "image", bs->file->bs->full_open_options);
qdict_put_obj(opts, "image", QOBJECT(bs->file->bs->full_open_options));
for (e = qdict_first(options); e; e = qdict_next(options, e)) {
if (strcmp(qdict_entry_key(e), "x-image")) {
@@ -856,30 +720,6 @@ static void blkdebug_refresh_filename(BlockDriverState *bs, QDict *options)
bs->full_open_options = opts;
}
static void blkdebug_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVBlkdebugState *s = bs->opaque;
if (s->align) {
bs->bl.request_alignment = s->align;
}
if (s->max_transfer) {
bs->bl.max_transfer = s->max_transfer;
}
if (s->opt_write_zero) {
bs->bl.pwrite_zeroes_alignment = s->opt_write_zero;
}
if (s->max_write_zero) {
bs->bl.max_pwrite_zeroes = s->max_write_zero;
}
if (s->opt_discard) {
bs->bl.pdiscard_alignment = s->opt_discard;
}
if (s->max_discard) {
bs->bl.max_pdiscard = s->max_discard;
}
}
static int blkdebug_reopen_prepare(BDRVReopenState *reopen_state,
BlockReopenQueue *queue, Error **errp)
{
@@ -890,24 +730,18 @@ static BlockDriver bdrv_blkdebug = {
.format_name = "blkdebug",
.protocol_name = "blkdebug",
.instance_size = sizeof(BDRVBlkdebugState),
.is_filter = true,
.bdrv_parse_filename = blkdebug_parse_filename,
.bdrv_file_open = blkdebug_open,
.bdrv_close = blkdebug_close,
.bdrv_reopen_prepare = blkdebug_reopen_prepare,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkdebug_getlength,
.bdrv_truncate = blkdebug_truncate,
.bdrv_refresh_filename = blkdebug_refresh_filename,
.bdrv_refresh_limits = blkdebug_refresh_limits,
.bdrv_co_preadv = blkdebug_co_preadv,
.bdrv_co_pwritev = blkdebug_co_pwritev,
.bdrv_co_flush_to_disk = blkdebug_co_flush,
.bdrv_co_pwrite_zeroes = blkdebug_co_pwrite_zeroes,
.bdrv_co_pdiscard = blkdebug_co_pdiscard,
.bdrv_co_get_block_status = blkdebug_co_get_block_status,
.bdrv_aio_readv = blkdebug_aio_readv,
.bdrv_aio_writev = blkdebug_aio_writev,
.bdrv_aio_flush = blkdebug_aio_flush,
.bdrv_debug_event = blkdebug_debug_event,
.bdrv_debug_breakpoint = blkdebug_debug_breakpoint,

View File

@@ -20,6 +20,11 @@ typedef struct Request {
QEMUBH *bh;
} Request;
/* Next request id.
This counter is global, because requests from different
block devices should not get overlapping ids. */
static uint64_t request_id;
static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
@@ -37,6 +42,9 @@ static int blkreplay_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
return ret;
}
@@ -57,7 +65,7 @@ static int64_t blkreplay_getlength(BlockDriverState *bs)
static void blkreplay_bh_cb(void *opaque)
{
Request *req = opaque;
aio_co_wake(req->co);
qemu_coroutine_enter(req->co, NULL);
qemu_bh_delete(req->bh);
g_free(req);
}
@@ -73,44 +81,44 @@ static void block_request_create(uint64_t reqid, BlockDriverState *bs,
replay_block_event(req->bh, reqid);
}
static int coroutine_fn blkreplay_co_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
static int coroutine_fn blkreplay_co_readv(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_preadv(bs->file, offset, bytes, qiov, flags);
uint64_t reqid = request_id++;
int ret = bdrv_co_readv(bs->file->bs, sector_num, nb_sectors, qiov);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pwritev(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
static int coroutine_fn blkreplay_co_writev(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, QEMUIOVector *qiov)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_pwritev(bs->file, offset, bytes, qiov, flags);
uint64_t reqid = request_id++;
int ret = bdrv_co_writev(bs->file->bs, sector_num, nb_sectors, qiov);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int bytes, BdrvRequestFlags flags)
static int coroutine_fn blkreplay_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_pwrite_zeroes(bs->file, offset, bytes, flags);
uint64_t reqid = request_id++;
int ret = bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
return ret;
}
static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
int64_t offset, int bytes)
static int coroutine_fn blkreplay_co_discard(BlockDriverState *bs,
int64_t sector_num, int nb_sectors)
{
uint64_t reqid = blkreplay_next_id();
int ret = bdrv_co_pdiscard(bs->file->bs, offset, bytes);
uint64_t reqid = request_id++;
int ret = bdrv_co_discard(bs->file->bs, sector_num, nb_sectors);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
@@ -119,7 +127,7 @@ static int coroutine_fn blkreplay_co_pdiscard(BlockDriverState *bs,
static int coroutine_fn blkreplay_co_flush(BlockDriverState *bs)
{
uint64_t reqid = blkreplay_next_id();
uint64_t reqid = request_id++;
int ret = bdrv_co_flush(bs->file->bs);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
@@ -134,14 +142,13 @@ static BlockDriver bdrv_blkreplay = {
.bdrv_file_open = blkreplay_open,
.bdrv_close = blkreplay_close,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkreplay_getlength,
.bdrv_co_preadv = blkreplay_co_preadv,
.bdrv_co_pwritev = blkreplay_co_pwritev,
.bdrv_co_readv = blkreplay_co_readv,
.bdrv_co_writev = blkreplay_co_writev,
.bdrv_co_pwrite_zeroes = blkreplay_co_pwrite_zeroes,
.bdrv_co_pdiscard = blkreplay_co_pdiscard,
.bdrv_co_write_zeroes = blkreplay_co_write_zeroes,
.bdrv_co_discard = blkreplay_co_discard,
.bdrv_co_flush = blkreplay_co_flush,
};

View File

@@ -14,42 +14,44 @@
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "qemu/cutils.h"
#include "qemu/option.h"
typedef struct {
BdrvChild *test_file;
} BDRVBlkverifyState;
typedef struct BlkverifyRequest {
Coroutine *co;
BlockDriverState *bs;
typedef struct BlkverifyAIOCB BlkverifyAIOCB;
struct BlkverifyAIOCB {
BlockAIOCB common;
QEMUBH *bh;
/* Request metadata */
bool is_write;
uint64_t offset;
uint64_t bytes;
int flags;
int (*request_fn)(BdrvChild *, int64_t, unsigned int, QEMUIOVector *,
BdrvRequestFlags);
int ret; /* test image result */
int raw_ret; /* raw image result */
int64_t sector_num;
int nb_sectors;
int ret; /* first completed request's result */
unsigned int done; /* completion counter */
QEMUIOVector *qiov; /* user I/O vector */
QEMUIOVector *raw_qiov; /* cloned I/O vector for raw file */
} BlkverifyRequest;
QEMUIOVector raw_qiov; /* cloned I/O vector for raw file */
void *buf; /* buffer for raw file I/O */
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyRequest *r,
void (*verify)(BlkverifyAIOCB *acb);
};
static const AIOCBInfo blkverify_aiocb_info = {
.aiocb_size = sizeof(BlkverifyAIOCB),
};
static void GCC_FMT_ATTR(2, 3) blkverify_err(BlkverifyAIOCB *acb,
const char *fmt, ...)
{
va_list ap;
va_start(ap, fmt);
fprintf(stderr, "blkverify: %s offset=%" PRId64 " bytes=%" PRId64 " ",
r->is_write ? "write" : "read", r->offset, r->bytes);
fprintf(stderr, "blkverify: %s sector_num=%" PRId64 " nb_sectors=%d ",
acb->is_write ? "write" : "read", acb->sector_num,
acb->nb_sectors);
vfprintf(stderr, fmt, ap);
fprintf(stderr, "\n");
va_end(ap);
@@ -68,7 +70,7 @@ static void blkverify_parse_filename(const char *filename, QDict *options,
if (!strstart(filename, "blkverify:", &filename)) {
/* There was no prefix; therefore, all options have to be already
present in the QDict (except for the filename) */
qdict_put_str(options, "x-image", filename);
qdict_put(options, "x-image", qstring_from_str(filename));
return;
}
@@ -85,7 +87,7 @@ static void blkverify_parse_filename(const char *filename, QDict *options,
/* TODO Allow multi-level nesting and set file.filename here */
filename = c + 1;
qdict_put_str(options, "x-image", filename);
qdict_put(options, "x-image", qstring_from_str(filename));
}
static QemuOptsList runtime_opts = {
@@ -143,6 +145,9 @@ static int blkverify_open(BlockDriverState *bs, QDict *options, int flags,
ret = 0;
fail:
if (ret < 0) {
bdrv_unref_child(bs, bs->file);
}
qemu_opts_del(opts);
return ret;
}
@@ -162,106 +167,116 @@ static int64_t blkverify_getlength(BlockDriverState *bs)
return bdrv_getlength(s->test_file->bs);
}
static void coroutine_fn blkverify_do_test_req(void *opaque)
static BlkverifyAIOCB *blkverify_aio_get(BlockDriverState *bs, bool is_write,
int64_t sector_num, QEMUIOVector *qiov,
int nb_sectors,
BlockCompletionFunc *cb,
void *opaque)
{
BlkverifyRequest *r = opaque;
BDRVBlkverifyState *s = r->bs->opaque;
BlkverifyAIOCB *acb = qemu_aio_get(&blkverify_aiocb_info, bs, cb, opaque);
r->ret = r->request_fn(s->test_file, r->offset, r->bytes, r->qiov,
r->flags);
r->done++;
qemu_coroutine_enter_if_inactive(r->co);
acb->bh = NULL;
acb->is_write = is_write;
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
acb->ret = -EINPROGRESS;
acb->done = 0;
acb->qiov = qiov;
acb->buf = NULL;
acb->verify = NULL;
return acb;
}
static void coroutine_fn blkverify_do_raw_req(void *opaque)
static void blkverify_aio_bh(void *opaque)
{
BlkverifyRequest *r = opaque;
BlkverifyAIOCB *acb = opaque;
r->raw_ret = r->request_fn(r->bs->file, r->offset, r->bytes, r->raw_qiov,
r->flags);
r->done++;
qemu_coroutine_enter_if_inactive(r->co);
}
static int coroutine_fn
blkverify_co_prwv(BlockDriverState *bs, BlkverifyRequest *r, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, QEMUIOVector *raw_qiov,
int flags, bool is_write)
{
Coroutine *co_a, *co_b;
*r = (BlkverifyRequest) {
.co = qemu_coroutine_self(),
.bs = bs,
.offset = offset,
.bytes = bytes,
.qiov = qiov,
.raw_qiov = raw_qiov,
.flags = flags,
.is_write = is_write,
.request_fn = is_write ? bdrv_co_pwritev : bdrv_co_preadv,
};
co_a = qemu_coroutine_create(blkverify_do_test_req, r);
co_b = qemu_coroutine_create(blkverify_do_raw_req, r);
qemu_coroutine_enter(co_a);
qemu_coroutine_enter(co_b);
while (r->done < 2) {
qemu_coroutine_yield();
qemu_bh_delete(acb->bh);
if (acb->buf) {
qemu_iovec_destroy(&acb->raw_qiov);
qemu_vfree(acb->buf);
}
if (r->ret != r->raw_ret) {
blkverify_err(r, "return value mismatch %d != %d", r->ret, r->raw_ret);
}
return r->ret;
acb->common.cb(acb->common.opaque, acb->ret);
qemu_aio_unref(acb);
}
static int coroutine_fn
blkverify_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static void blkverify_aio_cb(void *opaque, int ret)
{
BlkverifyRequest r;
QEMUIOVector raw_qiov;
void *buf;
ssize_t cmp_offset;
int ret;
BlkverifyAIOCB *acb = opaque;
buf = qemu_blockalign(bs->file->bs, qiov->size);
qemu_iovec_init(&raw_qiov, qiov->niov);
qemu_iovec_clone(&raw_qiov, qiov, buf);
switch (++acb->done) {
case 1:
acb->ret = ret;
break;
ret = blkverify_co_prwv(bs, &r, offset, bytes, qiov, &raw_qiov, flags,
false);
case 2:
if (acb->ret != ret) {
blkverify_err(acb, "return value mismatch %d != %d", acb->ret, ret);
}
cmp_offset = qemu_iovec_compare(qiov, &raw_qiov);
if (cmp_offset != -1) {
blkverify_err(&r, "contents mismatch at offset %" PRId64,
offset + cmp_offset);
if (acb->verify) {
acb->verify(acb);
}
acb->bh = aio_bh_new(bdrv_get_aio_context(acb->common.bs),
blkverify_aio_bh, acb);
qemu_bh_schedule(acb->bh);
break;
}
qemu_iovec_destroy(&raw_qiov);
qemu_vfree(buf);
return ret;
}
static int coroutine_fn
blkverify_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static void blkverify_verify_readv(BlkverifyAIOCB *acb)
{
BlkverifyRequest r;
return blkverify_co_prwv(bs, &r, offset, bytes, qiov, qiov, flags, true);
ssize_t offset = qemu_iovec_compare(acb->qiov, &acb->raw_qiov);
if (offset != -1) {
blkverify_err(acb, "contents mismatch in sector %" PRId64,
acb->sector_num + (int64_t)(offset / BDRV_SECTOR_SIZE));
}
}
static int blkverify_co_flush(BlockDriverState *bs)
static BlockAIOCB *blkverify_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, false, sector_num, qiov,
nb_sectors, cb, opaque);
acb->verify = blkverify_verify_readv;
acb->buf = qemu_blockalign(bs->file->bs, qiov->size);
qemu_iovec_init(&acb->raw_qiov, acb->qiov->niov);
qemu_iovec_clone(&acb->raw_qiov, qiov, acb->buf);
bdrv_aio_readv(s->test_file->bs, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
bdrv_aio_readv(bs->file->bs, sector_num, &acb->raw_qiov, nb_sectors,
blkverify_aio_cb, acb);
return &acb->common;
}
static BlockAIOCB *blkverify_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
BlkverifyAIOCB *acb = blkverify_aio_get(bs, true, sector_num, qiov,
nb_sectors, cb, opaque);
bdrv_aio_writev(s->test_file->bs, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
bdrv_aio_writev(bs->file->bs, sector_num, qiov, nb_sectors,
blkverify_aio_cb, acb);
return &acb->common;
}
static BlockAIOCB *blkverify_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb,
void *opaque)
{
BDRVBlkverifyState *s = bs->opaque;
/* Only flush test file, the raw file is not important */
return bdrv_co_flush(s->test_file->bs);
return bdrv_aio_flush(s->test_file->bs, cb, opaque);
}
static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
@@ -278,6 +293,22 @@ static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
return bdrv_recurse_is_first_non_filter(s->test_file->bs, candidate);
}
/* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_detach_aio_context(s->test_file->bs);
}
static void blkverify_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file->bs, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
{
BDRVBlkverifyState *s = bs->opaque;
@@ -289,12 +320,13 @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
&& s->test_file->bs->full_open_options)
{
QDict *opts = qdict_new();
qdict_put_str(opts, "driver", "blkverify");
qdict_put_obj(opts, "driver", QOBJECT(qstring_from_str("blkverify")));
QINCREF(bs->file->bs->full_open_options);
qdict_put(opts, "raw", bs->file->bs->full_open_options);
qdict_put_obj(opts, "raw", QOBJECT(bs->file->bs->full_open_options));
QINCREF(s->test_file->bs->full_open_options);
qdict_put(opts, "test", s->test_file->bs->full_open_options);
qdict_put_obj(opts, "test",
QOBJECT(s->test_file->bs->full_open_options));
bs->full_open_options = opts;
}
@@ -302,14 +334,10 @@ static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
if (bs->file->bs->exact_filename[0]
&& s->test_file->bs->exact_filename[0])
{
int ret = snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->bs->exact_filename,
s->test_file->bs->exact_filename);
if (ret >= sizeof(bs->exact_filename)) {
/* An overflow makes the filename unusable, so do not report any */
bs->exact_filename[0] = 0;
}
snprintf(bs->exact_filename, sizeof(bs->exact_filename),
"blkverify:%s:%s",
bs->file->bs->exact_filename,
s->test_file->bs->exact_filename);
}
}
@@ -321,13 +349,15 @@ static BlockDriver bdrv_blkverify = {
.bdrv_parse_filename = blkverify_parse_filename,
.bdrv_file_open = blkverify_open,
.bdrv_close = blkverify_close,
.bdrv_child_perm = bdrv_filter_default_perms,
.bdrv_getlength = blkverify_getlength,
.bdrv_refresh_filename = blkverify_refresh_filename,
.bdrv_co_preadv = blkverify_co_preadv,
.bdrv_co_pwritev = blkverify_co_pwritev,
.bdrv_co_flush = blkverify_co_flush,
.bdrv_aio_readv = blkverify_aio_readv,
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,

File diff suppressed because it is too large Load Diff

View File

@@ -27,8 +27,6 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h"
/**************************************************************/
@@ -105,24 +103,9 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
struct bochs_header bochs;
int ret;
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
return -EINVAL;
}
bs->read_only = 1; // no write support yet
if (!bdrv_is_read_only(bs)) {
error_report("Opening bochs images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp); /* no write support yet */
if (ret < 0) {
return ret;
}
}
ret = bdrv_pread(bs->file, 0, &bochs, sizeof(bochs));
ret = bdrv_pread(bs->file->bs, 0, &bochs, sizeof(bochs));
if (ret < 0) {
return ret;
}
@@ -156,7 +139,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
return -ENOMEM;
}
ret = bdrv_pread(bs->file, le32_to_cpu(bochs.header), s->catalog_bitmap,
ret = bdrv_pread(bs->file->bs, le32_to_cpu(bochs.header), s->catalog_bitmap,
s->catalog_size * 4);
if (ret < 0) {
goto fail;
@@ -204,11 +187,6 @@ fail:
return ret;
}
static void bochs_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
{
BDRVBochsState *s = bs->opaque;
@@ -230,7 +208,7 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
(s->extent_blocks + s->bitmap_blocks));
/* read in bitmap for current extent */
ret = bdrv_pread(bs->file, bitmap_offset + (extent_offset / 8),
ret = bdrv_pread(bs->file->bs, bitmap_offset + (extent_offset / 8),
&bitmap_entry, 1);
if (ret < 0) {
return ret;
@@ -243,52 +221,38 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
}
static int coroutine_fn
bochs_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static int bochs_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVBochsState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
uint64_t bytes_done = 0;
QEMUIOVector local_qiov;
int ret;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_co_mutex_lock(&s->lock);
while (nb_sectors > 0) {
int64_t block_offset = seek_to_sector(bs, sector_num);
if (block_offset < 0) {
ret = block_offset;
goto fail;
}
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, 512);
if (block_offset > 0) {
ret = bdrv_co_preadv(bs->file, block_offset, 512,
&local_qiov, 0);
return block_offset;
} else if (block_offset > 0) {
ret = bdrv_pread(bs->file->bs, block_offset, buf, 512);
if (ret < 0) {
goto fail;
return ret;
}
} else {
qemu_iovec_memset(&local_qiov, 0, 0, 512);
memset(buf, 0, 512);
}
nb_sectors--;
sector_num++;
bytes_done += 512;
buf += 512;
}
return 0;
}
ret = 0;
fail:
static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVBochsState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = bochs_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
qemu_iovec_destroy(&local_qiov);
return ret;
}
@@ -303,9 +267,7 @@ static BlockDriver bdrv_bochs = {
.instance_size = sizeof(BDRVBochsState),
.bdrv_probe = bochs_probe,
.bdrv_open = bochs_open,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_refresh_limits = bochs_refresh_limits,
.bdrv_co_preadv = bochs_co_preadv,
.bdrv_read = bochs_co_read,
.bdrv_close = bochs_close,
};

View File

@@ -23,11 +23,9 @@
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/error-report.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include <zlib.h>
/* Maximum compressed block size */
@@ -67,25 +65,10 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
uint32_t offsets_size, max_compressed_block_size = 1, i;
int ret;
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
return -EINVAL;
}
if (!bdrv_is_read_only(bs)) {
error_report("Opening cloop images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
}
bs->read_only = 1;
/* read header */
ret = bdrv_pread(bs->file, 128, &s->block_size, 4);
ret = bdrv_pread(bs->file->bs, 128, &s->block_size, 4);
if (ret < 0) {
return ret;
}
@@ -111,7 +94,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
return -EINVAL;
}
ret = bdrv_pread(bs->file, 128 + 4, &s->n_blocks, 4);
ret = bdrv_pread(bs->file->bs, 128 + 4, &s->n_blocks, 4);
if (ret < 0) {
return ret;
}
@@ -142,7 +125,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
return -ENOMEM;
}
ret = bdrv_pread(bs->file, 128 + 4 + 4, s->offsets, offsets_size);
ret = bdrv_pread(bs->file->bs, 128 + 4 + 4, s->offsets, offsets_size);
if (ret < 0) {
goto fail;
}
@@ -214,11 +197,6 @@ fail:
return ret;
}
static void cloop_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static inline int cloop_read_block(BlockDriverState *bs, int block_num)
{
BDRVCloopState *s = bs->opaque;
@@ -227,7 +205,7 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
int ret;
uint32_t bytes = s->offsets[block_num + 1] - s->offsets[block_num];
ret = bdrv_pread(bs->file, s->offsets[block_num],
ret = bdrv_pread(bs->file->bs, s->offsets[block_num],
s->compressed_block, bytes);
if (ret != bytes) {
return -1;
@@ -251,38 +229,33 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
return 0;
}
static int coroutine_fn
cloop_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static int cloop_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVCloopState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
int i;
for (i = 0; i < nb_sectors; i++) {
void *data;
uint32_t sector_offset_in_block =
((sector_num + i) % s->sectors_per_block),
block_num = (sector_num + i) / s->sectors_per_block;
if (cloop_read_block(bs, block_num) != 0) {
ret = -EIO;
goto fail;
return -1;
}
data = s->uncompressed_block + sector_offset_in_block * 512;
qemu_iovec_from_buf(qiov, i * 512, data, 512);
memcpy(buf + i * 512,
s->uncompressed_block + sector_offset_in_block * 512, 512);
}
return 0;
}
ret = 0;
fail:
static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCloopState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cloop_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -300,9 +273,7 @@ static BlockDriver bdrv_cloop = {
.instance_size = sizeof(BDRVCloopState),
.bdrv_probe = cloop_probe,
.bdrv_open = cloop_open,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_refresh_limits = cloop_refresh_limits,
.bdrv_co_preadv = cloop_co_preadv,
.bdrv_read = cloop_co_read,
.bdrv_close = cloop_close,
};

View File

@@ -13,10 +13,9 @@
*/
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "trace.h"
#include "block/block_int.h"
#include "block/blockjob_int.h"
#include "block/blockjob.h"
#include "qapi/error.h"
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
@@ -36,35 +35,29 @@ enum {
typedef struct CommitBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *commit_top_bs;
BlockBackend *top;
BlockBackend *base;
BlockDriverState *active;
BlockDriverState *top;
BlockDriverState *base;
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockBackend *bs, BlockBackend *base,
int64_t offset, uint64_t bytes,
static int coroutine_fn commit_populate(BlockDriverState *bs,
BlockDriverState *base,
int64_t sector_num, int nb_sectors,
void *buf)
{
int ret = 0;
QEMUIOVector qiov;
struct iovec iov = {
.iov_base = buf,
.iov_len = bytes,
};
assert(bytes < SIZE_MAX);
qemu_iovec_init_external(&qiov, &iov, 1);
ret = blk_co_preadv(bs, offset, qiov.size, &qiov, 0);
if (ret < 0) {
ret = bdrv_read(bs, sector_num, buf, nb_sectors);
if (ret) {
return ret;
}
ret = blk_co_pwritev(base, offset, qiov.size, &qiov, 0);
if (ret < 0) {
ret = bdrv_write(base, sector_num, buf, nb_sectors);
if (ret) {
return ret;
}
@@ -79,29 +72,15 @@ static void commit_complete(BlockJob *job, void *opaque)
{
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
BlockDriverState *top = blk_bs(s->top);
BlockDriverState *base = blk_bs(s->base);
BlockDriverState *commit_top_bs = s->commit_top_bs;
BlockDriverState *active = s->active;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
BlockDriverState *overlay_bs;
int ret = data->ret;
bool remove_commit_top_bs = false;
/* Make sure commit_top_bs and top stay around until bdrv_replace_node() */
bdrv_ref(top);
bdrv_ref(commit_top_bs);
/* Remove base node parent that still uses BLK_PERM_WRITE/RESIZE before
* the normal backing chain can be restored. */
blk_unref(s->base);
if (!block_job_is_cancelled(&s->common) && ret == 0) {
/* success */
ret = bdrv_drop_intermediate(s->commit_top_bs, base,
s->backing_file_str);
} else {
/* XXX Can (or should) we somehow keep 'consistent read' blocked even
* after the failed/cancelled commit job is gone? If we already wrote
* something to base, the intermediate images aren't valid any more. */
remove_commit_top_bs = true;
ret = bdrv_drop_intermediate(active, top, base, s->backing_file_str);
}
/* restore base open flags here if appropriate (e.g., change the base back
@@ -110,87 +89,82 @@ static void commit_complete(BlockJob *job, void *opaque)
if (s->base_flags != bdrv_get_flags(base)) {
bdrv_reopen(base, s->base_flags, NULL);
}
overlay_bs = bdrv_find_overlay(active, top);
if (overlay_bs && s->orig_overlay_flags != bdrv_get_flags(overlay_bs)) {
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
blk_unref(s->top);
/* If there is more than one reference to the job (e.g. if called from
* block_job_finish_sync()), block_job_completed() won't free it and
* therefore the blockers on the intermediate nodes remain. This would
* cause bdrv_set_backing_hd() to fail. */
block_job_remove_all_bdrv(job);
block_job_completed(&s->common, ret);
g_free(data);
/* If bdrv_drop_intermediate() didn't already do that, remove the commit
* filter driver from the backing chain. Do this as the final step so that
* the 'consistent read' permission can be granted. */
if (remove_commit_top_bs) {
bdrv_child_try_set_perm(commit_top_bs->backing, 0, BLK_PERM_ALL,
&error_abort);
bdrv_replace_node(commit_top_bs, backing_bs(commit_top_bs),
&error_abort);
}
bdrv_unref(commit_top_bs);
bdrv_unref(top);
}
static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
int64_t offset;
uint64_t delay_ns = 0;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end;
int ret = 0;
int64_t n = 0; /* bytes */
int n = 0;
void *buf = NULL;
int bytes_written = 0;
int64_t base_len;
ret = s->common.len = blk_getlength(s->top);
ret = s->common.len = bdrv_getlength(top);
if (s->common.len < 0) {
goto out;
}
ret = base_len = blk_getlength(s->base);
ret = base_len = bdrv_getlength(base);
if (base_len < 0) {
goto out;
}
if (base_len < s->common.len) {
ret = blk_truncate(s->base, s->common.len, PREALLOC_MODE_OFF, NULL);
ret = bdrv_truncate(base, s->common.len);
if (ret) {
goto out;
}
}
buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
end = s->common.len >> BDRV_SECTOR_BITS;
buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);
for (offset = 0; offset < s->common.len; offset += n) {
for (sector_num = 0; sector_num < end; sector_num += n) {
uint64_t delay_ns = 0;
bool copy;
wait:
/* Note that even when no rate limit is applied we need to yield
* with no pending I/O here so that bdrv_drain_all() returns.
*/
block_job_sleep_ns(&s->common, delay_ns);
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, delay_ns);
if (block_job_is_cancelled(&s->common)) {
break;
}
/* Copy if allocated above the base */
ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
offset, COMMIT_BUFFER_SIZE, &n);
ret = bdrv_is_allocated_above(top, base, sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
copy = (ret == 1);
trace_commit_one_iteration(s, offset, n, ret);
trace_commit_one_iteration(s, sector_num, n, ret);
if (copy) {
ret = commit_populate(s->top, s->base, offset, n, buf);
bytes_written += n;
if (s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
if (delay_ns > 0) {
goto wait;
}
}
ret = commit_populate(top, base, sector_num, n, buf);
bytes_written += n * BDRV_SECTOR_SIZE;
}
if (ret < 0) {
BlockErrorAction action =
block_job_error_action(&s->common, false, s->on_error, -ret);
if (action == BLOCK_ERROR_ACTION_REPORT) {
if (s->on_error == BLOCKDEV_ON_ERROR_STOP ||
s->on_error == BLOCKDEV_ON_ERROR_REPORT||
(s->on_error == BLOCKDEV_ON_ERROR_ENOSPC && ret == -ENOSPC)) {
goto out;
} else {
n = 0;
@@ -198,11 +172,7 @@ static void coroutine_fn commit_run(void *opaque)
}
}
/* Publish progress */
s->common.offset += n;
if (copy && s->common.speed) {
delay_ns = ratelimit_calculate_delay(&s->limit, n);
}
s->common.offset += n * BDRV_SECTOR_SIZE;
}
ret = 0;
@@ -223,65 +193,33 @@ static void commit_set_speed(BlockJob *job, int64_t speed, Error **errp)
error_setg(errp, QERR_INVALID_PARAMETER, "speed");
return;
}
ratelimit_set_speed(&s->limit, speed, SLICE_TIME);
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static const BlockJobDriver commit_job_driver = {
.instance_size = sizeof(CommitBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = commit_set_speed,
.start = commit_run,
};
static int coroutine_fn bdrv_commit_top_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
{
return bdrv_co_preadv(bs->backing, offset, bytes, qiov, flags);
}
static void bdrv_commit_top_refresh_filename(BlockDriverState *bs, QDict *opts)
{
bdrv_refresh_filename(bs->backing->bs);
pstrcpy(bs->exact_filename, sizeof(bs->exact_filename),
bs->backing->bs->filename);
}
static void bdrv_commit_top_close(BlockDriverState *bs)
{
}
static void bdrv_commit_top_child_perm(BlockDriverState *bs, BdrvChild *c,
const BdrvChildRole *role,
BlockReopenQueue *reopen_queue,
uint64_t perm, uint64_t shared,
uint64_t *nperm, uint64_t *nshared)
{
*nperm = 0;
*nshared = BLK_PERM_ALL;
}
/* Dummy node that provides consistent read to its users without requiring it
* from its backing file and that allows writes on the backing file chain. */
static BlockDriver bdrv_commit_top = {
.format_name = "commit_top",
.bdrv_co_preadv = bdrv_commit_top_preadv,
.bdrv_co_get_block_status = bdrv_co_get_block_status_from_backing,
.bdrv_refresh_filename = bdrv_commit_top_refresh_filename,
.bdrv_close = bdrv_commit_top_close,
.bdrv_child_perm = bdrv_commit_top_child_perm,
};
void commit_start(const char *job_id, BlockDriverState *bs,
BlockDriverState *base, BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, const char *backing_file_str,
const char *filter_node_name, Error **errp)
void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *top, int64_t speed,
BlockdevOnError on_error, BlockCompletionFunc *cb,
void *opaque, const char *backing_file_str, Error **errp)
{
CommitBlockJob *s;
BlockReopenQueue *reopen_queue = NULL;
int orig_overlay_flags;
int orig_base_flags;
BlockDriverState *iter;
BlockDriverState *commit_top_bs = NULL;
BlockDriverState *overlay_bs;
Error *local_err = NULL;
int ret;
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, "Invalid parameter combination");
return;
}
assert(top != bs);
if (top == base) {
@@ -289,258 +227,51 @@ void commit_start(const char *job_id, BlockDriverState *bs,
return;
}
s = block_job_create(job_id, &commit_job_driver, bs, 0, BLK_PERM_ALL,
speed, BLOCK_JOB_DEFAULT, NULL, NULL, errp);
overlay_bs = bdrv_find_overlay(bs, top);
if (overlay_bs == NULL) {
error_setg(errp, "Could not find overlay image for %s:", top->filename);
return;
}
orig_base_flags = bdrv_get_flags(base);
orig_overlay_flags = bdrv_get_flags(overlay_bs);
/* convert base & overlay_bs to r/w, if necessary */
if (!(orig_overlay_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, overlay_bs, NULL,
orig_overlay_flags | BDRV_O_RDWR);
}
if (!(orig_base_flags & BDRV_O_RDWR)) {
reopen_queue = bdrv_reopen_queue(reopen_queue, base, NULL,
orig_base_flags | BDRV_O_RDWR);
}
if (reopen_queue) {
bdrv_reopen_multiple(reopen_queue, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
return;
}
}
s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
/* convert base to r/w, if necessary */
orig_base_flags = bdrv_get_flags(base);
if (!(orig_base_flags & BDRV_O_RDWR)) {
bdrv_reopen(base, orig_base_flags | BDRV_O_RDWR, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
goto fail;
}
}
s->base = base;
s->top = top;
s->active = bs;
/* Insert commit_top block node above top, so we can block consistent read
* on the backing chain below it */
commit_top_bs = bdrv_new_open_driver(&bdrv_commit_top, filter_node_name, 0,
errp);
if (commit_top_bs == NULL) {
goto fail;
}
if (!filter_node_name) {
commit_top_bs->implicit = true;
}
commit_top_bs->total_sectors = top->total_sectors;
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(top));
s->base_flags = orig_base_flags;
s->orig_overlay_flags = orig_overlay_flags;
bdrv_set_backing_hd(commit_top_bs, top, &local_err);
if (local_err) {
bdrv_unref(commit_top_bs);
commit_top_bs = NULL;
error_propagate(errp, local_err);
goto fail;
}
bdrv_replace_node(top, commit_top_bs, &local_err);
if (local_err) {
bdrv_unref(commit_top_bs);
commit_top_bs = NULL;
error_propagate(errp, local_err);
goto fail;
}
s->commit_top_bs = commit_top_bs;
bdrv_unref(commit_top_bs);
/* Block all nodes between top and base, because they will
* disappear from the chain after this operation. */
assert(bdrv_chain_contains(top, base));
for (iter = top; iter != base; iter = backing_bs(iter)) {
/* XXX BLK_PERM_WRITE needs to be allowed so we don't block ourselves
* at s->base (if writes are blocked for a node, they are also blocked
* for its backing file). The other options would be a second filter
* driver above s->base. */
ret = block_job_add_bdrv(&s->common, "intermediate node", iter, 0,
BLK_PERM_WRITE_UNCHANGED | BLK_PERM_WRITE,
errp);
if (ret < 0) {
goto fail;
}
}
ret = block_job_add_bdrv(&s->common, "base", base, 0, BLK_PERM_ALL, errp);
if (ret < 0) {
goto fail;
}
s->base = blk_new(BLK_PERM_CONSISTENT_READ
| BLK_PERM_WRITE
| BLK_PERM_RESIZE,
BLK_PERM_CONSISTENT_READ
| BLK_PERM_GRAPH_MOD
| BLK_PERM_WRITE_UNCHANGED);
ret = blk_insert_bs(s->base, base, errp);
if (ret < 0) {
goto fail;
}
/* Required permissions are already taken with block_job_add_bdrv() */
s->top = blk_new(0, BLK_PERM_ALL);
ret = blk_insert_bs(s->top, top, errp);
if (ret < 0) {
goto fail;
}
s->base_flags = orig_base_flags;
s->backing_file_str = g_strdup(backing_file_str);
s->on_error = on_error;
s->common.co = qemu_coroutine_create(commit_run);
trace_commit_start(bs, base, top, s);
block_job_start(&s->common);
return;
fail:
if (s->base) {
blk_unref(s->base);
}
if (s->top) {
blk_unref(s->top);
}
if (commit_top_bs) {
bdrv_replace_node(commit_top_bs, top, &error_abort);
}
block_job_early_fail(&s->common);
}
#define COMMIT_BUF_SIZE (2048 * BDRV_SECTOR_SIZE)
/* commit COW file into the raw image */
int bdrv_commit(BlockDriverState *bs)
{
BlockBackend *src, *backing;
BlockDriverState *backing_file_bs = NULL;
BlockDriverState *commit_top_bs = NULL;
BlockDriver *drv = bs->drv;
int64_t offset, length, backing_length;
int ro, open_flags;
int64_t n;
int ret = 0;
uint8_t *buf = NULL;
Error *local_err = NULL;
if (!drv)
return -ENOMEDIUM;
if (!bs->backing) {
return -ENOTSUP;
}
if (bdrv_op_is_blocked(bs, BLOCK_OP_TYPE_COMMIT_SOURCE, NULL) ||
bdrv_op_is_blocked(bs->backing->bs, BLOCK_OP_TYPE_COMMIT_TARGET, NULL)) {
return -EBUSY;
}
ro = bs->backing->bs->read_only;
open_flags = bs->backing->bs->open_flags;
if (ro) {
if (bdrv_reopen(bs->backing->bs, open_flags | BDRV_O_RDWR, NULL)) {
return -EACCES;
}
}
src = blk_new(BLK_PERM_CONSISTENT_READ, BLK_PERM_ALL);
backing = blk_new(BLK_PERM_WRITE | BLK_PERM_RESIZE, BLK_PERM_ALL);
ret = blk_insert_bs(src, bs, &local_err);
if (ret < 0) {
error_report_err(local_err);
goto ro_cleanup;
}
/* Insert commit_top block node above backing, so we can write to it */
backing_file_bs = backing_bs(bs);
commit_top_bs = bdrv_new_open_driver(&bdrv_commit_top, NULL, BDRV_O_RDWR,
&local_err);
if (commit_top_bs == NULL) {
error_report_err(local_err);
goto ro_cleanup;
}
bdrv_set_aio_context(commit_top_bs, bdrv_get_aio_context(backing_file_bs));
bdrv_set_backing_hd(commit_top_bs, backing_file_bs, &error_abort);
bdrv_set_backing_hd(bs, commit_top_bs, &error_abort);
ret = blk_insert_bs(backing, backing_file_bs, &local_err);
if (ret < 0) {
error_report_err(local_err);
goto ro_cleanup;
}
length = blk_getlength(src);
if (length < 0) {
ret = length;
goto ro_cleanup;
}
backing_length = blk_getlength(backing);
if (backing_length < 0) {
ret = backing_length;
goto ro_cleanup;
}
/* If our top snapshot is larger than the backing file image,
* grow the backing file image if possible. If not possible,
* we must return an error */
if (length > backing_length) {
ret = blk_truncate(backing, length, PREALLOC_MODE_OFF, &local_err);
if (ret < 0) {
error_report_err(local_err);
goto ro_cleanup;
}
}
/* blk_try_blockalign() for src will choose an alignment that works for
* backing as well, so no need to compare the alignment manually. */
buf = blk_try_blockalign(src, COMMIT_BUF_SIZE);
if (buf == NULL) {
ret = -ENOMEM;
goto ro_cleanup;
}
for (offset = 0; offset < length; offset += n) {
ret = bdrv_is_allocated(bs, offset, COMMIT_BUF_SIZE, &n);
if (ret < 0) {
goto ro_cleanup;
}
if (ret) {
ret = blk_pread(src, offset, buf, n);
if (ret < 0) {
goto ro_cleanup;
}
ret = blk_pwrite(backing, offset, buf, n, 0);
if (ret < 0) {
goto ro_cleanup;
}
}
}
if (drv->bdrv_make_empty) {
ret = drv->bdrv_make_empty(bs);
if (ret < 0) {
goto ro_cleanup;
}
blk_flush(src);
}
/*
* Make sure all data we wrote to the backing device is actually
* stable on disk.
*/
blk_flush(backing);
ret = 0;
ro_cleanup:
qemu_vfree(buf);
blk_unref(backing);
if (backing_file_bs) {
bdrv_set_backing_hd(bs, backing_file_bs, &error_abort);
}
bdrv_unref(commit_top_bs);
blk_unref(src);
if (ro) {
/* ignoring error return here */
bdrv_reopen(bs->backing->bs, open_flags & ~BDRV_O_RDWR, NULL);
}
return ret;
trace_commit_start(bs, base, top, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co, s);
}

View File

@@ -24,12 +24,15 @@
#include "sysemu/block-backend.h"
#include "crypto/block.h"
#include "qapi/opts-visitor.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qobject-input-visitor.h"
#include "qapi-visit.h"
#include "qapi/error.h"
#include "qemu/option.h"
#include "block/crypto.h"
#define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
typedef struct BlockCrypto BlockCrypto;
@@ -55,13 +58,13 @@ static ssize_t block_crypto_read_func(QCryptoBlock *block,
size_t offset,
uint8_t *buf,
size_t buflen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
BlockDriverState *bs = opaque;
ssize_t ret;
ret = bdrv_pread(bs->file, offset, buf, buflen);
ret = bdrv_pread(bs->file->bs, offset, buf, buflen);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not read encryption header");
return ret;
@@ -82,8 +85,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
size_t offset,
const uint8_t *buf,
size_t buflen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
@@ -99,8 +102,8 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
static ssize_t block_crypto_init_func(QCryptoBlock *block,
size_t headerlen,
void *opaque,
Error **errp)
Error **errp,
void *opaque)
{
struct BlockCryptoCreateData *data = opaque;
int ret;
@@ -131,7 +134,11 @@ static QemuOptsList block_crypto_runtime_opts_luks = {
.name = "crypto",
.head = QTAILQ_HEAD_INITIALIZER(block_crypto_runtime_opts_luks.head),
.desc = {
BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(""),
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{ /* end of list */ }
},
};
@@ -146,33 +153,58 @@ static QemuOptsList block_crypto_create_opts_luks = {
.type = QEMU_OPT_SIZE,
.help = "Virtual disk size"
},
BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(""),
BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(""),
{
.name = BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of the secret that provides the encryption key",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE,
.type = QEMU_OPT_STRING,
.help = "Name of encryption cipher mode",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of IV generator hash algorithm",
},
{
.name = BLOCK_CRYPTO_OPT_LUKS_HASH_ALG,
.type = QEMU_OPT_STRING,
.help = "Name of encryption hash algorithm",
},
{ /* end of list */ }
},
};
QCryptoBlockOpenOptions *
static QCryptoBlockOpenOptions *
block_crypto_open_opts_init(QCryptoBlockFormat format,
QDict *opts,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
OptsVisitor *ov;
QCryptoBlockOpenOptions *ret = NULL;
Error *local_err = NULL;
Error *end_err = NULL;
ret = g_new0(QCryptoBlockOpenOptions, 1);
ret->format = format;
v = qobject_input_visitor_new_keyval(QOBJECT(opts));
ov = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
visit_start_struct(opts_get_visitor(ov),
NULL, NULL, 0, &local_err);
if (local_err) {
goto out;
}
@@ -180,23 +212,16 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
switch (format) {
case Q_CRYPTO_BLOCK_FORMAT_LUKS:
visit_type_QCryptoBlockOptionsLUKS_members(
v, &ret->u.luks, &local_err);
break;
case Q_CRYPTO_BLOCK_FORMAT_QCOW:
visit_type_QCryptoBlockOptionsQCow_members(
v, &ret->u.qcow, &local_err);
opts_get_visitor(ov), &ret->u.luks, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(v, &local_err);
}
visit_end_struct(v, NULL);
visit_end_struct(opts_get_visitor(ov), &end_err);
error_propagate(&local_err, end_err);
out:
if (local_err) {
@@ -204,26 +229,28 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
qapi_free_QCryptoBlockOpenOptions(ret);
ret = NULL;
}
visit_free(v);
opts_visitor_cleanup(ov);
return ret;
}
QCryptoBlockCreateOptions *
static QCryptoBlockCreateOptions *
block_crypto_create_opts_init(QCryptoBlockFormat format,
QDict *opts,
QemuOpts *opts,
Error **errp)
{
Visitor *v;
OptsVisitor *ov;
QCryptoBlockCreateOptions *ret = NULL;
Error *local_err = NULL;
Error *end_err = NULL;
ret = g_new0(QCryptoBlockCreateOptions, 1);
ret->format = format;
v = qobject_input_visitor_new_keyval(QOBJECT(opts));
ov = opts_visitor_new(opts);
visit_start_struct(v, NULL, NULL, 0, &local_err);
visit_start_struct(opts_get_visitor(ov),
NULL, NULL, 0, &local_err);
if (local_err) {
goto out;
}
@@ -231,23 +258,16 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
switch (format) {
case Q_CRYPTO_BLOCK_FORMAT_LUKS:
visit_type_QCryptoBlockCreateOptionsLUKS_members(
v, &ret->u.luks, &local_err);
break;
case Q_CRYPTO_BLOCK_FORMAT_QCOW:
visit_type_QCryptoBlockOptionsQCow_members(
v, &ret->u.qcow, &local_err);
opts_get_visitor(ov), &ret->u.luks, &local_err);
break;
default:
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(v, &local_err);
}
visit_end_struct(v, NULL);
visit_end_struct(opts_get_visitor(ov), &end_err);
error_propagate(&local_err, end_err);
out:
if (local_err) {
@@ -255,7 +275,7 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
qapi_free_QCryptoBlockCreateOptions(ret);
ret = NULL;
}
visit_free(v);
opts_visitor_cleanup(ov);
return ret;
}
@@ -273,16 +293,6 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
int ret = -EINVAL;
QCryptoBlockOpenOptions *open_opts = NULL;
unsigned int cflags = 0;
QDict *cryptoopts = NULL;
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
return -EINVAL;
}
bs->supported_write_flags = BDRV_REQ_FUA &
bs->file->bs->supported_write_flags;
opts = qemu_opts_create(opts_spec, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
@@ -291,9 +301,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
goto cleanup;
}
cryptoopts = qemu_opts_to_qdict(opts, NULL);
open_opts = block_crypto_open_opts_init(format, cryptoopts, errp);
open_opts = block_crypto_open_opts_init(format, opts, errp);
if (!open_opts) {
goto cleanup;
}
@@ -301,7 +309,7 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
if (flags & BDRV_O_NO_IO) {
cflags |= QCRYPTO_BLOCK_OPEN_NO_IO;
}
crypto->block = qcrypto_block_open(open_opts, NULL,
crypto->block = qcrypto_block_open(open_opts,
block_crypto_read_func,
bs,
cflags,
@@ -312,11 +320,11 @@ static int block_crypto_open_generic(QCryptoBlockFormat format,
goto cleanup;
}
bs->encrypted = true;
bs->encrypted = 1;
bs->valid_key = 1;
ret = 0;
cleanup:
QDECREF(cryptoopts);
qapi_free_QCryptoBlockOpenOptions(open_opts);
return ret;
}
@@ -336,16 +344,13 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
.opts = opts,
.filename = filename,
};
QDict *cryptoopts;
cryptoopts = qemu_opts_to_qdict(opts, NULL);
create_opts = block_crypto_create_opts_init(format, cryptoopts, errp);
create_opts = block_crypto_create_opts_init(format, opts, errp);
if (!create_opts) {
return -1;
}
crypto = qcrypto_block_create(create_opts, NULL,
crypto = qcrypto_block_create(create_opts,
block_crypto_init_func,
block_crypto_write_func,
&data,
@@ -358,24 +363,21 @@ static int block_crypto_create_generic(QCryptoBlockFormat format,
ret = 0;
cleanup:
QDECREF(cryptoopts);
qcrypto_block_free(crypto);
blk_unref(data.blk);
qapi_free_QCryptoBlockCreateOptions(create_opts);
return ret;
}
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset,
PreallocMode prealloc, Error **errp)
static int block_crypto_truncate(BlockDriverState *bs, int64_t offset)
{
BlockCrypto *crypto = bs->opaque;
uint64_t payload_offset =
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block);
assert(payload_offset < (INT64_MAX - offset));
offset += payload_offset;
return bdrv_truncate(bs->file, offset, prealloc, errp);
return bdrv_truncate(bs->file->bs, offset);
}
static void block_crypto_close(BlockDriverState *bs)
@@ -385,65 +387,66 @@ static void block_crypto_close(BlockDriverState *bs)
}
/*
* 1 MB bounce buffer gives good performance / memory tradeoff
* when using cache=none|directsync.
*/
#define BLOCK_CRYPTO_MAX_IO_SIZE (1024 * 1024)
#define BLOCK_CRYPTO_MAX_SECTORS 32
static coroutine_fn int
block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
block_crypto_co_readv(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
uint64_t cur_bytes; /* number of bytes in current iteration */
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
assert(!flags);
assert(payload_offset < INT64_MAX);
assert(QEMU_IS_ALIGNED(offset, sector_size));
assert(QEMU_IS_ALIGNED(bytes, sector_size));
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer because we don't wish to expose cipher text
* in qiov which points to guest memory.
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (bytes) {
cur_bytes = MIN(bytes, BLOCK_CRYPTO_MAX_IO_SIZE);
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_preadv(bs->file, payload_offset + offset + bytes_done,
cur_bytes, &hd_qiov, 0);
ret = bdrv_co_readv(bs->file->bs,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
if (qcrypto_block_decrypt(crypto->block, offset + bytes_done,
cipher_data, cur_bytes, NULL) < 0) {
if (qcrypto_block_decrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_from_buf(qiov, bytes_done, cipher_data, cur_bytes);
qemu_iovec_from_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
bytes -= cur_bytes;
bytes_done += cur_bytes;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
@@ -455,58 +458,63 @@ block_crypto_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
static coroutine_fn int
block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
block_crypto_co_writev(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
{
BlockCrypto *crypto = bs->opaque;
uint64_t cur_bytes; /* number of bytes in current iteration */
int cur_nr_sectors; /* number of sectors in current iteration */
uint64_t bytes_done = 0;
uint8_t *cipher_data = NULL;
QEMUIOVector hd_qiov;
int ret = 0;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
uint64_t payload_offset = qcrypto_block_get_payload_offset(crypto->block);
assert(!(flags & ~BDRV_REQ_FUA));
assert(payload_offset < INT64_MAX);
assert(QEMU_IS_ALIGNED(offset, sector_size));
assert(QEMU_IS_ALIGNED(bytes, sector_size));
size_t payload_offset =
qcrypto_block_get_payload_offset(crypto->block) / 512;
qemu_iovec_init(&hd_qiov, qiov->niov);
/* Bounce buffer because we're not permitted to touch
* contents of qiov - it points to guest memory.
/* Bounce buffer so we have a linear mem region for
* entire sector. XXX optimize so we avoid bounce
* buffer in case that qiov->niov == 1
*/
cipher_data =
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_IO_SIZE,
qemu_try_blockalign(bs->file->bs, MIN(BLOCK_CRYPTO_MAX_SECTORS * 512,
qiov->size));
if (cipher_data == NULL) {
ret = -ENOMEM;
goto cleanup;
}
while (bytes) {
cur_bytes = MIN(bytes, BLOCK_CRYPTO_MAX_IO_SIZE);
while (remaining_sectors) {
cur_nr_sectors = remaining_sectors;
qemu_iovec_to_buf(qiov, bytes_done, cipher_data, cur_bytes);
if (cur_nr_sectors > BLOCK_CRYPTO_MAX_SECTORS) {
cur_nr_sectors = BLOCK_CRYPTO_MAX_SECTORS;
}
if (qcrypto_block_encrypt(crypto->block, offset + bytes_done,
cipher_data, cur_bytes, NULL) < 0) {
qemu_iovec_to_buf(qiov, bytes_done,
cipher_data, cur_nr_sectors * 512);
if (qcrypto_block_encrypt(crypto->block,
sector_num,
cipher_data, cur_nr_sectors * 512,
NULL) < 0) {
ret = -EIO;
goto cleanup;
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cipher_data, cur_bytes);
qemu_iovec_add(&hd_qiov, cipher_data, cur_nr_sectors * 512);
ret = bdrv_co_pwritev(bs->file, payload_offset + offset + bytes_done,
cur_bytes, &hd_qiov, flags);
ret = bdrv_co_writev(bs->file->bs,
payload_offset + sector_num,
cur_nr_sectors, &hd_qiov);
if (ret < 0) {
goto cleanup;
}
bytes -= cur_bytes;
bytes_done += cur_bytes;
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
}
cleanup:
@@ -516,22 +524,13 @@ block_crypto_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
return ret;
}
static void block_crypto_refresh_limits(BlockDriverState *bs, Error **errp)
{
BlockCrypto *crypto = bs->opaque;
uint64_t sector_size = qcrypto_block_get_sector_size(crypto->block);
bs->bl.request_alignment = sector_size; /* No sub-sector I/O */
}
static int64_t block_crypto_getlength(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
int64_t len = bdrv_getlength(bs->file->bs);
uint64_t offset = qcrypto_block_get_payload_offset(crypto->block);
assert(offset < INT64_MAX);
assert(offset < len);
ssize_t offset = qcrypto_block_get_payload_offset(crypto->block);
len -= offset;
@@ -564,69 +563,19 @@ static int block_crypto_create_luks(const char *filename,
filename, opts, errp);
}
static int block_crypto_get_info_luks(BlockDriverState *bs,
BlockDriverInfo *bdi)
{
BlockDriverInfo subbdi;
int ret;
ret = bdrv_get_info(bs->file->bs, &subbdi);
if (ret != 0) {
return ret;
}
bdi->unallocated_blocks_are_zero = false;
bdi->cluster_size = subbdi.cluster_size;
return 0;
}
static ImageInfoSpecific *
block_crypto_get_specific_info_luks(BlockDriverState *bs)
{
BlockCrypto *crypto = bs->opaque;
ImageInfoSpecific *spec_info;
QCryptoBlockInfo *info;
info = qcrypto_block_get_info(crypto->block, NULL);
if (!info) {
return NULL;
}
if (info->format != Q_CRYPTO_BLOCK_FORMAT_LUKS) {
qapi_free_QCryptoBlockInfo(info);
return NULL;
}
spec_info = g_new(ImageInfoSpecific, 1);
spec_info->type = IMAGE_INFO_SPECIFIC_KIND_LUKS;
spec_info->u.luks.data = g_new(QCryptoBlockInfoLUKS, 1);
*spec_info->u.luks.data = info->u.luks;
/* Blank out pointers we've just stolen to avoid double free */
memset(&info->u.luks, 0, sizeof(info->u.luks));
qapi_free_QCryptoBlockInfo(info);
return spec_info;
}
BlockDriver bdrv_crypto_luks = {
.format_name = "luks",
.instance_size = sizeof(BlockCrypto),
.bdrv_probe = block_crypto_probe_luks,
.bdrv_open = block_crypto_open_luks,
.bdrv_close = block_crypto_close,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_create = block_crypto_create_luks,
.bdrv_truncate = block_crypto_truncate,
.create_opts = &block_crypto_create_opts_luks,
.bdrv_refresh_limits = block_crypto_refresh_limits,
.bdrv_co_preadv = block_crypto_co_preadv,
.bdrv_co_pwritev = block_crypto_co_pwritev,
.bdrv_co_readv = block_crypto_co_readv,
.bdrv_co_writev = block_crypto_co_writev,
.bdrv_getlength = block_crypto_getlength,
.bdrv_get_info = block_crypto_get_info_luks,
.bdrv_get_specific_info = block_crypto_get_specific_info_luks,
};
static void block_crypto_init(void)

View File

@@ -1,101 +0,0 @@
/*
* QEMU block full disk encryption
*
* Copyright (c) 2015-2017 Red Hat, Inc.
*
* This library is free software; you can redistribute it and/or
* modify it under the terms of the GNU Lesser General Public
* License as published by the Free Software Foundation; either
* version 2 of the License, or (at your option) any later version.
*
* This library is distributed in the hope that it will be useful,
* but WITHOUT ANY WARRANTY; without even the implied warranty of
* MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU
* Lesser General Public License for more details.
*
* You should have received a copy of the GNU Lesser General Public
* License along with this library; if not, see <http://www.gnu.org/licenses/>.
*
*/
#ifndef BLOCK_CRYPTO_H__
#define BLOCK_CRYPTO_H__
#define BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, helpstr) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_QCOW_KEY_SECRET, \
.type = QEMU_OPT_STRING, \
.help = helpstr, \
}
#define BLOCK_CRYPTO_OPT_QCOW_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_DEF_QCOW_KEY_SECRET(prefix) \
BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, \
"ID of the secret that provides the AES encryption key")
#define BLOCK_CRYPTO_OPT_LUKS_KEY_SECRET "key-secret"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG "cipher-alg"
#define BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE "cipher-mode"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG "ivgen-alg"
#define BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG "ivgen-hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_HASH_ALG "hash-alg"
#define BLOCK_CRYPTO_OPT_LUKS_ITER_TIME "iter-time"
#define BLOCK_CRYPTO_OPT_DEF_LUKS_KEY_SECRET(prefix) \
BLOCK_CRYPTO_OPT_DEF_KEY_SECRET(prefix, \
"ID of the secret that provides the keyslot passphrase")
#define BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_CIPHER_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption cipher algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_CIPHER_MODE(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_CIPHER_MODE, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption cipher mode", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_IVGEN_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of IV generator algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_IVGEN_HASH_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_IVGEN_HASH_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of IV generator hash algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_HASH_ALG(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_HASH_ALG, \
.type = QEMU_OPT_STRING, \
.help = "Name of encryption hash algorithm", \
}
#define BLOCK_CRYPTO_OPT_DEF_LUKS_ITER_TIME(prefix) \
{ \
.name = prefix BLOCK_CRYPTO_OPT_LUKS_ITER_TIME, \
.type = QEMU_OPT_NUMBER, \
.help = "Time to spend in PBKDF in milliseconds", \
}
QCryptoBlockCreateOptions *
block_crypto_create_opts_init(QCryptoBlockFormat format,
QDict *opts,
Error **errp);
QCryptoBlockOpenOptions *
block_crypto_open_opts_init(QCryptoBlockFormat format,
QDict *opts,
Error **errp);
#endif /* BLOCK_CRYPTO_H__ */

View File

@@ -21,13 +21,12 @@
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "qemu/option.h"
#include "block/block_int.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qstring.h"
#include "crypto/secret.h"
#include <curl/curl.h>
@@ -37,16 +36,10 @@
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
#define DEBUG_CURL_PRINT 1
#define DPRINTF(fmt, ...) do { printf(fmt, ## __VA_ARGS__); } while (0)
#else
#define DEBUG_CURL_PRINT 0
#define DPRINTF(fmt, ...) do { } while (0)
#endif
#define DPRINTF(fmt, ...) \
do { \
if (DEBUG_CURL_PRINT) { \
fprintf(stderr, fmt, ## __VA_ARGS__); \
} \
} while (0)
#if LIBCURL_VERSION_NUM >= 0x071000
/* The multi interface timer callback was introduced in 7.16.0 */
@@ -69,7 +62,8 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#endif
#define PROTOCOLS (CURLPROTO_HTTP | CURLPROTO_HTTPS | \
CURLPROTO_FTP | CURLPROTO_FTPS)
CURLPROTO_FTP | CURLPROTO_FTPS | \
CURLPROTO_TFTP)
#define CURL_NUM_STATES 8
#define CURL_NUM_ACB 8
@@ -77,12 +71,15 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
#define CURL_TIMEOUT_DEFAULT 5
#define CURL_TIMEOUT_MAX 10000
#define FIND_RET_NONE 0
#define FIND_RET_OK 1
#define FIND_RET_WAIT 2
#define CURL_BLOCK_OPT_URL "url"
#define CURL_BLOCK_OPT_READAHEAD "readahead"
#define CURL_BLOCK_OPT_SSLVERIFY "sslverify"
#define CURL_BLOCK_OPT_TIMEOUT "timeout"
#define CURL_BLOCK_OPT_COOKIE "cookie"
#define CURL_BLOCK_OPT_COOKIE_SECRET "cookie-secret"
#define CURL_BLOCK_OPT_USERNAME "username"
#define CURL_BLOCK_OPT_PASSWORD_SECRET "password-secret"
#define CURL_BLOCK_OPT_PROXY_USERNAME "proxy-username"
@@ -90,15 +87,13 @@ static CURLMcode __curl_multi_socket_action(CURLM *multi_handle,
struct BDRVCURLState;
static bool libcurl_initialized;
typedef struct CURLAIOCB {
Coroutine *co;
BlockAIOCB common;
QEMUBH *bh;
QEMUIOVector *qiov;
uint64_t offset;
uint64_t bytes;
int ret;
int64_t sector_num;
int nb_sectors;
size_t start;
size_t end;
@@ -116,7 +111,7 @@ typedef struct CURLState
CURL *curl;
QLIST_HEAD(, CURLSocket) sockets;
char *orig_buf;
uint64_t buf_start;
size_t buf_start;
size_t buf_off;
size_t buf_len;
char range[128];
@@ -127,7 +122,7 @@ typedef struct CURLState
typedef struct BDRVCURLState {
CURLM *multi;
QEMUTimer timer;
uint64_t len;
size_t len;
CURLState states[CURL_NUM_STATES];
char *url;
size_t readahead_size;
@@ -136,8 +131,6 @@ typedef struct BDRVCURLState {
char *cookie;
bool accept_range;
AioContext *aio_context;
QemuMutex mutex;
CoQueue free_state_waitq;
char *username;
char *password;
char *proxyusername;
@@ -149,7 +142,6 @@ static void curl_multi_do(void *arg);
static void curl_multi_read(void *arg);
#ifdef NEED_CURL_TIMER_CALLBACK
/* Called from curl_multi_do_locked, with s->mutex held. */
static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
{
BDRVCURLState *s = opaque;
@@ -166,7 +158,6 @@ static int curl_timer_cb(CURLM *multi, long timeout_ms, void *opaque)
}
#endif
/* Called from curl_multi_do_locked, with s->mutex held. */
static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
void *userp, void *sp)
{
@@ -197,26 +188,25 @@ static int curl_sock_cb(CURL *curl, curl_socket_t fd, int action,
switch (action) {
case CURL_POLL_IN:
aio_set_fd_handler(s->aio_context, fd, false,
curl_multi_read, NULL, NULL, state);
curl_multi_read, NULL, state);
break;
case CURL_POLL_OUT:
aio_set_fd_handler(s->aio_context, fd, false,
NULL, curl_multi_do, NULL, state);
NULL, curl_multi_do, state);
break;
case CURL_POLL_INOUT:
aio_set_fd_handler(s->aio_context, fd, false,
curl_multi_read, curl_multi_do, NULL, state);
curl_multi_read, curl_multi_do, state);
break;
case CURL_POLL_REMOVE:
aio_set_fd_handler(s->aio_context, fd, false,
NULL, NULL, NULL, NULL);
NULL, NULL, NULL);
break;
}
return 0;
}
/* Called from curl_multi_do_locked, with s->mutex held. */
static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
{
BDRVCURLState *s = opaque;
@@ -231,7 +221,6 @@ static size_t curl_header_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
return realsize;
}
/* Called from curl_multi_do_locked, with s->mutex held. */
static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
{
CURLState *s = ((CURLState*)opaque);
@@ -259,7 +248,7 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
continue;
if ((s->buf_off >= acb->end)) {
size_t request_length = acb->bytes;
size_t request_length = acb->nb_sectors * BDRV_SECTOR_SIZE;
qemu_iovec_from_buf(acb->qiov, 0, s->orig_buf + acb->start,
acb->end - acb->start);
@@ -270,11 +259,9 @@ static size_t curl_read_cb(void *ptr, size_t size, size_t nmemb, void *opaque)
request_length - offset);
}
acb->ret = 0;
acb->common.cb(acb->common.opaque, 0);
qemu_aio_unref(acb);
s->acb[i] = NULL;
qemu_mutex_unlock(&s->s->mutex);
aio_co_wake(acb->co);
qemu_mutex_lock(&s->s->mutex);
}
}
@@ -283,19 +270,18 @@ read_end:
return size * nmemb;
}
/* Called with s->mutex held. */
static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
CURLAIOCB *acb)
static int curl_find_buf(BDRVCURLState *s, size_t start, size_t len,
CURLAIOCB *acb)
{
int i;
uint64_t end = start + len;
uint64_t clamped_end = MIN(end, s->len);
uint64_t clamped_len = clamped_end - start;
size_t end = start + len;
size_t clamped_end = MIN(end, s->len);
size_t clamped_len = clamped_end - start;
for (i=0; i<CURL_NUM_STATES; i++) {
CURLState *state = &s->states[i];
uint64_t buf_end = (state->buf_start + state->buf_off);
uint64_t buf_fend = (state->buf_start + state->buf_len);
size_t buf_end = (state->buf_start + state->buf_off);
size_t buf_fend = (state->buf_start + state->buf_len);
if (!state->orig_buf)
continue;
@@ -314,8 +300,9 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
if (clamped_len < len) {
qemu_iovec_memset(acb->qiov, clamped_len, 0, len - clamped_len);
}
acb->ret = 0;
return true;
acb->common.cb(acb->common.opaque, 0);
return FIND_RET_OK;
}
// Wait for unfinished chunks
@@ -333,16 +320,15 @@ static bool curl_find_buf(BDRVCURLState *s, uint64_t start, uint64_t len,
for (j=0; j<CURL_NUM_ACB; j++) {
if (!state->acb[j]) {
state->acb[j] = acb;
return true;
return FIND_RET_WAIT;
}
}
}
}
return false;
return FIND_RET_NONE;
}
/* Called with s->mutex held. */
static void curl_multi_check_completion(BDRVCURLState *s)
{
int msgs_in_queue;
@@ -384,11 +370,9 @@ static void curl_multi_check_completion(BDRVCURLState *s)
continue;
}
acb->ret = -EIO;
acb->common.cb(acb->common.opaque, -EPROTO);
qemu_aio_unref(acb);
state->acb[i] = NULL;
qemu_mutex_unlock(&s->mutex);
aio_co_wake(acb->co);
qemu_mutex_lock(&s->mutex);
}
}
@@ -398,9 +382,9 @@ static void curl_multi_check_completion(BDRVCURLState *s)
}
}
/* Called with s->mutex held. */
static void curl_multi_do_locked(CURLState *s)
static void curl_multi_do(void *arg)
{
CURLState *s = (CURLState *)arg;
CURLSocket *socket, *next_socket;
int running;
int r;
@@ -418,23 +402,12 @@ static void curl_multi_do_locked(CURLState *s)
}
}
static void curl_multi_do(void *arg)
{
CURLState *s = (CURLState *)arg;
qemu_mutex_lock(&s->s->mutex);
curl_multi_do_locked(s);
qemu_mutex_unlock(&s->s->mutex);
}
static void curl_multi_read(void *arg)
{
CURLState *s = (CURLState *)arg;
qemu_mutex_lock(&s->s->mutex);
curl_multi_do_locked(s);
curl_multi_do(arg);
curl_multi_check_completion(s->s);
qemu_mutex_unlock(&s->s->mutex);
}
static void curl_multi_timeout_do(void *arg)
@@ -447,38 +420,40 @@ static void curl_multi_timeout_do(void *arg)
return;
}
qemu_mutex_lock(&s->mutex);
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
curl_multi_check_completion(s);
qemu_mutex_unlock(&s->mutex);
#else
abort();
#endif
}
/* Called with s->mutex held. */
static CURLState *curl_find_state(BDRVCURLState *s)
static CURLState *curl_init_state(BlockDriverState *bs, BDRVCURLState *s)
{
CURLState *state = NULL;
int i;
int i, j;
do {
for (i=0; i<CURL_NUM_STATES; i++) {
for (j=0; j<CURL_NUM_ACB; j++)
if (s->states[i].acb[j])
continue;
if (s->states[i].in_use)
continue;
for (i = 0; i < CURL_NUM_STATES; i++) {
if (!s->states[i].in_use) {
state = &s->states[i];
state->in_use = 1;
break;
}
}
return state;
}
if (!state) {
aio_poll(bdrv_get_aio_context(bs), true);
}
} while(!state);
static int curl_init_state(BDRVCURLState *s, CURLState *state)
{
if (!state->curl) {
state->curl = curl_easy_init();
if (!state->curl) {
return -EIO;
return NULL;
}
curl_easy_setopt(state->curl, CURLOPT_URL, s->url);
curl_easy_setopt(state->curl, CURLOPT_SSL_VERIFYPEER,
@@ -531,17 +506,11 @@ static int curl_init_state(BDRVCURLState *s, CURLState *state)
QLIST_INIT(&state->sockets);
state->s = s;
return 0;
return state;
}
/* Called with s->mutex held. */
static void curl_clean_state(CURLState *s)
{
int j;
for (j = 0; j < CURL_NUM_ACB; j++) {
assert(!s->acb[j]);
}
if (s->s->multi)
curl_multi_remove_handle(s->s->multi, s->curl);
@@ -553,14 +522,12 @@ static void curl_clean_state(CURLState *s)
}
s->in_use = 0;
qemu_co_enter_next(&s->s->free_state_waitq, &s->s->mutex);
}
static void curl_parse_filename(const char *filename, QDict *options,
Error **errp)
{
qdict_put_str(options, CURL_BLOCK_OPT_URL, filename);
qdict_put(options, CURL_BLOCK_OPT_URL, qstring_from_str(filename));
}
static void curl_detach_aio_context(BlockDriverState *bs)
@@ -568,7 +535,6 @@ static void curl_detach_aio_context(BlockDriverState *bs)
BDRVCURLState *s = bs->opaque;
int i;
qemu_mutex_lock(&s->mutex);
for (i = 0; i < CURL_NUM_STATES; i++) {
if (s->states[i].in_use) {
curl_clean_state(&s->states[i]);
@@ -584,7 +550,6 @@ static void curl_detach_aio_context(BlockDriverState *bs)
curl_multi_cleanup(s->multi);
s->multi = NULL;
}
qemu_mutex_unlock(&s->mutex);
timer_del(&s->timer);
}
@@ -637,11 +602,6 @@ static QemuOptsList runtime_opts = {
.type = QEMU_OPT_STRING,
.help = "Pass the cookie or list of cookies with each request"
},
{
.name = CURL_BLOCK_OPT_COOKIE_SECRET,
.type = QEMU_OPT_STRING,
.help = "ID of secret used as cookie passed with each request"
},
{
.name = CURL_BLOCK_OPT_USERNAME,
.type = QEMU_OPT_STRING,
@@ -676,28 +636,16 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
Error *local_err = NULL;
const char *file;
const char *cookie;
const char *cookie_secret;
double d;
const char *secretid;
const char *protocol_delimiter;
int ret;
static int inited = 0;
if (flags & BDRV_O_RDWR) {
error_setg(errp, "curl block device does not support writes");
return -EROFS;
}
if (!libcurl_initialized) {
ret = curl_global_init(CURL_GLOBAL_ALL);
if (ret) {
error_setg(errp, "libcurl initialization failed with %d", ret);
return -EIO;
}
libcurl_initialized = true;
}
qemu_mutex_init(&s->mutex);
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err) {
@@ -723,22 +671,7 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
s->sslverify = qemu_opt_get_bool(opts, CURL_BLOCK_OPT_SSLVERIFY, true);
cookie = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE);
cookie_secret = qemu_opt_get(opts, CURL_BLOCK_OPT_COOKIE_SECRET);
if (cookie && cookie_secret) {
error_setg(errp,
"curl driver cannot handle both cookie and cookie secret");
goto out_noclean;
}
if (cookie_secret) {
s->cookie = qcrypto_secret_lookup_as_utf8(cookie_secret, errp);
if (!s->cookie) {
goto out_noclean;
}
} else {
s->cookie = g_strdup(cookie);
}
s->cookie = g_strdup(cookie);
file = qemu_opt_get(opts, CURL_BLOCK_OPT_URL);
if (file == NULL) {
@@ -746,15 +679,6 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
goto out_noclean;
}
if (!strstart(file, bs->drv->protocol_name, &protocol_delimiter) ||
!strstart(protocol_delimiter, "://", NULL))
{
error_setg(errp, "%s curl driver cannot handle the URL '%s' (does not "
"start with '%s://')", bs->drv->protocol_name, file,
bs->drv->protocol_name);
goto out_noclean;
}
s->username = g_strdup(qemu_opt_get(opts, CURL_BLOCK_OPT_USERNAME));
secretid = qemu_opt_get(opts, CURL_BLOCK_OPT_PASSWORD_SECRET);
@@ -775,23 +699,20 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
}
}
if (!inited) {
curl_global_init(CURL_GLOBAL_ALL);
inited = 1;
}
DPRINTF("CURL: Opening %s\n", file);
qemu_co_queue_init(&s->free_state_waitq);
s->aio_context = bdrv_get_aio_context(bs);
s->url = g_strdup(file);
qemu_mutex_lock(&s->mutex);
state = curl_find_state(s);
qemu_mutex_unlock(&s->mutex);
if (!state) {
state = curl_init_state(bs, s);
if (!state)
goto out_noclean;
}
// Get file size
if (curl_init_state(s, state) < 0) {
goto out;
}
s->accept_range = false;
curl_easy_setopt(state->curl, CURLOPT_NOBODY, 1);
curl_easy_setopt(state->curl, CURLOPT_HEADERFUNCTION,
@@ -799,28 +720,11 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
curl_easy_setopt(state->curl, CURLOPT_HEADERDATA, s);
if (curl_easy_perform(state->curl))
goto out;
if (curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d)) {
curl_easy_getinfo(state->curl, CURLINFO_CONTENT_LENGTH_DOWNLOAD, &d);
if (d)
s->len = (size_t)d;
else if(!s->len)
goto out;
}
/* Prior CURL 7.19.4 return value of 0 could mean that the file size is not
* know or the size is zero. From 7.19.4 CURL returns -1 if size is not
* known and zero if it is realy zero-length file. */
#if LIBCURL_VERSION_NUM >= 0x071304
if (d < 0) {
pstrcpy(state->errmsg, CURL_ERROR_SIZE,
"Server didn't report file size.");
goto out;
}
#else
if (d <= 0) {
pstrcpy(state->errmsg, CURL_ERROR_SIZE,
"Unknown file size or zero-length file.");
goto out;
}
#endif
s->len = d;
if ((!strncasecmp(s->url, "http://", strlen("http://"))
|| !strncasecmp(s->url, "https://", strlen("https://")))
&& !s->accept_range) {
@@ -828,11 +732,9 @@ static int curl_open(BlockDriverState *bs, QDict *options, int flags,
"Server does not support 'range' (byte ranges).");
goto out;
}
DPRINTF("CURL: Size = %" PRIu64 "\n", s->len);
DPRINTF("CURL: Size = %zd\n", s->len);
qemu_mutex_lock(&s->mutex);
curl_clean_state(state);
qemu_mutex_unlock(&s->mutex);
curl_easy_cleanup(state->curl);
state->curl = NULL;
@@ -846,51 +748,53 @@ out:
curl_easy_cleanup(state->curl);
state->curl = NULL;
out_noclean:
qemu_mutex_destroy(&s->mutex);
g_free(s->cookie);
g_free(s->url);
g_free(s->username);
g_free(s->proxyusername);
g_free(s->proxypassword);
qemu_opts_del(opts);
return -EINVAL;
}
static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
static const AIOCBInfo curl_aiocb_info = {
.aiocb_size = sizeof(CURLAIOCB),
};
static void curl_readv_bh_cb(void *p)
{
CURLState *state;
int running;
BDRVCURLState *s = bs->opaque;
CURLAIOCB *acb = p;
BDRVCURLState *s = acb->common.bs->opaque;
uint64_t start = acb->offset;
uint64_t end;
qemu_bh_delete(acb->bh);
acb->bh = NULL;
qemu_mutex_lock(&s->mutex);
size_t start = acb->sector_num * BDRV_SECTOR_SIZE;
size_t end;
// In case we have the requested data already (e.g. read-ahead),
// we can just call the callback and be done.
if (curl_find_buf(s, start, acb->bytes, acb)) {
goto out;
switch (curl_find_buf(s, start, acb->nb_sectors * BDRV_SECTOR_SIZE, acb)) {
case FIND_RET_OK:
qemu_aio_unref(acb);
// fall through
case FIND_RET_WAIT:
return;
default:
break;
}
// No cache found, so let's start a new request
for (;;) {
state = curl_find_state(s);
if (state) {
break;
}
qemu_co_queue_wait(&s->free_state_waitq, &s->mutex);
}
if (curl_init_state(s, state) < 0) {
curl_clean_state(state);
acb->ret = -EIO;
goto out;
state = curl_init_state(acb->common.bs, s);
if (!state) {
acb->common.cb(acb->common.opaque, -EIO);
qemu_aio_unref(acb);
return;
}
acb->start = 0;
acb->end = MIN(acb->bytes, s->len - start);
acb->end = MIN(acb->nb_sectors * BDRV_SECTOR_SIZE, s->len - start);
state->buf_off = 0;
g_free(state->orig_buf);
@@ -900,41 +804,38 @@ static void curl_setup_preadv(BlockDriverState *bs, CURLAIOCB *acb)
state->orig_buf = g_try_malloc(state->buf_len);
if (state->buf_len && state->orig_buf == NULL) {
curl_clean_state(state);
acb->ret = -ENOMEM;
goto out;
acb->common.cb(acb->common.opaque, -ENOMEM);
qemu_aio_unref(acb);
return;
}
state->acb[0] = acb;
snprintf(state->range, 127, "%" PRIu64 "-%" PRIu64, start, end);
DPRINTF("CURL (AIO): Reading %" PRIu64 " at %" PRIu64 " (%s)\n",
acb->bytes, start, state->range);
snprintf(state->range, 127, "%zd-%zd", start, end);
DPRINTF("CURL (AIO): Reading %llu at %zd (%s)\n",
(acb->nb_sectors * BDRV_SECTOR_SIZE), start, state->range);
curl_easy_setopt(state->curl, CURLOPT_RANGE, state->range);
curl_multi_add_handle(s->multi, state->curl);
/* Tell curl it needs to kick things off */
curl_multi_socket_action(s->multi, CURL_SOCKET_TIMEOUT, 0, &running);
out:
qemu_mutex_unlock(&s->mutex);
}
static int coroutine_fn curl_co_preadv(BlockDriverState *bs,
uint64_t offset, uint64_t bytes, QEMUIOVector *qiov, int flags)
static BlockAIOCB *curl_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
CURLAIOCB acb = {
.co = qemu_coroutine_self(),
.ret = -EINPROGRESS,
.qiov = qiov,
.offset = offset,
.bytes = bytes
};
CURLAIOCB *acb;
curl_setup_preadv(bs, &acb);
while (acb.ret == -EINPROGRESS) {
qemu_coroutine_yield();
}
return acb.ret;
acb = qemu_aio_get(&curl_aiocb_info, bs, cb, opaque);
acb->qiov = qiov;
acb->sector_num = sector_num;
acb->nb_sectors = nb_sectors;
acb->bh = aio_bh_new(bdrv_get_aio_context(bs), curl_readv_bh_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
static void curl_close(BlockDriverState *bs)
@@ -943,13 +844,9 @@ static void curl_close(BlockDriverState *bs)
DPRINTF("CURL: Close\n");
curl_detach_aio_context(bs);
qemu_mutex_destroy(&s->mutex);
g_free(s->cookie);
g_free(s->url);
g_free(s->username);
g_free(s->proxyusername);
g_free(s->proxypassword);
}
static int64_t curl_getlength(BlockDriverState *bs)
@@ -968,7 +865,7 @@ static BlockDriver bdrv_http = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -984,7 +881,7 @@ static BlockDriver bdrv_https = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -1000,7 +897,7 @@ static BlockDriver bdrv_ftp = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -1016,7 +913,23 @@ static BlockDriver bdrv_ftps = {
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_co_preadv = curl_co_preadv,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
};
static BlockDriver bdrv_tftp = {
.format_name = "tftp",
.protocol_name = "tftp",
.instance_size = sizeof(BDRVCURLState),
.bdrv_parse_filename = curl_parse_filename,
.bdrv_file_open = curl_open,
.bdrv_close = curl_close,
.bdrv_getlength = curl_getlength,
.bdrv_aio_readv = curl_aio_readv,
.bdrv_detach_aio_context = curl_detach_aio_context,
.bdrv_attach_aio_context = curl_attach_aio_context,
@@ -1028,6 +941,7 @@ static void curl_block_init(void)
bdrv_register(&bdrv_https);
bdrv_register(&bdrv_ftp);
bdrv_register(&bdrv_ftps);
bdrv_register(&bdrv_tftp);
}
block_init(curl_block_init);

586
block/dictzip.c Normal file
View File

@@ -0,0 +1,586 @@
/*
* DictZip Block driver for dictzip enabled gzip files
*
* Use the "dictzip" tool from the "dictd" package to create gzip files that
* contain the extra DictZip headers.
*
* dictzip(1) is a compression program which creates compressed files in the
* gzip format (see RFC 1952). However, unlike gzip(1), dictzip(1) compresses
* the file in pieces and stores an index to the pieces in the gzip header.
* This allows random access to the file at the granularity of the compressed
* pieces (currently about 64kB) while maintaining good compression ratios
* (within 5% of the expected ratio for dictionary data).
* dictd(8) uses files stored in this format.
*
* For details on DictZip see http://dict.org/.
*
* Copyright (c) 2009 Alexander Graf <agraf@suse.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>
// #define DEBUG
#ifdef DEBUG
#define dprintf(fmt, ...) do { printf("dzip: " fmt, ## __VA_ARGS__); } while (0)
#else
#define dprintf(fmt, ...) do { } while (0)
#endif
#define SECTOR_SIZE 512
#define Z_STREAM_COUNT 4
#define CACHE_COUNT 20
/* magic values */
#define GZ_MAGIC1 0x1f
#define GZ_MAGIC2 0x8b
#define DZ_MAGIC1 'R'
#define DZ_MAGIC2 'A'
#define GZ_FEXTRA 0x04 /* Optional field (random access index) */
#define GZ_FNAME 0x08 /* Original name */
#define GZ_COMMENT 0x10 /* Zero-terminated, human-readable comment */
#define GZ_FHCRC 0x02 /* Header CRC16 */
/* offsets */
#define GZ_ID 0 /* GZ_MAGIC (16bit) */
#define GZ_FLG 3 /* FLaGs (see above) */
#define GZ_XLEN 10 /* eXtra LENgth (16bit) */
#define GZ_SI 12 /* Subfield ID (16bit) */
#define GZ_VERSION 16 /* Version for subfield format */
#define GZ_CHUNKSIZE 18 /* Chunk size (16bit) */
#define GZ_CHUNKCNT 20 /* Number of chunks (16bit) */
#define GZ_RNDDATA 22 /* Random access data (16bit) */
#define GZ_99_CHUNKSIZE 18 /* Chunk size (32bit) */
#define GZ_99_CHUNKCNT 22 /* Number of chunks (32bit) */
#define GZ_99_FILESIZE 26 /* Size of unpacked file (64bit) */
#define GZ_99_RNDDATA 34 /* Random access data (32bit) */
struct BDRVDictZipState;
typedef struct DictZipAIOCB {
BlockAIOCB common;
struct BDRVDictZipState *s;
QEMUIOVector *qiov; /* QIOV of the original request */
QEMUIOVector *qiov_gz; /* QIOV of the gz subrequest */
QEMUBH *bh; /* BH for cache */
z_stream *zStream; /* stream to use for decoding */
int zStream_id; /* stream id of the above pointer */
size_t start; /* offset into the uncompressed file */
size_t len; /* uncompressed bytes to read */
uint8_t *gzipped; /* the gzipped data */
uint8_t *buf; /* cached result */
size_t gz_len; /* amount of gzip data */
size_t gz_start; /* uncompressed starting point of gzip data */
uint64_t offset; /* offset for "start" into the uncompressed chunk */
int chunks_len; /* amount of uncompressed data in all gzip data */
} DictZipAIOCB;
typedef struct dict_cache {
size_t start;
size_t len;
uint8_t *buf;
} DictCache;
typedef struct BDRVDictZipState {
BlockDriverState *hd;
z_stream zStream[Z_STREAM_COUNT];
DictCache cache[CACHE_COUNT];
int cache_index;
uint8_t stream_in_use;
uint64_t chunk_len;
uint32_t chunk_cnt;
uint16_t *chunks;
uint32_t *chunks32;
uint64_t *offsets;
int64_t file_len;
} BDRVDictZipState;
static int start_zStream(z_stream *zStream)
{
zStream->zalloc = NULL;
zStream->zfree = NULL;
zStream->opaque = NULL;
zStream->next_in = 0;
zStream->avail_in = 0;
zStream->next_out = NULL;
zStream->avail_out = 0;
return inflateInit2( zStream, -15 );
}
static QemuOptsList runtime_opts = {
.name = "dzip",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.type = QEMU_OPT_STRING,
.help = "URL to the dictzip file",
},
{ /* end of list */ }
},
};
static int dictzip_open(BlockDriverState *bs, QDict *options, int flags, Error **errp)
{
BDRVDictZipState *s = bs->opaque;
const char *err = "Unknown (read error?)";
uint8_t magic[2];
char buf[100];
uint8_t header_flags;
uint16_t chunk_len16;
uint16_t chunk_cnt16;
uint32_t chunk_len32;
uint16_t header_ver;
uint16_t tmp_short;
uint64_t offset;
int chunks_len;
int headerLength = GZ_XLEN - 1;
int rnd_offs;
int ret;
int i;
QemuOpts *opts;
Error *local_err = NULL;
const char *filename;
opts = qemu_opts_create(&runtime_opts, NULL, 0, &error_abort);
qemu_opts_absorb_qdict(opts, options, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
ret = -EINVAL;
goto fail;
}
filename = qemu_opt_get(opts, "filename");
if (!strncmp(filename, "dzip://", 7))
filename += 7;
else if (!strncmp(filename, "dzip:", 5))
filename += 5;
ret = bdrv_open(&s->hd, filename, NULL, NULL, flags | BDRV_O_PROTOCOL, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
qemu_opts_del(opts);
return ret;
}
/* initialize zlib streams */
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (start_zStream( &s->zStream[i] ) != Z_OK) {
err = s->zStream[i].msg;
goto fail;
}
}
/* gzip header */
if (bdrv_pread(s->hd, GZ_ID, &magic, sizeof(magic)) != sizeof(magic))
goto fail;
if (!((magic[0] == GZ_MAGIC1) && (magic[1] == GZ_MAGIC2))) {
err = "No gzip file";
goto fail;
}
/* dzip header */
if (bdrv_pread(s->hd, GZ_FLG, &header_flags, 1) != 1)
goto fail;
if (!(header_flags & GZ_FEXTRA)) {
err = "Not a dictzip file (wrong flags)";
goto fail;
}
/* extra length */
if (bdrv_pread(s->hd, GZ_XLEN, &tmp_short, 2) != 2)
goto fail;
headerLength += le16_to_cpu(tmp_short) + 2;
/* DictZip magic */
if (bdrv_pread(s->hd, GZ_SI, &magic, 2) != 2)
goto fail;
if (magic[0] != DZ_MAGIC1 || magic[1] != DZ_MAGIC2) {
err = "Not a dictzip file (missing extra magic)";
goto fail;
}
/* DictZip version */
if (bdrv_pread(s->hd, GZ_VERSION, &header_ver, 2) != 2)
goto fail;
header_ver = le16_to_cpu(header_ver);
switch (header_ver) {
case 1: /* Normal DictZip */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_CHUNKSIZE, &chunk_len16, 2) != 2)
goto fail;
s->chunk_len = le16_to_cpu(chunk_len16);
/* chunk count */
if (bdrv_pread(s->hd, GZ_CHUNKCNT, &chunk_cnt16, 2) != 2)
goto fail;
s->chunk_cnt = le16_to_cpu(chunk_cnt16);
chunks_len = sizeof(short) * s->chunk_cnt;
rnd_offs = GZ_RNDDATA;
break;
case 99: /* Special Alex pigz version */
/* number of chunks */
if (bdrv_pread(s->hd, GZ_99_CHUNKSIZE, &chunk_len32, 4) != 4)
goto fail;
dprintf("chunk len [%#x] = %d\n", GZ_99_CHUNKSIZE, chunk_len32);
s->chunk_len = le32_to_cpu(chunk_len32);
/* chunk count */
if (bdrv_pread(s->hd, GZ_99_CHUNKCNT, &s->chunk_cnt, 4) != 4)
goto fail;
s->chunk_cnt = le32_to_cpu(s->chunk_cnt);
dprintf("chunk len | count = %"PRId64" | %d\n", s->chunk_len, s->chunk_cnt);
/* file size */
if (bdrv_pread(s->hd, GZ_99_FILESIZE, &s->file_len, 8) != 8)
goto fail;
s->file_len = le64_to_cpu(s->file_len);
chunks_len = sizeof(int) * s->chunk_cnt;
rnd_offs = GZ_99_RNDDATA;
break;
default:
err = "Invalid DictZip version";
goto fail;
}
/* random access data */
s->chunks = g_malloc(chunks_len);
if (header_ver == 99)
s->chunks32 = (uint32_t *)s->chunks;
if (bdrv_pread(s->hd, rnd_offs, s->chunks, chunks_len) != chunks_len)
goto fail;
/* orig filename */
if (header_flags & GZ_FNAME) {
if (bdrv_pread(s->hd, headerLength + 1, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("filename: %s\n", buf);
}
/* comment field */
if (header_flags & GZ_COMMENT) {
if (bdrv_pread(s->hd, headerLength, buf, sizeof(buf)) != sizeof(buf))
goto fail;
buf[sizeof(buf) - 1] = '\0';
headerLength += strlen(buf) + 1;
if (strlen(buf) == sizeof(buf))
goto fail;
dprintf("comment: %s\n", buf);
}
if (header_flags & GZ_FHCRC)
headerLength += 2;
/* uncompressed file length*/
if (!s->file_len) {
uint32_t file_len;
if (bdrv_pread(s->hd, bdrv_getlength(s->hd) - 4, &file_len, 4) != 4)
goto fail;
s->file_len = le32_to_cpu(file_len);
}
/* compute offsets */
s->offsets = g_malloc(sizeof( *s->offsets ) * s->chunk_cnt);
for (offset = headerLength + 1, i = 0; i < s->chunk_cnt; i++) {
s->offsets[i] = offset;
switch (header_ver) {
case 1:
offset += le16_to_cpu(s->chunks[i]);
break;
case 99:
offset += le32_to_cpu(s->chunks32[i]);
break;
}
dprintf("chunk %#"PRIx64" - %#"PRIx64" = offset %#"PRIx64" -> %#"PRIx64"\n", i * s->chunk_len, (i+1) * s->chunk_len, s->offsets[i], offset);
}
qemu_opts_del(opts);
return 0;
fail:
fprintf(stderr, "DictZip: Error opening file: %s\n", err);
bdrv_unref(s->hd);
if (s->chunks)
g_free(s->chunks);
qemu_opts_del(opts);
return -EINVAL;
}
/* This callback gets invoked when we have the result in cache already */
static void dictzip_cache_cb(void *opaque)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
qemu_iovec_from_buf(acb->qiov, 0, acb->buf, acb->len);
acb->common.cb(acb->common.opaque, 0);
qemu_bh_delete(acb->bh);
qemu_aio_unref(acb);
}
/* This callback gets invoked by the underlying block reader when we have
* all compressed data. We uncompress in here. */
static void dictzip_read_cb(void *opaque, int ret)
{
DictZipAIOCB *acb = (DictZipAIOCB *)opaque;
struct BDRVDictZipState *s = acb->s;
uint8_t *buf;
DictCache *cache;
int r, i;
buf = g_malloc(acb->chunks_len);
/* try to find zlib stream for decoding */
do {
for (i = 0; i < Z_STREAM_COUNT; i++) {
if (!(s->stream_in_use & (1 << i))) {
s->stream_in_use |= (1 << i);
acb->zStream_id = i;
acb->zStream = &s->zStream[i];
break;
}
}
} while(!acb->zStream);
/* sure, we could handle more streams, but this callback should be single
threaded and when it's not, we really want to know! */
assert(i == 0);
/* uncompress the chunk */
acb->zStream->next_in = acb->gzipped;
acb->zStream->avail_in = acb->gz_len;
acb->zStream->next_out = buf;
acb->zStream->avail_out = acb->chunks_len;
r = inflate( acb->zStream, Z_PARTIAL_FLUSH );
if ( (r != Z_OK) && (r != Z_STREAM_END) )
fprintf(stderr, "Error inflating: [%d] %s\n", r, acb->zStream->msg);
if ( r == Z_STREAM_END )
inflateReset(acb->zStream);
dprintf("inflating [%d] left: %d | %d bytes\n", r, acb->zStream->avail_in, acb->zStream->avail_out);
s->stream_in_use &= ~(1 << acb->zStream_id);
/* nofity the caller */
qemu_iovec_from_buf(acb->qiov, 0, buf + acb->offset, acb->len);
acb->common.cb(acb->common.opaque, 0);
/* fill the cache */
cache = &s->cache[s->cache_index];
s->cache_index++;
if (s->cache_index == CACHE_COUNT)
s->cache_index = 0;
cache->len = 0;
if (cache->buf)
g_free(cache->buf);
cache->start = acb->gz_start;
cache->buf = buf;
cache->len = acb->chunks_len;
/* free occupied ressources */
g_free(acb->qiov_gz);
qemu_aio_unref(acb);
}
static const AIOCBInfo dictzip_aiocb_info = {
.aiocb_size = sizeof(DictZipAIOCB),
};
/* This is where we get a request from a caller to read something */
static BlockAIOCB *dictzip_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
BDRVDictZipState *s = bs->opaque;
DictZipAIOCB *acb;
QEMUIOVector *qiov_gz;
struct iovec *iov;
uint8_t *buf;
size_t start = sector_num * SECTOR_SIZE;
size_t len = nb_sectors * SECTOR_SIZE;
size_t end = start + len;
size_t gz_start;
size_t gz_len;
int64_t gz_sector_num;
int gz_nb_sectors;
int first_chunk, last_chunk;
int first_offset;
int i;
acb = qemu_aio_get(&dictzip_aiocb_info, bs, cb, opaque);
if (!acb)
return NULL;
/* Search Cache */
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
if ((start >= s->cache[i].start) &&
(end <= (s->cache[i].start + s->cache[i].len))) {
acb->buf = s->cache[i].buf + (start - s->cache[i].start);
acb->len = len;
acb->qiov = qiov;
acb->bh = qemu_bh_new(dictzip_cache_cb, acb);
qemu_bh_schedule(acb->bh);
return &acb->common;
}
}
/* No cache, so let's decode */
/* We need to read these chunks */
first_chunk = start / s->chunk_len;
first_offset = start - first_chunk * s->chunk_len;
last_chunk = end / s->chunk_len;
gz_start = s->offsets[first_chunk];
gz_len = 0;
for (i = first_chunk; i <= last_chunk; i++) {
if (s->chunks32)
gz_len += le32_to_cpu(s->chunks32[i]);
else
gz_len += le16_to_cpu(s->chunks[i]);
}
gz_sector_num = gz_start / SECTOR_SIZE;
gz_nb_sectors = (gz_len / SECTOR_SIZE);
/* account for tail and heads */
while ((gz_start + gz_len) > ((gz_sector_num + gz_nb_sectors) * SECTOR_SIZE))
gz_nb_sectors++;
/* Allocate qiov, iov and buf in one chunk so we only need to free qiov */
qiov_gz = g_malloc0(sizeof(QEMUIOVector) + sizeof(struct iovec) +
(gz_nb_sectors * SECTOR_SIZE));
iov = (struct iovec *)(((char *)qiov_gz) + sizeof(QEMUIOVector));
buf = ((uint8_t *)iov) + sizeof(struct iovec *);
/* Kick off the read by the backing file, so we can start decompressing */
iov->iov_base = (void *)buf;
iov->iov_len = gz_nb_sectors * 512;
qemu_iovec_init_external(qiov_gz, iov, 1);
dprintf("read %zd - %zd => %zd - %zd\n", start, end, gz_start, gz_start + gz_len);
acb->s = s;
acb->qiov = qiov;
acb->qiov_gz = qiov_gz;
acb->start = start;
acb->len = len;
acb->gzipped = buf + (gz_start % SECTOR_SIZE);
acb->gz_len = gz_len;
acb->gz_start = first_chunk * s->chunk_len;
acb->offset = first_offset;
acb->chunks_len = (last_chunk - first_chunk + 1) * s->chunk_len;
return bdrv_aio_readv(s->hd, gz_sector_num, qiov_gz, gz_nb_sectors,
dictzip_read_cb, acb);
}
static void dictzip_close(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
int i;
for (i = 0; i < CACHE_COUNT; i++) {
if (!s->cache[i].len)
continue;
g_free(s->cache[i].buf);
}
for (i = 0; i < Z_STREAM_COUNT; i++) {
inflateEnd(&s->zStream[i]);
}
if (s->chunks)
g_free(s->chunks);
if (s->offsets)
g_free(s->offsets);
dprintf("Close\n");
}
static int64_t dictzip_getlength(BlockDriverState *bs)
{
BDRVDictZipState *s = bs->opaque;
dprintf("getlength -> %ld\n", s->file_len);
return s->file_len;
}
static BlockDriver bdrv_dictzip = {
.format_name = "dzip",
.protocol_name = "dzip",
.instance_size = sizeof(BDRVDictZipState),
.bdrv_file_open = dictzip_open,
.bdrv_close = dictzip_close,
.bdrv_getlength = dictzip_getlength,
.bdrv_aio_readv = dictzip_aio_readv,
};
static void dictzip_block_init(void)
{
bdrv_register(&bdrv_dictzip);
}
block_init(dictzip_block_init);

View File

@@ -1,7 +1,7 @@
/*
* Block Dirty Bitmap
*
* Copyright (c) 2016-2017 Red Hat. Inc
* Copyright (c) 2016 Red Hat. Inc
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -37,51 +37,14 @@
* or enabled. A frozen bitmap can only abdicate() or reclaim().
*/
struct BdrvDirtyBitmap {
QemuMutex *mutex;
HBitmap *bitmap; /* Dirty bitmap implementation */
HBitmap *meta; /* Meta dirty bitmap */
HBitmap *bitmap; /* Dirty sector bitmap implementation */
BdrvDirtyBitmap *successor; /* Anonymous child; implies frozen status */
char *name; /* Optional non-empty unique ID */
int64_t size; /* Size of the bitmap, in bytes */
bool disabled; /* Bitmap is disabled. It ignores all writes to
the device */
int active_iterators; /* How many iterators are active */
bool readonly; /* Bitmap is read-only. This field also
prevents the respective image from being
modified (i.e. blocks writes and discards).
Such operations must fail and both the image
and this bitmap must remain unchanged while
this flag is set. */
bool persistent; /* bitmap must be saved to owner disk image */
int64_t size; /* Size of the bitmap (Number of sectors) */
bool disabled; /* Bitmap is read-only */
QLIST_ENTRY(BdrvDirtyBitmap) list;
};
struct BdrvDirtyBitmapIter {
HBitmapIter hbi;
BdrvDirtyBitmap *bitmap;
};
static inline void bdrv_dirty_bitmaps_lock(BlockDriverState *bs)
{
qemu_mutex_lock(&bs->dirty_bitmap_mutex);
}
static inline void bdrv_dirty_bitmaps_unlock(BlockDriverState *bs)
{
qemu_mutex_unlock(&bs->dirty_bitmap_mutex);
}
void bdrv_dirty_bitmap_lock(BdrvDirtyBitmap *bitmap)
{
qemu_mutex_lock(bitmap->mutex);
}
void bdrv_dirty_bitmap_unlock(BdrvDirtyBitmap *bitmap)
{
qemu_mutex_unlock(bitmap->mutex);
}
/* Called with BQL or dirty_bitmap lock taken. */
BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
{
BdrvDirtyBitmap *bm;
@@ -95,16 +58,13 @@ BdrvDirtyBitmap *bdrv_find_dirty_bitmap(BlockDriverState *bs, const char *name)
return NULL;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_make_anon(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
g_free(bitmap->name);
bitmap->name = NULL;
bitmap->persistent = false;
}
/* Called with BQL taken. */
BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
uint32_t granularity,
const char *name,
@@ -112,84 +72,41 @@ BdrvDirtyBitmap *bdrv_create_dirty_bitmap(BlockDriverState *bs,
{
int64_t bitmap_size;
BdrvDirtyBitmap *bitmap;
uint32_t sector_granularity;
assert(is_power_of_2(granularity) && granularity >= BDRV_SECTOR_SIZE);
assert((granularity & (granularity - 1)) == 0);
if (name && bdrv_find_dirty_bitmap(bs, name)) {
error_setg(errp, "Bitmap already exists: %s", name);
return NULL;
}
bitmap_size = bdrv_getlength(bs);
sector_granularity = granularity >> BDRV_SECTOR_BITS;
assert(sector_granularity);
bitmap_size = bdrv_nb_sectors(bs);
if (bitmap_size < 0) {
error_setg_errno(errp, -bitmap_size, "could not get length of device");
errno = -bitmap_size;
return NULL;
}
bitmap = g_new0(BdrvDirtyBitmap, 1);
bitmap->mutex = &bs->dirty_bitmap_mutex;
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(granularity));
bitmap->bitmap = hbitmap_alloc(bitmap_size, ctz32(sector_granularity));
bitmap->size = bitmap_size;
bitmap->name = g_strdup(name);
bitmap->disabled = false;
bdrv_dirty_bitmaps_lock(bs);
QLIST_INSERT_HEAD(&bs->dirty_bitmaps, bitmap, list);
bdrv_dirty_bitmaps_unlock(bs);
return bitmap;
}
/* bdrv_create_meta_dirty_bitmap
*
* Create a meta dirty bitmap that tracks the changes of bits in @bitmap. I.e.
* when a dirty status bit in @bitmap is changed (either from reset to set or
* the other way around), its respective meta dirty bitmap bit will be marked
* dirty as well.
*
* @bitmap: the block dirty bitmap for which to create a meta dirty bitmap.
* @chunk_size: how many bytes of bitmap data does each bit in the meta bitmap
* track.
*/
void bdrv_create_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int chunk_size)
{
assert(!bitmap->meta);
qemu_mutex_lock(bitmap->mutex);
bitmap->meta = hbitmap_create_meta(bitmap->bitmap,
chunk_size * BITS_PER_BYTE);
qemu_mutex_unlock(bitmap->mutex);
}
void bdrv_release_meta_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(bitmap->meta);
qemu_mutex_lock(bitmap->mutex);
hbitmap_free_meta(bitmap->bitmap);
bitmap->meta = NULL;
qemu_mutex_unlock(bitmap->mutex);
}
int64_t bdrv_dirty_bitmap_size(const BdrvDirtyBitmap *bitmap)
{
return bitmap->size;
}
const char *bdrv_dirty_bitmap_name(const BdrvDirtyBitmap *bitmap)
{
return bitmap->name;
}
/* Called with BQL taken. */
bool bdrv_dirty_bitmap_frozen(BdrvDirtyBitmap *bitmap)
{
return bitmap->successor;
}
/* Called with BQL taken. */
bool bdrv_dirty_bitmap_enabled(BdrvDirtyBitmap *bitmap)
{
return !(bitmap->disabled || bitmap->successor);
}
/* Called with BQL taken. */
DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
{
if (bdrv_dirty_bitmap_frozen(bitmap)) {
@@ -204,7 +121,6 @@ DirtyBitmapStatus bdrv_dirty_bitmap_status(BdrvDirtyBitmap *bitmap)
/**
* Create a successor bitmap destined to replace this bitmap after an operation.
* Requires that the bitmap is not frozen and has no successor.
* Called with BQL taken.
*/
int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap, Error **errp)
@@ -237,7 +153,6 @@ int bdrv_dirty_bitmap_create_successor(BlockDriverState *bs,
/**
* For a bitmap with a successor, yield our name to the successor,
* delete the old bitmap, and return a handle to the new bitmap.
* Called with BQL taken.
*/
BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
@@ -256,8 +171,6 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
bitmap->name = NULL;
successor->name = name;
bitmap->successor = NULL;
successor->persistent = bitmap->persistent;
bitmap->persistent = false;
bdrv_release_dirty_bitmap(bs, bitmap);
return successor;
@@ -267,7 +180,6 @@ BdrvDirtyBitmap *bdrv_dirty_bitmap_abdicate(BlockDriverState *bs,
* In cases of failure where we can no longer safely delete the parent,
* we may wish to re-join the parent and child/successor.
* The merged parent will be un-frozen, but not explicitly re-enabled.
* Called with BQL taken.
*/
BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *parent,
@@ -292,111 +204,59 @@ BdrvDirtyBitmap *bdrv_reclaim_dirty_bitmap(BlockDriverState *bs,
/**
* Truncates _all_ bitmaps attached to a BDS.
* Called with BQL taken.
*/
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs, int64_t bytes)
void bdrv_dirty_bitmap_truncate(BlockDriverState *bs)
{
BdrvDirtyBitmap *bitmap;
uint64_t size = bdrv_nb_sectors(bs);
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
assert(!bdrv_dirty_bitmap_frozen(bitmap));
assert(!bitmap->active_iterators);
hbitmap_truncate(bitmap->bitmap, bytes);
bitmap->size = bytes;
hbitmap_truncate(bitmap->bitmap, size);
bitmap->size = size;
}
bdrv_dirty_bitmaps_unlock(bs);
}
static bool bdrv_dirty_bitmap_has_name(BdrvDirtyBitmap *bitmap)
{
return !!bdrv_dirty_bitmap_name(bitmap);
}
/* Called with BQL taken. */
static void bdrv_do_release_matching_dirty_bitmap(
BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
bool (*cond)(BdrvDirtyBitmap *bitmap))
static void bdrv_do_release_matching_dirty_bitmap(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap,
bool only_named)
{
BdrvDirtyBitmap *bm, *next;
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH_SAFE(bm, &bs->dirty_bitmaps, list, next) {
if ((!bitmap || bm == bitmap) && (!cond || cond(bm))) {
assert(!bm->active_iterators);
if ((!bitmap || bm == bitmap) && (!only_named || bm->name)) {
assert(!bdrv_dirty_bitmap_frozen(bm));
assert(!bm->meta);
QLIST_REMOVE(bm, list);
hbitmap_free(bm->bitmap);
g_free(bm->name);
g_free(bm);
if (bitmap) {
goto out;
return;
}
}
}
if (bitmap) {
abort();
}
out:
bdrv_dirty_bitmaps_unlock(bs);
}
/* Called with BQL taken. */
void bdrv_release_dirty_bitmap(BlockDriverState *bs, BdrvDirtyBitmap *bitmap)
{
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, NULL);
bdrv_do_release_matching_dirty_bitmap(bs, bitmap, false);
}
/**
* Release all named dirty bitmaps attached to a BDS (for use in bdrv_close()).
* There must not be any frozen bitmaps attached.
* This function does not remove persistent bitmaps from the storage.
* Called with BQL taken.
*/
void bdrv_release_named_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL, bdrv_dirty_bitmap_has_name);
bdrv_do_release_matching_dirty_bitmap(bs, NULL, true);
}
/**
* Release all persistent dirty bitmaps attached to a BDS (for use in
* bdrv_inactivate_recurse()).
* There must not be any frozen bitmaps attached.
* This function does not remove persistent bitmaps from the storage.
*/
void bdrv_release_persistent_dirty_bitmaps(BlockDriverState *bs)
{
bdrv_do_release_matching_dirty_bitmap(bs, NULL,
bdrv_dirty_bitmap_get_persistance);
}
/**
* Remove persistent dirty bitmap from the storage if it exists.
* Absence of bitmap is not an error, because we have the following scenario:
* BdrvDirtyBitmap can have .persistent = true but not yet saved and have no
* stored version. For such bitmap bdrv_remove_persistent_dirty_bitmap() should
* not fail.
* This function doesn't release corresponding BdrvDirtyBitmap.
*/
void bdrv_remove_persistent_dirty_bitmap(BlockDriverState *bs,
const char *name,
Error **errp)
{
if (bs->drv && bs->drv->bdrv_remove_persistent_dirty_bitmap) {
bs->drv->bdrv_remove_persistent_dirty_bitmap(bs, name, errp);
}
}
/* Called with BQL taken. */
void bdrv_disable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
bitmap->disabled = true;
}
/* Called with BQL taken. */
void bdrv_enable_dirty_bitmap(BdrvDirtyBitmap *bitmap)
{
assert(!bdrv_dirty_bitmap_frozen(bitmap));
@@ -409,7 +269,6 @@ BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
BlockDirtyInfoList *list = NULL;
BlockDirtyInfoList **plist = &list;
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
BlockDirtyInfo *info = g_new0(BlockDirtyInfo, 1);
BlockDirtyInfoList *entry = g_new0(BlockDirtyInfoList, 1);
@@ -422,19 +281,17 @@ BlockDirtyInfoList *bdrv_query_dirty_bitmaps(BlockDriverState *bs)
*plist = entry;
plist = &entry->next;
}
bdrv_dirty_bitmaps_unlock(bs);
return list;
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
bool bdrv_get_dirty_locked(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t offset)
int bdrv_get_dirty(BlockDriverState *bs, BdrvDirtyBitmap *bitmap,
int64_t sector)
{
if (bitmap) {
return hbitmap_get(bitmap->bitmap, offset);
return hbitmap_get(bitmap->bitmap, sector);
} else {
return false;
return 0;
}
}
@@ -458,83 +315,33 @@ uint32_t bdrv_get_default_bitmap_granularity(BlockDriverState *bs)
return granularity;
}
uint32_t bdrv_dirty_bitmap_granularity(const BdrvDirtyBitmap *bitmap)
uint32_t bdrv_dirty_bitmap_granularity(BdrvDirtyBitmap *bitmap)
{
return 1U << hbitmap_granularity(bitmap->bitmap);
return BDRV_SECTOR_SIZE << hbitmap_granularity(bitmap->bitmap);
}
BdrvDirtyBitmapIter *bdrv_dirty_iter_new(BdrvDirtyBitmap *bitmap)
void bdrv_dirty_iter_init(BdrvDirtyBitmap *bitmap, HBitmapIter *hbi)
{
BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
hbitmap_iter_init(&iter->hbi, bitmap->bitmap, 0);
iter->bitmap = bitmap;
bitmap->active_iterators++;
return iter;
}
BdrvDirtyBitmapIter *bdrv_dirty_meta_iter_new(BdrvDirtyBitmap *bitmap)
{
BdrvDirtyBitmapIter *iter = g_new(BdrvDirtyBitmapIter, 1);
hbitmap_iter_init(&iter->hbi, bitmap->meta, 0);
iter->bitmap = bitmap;
bitmap->active_iterators++;
return iter;
}
void bdrv_dirty_iter_free(BdrvDirtyBitmapIter *iter)
{
if (!iter) {
return;
}
assert(iter->bitmap->active_iterators > 0);
iter->bitmap->active_iterators--;
g_free(iter);
}
int64_t bdrv_dirty_iter_next(BdrvDirtyBitmapIter *iter)
{
return hbitmap_iter_next(&iter->hbi);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
void bdrv_set_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_set(bitmap->bitmap, offset, bytes);
hbitmap_iter_init(hbi, bitmap->bitmap, 0);
}
void bdrv_set_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
{
bdrv_dirty_bitmap_lock(bitmap);
bdrv_set_dirty_bitmap_locked(bitmap, offset, bytes);
bdrv_dirty_bitmap_unlock(bitmap);
}
/* Called within bdrv_dirty_bitmap_lock..unlock */
void bdrv_reset_dirty_bitmap_locked(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
int64_t cur_sector, int nr_sectors)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_reset(bitmap->bitmap, offset, bytes);
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_reset_dirty_bitmap(BdrvDirtyBitmap *bitmap,
int64_t offset, int64_t bytes)
int64_t cur_sector, int nr_sectors)
{
bdrv_dirty_bitmap_lock(bitmap);
bdrv_reset_dirty_bitmap_locked(bitmap, offset, bytes);
bdrv_dirty_bitmap_unlock(bitmap);
assert(bdrv_dirty_bitmap_enabled(bitmap));
hbitmap_reset(bitmap->bitmap, cur_sector, nr_sectors);
}
void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
{
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
bdrv_dirty_bitmap_lock(bitmap);
if (!out) {
hbitmap_reset_all(bitmap->bitmap);
} else {
@@ -543,162 +350,38 @@ void bdrv_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap **out)
hbitmap_granularity(backup));
*out = backup;
}
bdrv_dirty_bitmap_unlock(bitmap);
}
void bdrv_undo_clear_dirty_bitmap(BdrvDirtyBitmap *bitmap, HBitmap *in)
{
HBitmap *tmp = bitmap->bitmap;
assert(bdrv_dirty_bitmap_enabled(bitmap));
assert(!bdrv_dirty_bitmap_readonly(bitmap));
bitmap->bitmap = in;
hbitmap_free(tmp);
}
uint64_t bdrv_dirty_bitmap_serialization_size(const BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes)
{
return hbitmap_serialization_size(bitmap->bitmap, offset, bytes);
}
uint64_t bdrv_dirty_bitmap_serialization_align(const BdrvDirtyBitmap *bitmap)
{
return hbitmap_serialization_align(bitmap->bitmap);
}
void bdrv_dirty_bitmap_serialize_part(const BdrvDirtyBitmap *bitmap,
uint8_t *buf, uint64_t offset,
uint64_t bytes)
{
hbitmap_serialize_part(bitmap->bitmap, buf, offset, bytes);
}
void bdrv_dirty_bitmap_deserialize_part(BdrvDirtyBitmap *bitmap,
uint8_t *buf, uint64_t offset,
uint64_t bytes, bool finish)
{
hbitmap_deserialize_part(bitmap->bitmap, buf, offset, bytes, finish);
}
void bdrv_dirty_bitmap_deserialize_zeroes(BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes,
bool finish)
{
hbitmap_deserialize_zeroes(bitmap->bitmap, offset, bytes, finish);
}
void bdrv_dirty_bitmap_deserialize_ones(BdrvDirtyBitmap *bitmap,
uint64_t offset, uint64_t bytes,
bool finish)
{
hbitmap_deserialize_ones(bitmap->bitmap, offset, bytes, finish);
}
void bdrv_dirty_bitmap_deserialize_finish(BdrvDirtyBitmap *bitmap)
{
hbitmap_deserialize_finish(bitmap->bitmap);
}
void bdrv_set_dirty(BlockDriverState *bs, int64_t offset, int64_t bytes)
void bdrv_set_dirty(BlockDriverState *bs, int64_t cur_sector,
int nr_sectors)
{
BdrvDirtyBitmap *bitmap;
if (QLIST_EMPTY(&bs->dirty_bitmaps)) {
return;
}
bdrv_dirty_bitmaps_lock(bs);
QLIST_FOREACH(bitmap, &bs->dirty_bitmaps, list) {
if (!bdrv_dirty_bitmap_enabled(bitmap)) {
continue;
}
assert(!bdrv_dirty_bitmap_readonly(bitmap));
hbitmap_set(bitmap->bitmap, offset, bytes);
hbitmap_set(bitmap->bitmap, cur_sector, nr_sectors);
}
bdrv_dirty_bitmaps_unlock(bs);
}
/**
* Advance a BdrvDirtyBitmapIter to an arbitrary offset.
* Advance an HBitmapIter to an arbitrary offset.
*/
void bdrv_set_dirty_iter(BdrvDirtyBitmapIter *iter, int64_t offset)
void bdrv_set_dirty_iter(HBitmapIter *hbi, int64_t offset)
{
hbitmap_iter_init(&iter->hbi, iter->hbi.hb, offset);
assert(hbi->hb);
hbitmap_iter_init(hbi, hbi->hb, offset);
}
int64_t bdrv_get_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->bitmap);
}
int64_t bdrv_get_meta_dirty_count(BdrvDirtyBitmap *bitmap)
{
return hbitmap_count(bitmap->meta);
}
bool bdrv_dirty_bitmap_readonly(const BdrvDirtyBitmap *bitmap)
{
return bitmap->readonly;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_set_readonly(BdrvDirtyBitmap *bitmap, bool value)
{
qemu_mutex_lock(bitmap->mutex);
bitmap->readonly = value;
qemu_mutex_unlock(bitmap->mutex);
}
bool bdrv_has_readonly_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->readonly) {
return true;
}
}
return false;
}
/* Called with BQL taken. */
void bdrv_dirty_bitmap_set_persistance(BdrvDirtyBitmap *bitmap, bool persistent)
{
qemu_mutex_lock(bitmap->mutex);
bitmap->persistent = persistent;
qemu_mutex_unlock(bitmap->mutex);
}
bool bdrv_dirty_bitmap_get_persistance(BdrvDirtyBitmap *bitmap)
{
return bitmap->persistent;
}
bool bdrv_has_changed_persistent_bitmaps(BlockDriverState *bs)
{
BdrvDirtyBitmap *bm;
QLIST_FOREACH(bm, &bs->dirty_bitmaps, list) {
if (bm->persistent && !bm->readonly) {
return true;
}
}
return false;
}
BdrvDirtyBitmap *bdrv_dirty_bitmap_next(BlockDriverState *bs,
BdrvDirtyBitmap *bitmap)
{
return bitmap == NULL ? QLIST_FIRST(&bs->dirty_bitmaps) :
QLIST_NEXT(bitmap, list);
}
char *bdrv_dirty_bitmap_sha256(const BdrvDirtyBitmap *bitmap, Error **errp)
{
return hbitmap_sha256(bitmap->bitmap, errp);
}
int64_t bdrv_dirty_bitmap_next_zero(BdrvDirtyBitmap *bitmap, uint64_t offset)
{
return hbitmap_next_zero(bitmap->bitmap, offset);
}

View File

@@ -1,61 +0,0 @@
/*
* DMG bzip2 uncompression
*
* Copyright (c) 2004 Johannes E. Schindelin
* Copyright (c) 2016 Red Hat, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "dmg.h"
#include <bzlib.h>
static int dmg_uncompress_bz2_do(char *next_in, unsigned int avail_in,
char *next_out, unsigned int avail_out)
{
int ret;
uint64_t total_out;
bz_stream bzstream = {};
ret = BZ2_bzDecompressInit(&bzstream, 0, 0);
if (ret != BZ_OK) {
return -1;
}
bzstream.next_in = next_in;
bzstream.avail_in = avail_in;
bzstream.next_out = next_out;
bzstream.avail_out = avail_out;
ret = BZ2_bzDecompress(&bzstream);
total_out = ((uint64_t)bzstream.total_out_hi32 << 32) +
bzstream.total_out_lo32;
BZ2_bzDecompressEnd(&bzstream);
if (ret != BZ_STREAM_END ||
total_out != avail_out) {
return -1;
}
return 0;
}
__attribute__((constructor))
static void dmg_bz2_init(void)
{
assert(!dmg_uncompress_bz2);
dmg_uncompress_bz2 = dmg_uncompress_bz2_do;
}

View File

@@ -28,10 +28,11 @@
#include "qemu/bswap.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "dmg.h"
int (*dmg_uncompress_bz2)(char *next_in, unsigned int avail_in,
char *next_out, unsigned int avail_out);
#include <zlib.h>
#ifdef CONFIG_BZIP2
#include <bzlib.h>
#endif
#include <glib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
@@ -41,6 +42,31 @@ enum {
DMG_SECTORCOUNTS_MAX = DMG_LENGTHS_MAX / 512,
};
typedef struct BDRVDMGState {
CoMutex lock;
/* each chunk contains a certain number of sectors,
* offsets[i] is the offset in the .dmg file,
* lengths[i] is the length of the compressed chunk,
* sectors[i] is the sector beginning at offsets[i],
* sectorcounts[i] is the number of sectors in that chunk,
* the sectors array is ordered
* 0<=i<n_chunks */
uint32_t n_chunks;
uint32_t* types;
uint64_t* offsets;
uint64_t* lengths;
uint64_t* sectors;
uint64_t* sectorcounts;
uint32_t current_chunk;
uint8_t *compressed_chunk;
uint8_t *uncompressed_chunk;
z_stream zstream;
#ifdef CONFIG_BZIP2
bz_stream bzstream;
#endif
} BDRVDMGState;
static int dmg_probe(const uint8_t *buf, int buf_size, const char *filename)
{
int len;
@@ -61,7 +87,7 @@ static int read_uint64(BlockDriverState *bs, int64_t offset, uint64_t *result)
uint64_t buffer;
int ret;
ret = bdrv_pread(bs->file, offset, &buffer, 8);
ret = bdrv_pread(bs->file->bs, offset, &buffer, 8);
if (ret < 0) {
return ret;
}
@@ -75,7 +101,7 @@ static int read_uint32(BlockDriverState *bs, int64_t offset, uint32_t *result)
uint32_t buffer;
int ret;
ret = bdrv_pread(bs->file, offset, &buffer, 4);
ret = bdrv_pread(bs->file->bs, offset, &buffer, 4);
if (ret < 0) {
return ret;
}
@@ -111,7 +137,7 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
uncompressed_sectors = s->sectorcounts[chunk];
break;
case 1: /* copy */
uncompressed_sectors = DIV_ROUND_UP(s->lengths[chunk], 512);
uncompressed_sectors = (s->lengths[chunk] + 511) / 512;
break;
case 2: /* zero */
/* as the all-zeroes block may be large, it is treated specially: the
@@ -128,9 +154,8 @@ static void update_max_chunk_size(BDRVDMGState *s, uint32_t chunk,
}
}
static int64_t dmg_find_koly_offset(BdrvChild *file, Error **errp)
static int64_t dmg_find_koly_offset(BlockDriverState *file_bs, Error **errp)
{
BlockDriverState *file_bs = file->bs;
int64_t length;
int64_t offset = 0;
uint8_t buffer[515];
@@ -154,7 +179,7 @@ static int64_t dmg_find_koly_offset(BdrvChild *file, Error **errp)
offset = length - 511 - 512;
}
length = length < 515 ? length : 515;
ret = bdrv_pread(file, offset, buffer, length);
ret = bdrv_pread(file_bs, offset, buffer, length);
if (ret < 0) {
error_setg_errno(errp, -ret, "Failed while reading UDIF trailer");
return ret;
@@ -185,9 +210,10 @@ static bool dmg_is_known_block_type(uint32_t entry_type)
case 0x00000001: /* uncompressed */
case 0x00000002: /* zeroes */
case 0x80000005: /* zlib */
return true;
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 */
return !!dmg_uncompress_bz2;
#endif
return true;
default:
return false;
}
@@ -330,7 +356,7 @@ static int dmg_read_resource_fork(BlockDriverState *bs, DmgHeaderState *ds,
offset += 4;
buffer = g_realloc(buffer, count);
ret = bdrv_pread(bs->file, offset, buffer, count);
ret = bdrv_pread(bs->file->bs, offset, buffer, count);
if (ret < 0) {
goto fail;
}
@@ -367,7 +393,7 @@ static int dmg_read_plist_xml(BlockDriverState *bs, DmgHeaderState *ds,
buffer = g_malloc(info_length + 1);
buffer[info_length] = '\0';
ret = bdrv_pread(bs->file, info_begin, buffer, info_length);
ret = bdrv_pread(bs->file->bs, info_begin, buffer, info_length);
if (ret != info_length) {
ret = -EINVAL;
goto fail;
@@ -413,25 +439,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
int64_t offset;
int ret;
bs->file = bdrv_open_child(NULL, options, "file", bs, &child_file,
false, errp);
if (!bs->file) {
return -EINVAL;
}
if (!bdrv_is_read_only(bs)) {
error_report("Opening dmg images without an explicit read-only=on "
"option is deprecated. Future versions will refuse to "
"open the image instead of automatically marking the "
"image read-only.");
ret = bdrv_set_read_only(bs, true, errp);
if (ret < 0) {
return ret;
}
}
block_module_load_one("dmg-bz2");
bs->read_only = 1;
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
/* used by dmg_read_mish_block to keep track of the current I/O position */
@@ -440,7 +448,7 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
ds.max_sectors_per_chunk = 1;
/* locate the UDIF trailer */
offset = dmg_find_koly_offset(bs->file, errp);
offset = dmg_find_koly_offset(bs->file->bs, errp);
if (offset < 0) {
ret = offset;
goto fail;
@@ -538,11 +546,6 @@ fail:
return ret;
}
static void dmg_refresh_limits(BlockDriverState *bs, Error **errp)
{
bs->bl.request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O */
}
static inline int is_sector_in_chunk(BDRVDMGState* s,
uint32_t chunk_num, uint64_t sector_num)
{
@@ -578,6 +581,9 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
if (!is_sector_in_chunk(s, s->current_chunk, sector_num)) {
int ret;
uint32_t chunk = search_chunk(s, sector_num);
#ifdef CONFIG_BZIP2
uint64_t total_out;
#endif
if (chunk >= s->n_chunks) {
return -1;
@@ -588,7 +594,7 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
case 0x80000005: { /* zlib compressed */
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
ret = bdrv_pread(bs->file->bs, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
@@ -608,29 +614,36 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
return -1;
}
break; }
#ifdef CONFIG_BZIP2
case 0x80000006: /* bzip2 compressed */
if (!dmg_uncompress_bz2) {
break;
}
/* we need to buffer, because only the chunk as whole can be
* inflated. */
ret = bdrv_pread(bs->file, s->offsets[chunk],
ret = bdrv_pread(bs->file->bs, s->offsets[chunk],
s->compressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
}
ret = dmg_uncompress_bz2((char *)s->compressed_chunk,
(unsigned int) s->lengths[chunk],
(char *)s->uncompressed_chunk,
(unsigned int)
(512 * s->sectorcounts[chunk]));
if (ret < 0) {
return ret;
ret = BZ2_bzDecompressInit(&s->bzstream, 0, 0);
if (ret != BZ_OK) {
return -1;
}
s->bzstream.next_in = (char *)s->compressed_chunk;
s->bzstream.avail_in = (unsigned int) s->lengths[chunk];
s->bzstream.next_out = (char *)s->uncompressed_chunk;
s->bzstream.avail_out = (unsigned int) 512 * s->sectorcounts[chunk];
ret = BZ2_bzDecompress(&s->bzstream);
total_out = ((uint64_t)s->bzstream.total_out_hi32 << 32) +
s->bzstream.total_out_lo32;
BZ2_bzDecompressEnd(&s->bzstream);
if (ret != BZ_STREAM_END ||
total_out != 512 * s->sectorcounts[chunk]) {
return -1;
}
break;
#endif /* CONFIG_BZIP2 */
case 1: /* copy */
ret = bdrv_pread(bs->file, s->offsets[chunk],
ret = bdrv_pread(bs->file->bs, s->offsets[chunk],
s->uncompressed_chunk, s->lengths[chunk]);
if (ret != s->lengths[chunk]) {
return -1;
@@ -646,42 +659,38 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
return 0;
}
static int coroutine_fn
dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
static int dmg_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
BDRVDMGState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
int i;
for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk;
void *data;
if (dmg_read_chunk(bs, sector_num + i) != 0) {
ret = -EIO;
goto fail;
return -1;
}
/* Special case: current chunk is all zeroes. Do not perform a memcpy as
* s->uncompressed_chunk may be too small to cover the large all-zeroes
* section. dmg_read_chunk is called to find s->current_chunk */
if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
qemu_iovec_memset(qiov, i * 512, 0, 512);
memset(buf + i * 512, 0, 512);
continue;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
qemu_iovec_from_buf(qiov, i * 512, data, 512);
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
}
return 0;
}
ret = 0;
fail:
static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVDMGState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = dmg_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -706,9 +715,7 @@ static BlockDriver bdrv_dmg = {
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_refresh_limits = dmg_refresh_limits,
.bdrv_child_perm = bdrv_format_default_perms,
.bdrv_co_preadv = dmg_co_preadv,
.bdrv_read = dmg_co_read,
.bdrv_close = dmg_close,
};

View File

@@ -1,58 +0,0 @@
/*
* Header for DMG driver
*
* Copyright (c) 2004-2006 Fabrice Bellard
* Copyright (c) 2016 Red hat, Inc.
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#ifndef BLOCK_DMG_H
#define BLOCK_DMG_H
#include "qemu-common.h"
#include "block/block_int.h"
#include <zlib.h>
typedef struct BDRVDMGState {
CoMutex lock;
/* each chunk contains a certain number of sectors,
* offsets[i] is the offset in the .dmg file,
* lengths[i] is the length of the compressed chunk,
* sectors[i] is the sector beginning at offsets[i],
* sectorcounts[i] is the number of sectors in that chunk,
* the sectors array is ordered
* 0<=i<n_chunks */
uint32_t n_chunks;
uint32_t *types;
uint64_t *offsets;
uint64_t *lengths;
uint64_t *sectors;
uint64_t *sectorcounts;
uint32_t current_chunk;
uint8_t *compressed_chunk;
uint8_t *uncompressed_chunk;
z_stream zstream;
} BDRVDMGState;
extern int (*dmg_uncompress_bz2)(char *next_in, unsigned int avail_in,
char *next_out, unsigned int avail_out);
#endif

File diff suppressed because it is too large Load Diff

2769
block/io.c

File diff suppressed because it is too large Load Diff

View File

@@ -1,70 +0,0 @@
/*
* QEMU Block driver for iSCSI images (static options)
*
* Copyright (c) 2017 Peter Lieven <pl@kamp.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
* in the Software without restriction, including without limitation the rights
* to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
* copies of the Software, and to permit persons to whom the Software is
* furnished to do so, subject to the following conditions:
*
* The above copyright notice and this permission notice shall be included in
* all copies or substantial portions of the Software.
*
* THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
* IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
* FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL
* THE AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
* LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
* OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/config-file.h"
#include "qemu/option.h"
static QemuOptsList qemu_iscsi_opts = {
.name = "iscsi",
.head = QTAILQ_HEAD_INITIALIZER(qemu_iscsi_opts.head),
.desc = {
{
.name = "user",
.type = QEMU_OPT_STRING,
.help = "username for CHAP authentication to target",
},{
.name = "password",
.type = QEMU_OPT_STRING,
.help = "password for CHAP authentication to target",
},{
.name = "password-secret",
.type = QEMU_OPT_STRING,
.help = "ID of the secret providing password for CHAP "
"authentication to target",
},{
.name = "header-digest",
.type = QEMU_OPT_STRING,
.help = "HeaderDigest setting. "
"{CRC32C|CRC32C-NONE|NONE-CRC32C|NONE}",
},{
.name = "initiator-name",
.type = QEMU_OPT_STRING,
.help = "Initiator iqn name to use when connecting",
},{
.name = "timeout",
.type = QEMU_OPT_NUMBER,
.help = "Request timeout in seconds (default 0 = no timeout)",
},
{ /* end of list */ }
},
};
static void iscsi_block_opts_init(void)
{
qemu_add_opts(&qemu_iscsi_opts);
}
block_init(iscsi_block_opts_init);

File diff suppressed because it is too large Load Diff

View File

@@ -11,10 +11,8 @@
#include "qemu-common.h"
#include "block/aio.h"
#include "qemu/queue.h"
#include "block/block.h"
#include "block/raw-aio.h"
#include "qemu/event_notifier.h"
#include "qemu/coroutine.h"
#include <libaio.h>
@@ -28,10 +26,11 @@
*/
#define MAX_EVENTS 128
#define MAX_QUEUED_IO 128
struct qemu_laiocb {
BlockAIOCB common;
Coroutine *co;
LinuxAioState *ctx;
struct qemu_laio_state *ctx;
struct iocb iocb;
ssize_t ret;
size_t nbytes;
@@ -42,28 +41,26 @@ struct qemu_laiocb {
typedef struct {
int plugged;
unsigned int in_queue;
unsigned int in_flight;
unsigned int n;
bool blocked;
QSIMPLEQ_HEAD(, qemu_laiocb) pending;
} LaioQueue;
struct LinuxAioState {
AioContext *aio_context;
struct qemu_laio_state {
io_context_t ctx;
EventNotifier e;
/* io queue for submit at batch. Protected by AioContext lock. */
/* io queue for submit at batch */
LaioQueue io_q;
/* I/O completion processing. Only runs in I/O thread. */
/* I/O completion processing */
QEMUBH *completion_bh;
struct io_event events[MAX_EVENTS];
int event_idx;
int event_max;
};
static void ioq_submit(LinuxAioState *s);
static void ioq_submit(struct qemu_laio_state *s);
static inline ssize_t io_event_ret(struct io_event *ev)
{
@@ -73,7 +70,8 @@ static inline ssize_t io_event_ret(struct io_event *ev)
/*
* Completes an AIO request (calls the callback and frees the ACB).
*/
static void qemu_laio_process_completion(struct qemu_laiocb *laiocb)
static void qemu_laio_process_completion(struct qemu_laio_state *s,
struct qemu_laiocb *laiocb)
{
int ret;
@@ -87,191 +85,74 @@ static void qemu_laio_process_completion(struct qemu_laiocb *laiocb)
qemu_iovec_memset(laiocb->qiov, ret, 0,
laiocb->qiov->size - ret);
} else {
ret = -ENOSPC;
ret = -EINVAL;
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
laiocb->ret = ret;
if (laiocb->co) {
/* If the coroutine is already entered it must be in ioq_submit() and
* will notice laio->ret has been filled in when it eventually runs
* later. Coroutines cannot be entered recursively so avoid doing
* that!
*/
if (!qemu_coroutine_entered(laiocb->co)) {
aio_co_wake(laiocb->co);
}
} else {
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
qemu_aio_unref(laiocb);
}
/**
* aio_ring buffer which is shared between userspace and kernel.
*
* This copied from linux/fs/aio.c, common header does not exist
* but AIO exists for ages so we assume ABI is stable.
*/
struct aio_ring {
unsigned id; /* kernel internal index number */
unsigned nr; /* number of io_events */
unsigned head; /* Written to by userland or by kernel. */
unsigned tail;
unsigned magic;
unsigned compat_features;
unsigned incompat_features;
unsigned header_length; /* size of aio_ring */
struct io_event io_events[0];
};
/**
* io_getevents_peek:
* @ctx: AIO context
* @events: pointer on events array, output value
* Returns the number of completed events and sets a pointer
* on events array. This function does not update the internal
* ring buffer, only reads head and tail. When @events has been
* processed io_getevents_commit() must be called.
*/
static inline unsigned int io_getevents_peek(io_context_t ctx,
struct io_event **events)
{
struct aio_ring *ring = (struct aio_ring *)ctx;
unsigned int head = ring->head, tail = ring->tail;
unsigned int nr;
nr = tail >= head ? tail - head : ring->nr - head;
*events = ring->io_events + head;
/* To avoid speculative loads of s->events[i] before observing tail.
Paired with smp_wmb() inside linux/fs/aio.c: aio_complete(). */
smp_rmb();
return nr;
}
/**
* io_getevents_commit:
* @ctx: AIO context
* @nr: the number of events on which head should be advanced
*
* Advances head of a ring buffer.
*/
static inline void io_getevents_commit(io_context_t ctx, unsigned int nr)
{
struct aio_ring *ring = (struct aio_ring *)ctx;
if (nr) {
ring->head = (ring->head + nr) % ring->nr;
}
}
/**
* io_getevents_advance_and_peek:
* @ctx: AIO context
* @events: pointer on events array, output value
* @nr: the number of events on which head should be advanced
*
* Advances head of a ring buffer and returns number of elements left.
*/
static inline unsigned int
io_getevents_advance_and_peek(io_context_t ctx,
struct io_event **events,
unsigned int nr)
{
io_getevents_commit(ctx, nr);
return io_getevents_peek(ctx, events);
}
/**
* qemu_laio_process_completions:
* @s: AIO state
*
* Fetches completed I/O requests and invokes their callbacks.
/* The completion BH fetches completed I/O requests and invokes their
* callbacks.
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* indices are kept in LinuxAioState. Function schedules BH completion so it
* can be called again in a nested event loop. When there are no events left
* to complete the BH is being canceled.
* the completion events array and index are kept in qemu_laio_state. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
* BH returns without rescheduling.
*/
static void qemu_laio_process_completions(LinuxAioState *s)
static void qemu_laio_completion_bh(void *opaque)
{
struct io_event *events;
struct qemu_laio_state *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
do {
struct timespec ts = { 0 };
s->event_max = io_getevents(s->ctx, MAX_EVENTS, MAX_EVENTS,
s->events, &ts);
} while (s->event_max == -EINTR);
s->event_idx = 0;
if (s->event_max <= 0) {
s->event_max = 0;
return; /* no more events */
}
}
/* Reschedule so nested event loops see currently pending completions */
qemu_bh_schedule(s->completion_bh);
while ((s->event_max = io_getevents_advance_and_peek(s->ctx, &events,
s->event_idx))) {
for (s->event_idx = 0; s->event_idx < s->event_max; ) {
struct iocb *iocb = events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
/* Process completion events */
while (s->event_idx < s->event_max) {
struct iocb *iocb = s->events[s->event_idx].obj;
struct qemu_laiocb *laiocb =
container_of(iocb, struct qemu_laiocb, iocb);
laiocb->ret = io_event_ret(&events[s->event_idx]);
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
/* Change counters one-by-one because we can be nested. */
s->io_q.in_flight--;
s->event_idx++;
qemu_laio_process_completion(laiocb);
}
qemu_laio_process_completion(s, laiocb);
}
qemu_bh_cancel(s->completion_bh);
/* If we are nested we have to notify the level above that we are done
* by setting event_max to zero, upper level will then jump out of it's
* own `for` loop. If we are the last all counters droped to zero. */
s->event_max = 0;
s->event_idx = 0;
}
static void qemu_laio_process_completions_and_submit(LinuxAioState *s)
{
qemu_laio_process_completions(s);
aio_context_acquire(s->aio_context);
if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
aio_context_release(s->aio_context);
}
static void qemu_laio_completion_bh(void *opaque)
{
LinuxAioState *s = opaque;
qemu_laio_process_completions_and_submit(s);
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
LinuxAioState *s = container_of(e, LinuxAioState, e);
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
if (event_notifier_test_and_clear(&s->e)) {
qemu_laio_process_completions_and_submit(s);
qemu_bh_schedule(s->completion_bh);
}
}
static bool qemu_laio_poll_cb(void *opaque)
{
EventNotifier *e = opaque;
LinuxAioState *s = container_of(e, LinuxAioState, e);
struct io_event *events;
if (!io_getevents_peek(s->ctx, &events)) {
return false;
}
qemu_laio_process_completions_and_submit(s);
return true;
}
static void laio_cancel(BlockAIOCB *blockacb)
{
struct qemu_laiocb *laiocb = (struct qemu_laiocb *)blockacb;
@@ -300,26 +181,22 @@ static void ioq_init(LaioQueue *io_q)
{
QSIMPLEQ_INIT(&io_q->pending);
io_q->plugged = 0;
io_q->in_queue = 0;
io_q->in_flight = 0;
io_q->n = 0;
io_q->blocked = false;
}
static void ioq_submit(LinuxAioState *s)
static void ioq_submit(struct qemu_laio_state *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
struct iocb *iocbs[MAX_EVENTS];
struct iocb *iocbs[MAX_QUEUED_IO];
QSIMPLEQ_HEAD(, qemu_laiocb) completed;
do {
if (s->io_q.in_flight >= MAX_EVENTS) {
break;
}
len = 0;
QSIMPLEQ_FOREACH(aiocb, &s->io_q.pending, next) {
iocbs[len++] = &aiocb->iocb;
if (s->io_q.in_flight + len >= MAX_EVENTS) {
if (len == MAX_QUEUED_IO) {
break;
}
}
@@ -329,56 +206,55 @@ static void ioq_submit(LinuxAioState *s)
break;
}
if (ret < 0) {
/* Fail the first request, retry the rest */
aiocb = QSIMPLEQ_FIRST(&s->io_q.pending);
QSIMPLEQ_REMOVE_HEAD(&s->io_q.pending, next);
s->io_q.in_queue--;
aiocb->ret = ret;
qemu_laio_process_completion(aiocb);
continue;
abort();
}
s->io_q.in_flight += ret;
s->io_q.in_queue -= ret;
s->io_q.n -= ret;
aiocb = container_of(iocbs[ret - 1], struct qemu_laiocb, iocb);
QSIMPLEQ_SPLIT_AFTER(&s->io_q.pending, aiocb, next, &completed);
} while (ret == len && !QSIMPLEQ_EMPTY(&s->io_q.pending));
s->io_q.blocked = (s->io_q.in_queue > 0);
if (s->io_q.in_flight) {
/* We can try to complete something just right away if there are
* still requests in-flight. */
qemu_laio_process_completions(s);
/*
* Even we have completed everything (in_flight == 0), the queue can
* have still pended requests (in_queue > 0). We do not attempt to
* repeat submission to avoid IO hang. The reason is simple: s->e is
* still set and completion callback will be called shortly and all
* pended requests will be submitted from there.
*/
}
s->io_q.blocked = (s->io_q.n > 0);
}
void laio_io_plug(BlockDriverState *bs, LinuxAioState *s)
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
}
void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s)
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
{
assert(s->io_q.plugged);
if (--s->io_q.plugged == 0 &&
!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
struct qemu_laio_state *s = aio_ctx;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return;
}
if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
}
static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
int type)
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{
LinuxAioState *s = laiocb->ctx;
struct iocb *iocbs = &laiocb->iocb;
QEMUIOVector *qiov = laiocb->qiov;
struct qemu_laio_state *s = aio_ctx;
struct qemu_laiocb *laiocb;
struct iocb *iocbs;
off_t offset = sector_num * 512;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * 512;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
iocbs = &laiocb->iocb;
switch (type) {
case QEMU_AIO_WRITE:
@@ -391,88 +267,43 @@ static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
default:
fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
__func__, type);
return -EIO;
goto out_free_aiocb;
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
QSIMPLEQ_INSERT_TAIL(&s->io_q.pending, laiocb, next);
s->io_q.in_queue++;
s->io_q.n++;
if (!s->io_q.blocked &&
(!s->io_q.plugged ||
s->io_q.in_flight + s->io_q.in_queue >= MAX_EVENTS)) {
(!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
ioq_submit(s);
}
return 0;
}
int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
uint64_t offset, QEMUIOVector *qiov, int type)
{
int ret;
struct qemu_laiocb laiocb = {
.co = qemu_coroutine_self(),
.nbytes = qiov->size,
.ctx = s,
.ret = -EINPROGRESS,
.is_read = (type == QEMU_AIO_READ),
.qiov = qiov,
};
ret = laio_do_submit(fd, &laiocb, offset, type);
if (ret < 0) {
return ret;
}
if (laiocb.ret == -EINPROGRESS) {
qemu_coroutine_yield();
}
return laiocb.ret;
}
BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laiocb *laiocb;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
int ret;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * BDRV_SECTOR_SIZE;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
ret = laio_do_submit(fd, laiocb, offset, type);
if (ret < 0) {
qemu_aio_unref(laiocb);
return NULL;
}
return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
return NULL;
}
void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context)
void laio_detach_aio_context(void *s_, AioContext *old_context)
{
aio_set_event_notifier(old_context, &s->e, false, NULL, NULL);
struct qemu_laio_state *s = s_;
aio_set_event_notifier(old_context, &s->e, false, NULL);
qemu_bh_delete(s->completion_bh);
s->aio_context = NULL;
}
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context)
void laio_attach_aio_context(void *s_, AioContext *new_context)
{
s->aio_context = new_context;
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, false,
qemu_laio_completion_cb,
qemu_laio_poll_cb);
qemu_laio_completion_cb);
}
LinuxAioState *laio_init(void)
void *laio_init(void)
{
LinuxAioState *s;
struct qemu_laio_state *s;
s = g_malloc0(sizeof(*s));
if (event_notifier_init(&s->e, false) < 0) {
@@ -494,8 +325,10 @@ out_free_state:
return NULL;
}
void laio_cleanup(LinuxAioState *s)
void laio_cleanup(void *s_)
{
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {

Some files were not shown because too many files have changed in this diff Show More