Call vbe_update_vgaregs() when the guest touches GFX, SEQ or CRT
registers, to make sure the vga registers will always have the
values needed by vbe mode. This makes sure the sanity checks
applied by vbe_fixup_regs() are effective.
Without this guests can muck with shift_control, can turn on planar
vga modes or text mode emulation while VBE is active, making qemu
take code paths meant for CGA compatibility, but with the very
large display widths and heigts settable using VBE registers.
Which is good for one or another buffer overflow. Not that
critical as they typically read overflows happening somewhere
in the display code. So guests can DoS by crashing qemu with a
segfault, but it is probably not possible to break out of the VM.
Fixes: CVE-2016-3712
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Reported-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Call the new vbe_update_vgaregs() function on vbe configuration
changes, to make sure vga registers are up-to-date.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When enabling vbe mode qemu will setup a bunch of vga registers to make
sure the vga emulation operates in correct mode for a linear
framebuffer. Move that code to a separate function so we can call it
from other places too.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Makes code a bit easier to read.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978160 CVE-2016-3712]
Signed-off-by: Bruce Rogers <brogers@suse.com>
vga allows banked access to video memory using the window at 0xa00000
and it supports a different access modes with different address
calculations.
The VBE bochs extentions support banked access too, using the
VBE_DISPI_INDEX_BANK register. The code tries to take the different
address calculations into account and applies different limits to
VBE_DISPI_INDEX_BANK depending on the current access mode.
Which is probably effective in stopping misprogramming by accident.
But from a security point of view completely useless as an attacker
can easily change access modes after setting the bank register.
Drop the bogus check, add range checks to vga_mem_{readb,writeb}
instead.
Fixes: CVE-2016-3710
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[BR: BSC#978158 CVE-2016-3710]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.
Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975700 CVE-2016-4020]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When receiving packets over MIPSnet network device, it uses
receive buffer of size 1514 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.
Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#975136 CVE-2016-4002]
Signed-off-by: Bruce Rogers <brogers@suse.com>
While computing IP checksum, 'net_checksum_calculate' reads
payload length from the packet. It could exceed the given 'data'
buffer size. Add a check to avoid it.
Reported-by: Liu Ling <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 362786f14a)
[BR: BSC#970037 CVE-2016-2857]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Requests are now created in the RngBackend parent class and the
code path is shared by both rng-egd and rng-random.
This commit fixes the rng-random implementation which processed
only one request at a time and simply discarded all but the most
recent one. In the guest this manifested as delayed completion
of reads from virtio-rng, i.e. a read was completed only after
another read was issued.
By switching rng-random to use the same request queue as rng-egd,
the unsafe stack-based allocation of the entropy buffer is
eliminated and replaced with g_malloc.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1456994238-9585-5-git-send-email-lprosek@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
(cherry picked from commit 60253ed1e6)
[BR: BSC#970036 CVE-2016-2858]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. Registers PSTART & PSTOP
define ring buffer size & location. Setting these registers
to invalid values could lead to infinite loop or OOB r/w
access issues. Add check to avoid it.
Reported-by: Yang Hongke <yanghongke@huawei.com>
Tested-by: Yang Hongke <yanghongke@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 415ab35a44)
[BR: BSC#969350 CVE-2016-2841]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Loading the BIOS in the mac99 machine is interesting, because there is a
PROM in the middle of the BIOS region (from 16K to 32K). Before memory
region accesses were clamped, when QEMU was asked to load a BIOS from
0xfff00000 to 0xffffffff it would put even those 16K from the BIOS file
into the region. This is weird because those 16K were not actually
visible between 0xfff04000 and 0xfff07fff. However, it worked.
After clamping was added, this also worked. In this case, the
cpu_physical_memory_write_rom_internal function split the write in
three parts: the first 16K were copied, the PROM area (second 16K) were
ignored, then the rest was copied.
Problems then started with commit 965eb2f (exec: do not clamp accesses
to MMIO regions, 2015-06-17). Clamping accesses is not done for MMIO
regions because they can overlap wildly, and MMIO registers can be
expected to perform full-width accesses based only on their address
(with no respect for adjacent registers that could decode to completely
different MemoryRegions). However, this lack of clamping also applied
to the PROM area! cpu_physical_memory_write_rom_internal thus failed
to copy the third range above, i.e. only copied the first 16K of the BIOS.
In effect, address_space_translate is expecting _something else_ to do
the clamping for MMIO regions if the incoming length is large. This
"something else" is memory_access_size in the case of address_space_rw,
so use the same logic in cpu_physical_memory_write_rom_internal.
Reported-by: Alexander Graf <agraf@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Fixes: 965eb2f
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit b242e0e0e2)
[BR: BSC#969122 CVE-2015-8818]
Signed-off-by: Bruce Rogers <brogers@suse.com>
It is common for MMIO registers to overlap, for example a 4 byte register
at 0xcf8 (totally random choice... :)) and a 1 byte register at 0xcf9.
If these registers are implemented via separate MemoryRegions, it is
wrong to clamp the accesses as the value written would be truncated.
Hence for these regions the effects of commit 23820db (exec: Respect
as_translate_internal length clamp, 2015-03-16, previously applied as
commit c3c1bb99) must be skipped.
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 965eb2fcdf)
[BR: support patch for BSC#969122]
Signed-off-by: Bruce Rogers <brogers@suse.com>
address_space_translate_internal will clamp the *plen length argument
based on the size of the memory region being queried. The iommu walker
logic in addresss_space_translate was ignoring this by discarding the
post fn call value of *plen. Fix by just always using *plen as the
length argument throughout the fn, removing the len local variable.
This fixes a bootloader bug when a single elf section spans multiple
QEMU memory regions.
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Message-Id: <1426570554-15940-1-git-send-email-peter.crosthwaite@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c3c1bb99d1)
[BR: BSC#969121 CVE-2015-8817]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing remote NDIS control message packets,
the USB Net device emulator uses a fixed length(4096) data buffer.
The incoming informationBufferOffset & Length combination could
overflow and cross that range. Check control message buffer
offsets and length to avoid it.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1455648821-17340-3-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit fe3c546c5f)
[BR: BSC#967969 CVE-2016-2538]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The _cmp_bytes variable added by commit "bea60dd ui/vnc: fix potential
memory corruption issues" can become negative. Result is (possibly
exploitable) memory corruption. Reason for that is it uses the stride
instead of bytes per scanline to apply limits.
For the server surface is is actually fine. vnc creates that itself,
there is never any padding and thus scanline length always equals stride.
For the guest surface scanline length and stride are typically identical
too, but it doesn't has to be that way. So add and use a new variable
(guest_ll) for the guest scanline length. Also rename min_stride to
line_bytes to make more clear what it actually is. Finally sprinkle
in an assert() to make sure we never use a negative _cmp_bytes again.
Reported-by: 范祚至(库特) <zuozhi.fzz@alibaba-inc.com>
Reviewed-by: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit eb8934b041)
[AF: BSC#942845]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
The start_xmit() and e1000_receive_iov() functions implement DMA transfers
iterating over a set of descriptors that the guest's e1000 driver
prepares:
- the TDLEN and RDLEN registers store the total size of the descriptor
area,
- while the TDH and RDH registers store the offset (in whole tx / rx
descriptors) into the area where the transfer is supposed to start.
Each time a descriptor is processed, the TDH and RDH register is bumped
(as appropriate for the transfer direction).
QEMU already contains logic to deal with bogus transfers submitted by the
guest:
- Normally, the transmit case wants to increase TDH from its initial value
to TDT. (TDT is allowed to be numerically smaller than the initial TDH
value; wrapping at or above TDLEN bytes to zero is normal.) The failsafe
that QEMU currently has here is a check against reaching the original
TDH value again -- a complete wraparound, which should never happen.
- In the receive case RDH is increased from its initial value until
"total_size" bytes have been received; preferably in a single step, or
in "s->rxbuf_size" byte steps, if the latter is smaller. However, null
RX descriptors are skipped without receiving data, while RDH is
incremented just the same. QEMU tries to prevent an infinite loop
(processing only null RX descriptors) by detecting whether RDH assumes
its original value during the loop. (Again, wrapping from RDLEN to 0 is
normal.)
What both directions miss is that the guest could program TDLEN and RDLEN
so low, and the initial TDH and RDH so high, that these registers will
immediately be truncated to zero, and then never reassume their initial
values in the loop -- a full wraparound will never occur.
The condition that expresses this is:
xdh_start >= s->mac_reg[XDLEN] / sizeof(desc)
i.e., TDH or RDH start out after the last whole rx or tx descriptor that
fits into the TDLEN or RDLEN sized area.
This condition could be checked before we enter the loops, but
pci_dma_read() / pci_dma_write() knows how to fill in buffers safely for
bogus DMA addresses, so we just extend the existing failsafes with the
above condition.
This is CVE-2016-1981.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Petr Matousek <pmatouse@redhat.com>
Cc: Stefano Stabellini <stefano.stabellini@eu.citrix.com>
Cc: Prasad Pandit <ppandit@redhat.com>
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
RHBZ: https://bugzilla.redhat.com/show_bug.cgi?id=1296044
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit dd793a7488)
[BR: BSC#963782 CVE-2016-1981]
Signed-off-by: Bruce Rogers <brogers@suse.com>
USB Ehci emulation supports host controller capability registers.
But its mmio '.write' function was missing, which lead to a null
pointer dereference issue. Add a do nothing 'ehci_caps_write'
definition to avoid it; Do nothing because capability registers
are Read Only(RO).
Reported-by: Zuozhi Fzz <zuozhi.fzz@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#964413 CVE-2016-2198]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing 'sendkey' command, hmp_sendkey routine null
terminates the 'keyname_buf' array. This results in an OOB
write issue, if 'keyname_len' was to fall outside of
'keyname_buf' array.
Since the keyname's length is known the keyname_buf can be
removed altogether by adding a length parameter to
index_from_key() and using it for the error output as well.
Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Message-Id: <20160113080958.GA18934@olga>
[Comparison with "<" dumbed down, test for junk after strtoul()
tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 64ffbe04ea)
[BR: BSC#960334 CVE-2015-8619]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hmp.c
ui/input-legacy.c
Validation of l2 header length assumed minimal packet size as
eth_header + 2 * vlan_header regardless of the actual protocol.
This caused crash for valid non-IP packets shorter than 22 bytes, as
'tx_pkt->packet_type' hasn't been assigned for such packets, and
'vmxnet3_on_tx_done_update_stats()' expects it to be properly set.
Refine header length validation in 'vmxnet_tx_pkt_parse_headers'.
Check its return value during packet processing flow.
As a side effect, in case IPv4 and IPv6 header validation failure,
corrupt packets will be dropped.
Signed-off-by: Dana Rubin <dana.rubin@ravellosystems.com>
Signed-off-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7278b36fcab9af469563bd7b9dadebe2ae25e48)
[CYL: BSC#960835 CVE-2015-8744]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Vmxnet3 device emulator does not check if the device is active
before activating it, also it did not free the transmit & receive
buffers while deactivating the device, thus resulting in memory
leakage on the host. This patch fixes both these issues to avoid
host memory leakage.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit aa4a3dce1c)
[CYL: BSC#959386 CVE-2015-8568 CVE-2015-8567]
Signed-off-by: Chunyan Liu <cyliu@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Conflicts:
hw/net/vmxnet3.c
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing NCQ commands, AHCI device emulation prepares a
NCQ transfer object; To which an aio control block(aiocb) object
is assigned in 'execute_ncq_command'. In case, when the NCQ
command is invalid, the 'aiocb' object is not assigned, and NCQ
transfer object is left as 'used'. This leads to a use after
free kind of error in 'bdrv_aio_cancel_async' via 'ahci_reset_port'.
Reset NCQ transfer object to 'unused' to avoid it.
[Maintainer edit: s/ACHI/AHCI/ in the commit message. --js]
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1452282511-4116-1-git-send-email-ppandit@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit 4ab0359a8a)
[LM: BSC#961333 CVE-2016-1568]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
qpci_msix_pending() writes on pba region, causing qemu to SEGV:
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffff7fba8c0 (LWP 25882)]
0x0000000000000000 in ?? ()
(gdb) bt
#0 0x0000000000000000 in ()
#1 0x00005555556556c5 in memory_region_oldmmio_write_accessor (mr=0x5555579f3f80, addr=0, value=0x7fffffffbf68, size=4, shift=0, mask=4294967295, attrs=...) at /home/elmarco/src/qemu/memory.c:434
#2 0x00005555556558e1 in access_with_adjusted_size (addr=0, value=0x7fffffffbf68, size=4, access_size_min=1, access_size_max=4, access=0x55555565563e <memory_region_oldmmio_write_accessor>, mr=0x5555579f3f80, attrs=...) at /home/elmarco/src/qemu/memory.c:506
#3 0x00005555556581eb in memory_region_dispatch_write (mr=0x5555579f3f80, addr=0, data=0, size=4, attrs=...) at /home/elmarco/src/qemu/memory.c:1176
#4 0x000055555560b6f9 in address_space_rw (as=0x555555eff4e0 <address_space_memory>, addr=3759147008, attrs=..., buf=0x7fffffffc1b0 "", len=4, is_write=true) at /home/elmarco/src/qemu/exec.c:2439
#5 0x000055555560baa2 in cpu_physical_memory_rw (addr=3759147008, buf=0x7fffffffc1b0 "", len=4, is_write=1) at /home/elmarco/src/qemu/exec.c:2534
#6 0x000055555564c005 in cpu_physical_memory_write (addr=3759147008, buf=0x7fffffffc1b0, len=4) at /home/elmarco/src/qemu/include/exec/cpu-common.h:80
#7 0x000055555564cd9c in qtest_process_command (chr=0x55555642b890, words=0x5555578de4b0) at /home/elmarco/src/qemu/qtest.c:378
#8 0x000055555564db77 in qtest_process_inbuf (chr=0x55555642b890, inbuf=0x55555641b340) at /home/elmarco/src/qemu/qtest.c:569
#9 0x000055555564dc07 in qtest_read (opaque=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", size=22) at /home/elmarco/src/qemu/qtest.c:581
#10 0x000055555574ce3e in qemu_chr_be_write (s=0x55555642b890, buf=0x7fffffffc2e0 "writel 0xe0100800 0x0\n", len=22) at qemu-char.c:306
#11 0x0000555555751263 in tcp_chr_read (chan=0x55555642bcf0, cond=G_IO_IN, opaque=0x55555642b890) at qemu-char.c:2876
#12 0x00007ffff64c9a8a in g_main_context_dispatch (context=0x55555641c400) at gmain.c:3122
(without this patch, this can be reproduced with the ivshmem qtest)
Implement an empty mmio write to avoid the crash.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 43b11a91dd)
[LM: BSC#958917 CVE-2015-7549]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
Commit "156a2e4 ehci: make idt processing more robust" tries to avoid a
DoS by the guest (create a circular iTD queue and let qemu ehci
emulation run in circles forever). Unfortunately this has two problems:
First it misses the case of siTDs, and second it reportedly breaks
FreeBSD.
So lets go for a different approach: just count the number of iTDs and
siTDs we have seen per frame and apply a limit. That should really
catch all cases now.
Reported-by: 杜少博 <dushaobo@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 1ae3f2f178)
[BR: BSC#976109 CVE-2016-4037]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Don't assume a specific layout for control messages.
Required by virtio 1.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
(cherry picked from commit 7882080388)
[LM: BSC#940929 CVE-2015-5745]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
While sending 'SetPixelFormat' messages to a VNC server,
the client could set the 'red-max', 'green-max' and 'blue-max'
values to be zero. This leads to a floating point exception in
write_png_palette while doing frame buffer updates.
Reported-by: Lian Yihan <lianyihan@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 4c65fed8bd)
[LM: BSC#958491 CVE-2015-8504]
Signed-off-by: Lin Ma <lma@suse.com>
Signed-off-by: Bruce Rogers <brogers@suse.com>
While processing controller 'CTRL_GET_INFO' command, the routine
'megasas_ctrl_get_info' overflows the '&info' object size. Use its
appropriate size to null initialise it.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512211501420.22471@wniryva>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 36fef36b91)
[BR: BSC#961556 CVE-2015-8613]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Hello,
A null pointer dereference issue was reported by Mr Ling Liu, CC'd here. It
occurs while doing I/O port write operations via hmp interface. In that,
'current_cpu' remains null as it is not called from cpu_exec loop, which
results in the said issue.
Below is a proposed (tested)patch to fix this issue; Does it look okay?
===
From ae88a4947fab9a148cd794f8ad2d812e7f5a1d0f Mon Sep 17 00:00:00 2001
From: Prasad J Pandit <pjp@fedoraproject.org>
Date: Fri, 18 Dec 2015 11:16:07 +0530
Subject: [PATCH] i386: avoid null pointer dereference
When I/O port write operation is called from hmp interface,
'current_cpu' remains null, as it is not called from cpu_exec()
loop. This leads to a null pointer dereference in vapic_write
routine. Add check to avoid it.
Reported-by: Ling Liu <liuling-it@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <alpine.LFD.2.20.1512181129320.9805@wniryva>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: P J P <ppandit@redhat.com>
(cherry picked from commit 4c1396cb57)
[BR: BSC#962320 CVE-2016-1922]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When processing firmware configurations, an OOB r/w access occurs
if 's->cur_entry' is set to be invalid(FW_CFG_INVALID=0xffff).
Add a check to validate 's->cur_entry' to avoid such access.
Reported-by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
[BR: BSC#961691 CVE-2016-1714]
Signed-off-by: Bruce Rogers <brogers@suse.com>
When packet is truncated during receiving, we drop the packets but
neither discard the descriptor nor add and signal used
descriptor. This will lead several issues:
- sg mappings are leaked
- rx will be stalled if a lots of packets were truncated
In order to be consistent with vhost, fix by discarding the descriptor
in this case.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 0cf33fb6b4)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
This patch introduces virtqueue_discard() to discard a descriptor and
unmap the sgs. This will be used by the patch that will discard
descriptor when packet is truncated.
Cc: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 29b9f5efd7)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Factor out sg unmapping logic. This will be reused by the patch that
can discard descriptor.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Andrew James <andrew.james@hpe.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit ce31746157)
[BR: BSC#947159 CVE-2015-7295]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Backends could provide a packet whose length is greater than buffer
size. Check for this and truncate the packet to avoid rx buffer
overflow in this case.
Cc: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Jason Wang <jasowang@redhat.com>
[BR: BSC#957162]
Signed-off-by: Bruce Rogers <brogers@suse.com>
We're a little too lenient with what we'll let an ATAPI drive handle.
Clamp down on the IDE command execution table to remove CD_OK permissions
from commands that are not and have never been ATAPI commands.
For ATAPI command validity, please see:
- ATA4 Section 6.5 ("PACKET Command feature set")
- ATA8/ACS Section 4.3 ("The PACKET feature set")
- ACS3 Section 4.3 ("The PACKET feature set")
ACS3 has a historical command validity table in Table B.4
("Historical Command Assignments") that can be referenced to find when
a command was introduced, deprecated, obsoleted, etc.
The only reference for ATAPI command validity is by checking that
version's PACKET feature set section.
ATAPI was introduced by T13 into ATA4, all commands retired prior to ATA4
therefore are assumed to have never been ATAPI commands.
Mandatory commands, as listed in ATA8-ACS3, are:
- DEVICE RESET
- EXECUTE DEVICE DIAGNOSTIC
- IDENTIFY DEVICE
- IDENTIFY PACKET DEVICE
- NOP
- PACKET
- READ SECTOR(S)
- SET FEATURES
Optional commands as listed in ATA8-ACS3, are:
- FLUSH CACHE
- READ LOG DMA EXT
- READ LOG EXT
- WRITE LOG DMA EXT
- WRITE LOG EXT
All other commands are illegal to send to an ATAPI device and should
be rejected by the device.
CD_OK removal justifications:
0x06 WIN_DSM Defined in ACS2. Not valid for ATAPI.
0x21 WIN_READ_ONCE Retired in ATA5. Not ATAPI in ATA4.
0x94 WIN_STANDBYNOW2 Retired in ATA4. Did not coexist with ATAPI.
0x95 WIN_IDLEIMMEDIATE2 Retired in ATA4. Did not coexist with ATAPI.
0x96 WIN_STANDBY2 Retired in ATA4. Did not coexist with ATAPI.
0x97 WIN_SETIDLE2 Retired in ATA4. Did not coexist with ATAPI.
0x98 WIN_CHECKPOWERMODE2 Retired in ATA4. Did not coexist with ATAPI.
0x99 WIN_SLEEPNOW2 Retired in ATA4. Did not coexist with ATAPI.
0xE0 WIN_STANDBYNOW1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE1 WIN_IDLEIMMDIATE Not part of ATAPI in ATA4, ACS or ACS3.
0xE2 WIN_STANDBY Not part of ATAPI in ATA4, ACS or ACS3.
0xE3 WIN_SETIDLE1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE4 WIN_CHECKPOWERMODE1 Not part of ATAPI in ATA4, ACS or ACS3.
0xE5 WIN_SLEEPNOW1 Not part of ATAPI in ATA4, ACS or ACS3.
0xF8 WIN_READ_NATIVE_MAX Obsoleted in ACS3. Not ATAPI in ATA4 or ACS.
This patch fixes a divide by zero fault that can be caused by sending
the WIN_READ_NATIVE_MAX command to an ATAPI drive, which causes it to
attempt to use zeroed CHS values to perform sector arithmetic.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1441816082-21031-1-git-send-email-jsnow@redhat.com
CC: qemu-stable@nongnu.org
(cherry picked from commit d9033e1d3a)
[BR: CVE-2015-6855 BSC#945404]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, which could lead to a
memory buffer overflow. Added other checks at initialisation.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9bbdbc66e5)
[BR: BSC#945987]
Signed-off-by: Bruce Rogers <brogers@suse.com>
Ne2000 NIC uses ring buffer of NE2000_MEM_SIZE(49152)
bytes to process network packets. While receiving packets
via ne2000_receive() routine, a local 'index' variable
could exceed the ring buffer size, leading to an infinite
loop situation.
Reported-by: Qinghao Tang <luodalongde@gmail.com>
Signed-off-by: P J P <pjp@fedoraproject.org>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 737d2b3c41)
[BR: BSC#945989]
Signed-off-by: Bruce Rogers <brogers@suse.com>
bits_per_pixel that are less than 8 could result in accessing
non-initialized buffers later in the code due to the expectation
that bytes_per_pixel value that is used to initialize these buffers is
never zero.
To fix this check that bits_per_pixel from the client is one of the
values that the rfb protocol specification allows.
This is CVE-2014-7815.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
[ kraxel: apply codestyle fix ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e6908bfe8e)
[AF: BSC#902737]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
This is additional hardening against an end_transfer_func that fails to
clear the DRQ status bit. The bit must be unset as soon as the PIO
transfer has completed, so it's better to do this in a central place
instead of duplicating the code in all commands (and forgetting it in
some).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The command must be completed on all code paths. START STOP UNIT with
pwrcnd set should succeed without doing anything.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
If the end_transfer_func of a command is called because enough data has
been read or written for the current PIO transfer, and it fails to
correctly call the command completion functions, the DRQ bit in the
status register and s->end_transfer_func may remain set. This allows the
guest to access further bytes in s->io_buffer beyond s->data_end, and
eventually overflowing the io_buffer.
One case where this currently happens is emulation of the ATAPI command
START STOP UNIT.
This patch fixes the problem by adding explicit array bounds checks
before accessing the buffer instead of relying on end_transfer_func to
function correctly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
[BR: BSC#938344]
Signed-off-by: Bruce Rogers <brogers@suse.com>
In this version I used mkdtemp(3) which is:
_BSD_SOURCE
|| /* Since glibc 2.10: */
(_POSIX_C_SOURCE >= 200809L || _XOPEN_SOURCE >= 700)
(POSIX.1-2008), so should be available on systems we care about.
While at it, reset the resulting directory name within smb structure
on error so cleanup function wont try to remove directory which we
failed to create.
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 8b8f1c7e9d)
[BR: BSC#932267]
Signed-off-by: Bruce Rogers <brogers@suse.com>
4096 is the maximum length per TMD and it is also currently the size of
the relay buffer pcnet driver uses for sending the packet data to QEMU
for further processing. With packet spanning multiple TMDs it can
happen that the overall packet size will be bigger than sizeof(buffer),
which results in memory corruption.
Fix this by only allowing to queue maximum sizeof(buffer) bytes.
This is CVE-2015-3209.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reported-by: Matt Tait <matttait@google.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
s->xmit_pos maybe assigned to a negative value (-1),
but in this branch variable s->xmit_pos as an index to
array s->buffer. Let's add a check for s->xmit_pos.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 7b50d00911)
[BR: BSC#932770]
Signed-off-by: Bruce Rogers <brogers@suse.com>
The VNC server websockets decoder will read and buffer data from
websockets clients until it sees the end of the HTTP headers,
as indicated by \r\n\r\n. In theory this allows a malicious to
trick QEMU into consuming an arbitrary amount of RAM. In practice,
because QEMU runs g_strstr_len() across the buffered header data,
it will spend increasingly long burning CPU time searching for
the substring match and less & less time reading data. So while
this does cause arbitrary memory growth, the bigger problem is
that QEMU will be burning 100% of available CPU time.
A novnc websockets client typically sends headers of around
512 bytes in length. As such it is reasonable to place a 4096
byte limit on the amount of data buffered while searching for
the end of HTTP headers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 2cdb5e142f)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
The logic for decoding websocket frames wants to fully
decode the frame header and payload, before allowing the
VNC server to see any of the payload data. There is no
size limit on websocket payloads, so this allows a
malicious network client to consume 2^64 bytes in memory
in QEMU. It can trigger this denial of service before
the VNC server even performs any authentication.
The fix is to decode the header, and then incrementally
decode the payload data as it is needed. With this fix
the websocket decoder will allow at most 4k of data to
be buffered before decoding and processing payload.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
[ kraxel: fix frequent spurious disconnects, suggested by Peter Maydell ]
[ kraxel: fix 32bit build ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit a2bebfd6e0)
[AF: BSC#924018]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Signed-off-by: Bruce Rogers <brogers@suse.com>
During processing of certain commands such as FD_CMD_READ_ID and
FD_CMD_DRIVE_SPECIFICATION_COMMAND the fifo memory access could
get out of bounds leading to memory corruption with values coming
from the guest.
Fix this by making sure that the index is always bounded by the
allocated memory.
This is CVE-2015-3456.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
(cherry picked from commit e907746266)
[AF: BOO#929339]
Signed-off-by: Andreas Färber <afaerber@suse.de>
The blkpg ioctl can take different payloads depending on the opcode in
its payload structure. Create a new special ioctl handler that can only
deal with partition style ones for now.
Signed-off-by: Alexander Graf <agraf@suse.de>
---
Andreas, if you like feel free to squash this into your patch and submit
it upstream.
We check whether the passed in counter value is negative on all calls
that involve g_posix_timers. However, we also check check for negativity
of that value after casting it - at which point it couldn't possibly be
negative anymore.
Cast the check to int16_t. Maybe this is correct. Maybe the check should
get removed completely.
Signed-off-by: Alexander Graf <agraf@suse.de>
When doing lseek, SEEK_SET indicates that the offset is an unsigned variable.
Other seek types have parameters that can be negative.
When converting from 32bit to 64bit parameters, we need to take this into
account and enable SEEK_END and SEEK_CUR to be negative, while SEEK_SET stays
absolute positioned which we need to maintain as unsigned.
Signed-off-by: Alexander Graf <agraf@suse.de>
Virtio-Console can only process one character at a time. Using it on S390
gave me strage "lags" where I got the character I pressed before when
pressing one. So I typed in "abc" and only received "a", then pressed "d"
but the guest received "b" and so on.
While the stdio driver calls a poll function that just processes on its
queue in case virtio-console can't take multiple characters at once, the
muxer does not have such callbacks, so it can't empty its queue.
To work around that limitation, I introduced a new timer that only gets
active when the guest can not receive any more characters. In that case
it polls again after a while to check if the guest is now receiving input.
This patch fixes input when using -nographic on s390 for me.
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.
This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.
Signed-off-by: Alexander Graf <agraf@suse.de>
Tar is a very widely used format to store data in. Sometimes people even put
virtual machine images in there.
So it makes sense for qemu to be able to read from tar files. I implemented a
written from scratch reader that also knows about the GNU sparse format, which
is what pigz creates.
This version checks for filenames that end on well-known extensions. The logic
could be changed to search for filenames given on the command line, but that
would require changes to more parts of qemu.
The tar reader in conjunctiuon with dzip gives us the chance to download
tar'ed up virtual machine images (even via http) and instantly make use of
them.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: bdrv_file_open got an Error **errp argument, bdrv_delete -> brd_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
DictZip is an extension to the gzip format that allows random seeks in gzip
compressed files by cutting the file into pieces and storing the piece offsets
in the "extra" header of the gzip format.
Thanks to that extension, we can use gzip compressed files as block backend,
though only in read mode.
This makes a lot of sense when stacked with tar files that can then be shipped
to VM users. If a VM image is inside a tar file that is inside a DictZip
enabled gzip file, the user can run the tar.gz file as is without having to
extract the image first.
Tar patch follows.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Bruce Rogers <brogers@novell.com>
Signed-off-by: Andreas Färber <afaerber@suse.de>
[TH: Use bdrv_open options instead of filename]
Signed-off-by: Tim Hardeck <thardeck@suse.de>
[AF: Error **errp added for bdrv_file_open, bdrv_delete -> bdrv_unref]
[AF: qemu_opts_create_nofail() -> qemu_opts_create(),
bdrv_file_open() -> bdrv_open(), based on work by brogers]
[AF: error_is_set() dropped for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Linux syscalls pass pointers or data length or other information of that sort
to the kernel. This is all stuff you don't want to have sign extended.
Otherwise a host 64bit variable parameter with a size parameter will extend
it to a negative number, breaking lseek for example.
Pass syscall arguments as ulong always.
Signed-off-by: Alexander Graf <agraf@suse.de>
Fedora 17 for ARM reads /proc/cpuinfo and fails if it doesn't contain
ARM related contents. This patch implements a quick hack to expose real
/proc/cpuinfo data taken from a real world machine.
The real fix would be to generate at least the flags automatically based
on the selected CPU. Please do not submit this patch upstream until this
has happened.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v1.6 and v1.7]
Signed-off-by: Andreas Färber <afaerber@suse.de>
Running multi-threaded code can easily expose some of the fundamental
breakages in QEMU's design. It's just not a well supported scenario.
So if we pin the whole process to a single host CPU, we guarantee that
we will never have concurrent memory access actually happen. We can still
get scheduled away at any time, so it's no complete guarantee, but apparently
it reduces the odds well enough to get my test cases to pass.
This gets Java 1.7 working for me again on my test box.
Signed-off-by: Alexander Graf <agraf@suse.de>
The tcg code generator is not thread safe. Lock its generation between
different threads.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto exec.c/translate-all.c split for 1.4]
[AF: Rebased for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
During invocations of losetup, we run into an ioctl that doesn't
exist. However, because of that we output an error, which then
screws up the kiwi logic around that call.
So let's silently ignore that bogus ioctl.
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
When running automoc4 as linux-user guest program, it segfaults right after
it creates a thread. Bisecting pointed to commit a84fac1426 which introduces
tb_flush on reset.
So something in our thread creation is broken. But for now, let's revert the
change to at least get a working build again.
[AF: Rebased, fixed typo]
When we have a working host binary equivalent for the guest binary we're
trying to run, let's just use that instead as it will be a lot faster.
Signed-off-by: Alexander Graf <agraf@suse.de>
When entering the guest we take a lock to ensure that nobody else messes
with our TB chaining while we're doing it. If we get a segfault inside that
code, we manage to work on, but will not unlock the lock.
This patch forces unlocking of that lock in the segv handler. I'm not sure
this is the right approach though. Maybe we should rather make sure we don't
segfault in the code? I would greatly appreciate someone more intelligible
than me to look at this :).
Example code to trigger this is at: http://csgraf.de/tmp/conftest.c
Reported-by: Fabio Erculiani <lxnay@sabayon.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Andreas Färber <afaerber@suse.de>
When using hugetlbfs (which is required for HV mode KVM on 970), we
check for MMU notifiers that on 970 can not be implemented properly.
So disable the check for mmu notifiers on PowerPC guests, making
KVM guests work there, even if possibly racy in some odd circumstances.
When using qemu's linux-user binaries through binfmt, argv[0] gets lost
along the execution because qemu only gets passed in the full file name
to the executable while argv[0] can be something completely different.
This breaks in some subtile situations, such as the grep and make test
suites.
This patch adds a wrapper binary called qemu-$TARGET-binfmt that can be
used with binfmt's P flag which passes the full path _and_ argv[0] to
the binfmt handler.
The binary would be smart enough to be versatile and only exist in the
system once, creating the qemu binary path names from its own argv[0].
However, this seemed like it didn't fit the make system too well, so
we're currently creating a new binary for each target archictecture.
CC: Reinhard Max <max@suse.de>
Signed-off-by: Alexander Graf <agraf@suse.de>
[AF: Rebased onto new Makefile infrastructure, twice]
[AF: Updated for aarch64 for v2.0.0-rc1]
[AF: Rebased onto Makefile changes for v2.1.0-rc0]
Signed-off-by: Andreas Färber <afaerber@suse.de>
the direction given in the ioctl should be correct so we can assume the
communication is uni-directional. The alsa developers did not like this
concept though and declared ioctls IOC_R and IOC_W even though they were
IOC_RW.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
Hack to prevent ALSA from using mmap() interface to simplify emulation.
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Ulrich Hecht <uli@suse.de>
After 'Machine as QOM' series the machine type input triggers
the creation of the machine class.
If the machine type is set in the configuration file, the machine
class is not updated accordingly and remains the default.
Fixed that by querying the machine options after the configuration
file is loaded.
Cc: qemu-stable@nongnu.org
Reported-by: William Dauchy <william@gandi.net>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 364c3e6b8d)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
spapr_tce_table_finalize() can SEGV if the object was not previously
realized. In particular this can be triggered by running
qemu-system-ppc -device spapr-tce-table,?
The basic problem is that we have mismatched initialization versus
finalization: spapr_tce_table_finalize() is attempting to undo things that
are done in spapr_tce_table_realize(), not an instance_init function.
Therefore, replace spapr_tce_table_finalize() with
spapr_tce_table_unrealize().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 5f9490de56)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Because of wrong return value of .save_live_pending() in
migration/block.c, migration finishes before the whole disk is
transferred. Such situation occurs when the migration process is fast
enough, for example when source and dest are on the same host.
If in the bulk phase we return something < max_size, we will skip
transferring the tail of the device. Currently we have "set pending to
BLOCK_SIZE if it is zero" for bulk phase, but there no guarantee, that
it will be < max_size.
True approach is to return, for example, max_size+1 when we are in the
bulk phase.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@parallels.com>
Message-id: 1419933856-4018-2-git-send-email-vsementsov@parallels.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 04636dc410)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If QEMU is started with -numa ... Windows only notices that
CPU has been hot-added but it will not online such CPUs.
It's caused by the fact that possible CPUs are flagged as
not enabled in SRAT and Windows honoring that information
doesn't use corresponding CPU.
ACPI 5.0 Spec regarding to flag says:
"
Table 5-47 Local APIC Flags
...
Enabled: if zero, this processor is unusable, and the operating system
support will not attempt to use it.
"
Fix QEMU to adhere to spec and mark possible CPUs as enabled
in SRAT.
With that Windows onlines hot-added CPUs as expected.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit dd0247e09a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If TB ends with an opcode that crosses page boundary and the following
page is not executable then EPC1 for the code fetch exception wrongly
points at the beginning of the TB. Always treat instruction that crosses
page boundary as a separate TB.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 01673a3401)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When stopping an audio voice, call the audio backend's fini
method before calling audio_pcm_hw_free_resources_ rather than
afterwards. This allows backends which use helper threads (like
pulseaudio) to terminate those threads before the conv_buf or
mix_buf are freed and avoids race conditions where the helper
may access a NULL pointer or freed memory.
Cc: qemu-stable@nongnu.org
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1418406239-9838-1-git-send-email-peter.maydell@linaro.org
(cherry picked from commit b28fb27b5e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Old kernels that used high memory only allowed the initrd to be in the
first 896MB of memory. If you load the initrd above, they complain
that "initrd extends beyond end of memory".
In order to fix this, while not breaking machines with small amounts
of memory fixed by cdebec5 (linuxboot: compute initrd loading address,
2014-10-06), we need to distinguish two cases. If pc.c placed the
initrd at end of memory, use the new algorithm based on the e801
memory map. If instead pc.c placed the initrd at the maximum address
specified by the bzImage, leave it there.
The only interesting part is that the low-memory info block is now
loaded very early, in real mode, and thus the 32-bit address has
to be converted into a real mode segment. The initrd address is
also patched in the info block before entering real mode, it is
simpler that way.
This fixes booting the RHEL4.8 32-bit installation image with 1GB
of RAM.
Cc: qemu-stable@nongnu.org
Cc: mst@redhat.com
Cc: jsnow@redhat.com
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 269e235849)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Even though hw/i386/pc.c tries to compute a valid loading address for the
initrd, close to the top of RAM, this does not take into account other
data that is malloced into that memory by SeaBIOS.
Luckily we can easily look at the memory map to find out how much memory is
used up there. This patch places the initrd in the first four gigabytes,
below the first hole (as returned by INT 15h, AX=e801h).
Without this patch:
[ 0.000000] init_memory_mapping: [mem 0x07000000-0x07fdffff]
[ 0.000000] RAMDISK: [mem 0x0710a000-0x07fd7fff]
With this patch:
[ 0.000000] init_memory_mapping: [mem 0x07000000-0x07fdffff]
[ 0.000000] RAMDISK: [mem 0x07112000-0x07fdffff]
So linuxboot is able to use the 64k that were added as padding for
QEMU <= 2.1.
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit cdebec5e40)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If a qcow2 image specifies a backing file format that doesn't correspond
to any format driver that qemu knows, we shouldn't fall back to probing,
but simply error out.
Not looking up the backing file driver in bdrv_open_backing_file(), but
just filling in the "driver" option if it isn't there moves us closer to
the goal of having everything in QDict options and gets us the error
handling of bdrv_open(), which correctly refuses unknown drivers.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1416935562-7760-4-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c5f6e493bb)
Conflicts:
tests/qemu-iotests/group
*removed context from upstream iotest groups
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qcow2_cache_flush() may fail; if one of the caches failed to be flushed
successfully to disk in qcow2_close() the image should not be marked
clean, and we should emit a warning.
This breaks the (qcow2-specific) iotests 026, 071 and 089; change their
output accordingly.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 3b5e14c76a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 9e52c53b8c)
*included to maintain parity with unit tests which inject errors
via blkdebug. needed for:
"qcow2: Flushing the caches in qcow2_close may fail"
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In qcow2_alloc_cluster_offset(), *num is limited to
INT_MAX >> BDRV_SECTOR_BITS by all callers. However, since remaining is
of type uint64_t, we might as well cast *num to that type before
performing the shift.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 11c89769dc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add a test for creating and amending images (amendment uses the creation
options) with formats not supporting creation over protocols not
supporting creation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 2247798d13)
Conflicts:
tests/qemu-iotests/group
*removed context dependencies from upstream iotest groups
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
There may be NBD tests which do not create a sample image and simply
test whether wrong usage of the protocol is rejected as expected. In
this case, there will be no NBD server and trying to kill it during
clean-up will fail.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f798068c56)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The image options which can be amended are described by the .create_opts
field for every driver. This field must therefore be non-NULL so that
anything can be amended in the first place. Check that this holds true
before going into qemu_opts_create() (because if .create_opts is NULL,
the create_opts pointer in img_amend() will be NULL after
qemu_opts_append()).
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit b2439d26f0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL for the target image in qemu-img convert, which is
important so that the create_opts pointer in img_convert() is not NULL
after the qemu_opts_append() calls and when going into
qemu_opts_create().
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit f75613cf24)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If a driver supports image creation, it needs to set the .create_opts
field. We can use that to make sure .create_opts for both drivers
involved is not NULL in bdrv_img_create(), which is important so that
the create_opts pointer in that function is not NULL after the
qemu_opts_append() calls and when going into qemu_opts_create().
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit c614972408)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The nfs protocol driver is capable of creating images, but did not
specify any creation options. Fix it.
A way to test this issue is the following:
$ qemu-img create -f nfs nfs://127.0.0.1/foo.qcow2 64M
Without this patch, it segfaults. With this patch, it does not. However,
this is not something that should really work; qemu-img should check
whether the parameter for the -f option (and -O for convert) is indeed a
format, and error out if it is not. Therefore, I am not making it an
iotest.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit fd752801ae)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We can always assume raw, file and qcow2 being available; so do not use
bdrv_find_format() to locate their BlockDriver objects but statically
reference the respective objects.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit ef8104378c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
There are some block drivers which are essential to QEMU and may not be
removed: These are raw, file and qcow2 (as the default non-raw format).
Make their BlockDriver objects public so they can be directly referenced
throughout the block layer without needing to call bdrv_find_format()
and having to deal with an error at runtime, while the real problem
occurred during linking (where raw, file or qcow2 were not linked into
qemu).
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit 5f535a941e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The code in invalidate_and_set_dirty() needs to handle addr/length
combinations which cross guest physical page boundaries. This can happen,
for example, when disk I/O reads large blocks into guest RAM which previously
held code that we have cached translations for. Unfortunately we were only
checking the clean/dirty status of the first page in the range, and then
were calling a tb_invalidate function which only handles ranges that don't
cross page boundaries. Fix the function to deal with multipage ranges.
The symptoms of this bug were that guest code would misbehave (eg segfault),
in particular after a guest reboot but potentially any time the guest
reused a page of its physical RAM for new code.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1416167061-13203-1-git-send-email-peter.maydell@linaro.org
(cherry picked from commit f874bf905f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Old BIOSes left some padding by mistake after the req_size/resp_size.
New QEMU does not like it, thinking it is a bidirectional command.
As a workaround, we can check if the ANY_LAYOUT bit is set; if not, we
always consider the first buffer as the virtio-scsi request/response,
because, back when QEMU did not support ANY_LAYOUT, it expected the
payload to start at the second element of the iovec.
This can show up during migration.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 55783a5521)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Memory slots have to be page aligned to get entered into KVM. There
is existing logic that tries to ensure that we pad memory slots that
are not page aligned to the biggest region that would still fit in the
alignment requirements.
Unfortunately, that logic is broken. It tries to calculate the start
offset based on the region size.
Fix up the logic to do the thing it was intended to do and document it
properly in the comment above it.
With this patch applied, I can successfully run an e500 guest with more
than 3GB RAM (at which point RAM starts overlapping subpage memory regions).
Cc: qemu-stable@nongnu.org
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit f2a64032a1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Entry opcode needs to check if moving to new register frame would cause
register window overflow. Entry used in function prologue never
overflows because preceding windowed call* opcode writes return address
to the target register window frame, causing overflow exceptions at the
point of call. But when a sequence of entry opcodes is used for register
window spilling there may not be a call or other opcode that would cause
window check between entries and they would not raise overflow exception
themselves resulting in data corruption.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 1b3e71f8ee)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
A linux guest will be issuing messages:
[ 32.124042] DC390: Deadlock in DataIn_0: DMA aborted unfinished: 000000 bytes remain!!
[ 32.126348] DC390: DataIn_0: DMA State: 0
and the HBA will fail to work properly.
Reason is the emulation is not setting the 'DMA transfer done'
status correctly.
Signed-off-by: Hannes Reinecke <hare@suse.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c3543fb5fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The g_hash_table_iter_* functions for iterating through a hash table
are not present in glib 2.12, which is our current minimum requirement.
Rewrite the code to use g_hash_table_foreach() instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit f8833a37c0)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If there are still pending i/o while deleting snapshot,
because deleting snapshot is done in non-coroutine context, and
the pending i/o read/write (bdrv_co_do_rw) is done in coroutine context,
so it's possible to cause concurrency problem between above two operations.
Add bdrv_drain_all() to bdrv_snapshot_delete() to avoid this problem.
Signed-off-by: Zhang Haoyu <zhanghy@sangfor.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 201410211637596311287@sangfor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
(cherry picked from commit 3432a1929e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
U-boot for xtensa always treats uImage load address as virtual address.
This is important when booting uImage on xtensa core with MMUv2, because
MMUv2 has fixed non-identity virtual-to-physical mapping after reset.
Always do virtual-to-physical translation of uImage load address and
load uImage at the translated address. This fixes booting uImage kernels
on dc232b and other MMUv2 cores.
Cc: qemu-stable@nongnu.org
Reported-by: Waldemar Brodkorb <mail@waldemar-brodkorb.de>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
(cherry picked from commit 6d2e453053)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Such address translation is needed when load address recorded in uImage
is a virtual address. When the actual load address is requested, return
untranslated address: user that needs the translated address can always
apply translation function to it and those that need it untranslated
don't need to do the inverse translation.
Add translation function pointer and its parameter to uimage_load
prototype. Update all existing users.
No user-visible functional changes.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 25bda50a0c)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit 9d8bf2d1 moved the softmmu slow path out of line and introduce a
regression at the same time by always calling tcg_out_tlb_load with
is_load=1. This makes impossible to run any significant code under
qemu-system-mips*.
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
(cherry picked from commit 0a2923f848)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
bits_per_pixel that are less than 8 could result in accessing
non-initialized buffers later in the code due to the expectation
that bytes_per_pixel value that is used to initialize these buffers is
never zero.
To fix this check that bits_per_pixel from the client is one of the
values that the rfb protocol specification allows.
This is CVE-2014-7815.
Signed-off-by: Petr Matousek <pmatouse@redhat.com>
[ kraxel: apply codestyle fix ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e6908bfe8e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
qemu_shutdown_requested may be interrupted by qemu_system_killed. If the
latter sets shutdown_requested after qemu_shutdown_requested has read it
but before it was cleared, the shutdown event is lost. Fix this by using
atomic_xchg.
This provides a different fix for the problem which commit 15124e142
attempts to deal with. That commit breaks use of ^C to drop into gdb,
and so this approach is better (and 15124e142 can be reverted).
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
[PMM: commit message tweak]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
(cherry picked from commit 817ef04db2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit 57f97834ef cleaned up
the cac_applet_pki_process_apdu function to have a single
exit point. Unfortunately, that commit introduced a bug
where the sign buffer can get free'd and nullified while
it's still being used.
This commit corrects the bug by introducing a boolean to
track whether or not the sign buffer should be freed in
the function exit path.
Signed-off-by: Ray Strode <rstrode@redhat.com>
Reviewed-by: Alon Levy <alon@pobox.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit 81b49e8f89)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
While writing an L1 table sector, qcow2_write_l1_entry() copies the
respective range from s->l1_table to the local "buf" array. The size of
s->l1_table does not have to be a multiple of L1_ENTRIES_PER_SECTOR;
thus, limit the index which is used for copying all entries to the L1
size.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
(cherry picked from commit a1391444fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Switch vmsvga_update_rect over to use vmsvga_verify_rect. Slight change
in behavior: We don't try to automatically fixup rectangles any more.
In case we find invalid update requests we'll do a full-screen update
instead.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 1735fe1edb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Quick & easy stopgap for CVE-2014-3689: We just compile out the
hardware acceleration functions which lack sanity checks. Thankfully
we have capability bits for them (SVGA_CAP_RECT_COPY and
SVGA_CAP_RECT_FILL), so guests should deal just fine, in theory.
Subsequent patches will add the missing checks and re-enable the
hardware acceleration emulation.
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Don Koch <dkoch@verizon.com>
(cherry picked from commit 83afa38eb2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
We used to be able to address both the QEMU and the KVM APIC via "apic".
This doesn't work anymore. So we need to use their parent class to turn
off the vapic on machines that should not expose them.
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit df1fd4b541)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is
dropped again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon
unplug the virtio-9p child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8f3d60e568)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-9p-pci all duplicate the qdev properties of their
V9fsState child. This approach does not work well with
string or pointer properties since we must be careful
about leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
V9fsState child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 48833071d9)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-balloon child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 91ba212088)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-rng child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 352fa88dfb)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-rng-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIORNG child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIORNG child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 8ee486ae33)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-serial child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit e77ca8b92a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-serial-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSerial child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIOSerial child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 4f456d8025)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-scsi/vhost-scsi child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 1312f12bcc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
{virtio, vhost}-scsi-{pci, s390, ccw} all duplicate the
qdev properties of their VirtIOSCSI/VHostSCSI child.
This approach does not work well with string or pointer
properties since we must be careful about leaking or
double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIOSCSI/VHostSCSI child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit c39343fd81)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
object_initialize() leaves the object with a refcount of 1.
object_property_add_child() adds its own reference which is dropped
again when the property is deleted.
The upshot of this is that we always have a refcount >= 1. Upon hot
unplug the virtio-net child is not finalized!
Drop our reference after the child property has been added to the
parent.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 6a0c6b5978)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
virtio-net-pci, virtio-net-s390, and virtio-net-ccw all duplicate the
qdev properties of their VirtIONet child. This approach does not work
well with string or pointer properties since we must be careful about
leaking or double-freeing them.
Use the QOM alias property to forward property accesses to the
VirtIONet child. This way no duplication is necessary.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 7779edfeb1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
QEMU currently allows the number of VCPUs to not be a multiple of the
number of threads per socket, but the smbios socket count calculation
introduced by commit c97294ec1b doesn't
take that into account, triggering an assertion. e.g.:
$ ./x86_64-softmmu/qemu-system-x86_64 -smp 4,sockets=2,cores=6,threads=1
qemu-system-x86_64: /home/ehabkost/rh/proj/virt/qemu/hw/i386/smbios.c:825: smbios_get_tables: Assertion `smbios_smp_sockets >= 1' failed.
Aborted (core dumped)
Socket count calculation doesn't belong to smbios.c and should
eventually be moved to the main SMP topology configuration code. But
while we don't move the code, at least make it correct by rounding up
the division.
Cc: Gabriel Somlo <somlo@cmu.edu>
Cc: qemu-stable@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 7dfddd7f88)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Since 3687d532 we've been unconditionally adding qom-test to our qtests
for every arch. However, some archs inherit their tests from Makefile
variables for other archs, such as i386/x86_64,
microblaze/microblazeel, and xtensa/xtensaeb. Since these are evaluated
in a lazy manner, we ultimately end up adding qom-test twice.
In the case x86_64, where we have a large number of machine types that
we rerun qom-test for, this has lead to a fairly noticeable increase
in the overall run-time of `make check` (78s vs. 42s on my machine).
Similar speed-ups are visible for other such archs, but not nearly as
significant.
Fix this by only adding qom-test to an arch's test list if it's not
already present.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andreas Färber <afaerber@suse.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
(cherry picked from commit 2b8419cb49)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
It should not break memory hotplug feature if there is non-NUMA option.
This patch would also allow to use pc-dimm as replacement for initial memory
for non-NUMA configs.
Note: After this patch, the memory hotplug can work normally for Linux guest OS
when there is non-NUMA option and NUMA option. But not support Windows guest OS
to hotplug memory with no-NUMA config, actully, it's Windows limitation.
Reviewed-By: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit fc50ff0666)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Header length check should happen only if backend is kernel. For user
backend there is no reason to reset this bit.
vhost-user code does not define .has_vnet_hdr_len so
VIRTIO_NET_F_MRG_RXBUF cannot be negotiated even if both sides
support it.
Signed-off-by: Damjan Marion <damarion@cisco.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit d8e80ae37a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When a QMP client changes the polling interval time by setting
the guest-stats-polling-interval property, the interval value
is stored and manipulated as an int64_t variable.
However, the balloon_stats_change_timer() function, which is
used to set the actual timer with the interval value, takes
an int instead, causing an overflow for big interval values.
This commit fix this bug by changing balloon_stats_change_timer()
to take an int64_t and also it limits the polling interval value
to UINT_MAX to avoid other kinds of overflow.
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
(cherry picked from commit 1f9296b51a)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Commit cdaa86a54 ("Add G_IO_HUP handler for socket chardev") exposed a bug in
the way the HMP monitor handles its command buffer. When a client closes the
connection to the monitor, tcp_chr_read() will detect the G_IO_HUP condition
and call tcp_chr_disconnect() to close the server-side connection too. Due to
the fact that monitor reads 1 byte at a time (for each tcp_chr_read()), the
monitor readline state / buffers might contain junk (i.e. a half-finished
command). Thus, without calling readline_restart() on mon->rs in
CHR_EVENT_OPEN, future HMP commands will fail.
Signed-off-by: Stratos Psomadakis <psomas@grnet.gr>
Signed-off-by: Dimitris Aragiorgis <dimara@grnet.gr>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit e5554e2015)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This is more of an exercise of the dealloc visitor, where it may
erroneously use an uninitialized discriminator field as indication
that union fields corresponding to that discriminator field/type are
present, which can lead to attempts to free random chunks of heap
memory.
Cc: qemu-stable@nongnu.org
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit cb55111b4e)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
In some cases an input visitor might bail out on filling out a
struct for various reasons, such as missing fields when running
in strict mode. In the case of a QAPI Union type, this may lead
to cases where the .kind field which encodes the union type
is uninitialized. Subsequently, other visitors, such as the
dealloc visitor, may use this .kind value as if it were
initialized, leading to assumptions about the union type which
in this case may lead to segfaults. For example, freeing an
integer value.
However, we can generally rely on the fact that the always-present
.data void * field that we generate for these union types will
always be NULL in cases where .kind is uninitialized (at least,
there shouldn't be a reason where we'd do this purposefully).
So pass this information on to Visitor implementation via these
optional start_union/end_union interfaces so this information
can be used to guard against the situation above. We will make
use of this information in a subsequent patch for the dealloc
visitor.
Cc: qemu-stable@nongnu.org
Reported-by: Fam Zheng <famz@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
(cherry picked from commit cee2dedb85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This patch initializes monitor for gdbstub with the qemu_chr_alloc function
instead of just allocating the memory. Initialization function call
is required, because it also creates chr_write_lock mutex, which is used
when writing to this character device.
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 462efe9e53)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
On sPAPR, virtio devices are connected to the PCI bus and use MSI-X.
Commit cc943c36fa has modified MSI-X
so that writes are made using the bus master address space and follow
the IOMMU path.
Unfortunately, the IOMMU address space address space does not have an
MSI window: the notification is silently dropped in unassigned_mem_write
instead of reaching the guest... The most visible effect is that all
virtio devices are non-functional on sPAPR since then. :(
This patch does the following:
1) map the MSI window into the IOMMU address space for each PHB
- since each PHB instantiates its own IOMMU address space, we
can safely map the window at a fixed address (SPAPR_PCI_MSI_WINDOW)
- no real need to keep the MSI window setup in a separate function,
the spapr_pci_msi_init() code moves to spapr_phb_realize().
2) kill the global MSI window as it is not needed in the end
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
(cherry picked from commit 8c46f7ec85)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
commit cc943c36fa
pci: Use bus master address space for delivering MSI/MSI-X messages
breaks virtio-net for rhel6.[56] x86 guests because they don't
enable bus mastering for virtio PCI devices. For the same reason,
rhel6.[56] ppc64 guests cannot boot on a virtio-blk disk anymore.
Old guests forgot to enable bus mastering, enable it automatically on
DRIVER (guests use some devices before DRIVER_OK).
Reported-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Tested-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit e43c0b2ea5)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The spec says (and real HW confirms this) that, if the bus master bit
is 0, the device will not generate any PCI accesses. MSI and MSI-X
messages fall among these, so we should use the corresponding address
space to deliver them. This will prevent delivery if bus master support
is disabled.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit cc943c36fa)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When we migrate we ask the kernel about its current belief on what the guest
time would be. However, I've seen cases where the kvmclock guest structure
indicates a time more recent than the kvm returned time.
To make sure we never go backwards, calculate what the guest would have seen as time at the point of migration and use that value instead of the kernel returned one when it's more recent.
This bases the view of the kvmclock after migration on the
same foundation in host as well as guest.
Signed-off-by: Alexander Graf <agraf@suse.de>
Cc: qemu-stable@nongnu.org
Reviewed-by: Marcelo Tosatti <mtosatti@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
(cherry picked from commit 9a48bcd1b8)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add back the PCIe config capabilities on XHCI cards in non-PCIe slots,
but only for machine types before 2.1.
This fixes a migration incompatibility in the XHCI PCI devices
caused by:
058fdcf52c - xhci: add endpoint cap on express bus only
Note that in fixing it for compatibility with older QEMUs, it breaks
compatibility with existing QEMU 2.1's on older machine types.
The status before this patch was (if it used an XHCI adapter):
machine type | source qemu
any pre-2.1 - FAIL
any 2.1... - PASS
With this patch:
machine type | source qemu
any pre-2.1 - PASS
pre-2.1 2.1... - FAIL
2.1 2.1... - PASS
A test to trigger it is to add '-device nec-usb-xhci,id=xhci,addr=0x12'
to the command line.
Cc: qemu-stable@nongnu.org
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
(cherry picked from commit e6043e92c2)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
If memory allocation fails when using the -mem-prealloc command-line
option, QEMU exits without printing any error information to
the user:
# qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
# echo $?
1
This commit adds an error message, so that we print instead:
# qemu [...] -m 1G -mem-prealloc -mem-path /dev/hugepages
qemu: unable to map backing store for hugepages: Cannot allocate memory
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
(cherry picked from commit e4d9df4fb1)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
At present, this function doesn't have partial cleanup implemented,
which will cause resource leaks in some scenarios.
Example:
1. Assume that "dc->realize(dev, &local_err)" executes successful
and local_err == NULL;
2. device hotplug in hotplug_handler_plug() executes but fails
(it is prone to occur). Then local_err != NULL;
3. error_propagate(errp, local_err) and return. But the resources
which have been allocated in dc->realize() will be leaked.
Simple backtrace:
dc->realize()
|->device_realize
|->pci_qdev_init()
|->do_pci_register_device()
|->etc.
Add fuller cleanup logic which assures that function can
goto appropriate error label as local_err population is
detected at each relevant point.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit 1d45a705fc)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Forcefully unrealize all children regardless of errors in earlier
iterations (if any). We should keep going with cleanup operation
rather than report an error immediately. Therefore store the first
child unrealization failure and propagate it at the end. We also
forcefully unregister vmsd and unrealize actual object, too.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Andreas Färber <afaerber@suse.de>
(cherry picked from commit cd4520adca)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Since QEMU 2.1, we are allocating more space for ACPI tables, so no
space is left after initrd for the BIOS to allocate memory.
Besides ACPI tables, there are a few other uses of high memory in
SeaBIOS: SMBIOS tables and USB drivers use it in particular. These uses
allocate a very small amount of memory. Malloc metadata also lives
there. So we need _some_ extra padding there to avoid initrd breakage,
but not much.
John Snow found a case where RHEL5 was broken by the recent change to
ACPI_TABLE_SIZE; in his case 4KB of extra padding are fine, but just to
be safe I am adding 32KB, which is roughly the same amount of padding
that was left by QEMU 2.0 and earlier.
Move initrd to leave some space for the BIOS.
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 438f92ee9f)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This reverts commit a1bc7b827e422e1ff065640d8ec5347c4aadfcd8.
virtio: don't call device on !vm_running
It turns out that virtio net assumes that vm_running
is updated before device status callback in many places,
so this change leads to asserts.
Previous commit fixes the root issue that motivated
a1bc7b827e422e1ff065640d8ec5347c4aadfcd8 differently,
so there's no longer a need for this change.
In the future, we might be able to drop checking vm_running
completely, and check vm state directly.
Reported-by: Dietmar Maurer <dietmar@proxmox.com>
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 9e8e8c4865)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
On vm stop, vm_running state set to stopped
before device is notified, so callbacks can get envoked with
vm_running = false; and this is not an error.
Cc: qemu-stable@nongnu.org
Acked-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
(cherry picked from commit 131c5221fe)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
This patch is predicated on cc943c, which was dropped from
stable tree for other reasons.
This reverts commit 0824ca6bd1.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2014-09-23 10:48:06 -05:00
169 changed files with 7195 additions and 635 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.