Compare commits

...

1573 Commits

Author SHA1 Message Date
Gerd Hoffmann
3b1154fff1 Revert "bios: Add fast variant of SeaBIOS for use with -kernel on x86."
This reverts commit 4e04ab6a63.

Also remove pc-bios/bios-fast.bin.

Commit was merged by mistake.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-07-04 17:23:33 +02:00
Peter Maydell
3173a1fd54 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160704' into staging
target-arm queue:
 * fix semihosting SYS_HEAPINFO call for A64 guests
 * fix crash if guest tries to write to ROM on imx boards
 * armv7m_nvic: fix crash for debugger reads from some registers
 * virt: mark PCIe host controller as dma-coherent in the DT
 * add data-driven register API
 * Xilinx Zynq: add devcfg device model
 * m25p80: fix various bugs
 * ast2400: add SMC controllers and SPI flash slaves

# gpg: Signature made Mon 04 Jul 2016 13:17:34 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20160704: (23 commits)
  ast2400: create SPI flash slaves
  ast2400: add SPI flash slaves
  ast2400: add SMC controllers (FMC and SPI)
  m25p80: qdev-ify drive property
  m25p80: change cur_addr to 32 bit integer
  m25p80: avoid out of bounds accesses
  m25p80: do not put iovec on the stack
  ssi: change ssi_slave_init to be a realize ops
  xilinx_zynq: Connect devcfg to the Zynq machine model
  dma: Add Xilinx Zynq devcfg device model
  register: Add block initialise helper
  register: QOMify
  register: Define REG and FIELD macros
  register: Add Memory API glue
  register: Add Register API
  bitops: Add MAKE_64BIT_MASK macro
  hw/arm/virt: mark the PCIe host controller as DMA coherent in the DT
  armv7m_nvic: Use qemu_get_cpu(0) instead of current_cpu
  memory: Assert that memory_region_init_rom_device() ops aren't NULL
  imx: Use memory_region_init_rom() for ROMs
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 14:33:05 +01:00
Peter Maydell
9b9611c85d Merge remote-tracking branch 'remotes/kraxel/tags/pull-seabios-20160704-1' into staging
seabios: update from 1.9.1 to 1.9.3

# gpg: Signature made Mon 04 Jul 2016 10:29:47 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-seabios-20160704-1:
  seabios: update binaries from 1.9.1 to 1.9.3
  seabios: update 128k config
  bios: Add fast variant of SeaBIOS for use with -kernel on x86.
  seabios: update submodule from 1.9.1 to 1.9.3

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:39:31 +01:00
Cédric Le Goater
e1ad9bc405 ast2400: create SPI flash slaves
A set of SPI flash slaves is attached under the flash controllers of
the palmetto platform. "n25q256a" flash modules are used for the BMC
and "mx25l25635e" for the host. These types are common in the
OpenPower ecosystem.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-9-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
924ed16386 ast2400: add SPI flash slaves
Each controller on the ast2400 has a memory range on which it maps its
flash module slaves. Each slave is assigned a memory segment for its
mapping that can be changed at bootime with the Segment Address
Register. This is not supported in the current implementation so we
are using the defaults provided by the specs.

Each SPI flash slave can then be accessed in two modes: Command and
User. When in User mode, accesses to the memory segment of the slaves
are translated in SPI transfers. When in Command mode, the HW
generates the SPI commands automatically and the memory segment is
accessed as if doing a MMIO. Other SPI controllers call that mode
linear addressing mode.

For this purpose, we are adding below each crontoller an array of
structs gathering for each SPI flash module, a segment rank, a
MemoryRegion to handle the memory accesses and the associated SPI
slave device, which should be a m25p80.

Only the User mode is supported for now but we are preparing ground
for the Command mode. The framework is sufficient to support Linux.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-8-git-send-email-clg@kaod.org
[PMM: Use g_new0() rather than g_malloc0()]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
7c1c69bca4 ast2400: add SMC controllers (FMC and SPI)
The Aspeed AST2400 soc includes a static memory controller for the BMC
which supports NOR, NAND and SPI flash memory modules. This controller
has two modes : the SMC for the legacy interface which supports only
one module and the FMC for the new interface which supports up to five
modules. The AST2400 also includes a SPI only controller used for the
host firmware, commonly called BIOS on Intel. It can be used in three
mode : a SPI master, SPI slave and SPI pass-through

Below is the initial framework for the SMC controller (FMC mode only)
and the SPI controller: the sysbus object, MMIO for registers
configuration and controls. Each controller has a SPI bus and a
configurable number of CS lines for SPI flash slaves.

The differences between the controllers are small, so they are
abstracted using indirections on the register numbers.

Only SPI flash modules are supported.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-7-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: added one missing error_propagate]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
73bce5187b m25p80: qdev-ify drive property
This allows specifying the property via -drive if=none and creating
the flash device with -device.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-6-git-send-email-clg@kaod.org
[clg: added an extra fix for sabrelite_init()
      keeping the test on flash_dev did not seem necessary. ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
b7f480c3f6 m25p80: change cur_addr to 32 bit integer
The maximum amount of storage that can be addressed by the m25p80 command
set is 4 GiB.  However, cur_addr is currently a 64-bit integer.  To avoid
further problems related to sign extension of signed 32-bit integer
expressions, change cur_addr to a 32 bit integer.  Preserve migration
format by adding a dummy 4-byte field in place of the (big-endian)
high four bytes in the formerly 64-bit cur_addr field.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
b68cb06093 m25p80: avoid out of bounds accesses
s->cur_addr can be made to point outside s->storage, either by
writing a value >= 128 to s->ear (because s->ear * MAX_3BYTES_SIZE
is a signed integer and sign-extends into the 64-bit cur_addr),
or just by writing an address beyond the size of the flash being
emulated.  Avoid the sign extension to make the code cleaner, and
on top of that mask s->cur_addr to s->size.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-4-git-send-email-clg@kaod.org
Reviewed by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Paolo Bonzini
cace7b801d m25p80: do not put iovec on the stack
When doing a read-modify-write cycle, QEMU uses the iovec after returning
from blk_aio_pwritev.  m25p80 puts the iovec on the stack of blk_aio_pwritev's
caller, which causes trouble in this case.  This has been a problem
since commit 243e6f6 ("m25p80: Switch to byte-based block access",
2016-05-12) started doing writes at a smaller granularity than 512 bytes.
In principle however it could have broken before when using -drive
if=mtd,cache=none on a disk with 4K native sectors.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Cédric Le Goater
7673bb4cd3 ssi: change ssi_slave_init to be a realize ops
This enables qemu to handle late inits and report errors. All the SSI
slave routine names were changed accordingly. Code was modified to
handle errors when possible (m25p80 and ssi-sd)

Tested with the m25p80 slave object.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1467138270-32481-2-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
f4b99537f1 xilinx_zynq: Connect devcfg to the Zynq machine model
Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 85f39c9a13569b1113dacac3b952b0af54fc1260.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
034c2e6902 dma: Add Xilinx Zynq devcfg device model
Add a minimal model for the devcfg device which is part of Zynq.
This model supports DMA capabilities and interrupt generation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 83df49d8fa2d203a421ca71620809e4b04754e65.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
a74229597e register: Add block initialise helper
Add a helper that will scan a static RegisterAccessInfo Array
and populate a container MemoryRegion with registers as defined.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 347b810b2799e413c98d5bbeca97bcb1557946c3.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
49e14ddbce register: QOMify
QOMify registers as a child of TYPE_DEVICE. This allows registers to
define GPIOs.

Define an init helper that will do QOM initialisation.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 2545f71db26bf5586ca0c08a3e3cf1b217450552.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Crosthwaite
684204593d register: Define REG and FIELD macros
Define some macros that can be used for defining registers and fields.

The REG32 macro will define A_FOO, for the byte address of a register
as well as R_FOO for the uint32_t[] register number (A_FOO / 4).

The FIELD macro will define FOO_BAR_MASK, FOO_BAR_SHIFT and
FOO_BAR_LENGTH constants for field BAR in register FOO.

Finally, there are some shorthand helpers for extracting/depositing
fields from registers based on these naming schemes.

Usage can greatly reduce the verbosity of device code.

The deposit and extract macros (eg FIELD_EX32, FIELD_DP32  etc.) can be
used to generate extract and deposits without any repetition of the name
stems.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: bbd87a3c03b1f173b1ed73a6d502c0196c18a72f.1467053537.git.alistair.francis@xilinx.com
[ EI Changes:
  * Add Deposit macros
]
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
0b73c9bb06 register: Add Memory API glue
Add memory io handlers that glue the register API to the memory API.
Just translation functions at this stage. Although it does allow for
devices to be created without all-in-one mmio r/w handlers.

This patch also adds the RegisterInfoArray struct, which allows all of
the individual RegisterInfo structs to be grouped into a single memory
region.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: f7704d8ac6ac0f469ed35401f8151a38bd01468b.1467053537.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
1599121b57 register: Add Register API
This API provides some encapsulation of registers and factors out some
common functionality to common code. Bits of device state (usually MMIO
registers) often have all sorts of access restrictions and semantics
associated with them. This API allows you to define what those
restrictions are on a bit-by-bit basis.

Helper functions are then used to access the register which observe the
semantics defined by the RegisterAccessInfo struct.

Some features:
Bits can be marked as read_only (ro field)
Bits can be marked as write-1-clear (w1c field)
Bits can be marked as reserved (rsvd field)
Reset values can be defined (reset)
Bits can be marked clear on read (cor)
Pre and post action callbacks can be added to read and write ops
Verbose debugging info can be enabled/disabled

Useful for defining device register spaces in a data driven way. Cuts
down on a lot of the verbosity and repetition in the switch-case blocks
in the standard foo_mmio_read/write functions.

Also useful for automated generation of device models from hardware
design sources.

Signed-off-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 40d62c7e1bf6e63bb4193ec46b15092a7d981e59.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Alistair Francis
ae2923b5c2 bitops: Add MAKE_64BIT_MASK macro
Add a macro that creates a 64bit value which has length number of ones
shifted across by the value of shift.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 9773244aa1c8c26b8b82cb261d8f5dd4b7b9fcf9.1467053537.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Ard Biesheuvel
5d636e21c4 hw/arm/virt: mark the PCIe host controller as DMA coherent in the DT
Since QEMU performs cacheable accesses to guest memory when doing DMA
as part of the implementation of emulated PCI devices, guest drivers
should use cacheable accesses as well when running under KVM. Since this
essentially means that emulated PCI devices are DMA coherent, set the
'dma-coherent' DT property on the PCIe host controller DT node.

This brings the DT description into line with the ACPI description,
which already marks the PCI bridge as cache coherent (see commit
bc64b96c98).

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 1467134090-5099-1-git-send-email-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Andrey Smirnov
a19861666b armv7m_nvic: Use qemu_get_cpu(0) instead of current_cpu
Starting QEMU with -S results in current_cpu containing its initial
value of NULL. It is however possible to connect to such QEMU instance
and query various CPU registers, one example being CPUID, and doing that
results in QEMU segfaulting.

Using qemu_get_cpu(0) seem reasonable enough given that ARMv7M
architecture is a single core architecture.

Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 13:15:22 +01:00
Peter Maydell
39e0b03dec memory: Assert that memory_region_init_rom_device() ops aren't NULL
It doesn't make sense to pass a NULL ops argument to
memory_region_init_rom_device(), because the effect will
be that if the guest tries to write to the memory region
then QEMU will segfault. Catch the bug earlier by sanity
checking the arguments to this function, and remove the
misleading documentation that suggests that passing NULL
might be sensible.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-4-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
a7aeb5f7b2 imx: Use memory_region_init_rom() for ROMs
The imx boards were all incorrectly creating ROMs using
memory_region_init_rom_device() with a NULL ops pointer. This
will cause QEMU to abort if the guest tries to write to the
ROM. Switch to the new memory_region_init_rom() instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-3-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
a1777f7f64 memory: Provide memory_region_init_rom()
Provide a new helper function memory_region_init_rom() for memory
regions which are read-only (and unlike those created by
memory_region_init_rom_device() don't have special behaviour
for writes). This has the same behaviour as calling
memory_region_init_ram() and then memory_region_set_readonly()
(which is what we do today in boards with pure ROMs) but is a
more easily discoverable API for the purpose.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1467122287-24974-2-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
f5666418c4 target-arm/arm-semi.c: Fix SYS_HEAPINFO for 64-bit guests
SYS_HEAPINFO is one of the few semihosting calls which has to write
values back into a parameter block in memory.  When we added
support for 64-bit semihosting we updated the code which reads from
the parameter block to read 64-bit words but forgot to change the
code that writes back into the block. Update it to treat the
block as a set of words of the appropriate width for the guest.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1466783381-29506-3-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
d317091d5e linux-user: Make semihosting heap/stack fields abi_ulongs
The fields in the TaskState heap_base, heap_limit and stack_base
are all guest addresses (representing the locations of the heap
and stack for the guest binary), so they should be abi_ulong
rather than uint32_t. (This only in practice affects ARM AArch64
since all the other semihosting implementations are 32-bit.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Message-id: 1466783381-29506-2-git-send-email-peter.maydell@linaro.org
2016-07-04 13:06:35 +01:00
Peter Maydell
e2c8f9e44e Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Sun 03 Jul 2016 23:03:04 BST
# gpg:                using RSA key 0xE3E51CE8FB6B2F1D
# gpg: Good signature from "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: F632 74CD C630 0873 CB3D  29D9 E3E5 1CE8 FB6B 2F1D

* remotes/thibault/tags/samuel-thibault:
  slirp: Add support for stateless DHCPv6
  slirp: Remove superfluous memset() calls from the TFTP code
  slirp: Add RDNSS advertisement
  slirp: Support link-local DNS addresses
  slirp: Add dns6 resolution
  slirp: Split get_dns_addr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-04 10:49:17 +01:00
Gerd Hoffmann
6e03a28e1c seabios: update binaries from 1.9.1 to 1.9.3
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-07-04 11:28:58 +02:00
Gerd Hoffmann
ea996ebd65 seabios: update 128k config
Turn off mpt-scsi and bootsplash to keep size below 128k.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-07-04 11:28:58 +02:00
Richard W.M. Jones
4e04ab6a63 bios: Add fast variant of SeaBIOS for use with -kernel on x86.
This commit adds a fast variant of SeaBIOS called 'bios-fast.bin'.

It's designed to be the fastest (also the smallest, but that's not the
main aim) SeaBIOS that is just enough to boot a Linux kernel using the
-kernel option on i686 and x86_64.

This commit does not modify the -kernel option to use this.  You have
to specify it by doing something like this:

  -kernel vmlinuz -bios bios-fast.bin

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-07-04 11:28:58 +02:00
Gerd Hoffmann
8692aa2979 seabios: update submodule from 1.9.1 to 1.9.3
git shortlog
============

Alex Williamson (1):
      fw/pci: Add support for mapping Intel IGD via QEMU

Haozhong Zhang (1):
      fw/msr_feature_control: add support to set MSR_IA32_FEATURE_CONTROL

Kevin O'Connor (1):
      build: fix .text section address alignment

Marcel Apfelbaum (1):
      fw/pci: add Q35 S3 support

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-07-04 11:28:58 +02:00
Thomas Huth
7b143999f2 slirp: Add support for stateless DHCPv6
Provide basic support for stateless DHCPv6 (see RFC 3736) so
that guests can also automatically boot via IPv6 with SLIRP
(for IPv6 network booting, see RFC 5970 for details).

Tested with:

    qemu-system-ppc64 -nographic -vga none -boot n -net nic \
        -net user,ipv6=yes,ipv4=no,tftp=/path/to/tftp,bootfile=ppc64.img

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-07-03 23:59:42 +02:00
Thomas Huth
e5857062a6 slirp: Remove superfluous memset() calls from the TFTP code
Commit fad7fb9ccd  ("Add IPv6 support to the TFTP code")
refactored some common code for preparing the mbuf into a new
function called tftp_prep_mbuf_data(). One part of this common
code is to do a "memset(m->m_data, 0, m->m_size);" for the related
buffer first. However, at two spots, the memset() was not removed
from the calling function, so it currently done twice in these code
paths. Thus let's delete these superfluous memsets in the calling
functions now.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-07-03 23:59:42 +02:00
Samuel Thibault
f7725df387 slirp: Add RDNSS advertisement
This adds the RDNSS option to IPv6 router advertisements, so that the guest
can autoconfigure the DNS server address.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>

---
Changes since last submission:
- Disable on windows, until we have support for it
2016-07-03 23:31:12 +02:00
Samuel Thibault
ef763fa4bd slirp: Support link-local DNS addresses
They look like fe80::%eth0

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>

---
Changes since last submission:
- fix windows build
2016-07-03 23:29:13 +02:00
Samuel Thibault
1d17654e76 slirp: Add dns6 resolution
This makes get_dns_addr address family-agnostic, thus allowing to add the
IPv6 case.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-07-03 23:27:08 +02:00
Samuel Thibault
972487b878 slirp: Split get_dns_addr
Separate get_dns_addr into get_dns_addr_cached and get_dns_addr_resolv_conf
to make conversion to IPv6 easier.

Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
2016-07-03 23:24:54 +02:00
Peter Maydell
96b39d8327 Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
Only trivial fixes.

# gpg: Signature made Fri 01 Jul 2016 13:39:06 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <gkurz@fr.ibm.com>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg:                 aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9p: synth: drop v9fs_ prefix
  9p: don't include <sys/uio.h>

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 19:29:27 +01:00
Alexander Shopov
9a48e36700 Added Bulgarian translation
Signed-off-by: Alexander Shopov <ash@kambanaria.org>
Message-id: 20160626105922.40590-2-ash@kambanaria.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 16:06:57 +01:00
Greg Kurz
b05528b533 9p: synth: drop v9fs_ prefix
To have shorter lines and be consistent with other fs devices.

Acked-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
2016-07-01 14:38:54 +02:00
Greg Kurz
8d85a22aab 9p: don't include <sys/uio.h>
The <sys/uio.h> system header doesn't exist on all host platforms. Code
should include "qemu/osdep.h" instead to avoid build breaks on plafforms
that don't define CONFIG_IOVEC (like win32, if it is to support 9p one day).

Acked-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Michael Fritscher <michael@fritscher.net>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-07-01 14:38:54 +02:00
Peter Maydell
1b756f1abf Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160701' into staging
ppc patch queue 2016-07-01

Here's the current ppc patch queue.  This is a fairly large batch,
containing:
    * A number of further preliminary patches towards full hypervisor
      mode emulation
    * Some further fixes / cleanups for the recently merged device_add
      based CPU hotplug
    * Preliminary patches towards supporting a native (rather than
      paravirtualized) XICS device.  This will be needed to emulate a
      physical Power machine, including hypervisor capabilities
    * Assorted bug fixes

# gpg: Signature made Fri 01 Jul 2016 06:56:35 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160701: (23 commits)
  qmp: fix spapr example of query-hotpluggable-cpus
  spapr: drop duplicate variable in spapr_core_release()
  spapr: do proper error propagation in spapr_cpu_core_realize_child()
  spapr: drop reference on child object during core realization
  spapr: Restore support for 970MP and POWER8NVL CPU cores
  target-ppc: gen_pause for instructions: yield, mdoio, mdoom, miso
  ppc/xics: Replace "icp" with "xics" in most places
  ppc/xics: Implement H_IPOLL using an accessor
  ppc/xics: Move SPAPR specific code to a separate file
  ppc/xics: Rename existing xics to xics_spapr
  ppc: Fix 64K pages support in full emulation
  target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb
  spapr: Restore support for older PowerPC CPU cores
  spapr: fix write-past-end-of-array error in cpu core device init code
  hw/ppc/spapr: Add some missing hcall function set strings
  ppc: Print HSRR0/HSRR1 in "info registers"
  ppc: LPCR is a HV resource
  ppc: Initial HDEC support
  ppc: Enforce setting MSR:EE,IR and DR when MSR:PR is set
  ppc: Fix conditions for delivering external interrupts to a guest
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 13:31:48 +01:00
Peter Maydell
94e31093ff Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160630.0' into staging
VFIO updates 2016-06-30

 - Fix VGA quirks (stable 2.6) (Alex Williamson)
 - Registering PCIe extended capabilities (Chen Fan)
 - Hide read-only SR-IOV capability from VM (Alex Williamson)
 - MemoryRegionIOMMUOps.notify_started/stopped (Alexey Kardashevskiy)
 - hw_error on intel_iommu notify_started  (Alex Williamson)

# gpg: Signature made Thu 30 Jun 2016 20:45:55 BST
# gpg:                using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B  8A90 239B 9B6E 3BB0 8B22

* remotes/awilliam/tags/vfio-update-20160630.0:
  intel_iommu: Throw hw_error on notify_started
  memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks
  vfio/pci: Hide SR-IOV capability
  vfio: add pcie extended capability support
  vfio/pci: Fix VGA quirks

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 11:52:14 +01:00
Peter Maydell
1fb4c13e4f Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-06-30' into staging
QAPI patches 2016-06-30

# gpg: Signature made Thu 30 Jun 2016 14:29:43 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-qapi-2016-06-30:
  qapi: Fix memleak in string visitors on int lists
  qapi: Simplify use of range.h
  range: Create range.c for code that should not be inline
  qapi: Fix crash on missing alternate member of QAPI struct
  checkpatch: There is no qemu_strtod()
  qobject: Correct JSON lexer grammar comments
  json-streamer: Don't leak tokens on incomplete parse

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-07-01 11:18:01 +01:00
Igor Mammedov
13f5e8003e qmp: fix spapr example of query-hotpluggable-cpus
27393c33 qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCore
added -id suffix to property names but forgot to fix example in qmp-commands.hx

Fix example to have 'core-id' instead of 'core' to match current code

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz
8a1eb71bd8 spapr: drop duplicate variable in spapr_core_release()
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz
f11235b920 spapr: do proper error propagation in spapr_cpu_core_realize_child()
This patch changes spapr_cpu_core_realize_child() to have a local error
pointer and use error_propagate() as it is supposed to be done.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Greg Kurz
8e758dee66 spapr: drop reference on child object during core realization
When a core is being realized, we create a child object for each thread
of the core.

The child is first initialized with object_initialize() which sets its ref
count to 1, and then added to the core with object_property_add_child()
which bumps the ref count to 2.

When the core gets released, object_unparent() decreases the ref count to 1,
and we g_free() the object: we hence loose the reference on an unfinalized
object. This is likely to cause random crashes.

Let's drop the extra reference as soon as we don't need it, after the
thread is added to the core.

Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Bharata B Rao
470f215787 spapr: Restore support for 970MP and POWER8NVL CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970MP and POWER8NVL based core types. Add support for
the same.

While we are here, add support for explicit specification of POWER5+_v2.1
core type.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Aaron Larson
9e196938aa target-ppc: gen_pause for instructions: yield, mdoio, mdoom, miso
Call gen_pause for all "or rx,rx,rx" encodings other nop.  This
provides a reasonable implementation for yield, and a better
approximation for mdoio, mdoom, and miso.  The choice to pause for all
encodings !=0 leverages the PowerISA admonition that the reserved
encodings might change program priority, providing a slight "future
proofing".

Signed-off-by: Aaron Larson <alarson@ddci.com>
Acked-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
27f2458245 ppc/xics: Replace "icp" with "xics" in most places
The "ICP" is a different object than the "XICS". For historical reasons,
we have a number of places where we name a variable "icp" while it contains
a XICSState pointer. There *is* an ICPState structure too so this makes
the code really confusing.

This is a mechanical replacement of all those instances to use the name
"xics" instead. There should be no functional change.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[spapr_cpu_init has been moved to spapr_cpu_core.c, change there]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
1cbd222055 ppc/xics: Implement H_IPOLL using an accessor
None of the other presenter functions directly mucks with the
internal state, so don't do it there either.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:47 +10:00
Benjamin Herrenschmidt
9c7027ba94 ppc/xics: Move SPAPR specific code to a separate file
Leave the core ICP/ICS logic in xics.c and move the top level
class wrapper, hypercall and RTAS handlers to xics_spapr.c

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[add cpu.h in xics_spapr.c, move set_nr_irqs and set_nr_servers to
 xics_spapr.c]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:46 +10:00
Benjamin Herrenschmidt
161deaf225 ppc/xics: Rename existing xics to xics_spapr
The common class doesn't change, the KVM one is sPAPR specific. Rename
variables and functions to xics_spapr.

Retain the type name as "xics" to preserve migration for existing sPAPR
guests.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 13:41:46 +10:00
Benjamin Herrenschmidt
4322e8ced5 ppc: Fix 64K pages support in full emulation
We were always advertising only 4K & 16M. Additionally the code wasn't
properly matching the page size with the PTE content, which meant we
could potentially hit an incorrect PTE if the guest used multiple sizes.

Finally, honor the CPU capabilities when decoding the size from the SLB
so we don't try to use 64K pages on 970.

This still doesn't add support for MPSS (Multiple Page Sizes per Segment)

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors
      commits 61a36c9b5a and 1114e712c9 reworked the hpte code
      doing insertion/removal in hw/ppc/spapr_hcall.c. The hunks
      modifying these areas were removed. ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Aaron Larson
a36848ff7c target-ppc: Eliminate redundant and incorrect function booke206_page_size_to_tlb
Eliminate redundant and incorrect booke206_page_size_to_tlb function
from ppce500_spin.c in preference to previously existing but newly
exported definition from e500.c

Defect analysis:

The booke206_page_size_to_tlb function in e500.c was updated in commit
2bd9543 "ppc: booke206: use MAV=2.0 TSIZE definition, fix 4G pages" to
reflect a change in the definition of MAS1_TSIZE_SHIFT from 8
(corresponding to a min TLB page size of 4kb) to a value of 7 (TLB
page size 2k).  The booke206_page_size_to_tlb() function defined in
ppce500_spin.c was never updated to reflect the change in
MAS1_TSIZE_SHIFT.

In http://lists.nongnu.org/archive/html/qemu-ppc/2016-06/msg00533.html,
Scott Wood suggested this "root cause" explanation:

SW> The patch that changed MAS1_TSIZE_SHIFT from 8 to 7 was around the
SW> same time as the patch that added this code, which is probably why
SW> adjusting it got missed.  Commit 2bd9543cd3 did update the
SW> equivalent code in ppce500_mpc8544ds.c, which now resides in
SW> hw/ppc/e500.c and has been changed to not assume a power-of-2
SW> size.  The ppce500_spin version should be eliminated.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Bharata B Rao
ff461b8da9 spapr: Restore support for older PowerPC CPU cores
Introduction of core based CPU hotplug for PowerPC sPAPR didn't
add support for 970 and POWER5+ based core types. Add support for
the same.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Greg Kurz
dde35bc966 spapr: fix write-past-end-of-array error in cpu core device init code
This fixes a potential QEMU crash introduced by commit 3b54254966.

Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Thomas Huth
6cc09e261b hw/ppc/spapr: Add some missing hcall function set strings
Add "hcall-sprg0" (for H_SET_SPRG0), "hcall-copy" (for H_PAGE_INIT)
and "hcall-debug" (for H_LOGICAL_CI_LOAD/STORE) to the property
"ibm,hypertas-functions" to indicate that we support these hypercalls.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
f2b70fded9 ppc: Print HSRR0/HSRR1 in "info registers"
They are generally useful when debugging HV mode stuff

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
635dff20a3 ppc: LPCR is a HV resource
Don't allow access in guest mode

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
4b236b621b ppc: Initial HDEC support
The current behaviour isn't completely right, as for the DEC, we
don't properly re-arm when wrapping around, but I will fix this
in a separate patch.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
b378bb0948 ppc: Enforce setting MSR:EE,IR and DR when MSR:PR is set
The architecture specifies that any instruction that sets MSR:PR will also
set MSR:EE, IR and DR.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
d1dbe37c1e ppc: Fix conditions for delivering external interrupts to a guest
External interrupts can bypass the MSR_EE test if they occur in guest
mode and LPES0 is clear. In that case they are directed to the hypervisor

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
4b3fc37788 ppc: Use a helper to filter writes to LPCR
This handles filtering bits based on what is implemented by a
given architecture version. We also use it to copy to LPCR
some of the relevant 970 HID4 bits.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
88536935c0 ppc: Update LPCR definitions
Includes all the bits up to ISA 2.07

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Benjamin Herrenschmidt
8eeb330c69 ppc: Add a bunch of hypervisor SPRs to Book3s
We don't give them a KVM reg number yet as no current KVM version
supports HV mode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: SPRs AMOR,DAWR,DARWX were already included in commit f401dd32cb]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-07-01 09:57:01 +10:00
Alex Williamson
3cb3b1549f intel_iommu: Throw hw_error on notify_started
We don't currently support the MemoryRegionIOMMUOps notifier, so throw
an error should a device require it.

Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:24 -06:00
Alexey Kardashevskiy
d22d8956b1 memory: Add MemoryRegionIOMMUOps.notify_started/stopped callbacks
The IOMMU driver may change behavior depending on whether a notifier
client is present.  In the case of POWER, this represents a change in
the visibility of the IOTLB, for other drivers such as intel-iommu and
future AMD-Vi emulation, notifier support is not yet enabled and this
provides the opportunity to flag that incompatibility.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[new log & extracted from [PATCH qemu v17 12/12] spapr_iommu, vfio, memory: Notify IOMMU about starting/stopping listening]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:23 -06:00
Alex Williamson
e37dac06dc vfio/pci: Hide SR-IOV capability
The kernel currently exposes the SR-IOV capability as read-only
through vfio-pci.  This is sufficient to protect the host kernel, but
has the potential to confuse guests without further virtualization.
In particular, OVMF tries to size the VF BARs and comes up with absurd
results, ending with an assert.  There's not much point in adding
virtualization to a read-only capability, so we simply hide it for
now.  If the kernel ever enables SR-IOV virtualization, we should
easily be able to test it through VF BAR sizing or explicit flags.

Testing whether we should parse extended capabilities is also pulled
into the function to keep these assumptions in one place.

Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:23 -06:00
Chen Fan
325ae8d548 vfio: add pcie extended capability support
For vfio pcie device, we could expose the extended capability on
PCIE bus. due to add a new pcie capability at the tail of the chain,
in order to avoid config space overwritten, we introduce a copy config
for parsing extended caps. and rebuild the pcie extended config space.

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Tested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:23 -06:00
Alex Williamson
4d3fc4fdc6 vfio/pci: Fix VGA quirks
Commit 2d82f8a3cd ("vfio/pci: Convert all MemoryRegion to dynamic
alloc and consistent functions") converted VFIOPCIDevice.vga to be
dynamically allocted, negating the need for VFIOPCIDevice.has_vga.
Unfortunately not all of the has_vga users were converted, nor was
the field removed from the structure.  Correct these oversights.

Reported-by: Peter Maloney <peter.maloney@brockmann-consult.de>
Tested-by: Peter Maloney <peter.maloney@brockmann-consult.de>
Fixes: 2d82f8a3cd ("vfio/pci: Convert all MemoryRegion to dynamic alloc and consistent functions")
Fixes: https://bugs.launchpad.net/qemu/+bug/1591628
Cc: qemu-stable@nongnu.org
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-06-30 13:00:22 -06:00
Peter Maydell
ddf31aa853 linux-user: Fix compilation when F_SETPIPE_SZ isn't defined
Older kernels don't have F_SETPIPE_SZ and F_GETPIPE_SZ (in
particular RHEL6's system headers don't define these). Add
ifdefs so that we can gracefully fall back to not supporting
those guest ioctls rather than failing to build.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 1467304429-21470-1-git-send-email-peter.maydell@linaro.org
2016-06-30 19:27:51 +01:00
Paolo Bonzini
8a0b4de048 pcspk: fix KVM
The link property that was added to the pcspk device has the wrong type:
it is only correct for TCG and for KVM's userspace or split irqchip
options.  The default KVM option (fully in-kernel irqchip) breaks
because it uses a PIT whose type is a sibling of TYPE_I8254.

Fixes: 873b4d3f05
Tested-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1467298657-6588-1-git-send-email-pbonzini@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-30 19:00:02 +01:00
Eric Blake
db486cc334 qapi: Fix memleak in string visitors on int lists
Commit 7f8f9ef1 introduced the ability to store a list of
integers as a sorted list of ranges, but when merging ranges,
it leaks one or more ranges.  It was also using range_get_last()
incorrectly within range_compare() (a range is a start/end pair,
but range_get_last() is for start/len pairs), and will also
mishandle a range ending in UINT64_MAX (remember, we document
that no range covers 2**64 bytes, but that ranges that end on
UINT64_MAX have end < begin).

The whole merge algorithm was rather complex, and included
unnecessary passes over data within glib functions, and enough
indirection to make it hard to easily plug the data leaks.
Since we are already hard-coding things to a list of ranges,
just rewrite the thing to open-code the traversal and
comparisons, by making the range_compare() helper function give
us an answer that is easier to use, at which point we avoid the
need to pass any callbacks to g_list_*(). Then by reusing
range_extend() instead of duplicating effort with range_merge(),
we cover the corner cases correctly.

Drop the now-unused range_merge() and ranges_can_merge().

Doing this lets test-string-{input,output}-visitor pass under
valgrind without leaks.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464712890-14262-4-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Comment hoisted out of loop]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:28:54 +02:00
Eric Blake
7c47959d0c qapi: Simplify use of range.h
Calling our function g_list_insert_sorted_merged is a misnomer,
since we are NOT writing a glib function.  Furthermore, we are
making every caller pass the same comparator function of
range_merge(): any caller that would try otherwise would break
in weird ways since our internal call to ranges_can_merge() is
hard-coded to operate only on ranges, rather than paying
attention to the caller's comparator.

Better is to fix things so that callers don't have to care about
our internal comparator, by picking a function name and updating
the parameter type away from a gratuitous use of void*, to make
it obvious that we are operating specifically on a list of ranges
and not a generic list.  Plus, refactoring the code here will
make it easier to plug a memory leak in the next patch.

range_compare() is now internal only, and moves to the .c file.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464712890-14262-3-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:28:51 +02:00
Eric Blake
fec0fc0a13 range: Create range.c for code that should not be inline
g_list_insert_sorted_merged() is rather large to be an inline
function; move it to its own file.  range_merge() and
ranges_can_merge() can likewise move, as they are only used
internally.  Also, it becomes obvious that the condition within
range_merge() is already satisfied by its caller, and that the
return value is not used.

The diffstat is misleading, because of the copyright boilerplate.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464712890-14262-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:28:40 +02:00
Eric Blake
9b4e38fe6a qapi: Fix crash on missing alternate member of QAPI struct
If a QAPI struct has a mandatory alternate member which is not
present on input, the input visitor reports an error for the
missing alternate without setting the discriminator, but the
cleanup code for the struct still tries to use the dealloc
visitor to clean up the alternate.

Commit dbf11922 changed visit_start_alternate to set *obj to NULL
when an error occurs, where it was previously left untouched.
Thus, before the patch, the dealloc visitor is blindly trying to
cleanup whatever branch corresponds to (*obj)->type == 0 (that is,
QTYPE_NONE, because *obj still pointed to zeroed memory), which
selects the default branch of the switch and sets an error, but
this second error is ignored by the way the dealloc visitor is
used; but after the patch, the attempt to switch dereferences NULL.

When cleaning up after a partial object parse, we specifically
check for !*obj after visit_start_struct() (see gen_visit_object());
doing the same for alternates fixes the crash. Enhance the testsuite
to give coverage for both missing struct and missing alternate
members.

Also add an abort - we expect visit_start_alternate() to either set an
error or to set (*obj)->type to a valid QType that corresponds to
actual user input, and QTYPE_NONE should never be reachable from valid
input.  Had the abort() been in place earlier, we might have noticed
the dealloc visitor dereferencing bogus zeroed memory prior to when
commit dbf11922 forced our hand by setting *obj to NULL and causing a
fault.

Test case:

{'execute':'blockdev-add', 'arguments':{'options':{'driver':'raw'}}}

The choice of 'driver':'raw' selects a BlockdevOptionsGenericFormat
struct, which has a mandatory 'file':'BlockdevRef' in QAPI.  Since
'file' is missing as a sibling of 'driver', this should report a
graceful error rather than fault.  After this patch, we are back to:

{"error": {"class": "GenericError", "desc": "Parameter 'file' is missing"}}

Generated code in qapi-visit.c changes as:

|@@ -2444,6 +2444,9 @@ void visit_type_BlockdevRef(Visitor *v,
|     if (err) {
|         goto out;
|     }
|+    if (!*obj) {
|+        goto out_obj;
|+    }
|     switch ((*obj)->type) {
|     case QTYPE_QDICT:
|         visit_start_struct(v, name, NULL, 0, &err);
|@@ -2459,10 +2462,13 @@ void visit_type_BlockdevRef(Visitor *v,
|     case QTYPE_QSTRING:
|         visit_type_str(v, name, &(*obj)->u.reference, &err);
|         break;
|+    case QTYPE_NONE:
|+        abort();
|     default:
|         error_setg(&err, QERR_INVALID_PARAMETER_TYPE, name ? name : "null",
|                    "BlockdevRef");
|     }
|+out_obj:
|     visit_end_alternate(v);

Reported by Kashyap Chamarthy <kchamart@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1466012271-5204-1-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Tested-by: Kashyap Chamarthy <kchamart@redhat.com>
[Commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:24:36 +02:00
Eric Blake
01fb8e192d checkpatch: There is no qemu_strtod()
Maybe there should be; but until there is, we should not flag
strtod() calls as something to replaced with qemu_strtod().

We also lack qemu_strtof() and qemu_strtold(), but as no one
has been using strtof() or strtold(), it's not worth complicating
the regex for them.

(Ironically, I had to use 'git commit -n' since checkpatch uses
TAB indents, in violation of its own recommendations.)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465526889-8339-3-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:24:36 +02:00
Eric Blake
ff5394ad5b qobject: Correct JSON lexer grammar comments
Fix the regex comments describing what we parse as JSON.  No change
to the lexer itself, just to the comments:
- The "" and '' string construction was missing alternation between
different escape sequences
- The construction for numbers forgot to handle optional leading '-'
- The construction for numbers was grouped incorrectly so that it
didn't permit '0.1'
- The construction for numbers forgot to mark the exponent as optional
- No mention that our '' string and "\'" are JSON extensions
- No mention of our %d and related extensions when constructing JSON

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1465526889-8339-2-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Eric's regexp simplification squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:24:36 +02:00
Eric Blake
ba4dba5434 json-streamer: Don't leak tokens on incomplete parse
Valgrind complained about a number of leaks in
tests/check-qobject-json:

==12657==    definitely lost: 17,247 bytes in 1,234 blocks

All of which had the same root cause: on an incomplete parse,
we were abandoning the token queue without cleaning up the
allocated data within each queue element.  Introduced in
commit 95385fe, when we switched from QList (which recursively
frees contents) to g_queue (which does not).

We don't yet require glib 2.32 with its g_queue_free_full(),
so open-code it instead.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463608012-12760-1-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-30 15:24:36 +02:00
Markus Armbruster
297e8005f8 MAINTAINERS: Remove Blue Swirl leftovers
Blue hasn't been active in the QEMU project for a long time.  Drop his
last MAINTAINERS entries.

As per Paolo's recommendation, downgrade status of "BSD user" from
Maintained to Orphan since the FreeBSD guys effectively forked it, and
"SPARC target" from Maintained to Odd Fixes, since we still have the
overall TCG maintainer looking after it.

I'm leaving Checkpatch's status at Odd Fixes.  Calling it Maintained
wouldn't be wrong, but I'm not comfortable upgrading it while nobody
is willing to have his name nailed to the thing.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-30 13:34:49 +01:00
Greg Kurz
8c1cd719a5 MAINTAINERS: update email address for Greg Kurz
While here, also add a section for the tree I use for 9p.

Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Cédric Le Goater <clg@fr.ibm.com>
Acked-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-id: 146617410554.7281.1733165006203821878.stgit@bahia.lan
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-30 13:32:15 +01:00
Peter Maydell
1ec20c2a3a Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* serial port fixes (Paolo)
* Q35 modeling improvements (Paolo, Vasily)
* chardev cleanup improvements (Marc-André)
* iscsi bugfix (Peter L.)
* cpu_exec patch from multi-arch patches (Peter C.)
* pci-assign tweak (Lin Ma)

# gpg: Signature made Wed 29 Jun 2016 15:56:30 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (35 commits)
  socket: unlink unix socket on remove
  socket: add listen feature
  char: clean up remaining chardevs when leaving
  vhost-user: disable chardev handlers on close
  vhost-user-test: fix g_cond_wait_until compat implementation
  vl: smp_parse: fix regression
  ich9: implement SCI_IRQ_SEL register
  ich9: implement ACPI_EN register
  serial: reinstate watch after migration
  serial: remove watch on reset
  char: change qemu_chr_fe_add_watch to return unsigned
  serial: separate serial_xmit and serial_watch_cb
  serial: simplify tsr_retry reset
  serial: make tsr_retry unsigned
  iscsi: fix assertion in is_sector_request_lun_aligned
  target-*: Don't redefine cpu_exec()
  pci-assign: Move "Invalid ROM" error message to pci-assign-load-rom.c
  vnc: generalize "VNC server running on ..." message
  scsi: esp: fix migration
  MC146818 RTC: add GPIO access to output IRQ
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-29 19:14:48 +01:00
Peter Maydell
ef8757f1fe Merge remote-tracking branch 'remotes/cody/tags/block-pull-request' into staging
# gpg: Signature made Wed 29 Jun 2016 04:09:26 BST
# gpg:                using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg:                 aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg:                 aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98  D624 BDBE 7B27 C0DE 3057

* remotes/cody/tags/block-pull-request:
  mirror: fix misleading comments
  blockjob: assert(cb) when create job
  iotests: add small-granularity mirror test
  mirror: limit niov to IOV_MAX elements, again
  mirror: clarify mirror_do_read return code
  block/gluster: add support for selecting debug logging level
  mirror: fix trace_mirror_yield_in_flight usage in mirror_iteration()
  block/nfs: add support for libnfs pagecache
  block/nfs: refuse readahead if cache.direct is on
  block/gluster: add support for SEEK_DATA/SEEK_HOLE

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-29 16:08:49 +01:00
Marc-André Lureau
74b6ce43e3 socket: unlink unix socket on remove
qemu leaves unix socket files behind when removing a listening chardev
or leaving. qemu could clean that up, even if doing so isn't race-free.

Fixes:
https://bugzilla.redhat.com/show_bug.cgi?id=1347077

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1466105332-10285-4-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 16:49:41 +02:00
Marc-André Lureau
3fa27a9a1e socket: add listen feature
Add a flag to tell whether the channel socket is listening.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1466105332-10285-3-git-send-email-marcandre.lureau@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-06-29 16:49:41 +02:00
Marc-André Lureau
c1111a24a3 char: clean up remaining chardevs when leaving
This helps to remove various chardev resources leaks when leaving qemu.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1466105332-10285-2-git-send-email-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 16:49:41 +02:00
Paolo Bonzini
25f0d2aa5e vhost-user: disable chardev handlers on close
This otherwise causes a use-after-free if network backend cleanup
is performed before character device cleanup.

Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 16:49:40 +02:00
Paolo Bonzini
634d39b4e3 vhost-user-test: fix g_cond_wait_until compat implementation
This fixes compilation with glib versions up to 2.30, such
as the one in CentOS 6.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 16:49:40 +02:00
Peter Maydell
845d1e7e42 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 28 Jun 2016 22:27:20 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request:
  trace: [*-user] Add events to trace guest syscalls in syscall emulation mode
  trace: enable tracing in qemu-img
  qemu-img: move common options parsing before commands processing
  trace: enable tracing in qemu-nbd
  trace: enable tracing in qemu-io
  trace: move qemu_trace_opts to trace/control.c
  doc: move text describing --trace to specific .texi file
  doc: sync help description for --trace with man for qemu.1

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-29 14:07:57 +01:00
Andrew Jones
66f37d360b vl: smp_parse: fix regression
Commit 0544edd88a "vl: smp_parse: cleanups" regressed any -smp
config that left either cores or threads unspecified, and specified
a topology supporting more cpus than the given online cpus. The
correct way to calculate the missing parameter would be to use
maxcpus, but it's too late to change that now. Restore the old
way, which is to calculate it with the online cpus (as is still
done), but then, if the result is zero, just set it to one.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1466526844-29245-1-git-send-email-drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:48 +02:00
Paolo Bonzini
8f242cb724 ich9: implement SCI_IRQ_SEL register
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:48 +02:00
Paolo Bonzini
6d356c8c9e ich9: implement ACPI_EN register
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:48 +02:00
Paolo Bonzini
9f34a35e00 serial: reinstate watch after migration
Otherwise, a serial port can get stuck if it is migrated while flow control
is in effect.

Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
a1df76da57 serial: remove watch on reset
Otherwise, this can cause serial_xmit to be entered with LSR.TEMT=0,
which is invalid and causes an assertion failure.

Reported-by: Bret Ketchum <bcketchum@gmail.com>
Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
6f1de6b70d char: change qemu_chr_fe_add_watch to return unsigned
g_source_attach can return any value between 1 and UINT_MAX if you let
QEMU run long enough.  However, qemu_chr_fe_add_watch can also return
a negative errno value when the device is disconnected or does not
support chr_add_watch.  Change it to return zero to avoid overloading
these values.

Fix the cadence_uart which asserts in this case (easily obtained with
"-serial pty").

Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
b0585e7e07 serial: separate serial_xmit and serial_watch_cb
serial_xmit starts transmission of whatever is in the transmitter
register, THR or FIFO; serial_watch_cb is a wrapper around it and is
only used as a qemu_chr_fe_add_watch callback.

Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
bce933b85a serial: simplify tsr_retry reset
Move common code outside the if, and reset tsr_retry even in loopback mode.
Right now it cannot become non-zero, but it will be possible as soon as
we start respecting the baud rate.

Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
807464d8a7 serial: make tsr_retry unsigned
It can never become negative; reflect this in the type of the field
and simplify the conditions.

Tested-by: Bret Ketchum <bcketchum@gmail.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Peter Lieven
0ead93120e iscsi: fix assertion in is_sector_request_lun_aligned
Commit 94d047a added an assertion the the request alignment check.
This introduced 2 issues:
 a) A off-by-one error since a request of BDRV_REQUEST_MAX_SECTORS
    is actually allowed.
 b) The bdrv_get_block_status call in the read path to check the allocation
    status requests up to INT_MAX sectors which triggers the assertion.

Fixes: 94d047a35b
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1466414680-18383-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Peter Crosthwaite
8642c1b81e target-*: Don't redefine cpu_exec()
This function needs to be converted to QOM hook and virtualised for
multi-arch. This rename interferes, as cpu-qom will not have access
to the renaming causing name divergence. This rename doesn't really do
anything anyway so just delete it.

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-Id: <69bd25a8678b8b31b91cd9760c777bed1aafb44e.1437212383.git.crosthwaite.peter@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Crosthwaite <crosthwaitepeter@gmail.com>
2016-06-29 14:03:47 +02:00
be968c721e pci-assign: Move "Invalid ROM" error message to pci-assign-load-rom.c
In function pci_assign_dev_load_option_rom, For those pci devices don't
have 'rom' file under sysfs or if loading ROM from external file, The
function returns NULL, and won't set the passed 'size' variable.

In these 2 cases, qemu still reports "Invalid ROM" error message, Users
may be confused by it.

Signed-off-by: Lin Ma <lma@suse.com>
Message-Id: <1466010327-22368-1-git-send-email-lma@suse.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
33df7bf3bf vnc: generalize "VNC server running on ..." message
The message is useful whenever the user specifies "-vnc to=XX".
Move it to ui/vnc.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Paolo Bonzini
cc96677469 scsi: esp: fix migration
Commit 926cde5 ("scsi: esp: make cmdbuf big enough for maximum CDB size",
2016-06-16) changed the size of a migrated field.  Split it in two
parts, and only migrate the second part in a new vmstate version.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:47 +02:00
Efimov Vasily
3638439d54 MC146818 RTC: add GPIO access to output IRQ
The MC146818 RTC device has output IRQ line. Currently the corresponding field
is only accessible through direct access. Such access violates Qemu model.

The patch makes the field accessible through GPIO. It also updates the setting
of the IRQ during initialization.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
250263033c isa: introduce wrapper isa_connect_gpio_out
Currently a direct access to the device structure field is used to connect ISA
device IRQ to the bus. GPIO access should be used instead if possible.

The patch adds wrapper isa_connect_gpio_out. The function connects specified
output GPIO to specified ISA IRQ.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
ea5d42508c ICH9 LPC: move call of isa_bus_irqs to 'realize' method
The isa_bus_irqs function initializes ISA bus IRQ array pointer with specified
value.

Previously the ICH9 LPC bridge model did not have its own IRQs but
only IRQ pointer cache. And same GSI were used for ISA bus and other sources
behind the bridge (PCI, SCI). Hence, the pc_q35_init was only possible place to
setup both ISA bus IRQs and the bridge IRQ cache.

As a result, the call of isa_bus_irqs was made from pc_q35_init.

Now the ICH9 LPC bridge has its own output IRQs which are connected to GSI. The
output IRQs are already used to route IRQs from PCI and SCI.

The patch makes the ICH9 LPC bridge output IRQs to used for ISA bus too.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
f999c0de05 ICH9 LPC: handle GSI as qdev GPIO
The ICH9 LPC bridge has 24 output IRQs connected to GSI. Currently the IRQs are
referenced by pointers. The pointers are initialized at startup by direct access
to the structure fields. This violates Qemu device model.

The patch makes the IRQs handling to use GPIO model.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Paolo Bonzini
35a6b23c82 ich9: unify pic and ioapic IRQ vectors
ich9->pic and ich9->ioapic differ for the first 16 GSIs (because
ich9->pic is wired to 8259+IOAPIC but ich9->ioapic is wired to
IOAPIC only).  However, ich9->ioapic is never used for the first
16 GSIs, so the two vectors can be merged.

Reviewed-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Paolo Bonzini
a94dd6a9d6 ich9: clean up ich9_lpc_update_pic/ich9_lpc_update_apic and callers
Make ich9_lpc_update_pic take care only of GSIs 0-15, and
ich9_lpc_update_apic take care only of GSIs 16-23.  Assert
that they are called with the correct GSI indices.

Reviewed-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Paolo Bonzini
f62efcacfa ich9: call ich9_lpc_update_pic for disabled pirqs
An asserted pirq can be disabled and the corresponding GSIs
should then go down to 0.  However, because of the conditional in
ich9_lpc_update_by_pirq, the legacy 8259 pin could remain stuck to 1.

Reviewed-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
f2dd8ebdf4 ICH9 SMB: make TYPE_ICH9_SMB_DEVICE macro public
ICH9 SMB bridge can be created using qdev API despite existence of helper
function. The type name is needed for such creation. Using a preprocessor
alias instead the string type name itself is preferable.

The patch makes the alias accessible through the header.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
d812b3d68d port92: handle A20 IRQ as GPIO
The port92 device has outgouing IRQ line A20. Currently the IRQ is referenced
by a pointer which normally is set during machine initialization. The
pointer is never changed at runtime. Hence, common GPIO model can be applied
to A20 IRQ line. Note that checking for IRQ to be connected as in
previous version of code is not required qemu_set_irq will do it.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
3115b9e2d2 pckbd: handle A20 IRQ as GPIO
The i8042 device has outgouing IRQ line A20. Currently the IRQ is referenced
by a pointer which normally is set during machine initialization. The pointer
is never changed at runtime. So common GPIO model can be applied to A20 IRQ
line. Note that checking for IRQ to be connected as in previous version
of code is not required because qemu_set_irq will do it.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
8d1c7158a0 pc_q35: configure Q35 instance using properties
Currently, Q35 instance is configured using direct access to structure fields.
The patch uses property interface to set the fields.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
401f2f3ef1 Q35: implement property interfece to several parameters
During creation of Q35 instance several parameters are set using direct access.
It violates Qemu device model. Correctly, the parameters should be handled as
object properties.

The patch adds four link type properties for fields:
mch.ram_memory
mch.pci_address_space
mch.system_memory
mch.address_space_io
And, it adds two size type properties for fields:
mch.below_4g_mem_size
mch.above_4g_mem_size

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
1a004c7fc8 pflash: make TYPE_CFI_PFLASH0{1,2} macros public
qdev API can be used to create CFI pflash devices despite existance of helper
functions. The type name is needed in course of such creation. Using the
preprocessor alias instead of the string literal itself is preferable.

The patch makes the aliases accessible through the header.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:46 +02:00
Efimov Vasily
936a6447c8 vmport: identify vmport type by macro TYPE_VMPORT
Currently vmport device is identified by the string literal. Using a
preprocessor alias instead is preferable.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:45 +02:00
Efimov Vasily
873b4d3f05 pcspk: convert "pit" property type from ptr to link
The speaker device needs pointer to ISA PIT device to operate. But according to
qdev-properties.h, properties of pointer type should be avoided. It seems a
link type property is a good substitution.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:45 +02:00
Efimov Vasily
e8ad4d1680 ide: move headers to include folder
The patch moves "hw/ide/achi.h", "hw/ide/pci.h" and "hw/ide/internal.h" headers
to corresponding folders inside "include" folder alike other Qemu headers.

Signed-off-by: Efimov Vasily <real@ispras.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-29 14:03:45 +02:00
Peter Maydell
3e904d6ade Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160628' into staging
Drop building linux-user targets on HPPA or m68k host systems
and add safe_syscall support for i386, aarch64, arm, ppc64 and
s390x.

# gpg: Signature made Tue 28 Jun 2016 19:31:16 BST
# gpg:                using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"
# Primary key fingerprint: FF82 03C8 C391 98AE 0581  41EF B448 90DE DE3C 9BC0

* remotes/riku/tags/pull-linux-user-20160628: (24 commits)
  linux-user: Provide safe_syscall for ppc64
  linux-user: Provide safe_syscall for s390x
  linux-user: Provide safe_syscall for aarch64
  linux-user: Provide safe_syscall for arm
  linux-user: Provide safe_syscall for i386
  linux-user: fix x86_64 safe_syscall
  linux-user: don't swap NLMSG_DATA() fields
  linux-user: fd_trans_host_to_target_data() must process only received data
  linux-user: add missing return in netlink switch statement
  linux-user: update get_thread_area/set_thread_area strace
  linux-user: fix clone() strace
  linux-user: add socket() strace
  linux-user: add socketcall() strace
  linux-user: Support F_GETPIPE_SZ and F_SETPIPE_SZ fcntls
  linux-user: Fix wrong type used for argument to rt_sigqueueinfo
  linux-user: Create a hostdep.h for each host architecture
  user-exec: Remove unused code for OSX hosts
  user-exec: Delete now-unused hppa and m68k cpu_signal_handler() code
  configure: Don't allow user-only targets for unknown CPU architectures
  configure: Don't override ARCH=unknown if enabling TCI
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-29 10:43:08 +01:00
Changlong Xie
15d6729850 mirror: fix misleading comments
s/target bs/to_replace/, also we check to_replace bs is not
blocked in qmp_drive_mirror() not here

Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1466672241-22485-3-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 23:08:25 -04:00
Changlong Xie
b48100cf07 blockjob: assert(cb) when create job
Callback for block job should always exist

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1466672241-22485-2-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 23:08:13 -04:00
John Snow
ccee3d8f97 iotests: add small-granularity mirror test
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466625064-11280-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:53:03 -04:00
John Snow
e4808881cb mirror: limit niov to IOV_MAX elements, again
During the refactor of mirror_iteration in e5b43573,
we regressed the fix introduced in cae98cb8.

This patch re-adds IOV_MAX checking to cases where we
aren't checking alignment (and size) already.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466625064-11280-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:53:03 -04:00
John Snow
176129552f mirror: clarify mirror_do_read return code
mirror_do_read intends to return the number of sectors processed after
the starting sector, without regard to how many sectors were processed
before the starting sector due to alignment.

Clean up the comments and code to hopefully illustrate this more clearly.

This also fixes an issue in initialization where if the mirror buffer size
is initialized to smaller than the number of sectors being requested for
transfer, we report back an incorrectly large number to the caller.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466625064-11280-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:53:03 -04:00
Jeff Cody
7eac868a50 block/gluster: add support for selecting debug logging level
This adds commandline support for the logging level of the
gluster protocol driver, output to stdout.  The option is 'debug',
e.g.:

-drive filename=gluster://192.168.15.180/gv2/test.qcow2,debug=9

Debug levels are 0-9, with 9 being the most verbose, and 0 representing
no debugging output.  The default is the same as it was before, which
is a level of 4.  The current logging levels defined in the gluster
source are:

    0 - None
    1 - Emergency
    2 - Alert
    3 - Critical
    4 - Error
    5 - Warning
    6 - Notice
    7 - Info
    8 - Debug
    9 - Trace

(From: glusterfs/logging.h)

Reviewed-by: Niels de Vos <ndevos@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:52:45 -04:00
Denis V. Lunev
ff04198bf5 mirror: fix trace_mirror_yield_in_flight usage in mirror_iteration()
trace_mirror_yield_in_flight accepts 2nd arguments in sectors while here
we pass chunks instead.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1466518157-27140-1-git-send-email-den@openvz.org
CC: Jeff Cody <jcody@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:52:45 -04:00
Peter Lieven
d99b26c42a block/nfs: add support for libnfs pagecache
upcoming libnfs will have support for a read cache that can
significantly help to speed up requests since libnfs by design
circumvents the kernel cache.

Example:
 qemu -cdrom nfs://127.0.0.1/iso/my.iso?pagecache=1024

The pagecache parameters takes the maximum amount of pages to
cache.  A page in libnfs is always the NFS_BLKSIZE which is
4KB.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1463662083-20814-3-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:52:45 -04:00
Peter Lieven
38f8d5e025 block/nfs: refuse readahead if cache.direct is on
if we open a NFS export with disabled cache we should refuse
the readahead feature as it will cache data inside libnfs.

If a export was opened with readahead enabled it should
futher not be allowed to disable the cache while running.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1463662083-20814-2-git-send-email-pl@kamp.de
Signed-off-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:52:45 -04:00
Niels de Vos
947eb2030e block/gluster: add support for SEEK_DATA/SEEK_HOLE
GlusterFS 3.8 contains support for SEEK_DATA and SEEK_HOLE. This makes
it possible to detect sparse areas in files.

Signed-off-by: Niels de Vos <ndevos@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
2016-06-28 22:52:45 -04:00
Lluís Vilanova
9c15e70086 trace: [*-user] Add events to trace guest syscalls in syscall emulation mode
Adds two events to trace syscalls in syscall emulation mode (*-user):

* guest_user_syscall: Emitted before the syscall is emulated; contains
  the syscall number and arguments.

* guest_user_syscall_ret: Emitted after the syscall is emulated;
  contains the syscall number and return value.

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-id: 146651712411.12388.10024905980452504938.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
06a1e0c197 trace: enable tracing in qemu-img
The command will work this way:
    qemu-img --trace "qcow2*" create -f qcow2 1.img 64G

[Quote "qcow2*" to protect against shell globbing as suggested by Eric
Blake <eblake@redhat.com>.
--Stefan]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-8-git-send-email-den@openvz.org
Suggested by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
10985131e3 qemu-img: move common options parsing before commands processing
This is necessary to enable creation of common qemu-img options which will
be specified before command.

The patch also enables '-V' alias to '--version' (exactly like in other
block utilities) and documents this change.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-7-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
39ca463e81 trace: enable tracing in qemu-nbd
Please note, trace_init_backends() must be called in the final process,
i.e. after daemonization. This is necessary to keep tracing thread in
the proper process.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-6-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
e9a80859d6 trace: enable tracing in qemu-io
Moving trace_init_backends() into trace_opt_parse() is not possible. This
should be called after daemonize() in vl.c.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-5-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
e9e0bb2af2 trace: move qemu_trace_opts to trace/control.c
The patch also creates trace_opt_parse() helper in trace/control.c to reuse
this code in next patches for qemu-nbd and qemu-io.

The patch also makes trace_init_events() static, as this call is not used
outside the module anymore.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-4-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
eeb2b8f78d doc: move text describing --trace to specific .texi file
This text will be included to qemu-nbd/qemu-img mans in the next patches.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-3-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Denis V. Lunev
e370ad9999 doc: sync help description for --trace with man for qemu.1
[s/descriprion/description/ in commit message as suggested by Eric Blake
<eblake@redhat.com>.
--Stefan]

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1466174654-30130-2-git-send-email-den@openvz.org
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 21:14:12 +01:00
Peter Maydell
d7f3040357 cputlb: don't cpu_abort() if guest tries to execute outside RAM or RAM
In get_page_addr_code(), if the guest program counter turns out not to
be in ROM or RAM, we can't handle executing from it, and we call
cpu_abort(). This results in the message
  qemu: fatal: Trying to execute code outside RAM or ROM at 0x08000000
followed by a guest register dump, and then QEMU dumps core.

This situation happens in one of two cases:
 (1) a guest kernel bug, where it jumped off into nowhere
 (2) a user command line mistake, where they tried to run an image for
     board A on a QEMU model of board B, or where they didn't provide
     an image at all, and QEMU executed through a ROM or RAM full of
     NOP instructions and then fell off the end

In either case, a core dump of QEMU itself is entirely useless, and
only confuses users into thinking that this is a bug in QEMU rather
than a bug in the guest or a problem with their command line. (This
is a variation on the general idea that we shouldn't assert() on
something the user can accidentally provoke.)

Replace the cpu_abort() with something that explains the situation
a bit better and exits QEMU without dumping core.

(See LP:1062220 for several examples of confused users.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson  <rth@twiddle.net>
Message-id: 1466442425-11885-1-git-send-email-peter.maydell@linaro.org
2016-06-28 18:50:53 +01:00
Peter Maydell
7dd929dfdc configure: Make AVX2 test robust to non-ELF systems
The AVX2 optimization test assumes that the object format
is ELF and the system has the readelf utility. If this isn't
true then configure might fail or emit a warning (since in
a pipe "foo | bar >/dev/null 2>&1" does not redirect the
stderr of foo, only of bar). Adjust the check so that if
we don't have readelf or don't have an ELF object then we
just don't enable the AVX2 optimization.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1466287502-18730-3-git-send-email-pmaydell@chiark.greenend.org.uk
2016-06-28 15:40:40 +01:00
Peter Maydell
92fe2ba8b0 configure: Improve usermode relocation linker option probe
The probe we do to determine what flags to use to make the usermode
executables use a non-default text address has some flaws:
 * we run it even if we're not building the user binaries
 * we don't expect "ld --verbose" to fail

The combination of these two results in a harmless but
ugly "ld: unknown option: --verbose" message when running
configure on OSX.

Improve the probe to only run when we need it and to fail
nicely when even the backstop 'ld --verbose' approach fails.

Reported-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1466287502-18730-2-git-send-email-pmaydell@chiark.greenend.org.uk
2016-06-28 15:40:40 +01:00
Peter Maydell
b7a511248d hw/sh4/sh_pci.c: Use ldl_le_p() and stl_le_p()
Use ldl_le_p() and stl_le_p() instead of le32_to_cpup() and
cpu_to_le32w(); the former handle misaligned addresses and don't
need casts, and the latter are deprecated.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Aurelien Jarno <aurelien@aurel32.net>
Message-id: 1465575021-3774-1-git-send-email-peter.maydell@linaro.org
2016-06-28 15:09:32 +01:00
Peter Maydell
12c8720d8f Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 28 Jun 2016 14:23:24 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  virtio-blk: add num-queues device property
  virtio-blk: dataplane multiqueue support
  virtio-blk: live migrate s->rq with multiqueue
  virtio-blk: associate request with a virtqueue
  virtio-blk: tell dataplane which vq to notify
  virtio-blk: multiqueue batch notify
  virtio-blk: add VirtIOBlockConf->num_queues
  dma-helpers: dma_blk_io() cancel support
  Revert "virtio: sync the dataplane vring state to the virtqueue before virtio_save"

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-28 14:27:21 +01:00
Stefan Hajnoczi
2f2705908f virtio-blk: add num-queues device property
Multiqueue virtio-blk can be enabled as follows:

  qemu -device virtio-blk-pci,num-queues=8

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-8-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:32 +01:00
Stefan Hajnoczi
51b04ac5c6 virtio-blk: dataplane multiqueue support
Monitor ioeventfds for all virtqueues in the device's AioContext.  This
is not true multiqueue because requests from all virtqueues are
processed in a single IOThread.  In the future it will be possible to
use multiple IOThreads when the QEMU block layer supports multiqueue.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-7-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:32 +01:00
Stefan Hajnoczi
30d8bf6d17 virtio-blk: live migrate s->rq with multiqueue
Add a field for the virtqueue index when migrating the s->rq request
list.  The new field is only needed when num_queues > 1.  Existing QEMUs
are unaffected by this change and therefore virtio-blk migration stays
compatible.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-6-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:32 +01:00
Stefan Hajnoczi
edaffd9f0b virtio-blk: associate request with a virtqueue
Multiqueue requires that each request knows to which virtqueue it
belongs.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-5-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:32 +01:00
Stefan Hajnoczi
b234cdda95 virtio-blk: tell dataplane which vq to notify
Let the virtio_blk_data_plane_notify() caller decide which virtqueue to
notify.  This will allow the function to be used with multiqueue.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-4-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:32 +01:00
Stefan Hajnoczi
e21737ab15 virtio-blk: multiqueue batch notify
The batch notification BH needs to know which virtqueues to notify when
multiqueue is enabled.  Use a bitmap to track the virtqueues with
pending notifications.

At this point there is only one virtqueue so hard-code virtqueue index
0.  A later patch will switch to real virtqueue indices.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-3-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:31 +01:00
Stefan Hajnoczi
84419863f7 virtio-blk: add VirtIOBlockConf->num_queues
The num_queues field is always 1 for the time being.  A later patch will
make it a configurable device property so that multiqueue can be
enabled.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466511196-12612-2-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:31 +01:00
Stefan Hajnoczi
5fa78b2a1c dma-helpers: dma_blk_io() cancel support
Attempting to cancel a dma_blk_io() request causes an abort(3):

  void bdrv_aio_cancel(BlockAIOCB *acb)
  {
      ...
      while (acb->refcnt > 1) {
          if (acb->aiocb_info->get_aio_context) {
              aio_poll(acb->aiocb_info->get_aio_context(acb), true);
          } else if (acb->bs) {
              aio_poll(bdrv_get_aio_context(acb->bs), true);
          } else {
              abort();
          }
      }
      ...
  }

This happens because DMAAIOCB->bs is NULL and
dma_aiocb_info.get_aio_context() is also NULL.

This patch trivially implements dma_aiocb_info.get_aio_context() by
fetching the DMAAIOCB->ctx field.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466451417-27988-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:31 +01:00
Stefan Hajnoczi
17c42b1f6b Revert "virtio: sync the dataplane vring state to the virtqueue before virtio_save"
This reverts commit 10a06fd65f.

Dataplane has used the same virtqueue code as non-dataplane since
commits e24a47c5b7 ("virtio-scsi: do not
use vring in dataplane") and 03de2f5274
("virtio-blk: do not use vring in dataplane").  It is no longer
necessary to stop dataplane in order to sync state since there is no
duplicated virtqueue state.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-id: 1466503331-9831-1-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-28 13:08:31 +01:00
Peter Maydell
40428feaeb Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Tue 28 Jun 2016 04:29:53 BST
# gpg:                using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request:
  vmxnet3: Fix reading/writing guest memory specially when behind an IOMMU
  rtl8139: save/load RxMulOk counter (again)
  Change net/socket.c to use socket_*() functions
  net: mipsnet: check transmit buffer size before sending
  net: fix qemu_announce_self not emitting packets

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-28 10:32:13 +01:00
Pranith Kumar
aa4b04a09c misc/aspeed_scu: Fix build error caused by missing header
Tracing configurations error out currently as follows:

/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c: In function ‘aspeed_scu_read’:
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c:130:9: error: implicit declaration of function ‘qemu_log_mask’ [-Werror=implicit-function-declaration]
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c:130:9: error: nested extern declaration of ‘qemu_log_mask’ [-Werror=nested-externs]
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c:130:23: error: ‘LOG_GUEST_ERROR’ undeclared (first use in this function)
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c:130:23: note: each undeclared identifier is reported only once for each function it appears in
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c: In function ‘aspeed_scu_write’:
/home/travis/build/pranith/qemu/hw/misc/aspeed_scu.c:154:23: error: ‘LOG_GUEST_ERROR’ undeclared (first use in this function)

This is caused by a missing header file. Fix it.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 20160627215304.821-1-bobby.prani@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-28 09:41:30 +01:00
Peter Maydell
dc154b1db4 Merge remote-tracking branch 'remotes/jnsnow/tags/ide-pull-request' into staging
# gpg: Signature made Mon 27 Jun 2016 20:23:19 BST
# gpg:                using RSA key 0x7DEF8106AAFC390E
# gpg: Good signature from "John Snow (John Huston) <jsnow@redhat.com>"
# Primary key fingerprint: FAEB 9711 A12C F475 812F  18F2 88A9 064D 1835 61EB
#      Subkey fingerprint: F9B7 ABDB BCAC DF95 BE76  CBD0 7DEF 8106 AAFC 390E

* remotes/jnsnow/tags/ide-pull-request:
  macio: Use blk_drain instead of blk_drain_all

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-28 09:04:05 +01:00
KarimAllah Ahmed
c508277335 vmxnet3: Fix reading/writing guest memory specially when behind an IOMMU
When a PCI device lives behind an IOMMU, it should use 'pci_dma_*' family of
functions when any transfer from/to guest memory is required while
'cpu_physical_memory_*' family of functions completely bypass any MMU/IOMMU in
the system.

vmxnet3 in some places was using 'cpu_physical_memory_*' family of functions
which works fine with the default QEMU setup where IOMMU is not enabled but
fails miserably when IOMMU is enabled. This commit converts all such instances
in favor of 'pci_dma_*'

Cc: Dmitry Fleytman <dmitry@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-devel@nongnu.org
Cc: Anthony Liguori <aliguori@amazon.com>
Signed-off-by: KarimAllah Ahmed <karahmed@amazon.de>
Acked-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-28 10:13:57 +08:00
David Vrabel
46fe8bef4d rtl8139: save/load RxMulOk counter (again)
Commit 9d29cdeaac (rtl8139: port
TallyCounters to vmstate) introduced in incompatibility in the v4
format as it omitted the RxOkMul counter.

There are presumably no users that were impacted by the v4 to v4'
breakage, so increase the save version to 5 and re-add the field,
keeping backward compatibility with v4'.

We can't have a field conditional on the section version in
vmstate_tally_counters since this version checked would not be the
section version (but the version defined in this structure).  So, move
all the fields into the main state structure.

Signed-off-by: David Vrabel <david.vrabel@citrix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-28 10:13:57 +08:00
Ashijeet Acharya
7e8449594c Change net/socket.c to use socket_*() functions
Use socket_*() functions from include/qemu/sockets.h instead of
listen()/bind()/connect()/parse_host_port(). socket_*() fucntions are
QAPI based and this patch  performs this api conversion since
everything will be using QAPI based sockets in the future. Also add a
helper function socket_address_to_string() in util/qemu-sockets.c
which returns the string representation of socket address. Thetask was
listed on http://wiki.qemu.org/BiteSizedTasks page.

Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-28 10:13:57 +08:00
Prasad J Pandit
d88d3a0938 net: mipsnet: check transmit buffer size before sending
When processing MIPSnet I/O port write operation, it uses a
transmit buffer tx_buffer[MAX_ETH_FRAME_SIZE=1514]. Two indices
's->tx_written' and 's->tx_count' are used to control data written
to this buffer. If the two were to be equal before writing, it'd
lead to an OOB write access beyond tx_buffer. Add check to avoid it.

Reported-by: Li Qiang <qiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-28 10:13:57 +08:00
Peter Lieven
ca1ee3d6b5 net: fix qemu_announce_self not emitting packets
commit fefe2a78 accidently dropped the code path for injecting
raw packets. This feature is needed for sending gratuitous ARPs
after an incoming migration has completed. The result is increased
network downtime for vservers where the network card is not virtio-net
with the VIRTIO_NET_F_GUEST_ANNOUNCE feature.

Fixes: fefe2a78ab
Cc: qemu-stable@nongnu.org
Cc: hongyang.yang@easystack.cn
Signed-off-by: Peter Lieven <pl@kamp.de>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-28 10:13:57 +08:00
Richard Henderson
fdc997ef54 target-alpha: Avoid gcc 6.1 werror for linux-user
Using gcc 6.1 for alpha-linux-user target we see the following build error:

.../target-alpha/translate.c: In function ‘in_superpage’:
.../target-alpha/translate.c:454:52: error: self-comparison always evaluates to true [-Werror=tautological-compare]
             && addr >> TARGET_VIRT_ADDR_SPACE_BITS == addr >> 63);

Reported-by: Pranith Kumar <bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-id: 1466192793-2559-1-git-send-email-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 21:03:58 +01:00
Fam Zheng
0d0437aac6 macio: Use blk_drain instead of blk_drain_all
We only care about the associated backend, so blk_drain is more
appropriate here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20160612065603.21911-1-famz@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
2016-06-27 14:28:31 -04:00
Peter Maydell
14e60aaece hw/net/e1000: Don't use *_to_cpup()
Don't use *_to_cpup() to do byte-swapped loads; instead use
ld*_p() which correctly handle misaligned accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-6-git-send-email-peter.maydell@linaro.org
2016-06-27 16:39:56 +01:00
Peter Maydell
7542d3e706 hw/net/virtio-net.c: Don't use *_to_cpup()
Don't use *_to_cpup() to do byte-swapped loads; instead use
ld*_p() which correctly handle misaligned accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-5-git-send-email-peter.maydell@linaro.org
2016-06-27 16:39:56 +01:00
Peter Maydell
4071887b58 hw/net/rocker: Don't use *_to_cpup()
Don't use *_to_cpup() to do byte-swapped loads; instead use
ld*_p() which correctly handle misaligned accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-4-git-send-email-peter.maydell@linaro.org
2016-06-27 16:39:56 +01:00
Peter Maydell
6960bfca3a hw/net/rtl8139.c: Don't use *_to_cpup()
Don't use *_to_cpup() to do byte-swapped loads; instead use
ld*_p() which correctly handle misaligned accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-3-git-send-email-peter.maydell@linaro.org
2016-06-27 16:39:56 +01:00
Peter Maydell
4d9be25200 hw/net/eepro100.c: Don't use cpu_to_*w() and *_to_cpup()
Don't use cpu_to_*w() and *_to_cpup() to do byte-swapped loads
and stores; instead use ld*_p() and st*_p() which correctly handle
misaligned accesses.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Jason Wang <jasowang@redhat.com>
Acked-by: Dmitry Fleytman <dmitry@daynix.com <mailto:dmitry@daynix.com>>
Message-id: 1466097446-981-2-git-send-email-peter.maydell@linaro.org
2016-06-27 16:39:56 +01:00
Peter Maydell
f12103afaa Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160627' into staging
target-arm queue:
 * arm_gicv3: add missing 'break' statements
 * cadence_uart: protect against transmit errors
 * cadence_gem: avoid infinite loops with misconfigured buffer
 * cadence_gem: set the 'last' bit when 'wrap' is set
 * reenable tmp105 test case
 * palmetto-bmc: add ASPEED system control unit model
 * m25p80: add new 512Mbit and 1Gbit devices

# gpg: Signature made Mon 27 Jun 2016 15:43:42 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20160627:
  m25p80: Fix WINBOND fast read command handling
  m25p80: New flash devices.
  m25p80: Fast read commands family changes.
  m25p80: Introduce configuration registers.
  m25p80: Introduce quad and equad modes.
  m25p80: Add additional flash commands:
  m25p80: Introduce COLLECTING_VAR_LEN_DATA state.
  m25p80: Allow more than four banks.
  m25p80: Make a table for JEDEC ID.
  m25p80: Replace JEDEC ID masking with function.
  palmetto-bmc: Configure the SCU's hardware strapping register
  ast2400: Integrate the SCU model and set silicon revision
  hw/misc: Add a model for the ASPEED System Control Unit
  arm: Re-enable tmp105 test
  cadence_gem: Set the last bit when wrap is set
  cadence_gem: Avoid infinite loops with a misconfigured buffer
  cadence_uart: Protect against transmit errors
  hw/intc/arm_gicv3: Add missing break

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:46:33 +01:00
Marcin Krzeminski
3830c7a460 m25p80: Fix WINBOND fast read command handling
This commit fix obvious bug in WINBOND command handling.
Datasheet states that default dummy cycles is 8 so fix it.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-11-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:34 +01:00
Marcin Krzeminski
dadb2f9078 m25p80: New flash devices.
Macronix: mx66u51235f and mx66u1g45g
Micron: mt25ql01g and mt25qu01g
Spansion: s25fs512s and s70fs01gs

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-10-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:34 +01:00
Marcin Krzeminski
cf6f1efe0b m25p80: Fast read commands family changes.
Support for Spansion and Macronix flashes.
Additionally Numonyx(Micron) moved from default
in fast read commands family. Also moved fast read
command decoding to functions.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-9-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:34 +01:00
Marcin Krzeminski
d9cc8701f1 m25p80: Introduce configuration registers.
Configuration registers for Spansion and Macronix devices.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-8-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:34 +01:00
Marcin Krzeminski
7a69c10002 m25p80: Introduce quad and equad modes.
Quad and Equad modes for Spansion and Macronix flash devices.
This commit also includes modification and new command to manipulate
quad mode (status registers and dedicated commands).
This work is based on Pawel Lenkow work.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-7-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:34 +01:00
Marcin Krzeminski
30467afe7b m25p80: Add additional flash commands:
Page program 4byte/quad and erase 32K sectors 4 bytes.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-6-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Marcin Krzeminski
9964674e50 m25p80: Introduce COLLECTING_VAR_LEN_DATA state.
Some flash allows to stop read at any time.
Allow framework to support this.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-5-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Marcin Krzeminski
e02b3bf263 m25p80: Allow more than four banks.
Allow to have more than four 16MiB regions for bigger flash devices.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-4-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Marcin Krzeminski
e3ba6cd67f m25p80: Make a table for JEDEC ID.
Since it is now longer than 4. This work based on Pawel Lenkow
changes and the kernel SPI framework.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466755631-25201-3-git-send-email-marcin.krzeminski@nokia.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Marcin Krzeminski
c7cd0a6c24 m25p80: Replace JEDEC ID masking with function.
Instead of always reading and comparing jededc ID,
replace it by function.

Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Message-id: 1466755631-25201-2-git-send-email-marcin.krzeminski@nokia.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Andrew Jeffery
87e79af074 palmetto-bmc: Configure the SCU's hardware strapping register
The magic constant configures the following options:

* 28:27: Configure DRAM size as 256MB
* 26:24: DDR3 SDRAM with CL = 6, CWL = 5
* 23: Configure 24/48MHz CLKIN
* 22: Disable GPIOE pass-through mode
* 21: Disable GPIOD pass-through mode
* 20: Enable LPC decode of SuperIO 0x2E/0x4E addresses
* 19: Disable ACPI
* 18: Configure 48MHz CLKIN
* 17: Disable BMC 2nd boot watchdog timer
* 16: Decode SuperIO address 0x2E
* 15: VGA Class Code
* 14: Enable LPC dedicated reset pin
* 13:12: Enable SPI Master and SPI Slave to AHB Bridge
* 11:10: Select CPU:AHB ratio = 2:1
* 9:8: Select 384MHz H-PLL
* 7: Configure MAC#2 for RMII/NCSI
* 6: Configure MAC#1 for RMII/NCSI
* 5: No VGA BIOS ROM
* 4: Boot using 32bit SPI address mode
* 3:2: Select 16MB VGA memory
* 1:0: Boot from SPI flash memory

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1466744305-23163-4-git-send-email-andrew@aj.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Andrew Jeffery
334973bbae ast2400: Integrate the SCU model and set silicon revision
By specifying the silicon revision we select the appropriate reset
values for the SoC.

Additionally, expose hardware strapping properties aliasing those
provided by the SCU for board-specific configuration.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1466744305-23163-3-git-send-email-andrew@aj.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Andrew Jeffery
1c8a2388aa hw/misc: Add a model for the ASPEED System Control Unit
The SCU is a collection of chip-level control registers that manage the
various functions supported by ASPEED SoCs. Typically the bits control
interactions with clocks, external hardware or reset behaviour, and we
can largly take a hands-off approach to reads and writes.

Firmware makes heavy use of the state to determine how to boot, but the
reset values vary from SoC to SoC (eg AST2400 vs AST2500). A qdev
property is exposed so that the integrating SoC model can configure the
silicon revision, which in-turn selects the appropriate reset values.
Further qdev properties are exposed so the board model can configure the
board-dependent hardware strapping.

Almost all provided AST2400 reset values are specified by the datasheet.
The notable exception is SOC_SCRATCH1, where we mark the DRAM as
successfully initialised to avoid unnecessary dark corners in the SoC's
u-boot support.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1466744305-23163-2-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: drop unnecessary inttypes.h include]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:33 +01:00
Thomas Huth
1f5c1cfbae arm: Re-enable tmp105 test
The tmp105 test is currently not executed since the following
line in the Makefile overwrites the check-qtest-arm-y variable
instead of extending it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1466760306-21849-1-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:32 +01:00
Alistair Francis
cbdab58d46 cadence_gem: Set the last bit when wrap is set
The Cadence GEM data sheet says:
"Wrap - marks last descriptor in transmit buffer descriptor list. This
can be set for any buffer within the frame."
which seems to imply that when the wrap bit is set so is the last bit.

Previously if the wrap bit is set, but the last is not then QEMU will
enter an infinite loop.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Reported-by: P J P <ppandit@redhat.com>
Message-id: eb23f15c67989ea6a53609dc66568399dadf52a7.1466539342.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:32 +01:00
Alistair Francis
f265ae8c79 cadence_gem: Avoid infinite loops with a misconfigured buffer
A guest can write zero to the DMACFG resulting in an infinite loop when
it reaches the while(bytes_to_copy) loop.

To avoid this issue enforce a minimum size for the RX buffer. Hardware
does not have this enforcement and relies on the guest to set a non-zero
value.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Reported-by: P J P <ppandit@redhat.com>
Message-id: 84bb1c391b833275da3f573d4972920cea34c188.1466539342.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:32 +01:00
Alistair Francis
f6cf41932e cadence_uart: Protect against transmit errors
If qemu_chr_fe_write() returns an error (represented by a negative
number) we should skip incrementing the count and initiating a
memmove().

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 667e5dc534d33338fcfc2471e5aa32fe7cbd13dc.1466546703.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:32 +01:00
Shannon Zhao
92b30c2f7d hw/intc/arm_gicv3: Add missing break
These are spotted by coverity 1356936 and 1356937.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1466387717-13740-1-git-send-email-zhaoshenglong@huawei.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 15:37:32 +01:00
Peter Maydell
aa8151b7df Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160627' into staging
ppc patch queue for 2016-06-27

Small queue this time.  Main reason for sending it is the pair of
patches to fix up the new cpu hotplug model used on Power to what
should be an actually usable state.  There's also a small BookE bugfix
and a XICS trivial cleanup.

# gpg: Signature made Mon 27 Jun 2016 06:28:37 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160627:
  qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCore
  qapi: Report support for -device cpu hotplug in query-machines
  ppc/xics: Remove unused xics_set_irq_type()
  target-ppc: ppce500_spin.c uses SPR_PIR, should use SPR_BOOKE_PIR

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 12:54:54 +01:00
Peter Maydell
4b86bac21c Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160624' into staging
MIPS patches 2016-06-24

Changes:
* support IEEE 754-2008 in MIPS CPUs

# gpg: Signature made Fri 24 Jun 2016 16:09:38 BST
# gpg:                using RSA key 0x52118E3C0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"
# Primary key fingerprint: 8DD3 2F98 5495 9D66 35D4  4FC0 5211 8E3C 0B29 DA6B

* remotes/lalrae/tags/mips-20160624:
  target-mips: Add FCR31's FS bit definition
  target-mips: Implement FCR31's R/W bitmask and related functionalities
  target-mips: Add nan2008 flavor of <CEIL|CVT|FLOOR|ROUND|TRUNC>.<L|W>.<S|D>
  target-mips: Add abs2008 flavor of <ABS|NEG>.<S|D>
  target-mips: Activate IEEE 754-2008 signaling NaN bit meaning for MSA
  linux-user: Update preprocessor constants for Mips-specific e_flags bits
  softfloat: Handle snan_bit_is_one == 0 in MIPS pickNaNMulAdd()
  softfloat: For Mips only, correct default NaN values
  softfloat: Clean code format in fpu/softfloat-specialize.h
  softfloat: Implement run-time-configurable meaning of signaling NaN bit

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 11:48:22 +01:00
Peter Maydell
929bf947f7 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Fri 24 Jun 2016 18:19:36 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  target-sparc: fix register corruption in ldstub if there is no write permission

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-27 10:56:11 +01:00
Peter Krempa
27393c33d8 qapi: keep names in 'CpuInstanceProperties' in sync with struct CPUCore
struct CPUCore uses 'id' suffix in the property name. As docs for
query-hotpluggable-cpus state that the cpu core properties should be
passed back to device_add by management in case new members are added
and thus the names for the fields should be kept in sync.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[dwg: Removed a duplicated word in comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:15:06 +10:00
Peter Krempa
62c9467dff qapi: Report support for -device cpu hotplug in query-machines
For management apps it's very useful to know whether the selected
machine type supports cpu hotplug via the new -device approach. Using
the presence of 'query-hotpluggable-cpus' alone is not enough as a
witness.

Add a property to 'MachineInfo' called 'hotpluggable-cpus' that will
report the presence of this feature.

Example of output:
    {
        "hotpluggable-cpus": false,
        "name": "mac99",
        "cpu-max": 1
    },
    {
        "hotpluggable-cpus": true,
        "name": "pseries-2.7",
        "is-default": true,
        "cpu-max": 255,
        "alias": "pseries"
    },

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:13:35 +10:00
Benjamin Herrenschmidt
d29f086169 ppc/xics: Remove unused xics_set_irq_type()
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[dwg: Adjusted for context to apply without original series]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:13:30 +10:00
Aaron Larson
6d18a7a1ff target-ppc: ppce500_spin.c uses SPR_PIR, should use SPR_BOOKE_PIR
ppce500_spin.c uses SPR_PIR to initialize the spin table, however on
Book E processors the correct SPR is SPR_BOOKE_PIR.

Signed-off-by: Aaron Larson <alarson@ddci.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-27 13:12:22 +10:00
Richard Henderson
4ba92cd736 linux-user: Provide safe_syscall for ppc64
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:22 +03:00
Richard Henderson
c9bc3437a9 linux-user: Provide safe_syscall for s390x
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:22 +03:00
Richard Henderson
31f875f211 linux-user: Provide safe_syscall for aarch64
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
[RV] Updated syscall argument comment to match code
2016-06-26 13:17:22 +03:00
Richard Henderson
e942fefa6e linux-user: Provide safe_syscall for arm
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:22 +03:00
Richard Henderson
5d3acaf89c linux-user: Provide safe_syscall for i386
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:22 +03:00
Richard Henderson
4eed9990a0 linux-user: fix x86_64 safe_syscall
Do what the comment says, test for signal_pending non-zero,
rather than the current code which tests for bit 0 non-zero.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:22 +03:00
Laurent Vivier
b9403979b5 linux-user: don't swap NLMSG_DATA() fields
If the structure pointed by NLMSG_DATA() is bigger
than the size of NLMSG_DATA(), don't swap its fields
to avoid memory corruption.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:22 +03:00
Laurent Vivier
48dc0f2c3d linux-user: fd_trans_host_to_target_data() must process only received data
if we process the whole buffer, the netlink helpers can try
to swap invalid data.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:22 +03:00
Laurent Vivier
84f34b00c8 linux-user: add missing return in netlink switch statement
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:21 +03:00
Laurent Vivier
9a6309e7fa linux-user: update get_thread_area/set_thread_area strace
int get_thread_area(struct user_desc *u_info);
       int set_thread_area(struct user_desc *u_info);

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:21 +03:00
Laurent Vivier
84bd828429 linux-user: fix clone() strace
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:21 +03:00
Laurent Vivier
8997d1bd18 linux-user: add socket() strace
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:21 +03:00
Laurent Vivier
fb3aabf384 linux-user: add socketcall() strace
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:21 +03:00
Peter Maydell
7e3b92ece0 linux-user: Support F_GETPIPE_SZ and F_SETPIPE_SZ fcntls
Support the F_GETPIPE_SZ and F_SETPIPE_SZ fcntl operations.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:21 +03:00
Peter Maydell
4debae6fa5 linux-user: Fix wrong type used for argument to rt_sigqueueinfo
The third argument to the rt_sigqueueinfo syscall is a pointer to
a siginfo_t, not a pointer to a sigset_t. Fix the error in the
arguments to lock_user(), which meant that we would not have
detected some faults that we should.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:21 +03:00
Peter Maydell
ba4537805d linux-user: Create a hostdep.h for each host architecture
In commit 4d330cee37 a new hostdep.h file was added, with the intent
that host architectures which needed one could provide it, and the
build system would automatically fall back to a generic version if
there was no version for the host architecture. Although this works,
it has a flaw: if a subsequent commit switches an architecture from
"uses generic/hostdep.h" to "uses its own hostdep.h" nothing in the
makefile dependencies notices this and so doing a rebuild without
a manual 'make clean' will fail.

So we drop the idea of having a 'generic' version in favour of
every architecture we support having its own hostdep.h, even if
it doesn't have anything in it. (There are only thirteen of these.)

If the dependency files claim that an object file depends on a
nonexistent file, our dependency system means that make will
rebuild the object file, and regenerate the dependencies in
the process. So moving between trees prior to this commit and
trees after this commit works without requiring a 'make clean'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:21 +03:00
Peter Maydell
c5679026d5 user-exec: Remove unused code for OSX hosts
Since we dropped darwin-user support many years ago, the code in
user-exec to support hosts which define __APPLE__ is unused; delete it.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
4259a820d2 user-exec: Delete now-unused hppa and m68k cpu_signal_handler() code
Now that configure blocks attempts to build user-mode code on hppa
and m68k hosts, we can delete the cpu_signal_handler() implementations
for those architectures.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
affc88cc9b configure: Don't allow user-only targets for unknown CPU architectures
For the user-only targets, we need to know something about the host CPU
architecture even if we are using the TCI interpreter rather than TCG.
(In particular user-exec.c has code for handling signals that needs
to know about that host's context structures.)

Specifically forbid building the user-only targets on unknown CPU
architectures, rather than allowing them to configure but then fail
when building user-exec.c.

This change drops supports for two configurations which were theoretically
possible before:
 * linux-user targets on M68K hosts using TCI
 * linux-user targets on HPPA hosts using TCI

We don't think anybody is actually trying to use these in practice, though:
 * interpreted TCG on a slow host CPU would be unusably slow
 * the m68k user-exec.c support is missing is_write detection so guest
   code which writes to the same page it is executing from was broken
   (will include any guest program using signals)
 * HPPA TCG backend support was dropped two and a half years ago
   with no complaints

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
997f6ed3a1 configure: Don't override ARCH=unknown if enabling TCI
At the moment if configure finds an unknown CPU it will set
ARCH to 'unknown', and then later either bail out or set it
to 'tci' (depending on whether the user passed configure the
--enable-tcg-interpreter switch). This is unnecessarily
confusing, because we could be using TCI in two cases:
 * a known host architecture (in which case ARCH is set to
   the actual host architecture, like 'i386')
 * an unknown host architecture (in which case ARCH is
   set to 'tci')
so nothing can rely on ARCH=tci to mean "using TCI".
Remove the line setting ARCH, so we leave it as "unknown",
which is what the actual situation is.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
1d48fdd9d8 linux-user: Don't use sigfillset() on uc->uc_sigmask
The kernel and libc have different ideas about what a sigset_t
is -- for the kernel it is only _NSIG / 8 bytes in size (usually
8 bytes), but for libc it is much larger, 128 bytes. In most
situations the difference doesn't matter, because if you pass a
pointer to a libc sigset_t to the kernel it just acts on the first
8 bytes of it, but for the ucontext_t* argument to a signal handler
it trips us up. The kernel allocates this ucontext_t on the stack
according to its idea of the sigset_t type, but the type of the
ucontext_t defined by the libc headers uses the libc type, and
so do the manipulator functions like sigfillset(). This means that
 (1) sizeof(uc->uc_sigmask) is much larger than the actual
     space used on the stack
 (2) sigfillset(&uc->uc_sigmask) will write garbage 0xff bytes
     off the end of the structure, which can trash data that
     was on the stack before the signal handler was invoked,
     and may result in a crash after the handler returns

To avoid this, we use a memset() of the correct size to fill
the signal mask rather than using the libc function.

This fixes a problem where we would crash at least some of the
time on an i386 host when a signal was taken.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
435da5e709 linux-user: Use safe_syscall wrapper for fcntl
Use the safe_syscall wrapper for fcntl. This is straightforward now
that we always use 'struct fcntl64' on the host, as we don't need
to select whether to call the host's fcntl64 or fcntl syscall
(a detail that the libc previously hid for us).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:17:20 +03:00
Peter Maydell
213d3e9ea2 linux-user: Use __get_user() and __put_user() to handle structs in do_fcntl()
Use the __get_user() and __put_user() to handle reading and writing the
guest structures in do_ioctl(). This has two benefits:
 * avoids possible errors due to misaligned guest pointers
 * correctly sign extends signed fields (like l_start in struct flock)
   which might be different sizes between guest and host

To do this we abstract out into copy_from/to_user functions. We
also standardize on always using host flock64 and the F_GETLK64
etc flock commands, as this means we always have 64 bit offsets
whether the host is 64-bit or 32-bit and we don't need to support
conversion to both host struct flock and struct flock64.

In passing we fix errors in converting l_type from the host to
the target (where we were doing a byteswap of the host value
before trying to do the convert-bitmasks operation rather than
otherwise, and inexplicably shifting left by 1); these were
accidentally left over when the original simple "just shift by 1"
arm<->x86 conversion of commit 43f238d was changed to the more
general scheme of using target_to_host_bitmask() functions in 2ba7f73.

[RV: fixed ifdef guard for eabi functions]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-26 13:16:41 +03:00
Artyom Tarasenko
b64d2e57e7 target-sparc: fix register corruption in ldstub if there is no write permission
Signed-off-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-06-24 18:18:32 +01:00
Aleksandar Markovic
77be419980 target-mips: Add FCR31's FS bit definition
Add preprocessor definition of FCR31's FS bit, and update related
code for setting this bit.

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:43:53 +01:00
Aleksandar Markovic
599bc5e89c target-mips: Implement FCR31's R/W bitmask and related functionalities
This patch implements read and write access rules for Mips floating
point control and status register (FCR31). The change can be divided
into following parts:

- Add fields that will keep FCR31's R/W bitmask in procesor
  definitions and processor float_status structure.

- Add appropriate value for FCR31's R/W bitmask for each supported
  processor.

- Add function for setting snan_bit_is_one, and integrate it in
  appropriate places.

- Modify handling of CTC1 (case 31) instruction to use FCR31's R/W
  bitmask.

- Modify handling user mode executables for Mips, in relation to the
  bit EF_MIPS_NAN2008 from ELF header, that is in turn related to
  reading and writing to FCR31.

- Modify gdb behavior in relation to FCR31.

Signed-off-by: Thomas Schwinge <thomas@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:43:52 +01:00
Aleksandar Markovic
87552089b6 target-mips: Add nan2008 flavor of <CEIL|CVT|FLOOR|ROUND|TRUNC>.<L|W>.<S|D>
New set of helpers for handling nan2008-syle versions of instructions
<CEIL|CVT|FLOOR|ROUND|TRUNC>.<L|W>.<S|D>, for Mips R6.

All involved instructions have float operand and integer result. Their
core functionality is implemented via invocations of appropriate SoftFloat
functions. The problematic cases are when the operand is a NaN, and also
when the operand (float) is out of the range of the result.

Here one can distinguish three cases:

CASE MIPS-A: (FCR31.NAN2008 == 1)

   1. Operand is a NaN, result should be 0;
   2. Operand is larger than INT_MAX, result should be INT_MAX;
   3. Operand is smaller than INT_MIN, result should be INT_MIN.

CASE MIPS-B: (FCR31.NAN2008 == 0)

   1. Operand is a NaN, result should be INT_MAX;
   2. Operand is larger than INT_MAX, result should be INT_MAX;
   3. Operand is smaller than INT_MIN, result should be INT_MAX.

CASE SoftFloat:

   1. Operand is a NaN, result is INT_MAX;
   2. Operand is larger than INT_MAX, result is INT_MAX;
   3. Operand is smaller than INT_MIN, result is INT_MIN.

Current implementation of <CEIL|CVT|FLOOR|ROUND|TRUNC>.<L|W>.<S|D>
implements case MIPS-B. This patch relates to case MIPS-A. For case
MIPS-A, only return value for NaN-operands should be corrected after
appropriate SoftFloat library function is called.

Related MSA instructions FTRUNC_S and FTINT_S already handle well
all cases, in the fashion similar to the code from this patch.

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
[leon.alrae@imgtec.com:
 * removed a statement from the description which caused slight confusion]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:47 +01:00
Aleksandar Markovic
6be7748005 target-mips: Add abs2008 flavor of <ABS|NEG>.<S|D>
Updated handling of instructions <ABS|NEG>.<S|D>. Note that legacy
(pre-abs2008) ABS and NEG instructions are arithmetic (and, therefore,
any NaN operand causes signaling invalid operation), while abs2008
ones are non-arithmetic, always and only changing the sign bit, even
for NaN-like operands. Details on these instructions are documented
in [1] p. 35 and 359.

Implementation-wise, abs2008 versions are implemented without helpers,
for simplicity and performance sake.

[1] "MIPS Architecture For Programmers Volume II-A:
    The MIPS64 Instruction Set Reference Manual",
    Imagination Technologies LTD, Revision 6.04, November 13, 2015

Signed-off-by: Thomas Schwinge <thomas@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:46 +01:00
Aleksandar Markovic
40bd6dd456 target-mips: Activate IEEE 754-2008 signaling NaN bit meaning for MSA
Function msa_reset() is updated so that flag snan_bit_is_one is
properly set to 0.

By applying this patch, a number of incorrect MSA behaviors that
require IEEE 754-2008 compliance will be fixed. Those are behaviors
that (up to the moment of applying this patch) did not get the desired
functionality from SoftFloat library with respect to distinguishing
between quiet and signaling NaN, getting default NaN values (both
quiet and signaling), establishing if a floating point number is NaN
or not, etc.

Two examples:

* FMAX, FMIN will now correctly detect and propagate NaNs.
* FCLASS.D ans FCLASS.S will now correcty detect NaN flavors.

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:45 +01:00
Aleksandar Markovic
52d4c8ee93 linux-user: Update preprocessor constants for Mips-specific e_flags bits
Missing values EF_MIPS_FP64 and EF_MIPS_NAN2008 added.

Signed-off-by: Thomas Schwinge <thomas@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:45 +01:00
Aleksandar Markovic
c27644f0e9 softfloat: Handle snan_bit_is_one == 0 in MIPS pickNaNMulAdd()
Only for Mips platform, and only for cases when snan_bit_is_one is 0,
correct the order of argument comparisons in pickNaNMulAdd().

For more info, see [1], page 53, section "3.5.3 NaN Propagation".

[1] "MIPS Architecture for Programmers Volume IV-j:
    The MIPS32 SIMD Architecture Module",
    Imagination Technologies LTD, Revision 1.12, February 3, 2016

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[leon.alrae@imgtec.com:
 * reworded the subject of the patch
 * swapped if/else code blocks to match the commit description]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:32 +01:00
Aleksandar Markovic
a7c04d545a softfloat: For Mips only, correct default NaN values
Only for Mips platform, and only for cases when snan_bit_is_one is 0,
correct default NaN values (in their 16-, 32-, and 64-bit flavors).

For more info, see [1], page 84, Table 6.3 "Value Supplied When a New
Quiet NaN Is Created", and [2], page 52, Table 3.7 "Default NaN
Encodings".

[1] "MIPS Architecture For Programmers Volume II-A:
    The MIPS64 Instruction Set Reference Manual",
    Imagination Technologies LTD, Revision 6.04, November 13, 2015

[2] "MIPS Architecture for Programmers Volume IV-j:
    The MIPS32 SIMD Architecture Module",
    Imagination Technologies LTD, Revision 1.12, February 3, 2016

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:31 +01:00
Aleksandar Markovic
a59eaea646 softfloat: Clean code format in fpu/softfloat-specialize.h
fpu/softfloat-specialize.h is the most critical file in SoftFloat
library, since it handles numerous differences between platforms in
relation to floating point arithmetics. This patch makes the code
in this file more consistent format-wise, and hopefully easier to
debug and maintain.

Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:41:30 +01:00
Aleksandar Markovic
af39bc8c49 softfloat: Implement run-time-configurable meaning of signaling NaN bit
This patch modifies SoftFloat library so that it can be configured in
run-time in relation to the meaning of signaling NaN bit, while, at the
same time, strictly preserving its behavior on all existing platforms.

Background:

In floating-point calculations, there is a need for denoting undefined or
unrepresentable values. This is achieved by defining certain floating-point
numerical values to be NaNs (which stands for "not a number"). For additional
reasons, virtually all modern floating-point unit implementations use two
kinds of NaNs: quiet and signaling. The binary representations of these two
kinds of NaNs, as a rule, differ only in one bit (that bit is, traditionally,
the first bit of mantissa).

Up to 2008, standards for floating-point did not specify all details about
binary representation of NaNs. More specifically, the meaning of the bit
that is used for distinguishing between signaling and quiet NaNs was not
strictly prescribed. (IEEE 754-2008 was the first floating-point standard
that defined that meaning clearly, see [1], p. 35) As a result, different
platforms took different approaches, and that presented considerable
challenge for multi-platform emulators like QEMU.

Mips platform represents the most complex case among QEMU-supported
platforms regarding signaling NaN bit. Up to the Release 6 of Mips
architecture, "1" in signaling NaN bit denoted signaling NaN, which is
opposite to IEEE 754-2008 standard. From Release 6 on, Mips architecture
adopted IEEE standard prescription, and "0" denotes signaling NaN. On top of
that, Mips architecture for SIMD (also known as MSA, or vector instructions)
also specifies signaling bit in accordance to IEEE standard. MSA unit can be
implemented with both pre-Release 6 and Release 6 main processor units.

QEMU uses SoftFloat library to implement various floating-point-related
instructions on all platforms. The current QEMU implementation allows for
defining meaning of signaling NaN bit during build time, and is implemented
via preprocessor macro called SNAN_BIT_IS_ONE.

On the other hand, the change in this patch enables SoftFloat library to be
configured in run-time. This configuration is meant to occur during CPU
initialization, at the moment when it is definitely known what desired
behavior for particular CPU (or any additional FPUs) is.

The change is implemented so that it is consistent with existing
implementation of similar cases. This means that structure float_status is
used for passing the information about desired signaling NaN bit on each
invocation of SoftFloat functions. The additional field in float_status is
called snan_bit_is_one, which supersedes macro SNAN_BIT_IS_ONE.

IMPORTANT:

This change is not meant to create any change in emulator behavior or
functionality on any platform. It just provides the means for SoftFloat
library to be used in a more flexible way - in other words, it will just
prepare SoftFloat library for usage related to Mips platform and its
specifics regarding signaling bit meaning, which is done in some of
subsequent patches from this series.

Further break down of changes:

  1) Added field snan_bit_is_one to the structure float_status, and
     correspondent setter function set_snan_bit_is_one().

  2) Constants <float16|float32|float64|floatx80|float128>_default_nan
     (used both internally and externally) converted to functions
     <float16|float32|float64|floatx80|float128>_default_nan(float_status*).
     This is necessary since they are dependent on signaling bit meaning.
     At the same time, for the sake of code cleanup and simplicity, constants
     <floatx80|float128>_default_nan_<low|high> (used only internally within
     SoftFloat library) are removed, as not needed.

  3) Added a float_status* argument to SoftFloat library functions
     XXX_is_quiet_nan(XXX a_), XXX_is_signaling_nan(XXX a_),
     XXX_maybe_silence_nan(XXX a_). This argument must be present in
     order to enable correct invocation of new version of functions
     XXX_default_nan(). (XXX is <float16|float32|float64|floatx80|float128>
     here)

  4) Updated code for all platforms to reflect changes in SoftFloat library.
     This change is twofolds: it includes modifications of SoftFloat library
     functions invocations, and an addition of invocation of function
     set_snan_bit_is_one() during CPU initialization, with arguments that
     are appropriate for each particular platform. It was established that
     all platforms zero their main CPU data structures, so snan_bit_is_one(0)
     in appropriate places is not added, as it is not needed.

[1] "IEEE Standard for Floating-Point Arithmetic",
    IEEE Computer Society, August 29, 2008.

Signed-off-by: Thomas Schwinge <thomas@codesourcery.com>
Signed-off-by: Maciej W. Rozycki <macro@codesourcery.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Tested-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Tested-by: Leon Alrae <leon.alrae@imgtec.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[leon.alrae@imgtec.com:
 * cherry-picked 2 chunks from patch #2 to fix compilation warnings]
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-06-24 13:40:37 +01:00
Peter Maydell
a01aef5d2f Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes

nvdimm label support
cpu acpi hotplug rework
virtio rework
misc cleanups and fixes

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 24 Jun 2016 06:50:32 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream: (34 commits)
  virtio-bus: remove old set_host_notifier callback
  virtio-mmio: convert to ioeventfd callbacks
  virtio-pci: convert to ioeventfd callbacks
  virtio-ccw: convert to ioeventfd callbacks
  virtio-bus: have callers tolerate new host notifier api
  virtio-bus: common ioeventfd infrastructure
  pc: acpi: drop intermediate PCMachineState.node_cpu
  acpi-test-data: update expected
  pc: use new CPU hotplug interface since 2.7 machine type
  acpi: cpuhp: add cpu._OST handling
  acpi: cpuhp: implement hot-remove parts of CPU hotplug interface
  acpi: cpuhp: implement hot-add parts of CPU hotplug interface
  pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
  acpi: cpuhp: add CPU devices AML with _STA method
  pc: piix4/ich9: add 'cpu-hotplug-legacy' property
  docs: update ACPI CPU hotplug spec with new protocol
  i386: pci-assign: Fix MSI-X table size
  docs: add NVDIMM ACPI documentation
  nvdimm acpi: support Set Namespace Label Data function
  nvdimm acpi: support Get Namespace Label Data function
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-24 11:00:15 +01:00
Peter Maydell
55d72a7eb3 linux-user: Avoid possible misalignment in host_to_target_siginfo()
host_to_target_siginfo() is implemented by a combination of
host_to_target_siginfo_noswap() followed by tswap_siginfo().
The first of these two functions assumes that the target_siginfo_t
it is writing to is correctly aligned, but the pointer passed
into host_to_target_siginfo() is directly from the guest and
might be misaligned. Use a local variable to avoid this problem.
(tswap_siginfo() does now correctly handle a misaligned destination.)

We have to add a memset() to host_to_target_siginfo_noswap()
to avoid some false positive "may be used uninitialized" warnings
from gcc about subfields of the _sifields union if it chooses to
inline both tswap_siginfo() and host_to_target_siginfo_noswap()
into host_to_target_siginfo().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <riku.voipio@linaro.org>
2016-06-24 11:55:44 +03:00
Cornelia Huck
21a4d96243 virtio-bus: remove old set_host_notifier callback
All users have been converted to the new ioevent callbacks.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Cornelia Huck
c0971bcb7c virtio-mmio: convert to ioeventfd callbacks
Convert to the new interface.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Cornelia Huck
9f06e71a56 virtio-pci: convert to ioeventfd callbacks
Convert to new interface.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Cornelia Huck
7c55f68a63 virtio-ccw: convert to ioeventfd callbacks
Use the new interface.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Cornelia Huck
b1f0a33d80 virtio-bus: have callers tolerate new host notifier api
Have vhost and dataplane use the new api for transports that
have been converted.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Cornelia Huck
6798e245a3 virtio-bus: common ioeventfd infrastructure
Introduce a set of ioeventfd callbacks on the virtio-bus level
that can be implemented by the individual transports. At the
virtio-bus level, do common handling for host notifiers (which
is actually most of it).

Two things of note:
- When setting the host notifier, we only switch from/to the
  generic ioeventfd handler. This fixes a latent bug where we
  had no ioeventfd assigned for a certain window.
- We always iterate over all possible virtio queues, even though
  ccw (currently) has a lower limit. It does not really matter
  here.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:47:35 +03:00
Igor Mammedov
1f3aba377d pc: acpi: drop intermediate PCMachineState.node_cpu
PCMachineState.node_cpu was used for mapping APIC ID
to numa node id as CPU entries in SRAT used to be
built on sparse APIC ID bitmap (up to apic_id_limit).
However since commit
  5803fce pc: acpi: SRAT: create only valid processor lapic entries
CPU entries in SRAT aren't build using apic bitmap
but using 0..maxcpus index instead which is also used
for creating numa_info[x].node_cpu map.
So instead of doing useless intermediate conversion from
  1. node by cpu index -> node by apic id
       i.e. numa_info[x].node_cpu -> PCMachineState.node_cpu
  2. apic id -> srat entry PMX
       PCMachineState.node_cpu[apic id] -> PMX value
use numa_info[x].node_cpu map directly like ARM does and do
  1. numa_info[x].node_cpu -> PMX value using index
     in range 0..maxcpus
and drop not necessary PCMachineState.node_cpu and related
code.

That also removes the last (not counting legacy hotplug)
dependency of ACPI code on apic_id_limit and need to allocate
huge sparse PCMachineState.node_cpu array in case of 32-bit
APIC IDs.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:34:47 +03:00
Michael S. Tsirkin
d8d69e1f51 acpi-test-data: update expected
switched to new cpu hotplug interface, aml changed.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 08:22:07 +03:00
Igor Mammedov
679dd1a957 pc: use new CPU hotplug interface since 2.7 machine type
For compatibility reasons PC/Q35 will start with legacy
CPU hotplug interface by default but with new CPU hotplug
AML code since 2.7 machine type. That way legacy firmware
that doesn't use QEMU generated ACPI tables will be
able to continue using legacy CPU hotplug interface.

While new machine type, with firmware supporting QEMU
provided ACPI tables, will generate new CPU hotplug AML,
which will switch to new CPU hotplug interface when
guest OS executes its _INI method on ACPI tables
loading.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:38 +03:00
Igor Mammedov
76623d00ae acpi: cpuhp: add cpu._OST handling
it adds HW and AML parts for CPU_Device._OST method
handling to allow OSPM reports status of hot-(un)plug
operation.
And extends QMP command query-acpi-ospm-status to report
CPU's OST info along with already reported PC-DIMM devices.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:35 +03:00
Igor Mammedov
8872c25a26 acpi: cpuhp: implement hot-remove parts of CPU hotplug interface
it adds hw registers needed for handling CPU hot-remove and
corresponding AML methods to request and eject a CPU with
necessary hotplug callbacks in pc,piix4,ich9 code.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:26 +03:00
Igor Mammedov
d2238cb678 acpi: cpuhp: implement hot-add parts of CPU hotplug interface
it adds hw registers needed for handling CPU hot-add and
corresponding AML methods to handle hot-add events on
guest side.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:22 +03:00
Igor Mammedov
ac35f13ba8 pc: acpi: introduce AcpiDeviceIfClass.madt_cpu hook
Add madt_cpu callback to AcpiDeviceIfClass and use
it for generating LAPIC MADT entries for CPUs.

Later it will be used for generating x2APIC
entries in case of more than 255 CPUs and also
would be reused by ARM target when ACPI CPU hotplug
is introduced there.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:16 +03:00
Igor Mammedov
5e1b5d9388 acpi: cpuhp: add CPU devices AML with _STA method
it adds CPU objects to DSDT with _STA method
and QEMU side of CPU hotplug interface initialization
with registers sufficient to handle _STA requests,
including necessary hotplug callbacks in piix4,ich9 code.

Hot-(un)plug hw/acpi parts will be added by
corresponding follow up patches.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:21:01 +03:00
Igor Mammedov
16bcab97eb pc: piix4/ich9: add 'cpu-hotplug-legacy' property
It will be used to select which hotplug call-back is called
and for switching from legacy mode into new one.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:20:55 +03:00
Igor Mammedov
abd49bc2ed docs: update ACPI CPU hotplug spec with new protocol
Add description of new CPU hotplug interface.

To switch from from legacy mode into new mode use fact
that write accesses into CPU present bitmap were never
used before and were ignored by QEMU.
So use it to as a way to switch from legacy mode.
That way pc/q35 machine starts in legacy mode and
QEMU generated ACPI tables will switch to new CPU
hotplug interface during runtime.
In case QEMU is started with legacy BIOS (that doesn't
support QEMU generated ACPI tables), legacy CPU hotplug
will remain active and could be used by BIOS built in
ACPI tables for CPU hotplug.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:20:22 +03:00
Ido Yariv
aa1dd39ca3 i386: pci-assign: Fix MSI-X table size
The current code creates a whole page mmio region for the MSI-X table
size.

However, the page containing the MSI-X table may contain other registers
not related to MSI-X. Creating an mmio region for the whole page masks
such registers and may break drivers in the guest OS.

Since maximal number of entries is known, use that instead to deduce the
table size when setting up the mmio region.

Signed-off-by: Ido Yariv <ido@wizery.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
15b82b1dc5 docs: add NVDIMM ACPI documentation
It describes the basic concepts of NVDIMM ACPI and the interfaces
between QEMU and the ACPI BIOS

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
14e44198ff nvdimm acpi: support Set Namespace Label Data function
Function 6 is used to set Namespace Label Data

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
2b9e57fc7f nvdimm acpi: support Get Namespace Label Data function
Function 5 is used to get Namespace Label Data

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
5797dcdc7a nvdimm acpi: support Get Namespace Label Size function
Function 4 is used to get Namespace label size

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
d15fc53f8d nvdimm acpi: check revision
Currently only revision 1 is supported

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
189f4d5635 nvdimm acpi: abstract the operations for root & nvdimm devices
It separates the operations between root device and nvdimm devices
in order to introducing label functions support for nvdimm device

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
90623ebf60 nvdimm acpi: check UUID
Check arg0 which indicates UUID to see if it is valid

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
4568c94806 nvdimm acpi: save arg3 of _DSM method
Check if the input Arg3 is valid then store it into ARG3 if it is
needed

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
732b530c1b nvdimm acpi: set HDLE properly
Now we pass HDLE to Qemu properly, use 0 for root device and use the
handle for nvdimm devices

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
052889b8e9 acpi: add aml_call5
It will be used by NVDIMM ACPI

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
b265f27c5a acpi: add aml_object_type
Implement ObjectType which is used by NVDIMM _DSM method in
later patch

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
d6fb213a62 nvdimm: support nvdimm label
Introduce a parameter, 'label-size', which is the size of nvdimm label
data area which is reserved at the end of backend memory. It is required
at least 128k

Two callbacks, read_label_data() and write_label_data(), are used to
operate the label area

Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Xiao Guangrong
8df1426e44 pc-dimm: introduce get_vmstate_memory_region callback
This callback returns the MemoryRegion that is the memory of dimm should
be kept during live migration

nvdimm device is different with pc-dimm as its memory includes not only
the MemoryRegion directly mapping to guest's address space but also the
memory used as label data

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Corey Minyard
f4eda2d429 bios: Add tests for the IPMI ACPI and SMBIOS entries
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Corey Minyard
86e91dd713 acpi: Add IPMI table entries
Use the ACPI table construction tools to create an ACPI entry
for IPMI.  This adds a function called build_acpi_ipmi_devices
to add an DSDT entry for IPMI if IPMI is compiled in and an
IPMI device exists.  It also adds a dummy function if IPMI
is not compiled in.

This conforms to section "C3-2 Locating IPMI System Interfaces in
ACPI Name Space" in the IPMI 2.0 specification.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Corey Minyard
35658f6e0c ipmi: Add SMBIOS table entry
Add an IPMI table entry to the SMBIOS.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Corey Minyard
0517cc9863 smbios: Move table build tools into an include file.
This will let things in other files (like IPMI) build SMBIOS tables.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-24 05:13:57 +03:00
Peter Maydell
c728876752 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160623' into staging
ppc patch queue for 2016-06-23

Currently outstanding patches for spapr, target-ppc and related
devices.  This batch has:
    * Significant new progress towards full support for hypervisor
      mode
    * Assorted bugfixes
    * Some preliminary patches towards dynamic DMA window support

The last involves a change to memory.c, which Paolo has said I can
take through this tree.

# gpg: Signature made Thu 23 Jun 2016 06:47:53 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160623:
  ppc: Disable huge page support if it is not available for main RAM
  ppc: Add P7/P8 Power Management instructions
  ppc: Move exception generation code out of line
  ppc: Turn a bunch of booleans from int to bool
  ppc: Add real mode CI load/store instructions for P7 and P8
  ppc: Rework generation of priv and inval interrupts
  ppc: Fix generation if ISI/DSI vs. HV mode
  ppc: Fix POWER7 and POWER8 exception definitions
  ppc: fix exception model for HV mode
  ppc: define a default LPCR value
  ppc: Fix rfi/rfid/hrfi/... emulation
  memory: Add reporting of supported page sizes
  ppc: Improve emulation of THRM registers
  target-ppc: Fix rlwimi, rlwinm, rlwnm again
  ppc64: disable gen_pause() for linux-user mode
  tests: Use '+=' to add additional tests, not '='
  powerpc/mm: Update the WIMG check during H_ENTER

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-23 11:53:14 +01:00
Peter Maydell
c6eb076aec Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160622-2' into staging
usb: add hotplug support for usb-bot and usb-uas.

# gpg: Signature made Wed 22 Jun 2016 12:45:46 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-usb-20160622-2:
  usb-uas: hotplug support
  usb-bot: hotplug support
  usb: Add QOM property "attached".
  usb: make USBDevice->attached bool
  usb-storage: qcow2 encryption support is finally gone, zap dead code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-23 11:18:57 +01:00
Peter Maydell
59a79f65ba Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160622-tag' into staging
xen-20160622

# gpg: Signature made Wed 22 Jun 2016 12:45:56 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20160622-tag:
  xen: move xen_sysdev to xen_backend.c
  xen: fix qdisk BLKIF_OP_DISCARD for 32/64 word size mix
  xen: fix style of hw/block/xen_blkif.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-23 10:44:56 +01:00
Thomas Huth
86b50f2e1b ppc: Disable huge page support if it is not available for main RAM
On powerpc, we must only signal huge page support to the guest if
all memory areas are capable of supporting huge pages. The commit
2d103aae87 ("fix hugepage support when using memory-backend-file")
already fixed the case when the user specified the mem-path property
for NUMA memory nodes instead of using the global "-mem-path" option.
However, there is one more case where it currently can go wrong.
When specifying additional memory DIMMs without using NUMA, e.g.

 qemu-system-ppc64 -enable-kvm ... -m 1G,slots=2,maxmem=2G \
    -device pc-dimm,id=dimm-mem1,memdev=mem1 -object \
    memory-backend-file,policy=default,mem-path=/...,size=1G,id=mem1

the code in getrampagesize() currently assumes that huge pages
are possible since they are enabled for the mem1 object. But
since the main RAM is not backed by a huge page filesystem,
the guest Linux kernel then crashes very quickly after being
started. So in case the we've got "normal" memory without NUMA
and without the global "-mem-path" option, we must not announce
huge pages to the guest. Since this is likely a mis-configuration
by the user, also spill out a message in this case.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:53:42 +10:00
Benjamin Herrenschmidt
7778a575c7 ppc: Add P7/P8 Power Management instructions
This adds the ISA 2.06 and later power management instructions
(doze, nap, sleep and rvwinkle) and associated wakeup cause testing
in LPCR

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:54 +10:00
Benjamin Herrenschmidt
b9971cc53e ppc: Move exception generation code out of line
There's no point inlining this, if you hit the exception case you exit
anyway, and not inlining saves about 100K of code size (and cache
footprint).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: removed '__attribute__((noinline))' from original patch ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:54 +10:00
Benjamin Herrenschmidt
5c3ae92910 ppc: Turn a bunch of booleans from int to bool
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:54 +10:00
Benjamin Herrenschmidt
b781537560 ppc: Add real mode CI load/store instructions for P7 and P8
Those instructions are only available in hypervisor real mode and
allow cache inhibited garded access to devices in that mode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:54 +10:00
Benjamin Herrenschmidt
9b2fadda3e ppc: Rework generation of priv and inval interrupts
Recent server processors use the Hypervisor Emulation Assistance
interrupt for illegal instructions and *some* type of SPR accesses.

Also the code was always generating inval instructions even for priv
violations due to setting the wrong flags

Finally, the checking for PR/HV was open coded everywhere.

This reworks it all, using little helper macros for checking, and
adding the HV interrupt (which gets converted back to program check
in the slow path of excp_helper.c on CPUs that don't want it).

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:54 +10:00
Benjamin Herrenschmidt
33595dc9f3 ppc: Fix generation if ISI/DSI vs. HV mode
Under some circumstances, we need to direct ISI and DSI interrupts
at the hypervisor, turning them into HISI/HDSI, and using different
SPRs (HDSISR and HDAR) depending on the combination of MSR_DR and
the corresponding VPM bits in LPCR.

This moves part of the code into helpers that are fixed to select
the right exception type and registers. On pre-P7 processors, LPCR
is 0 which provides the old behaviour of directing the interrupts
at the supervisor.

Thanks to Andrei Warkentin for finding a bug when HV=1

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: Merged a fix on POWERPC_EXCP_HDSI fixing the condition on
      msr_hv, from Andrei Warkentin <andrey.warkentin@gmail.com> ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:25 +10:00
Benjamin Herrenschmidt
f03a1af581 ppc: Fix POWER7 and POWER8 exception definitions
We were initializing unused ones and missing some

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: fixed checkpatch.pl errors ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:25 +10:00
Benjamin Herrenschmidt
6d49d6d4ed ppc: fix exception model for HV mode
This properly implements LPES0 handling for HV vs. !HV mode and
removes the unsupported LPES1. This has been removed from the specs
since ISA v2.07.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: AIL implementation was fixed in commit 5c94b2a5e5. This patch
      only contains the bits of the original patch related to LPES0
      handling, adapted commit log.
      fixed checkpatch.pl errors. ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:25 +10:00
Benjamin Herrenschmidt
61687db252 ppc: define a default LPCR value
This allows us to set the appropriate LPCR bits which will be used
when fixing the exception model for the HV mode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: previous commit 26a7f1291b did not include the LPCR setting as
      it was not needed at the time, adapted commit log ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:43:25 +10:00
Benjamin Herrenschmidt
a2e71b28e8 ppc: Fix rfi/rfid/hrfi/... emulation
This reworks emulation of the various "rfi" variants. I removed
some masking bits that I couldn't make sense of, the only bit that
I am aware we should mask here is POW, the CPU's MSR mask should
take care of the rest.

This also fixes some problems when running 32-bit userspace under
a 64-bit kernel.

This patch broke 32bit OpenBIOS when run under a 970 cpu. A fix was
proposed here :

    https://www.coreboot.org/pipermail/openbios/2016-June/009452.html

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: updated the commit log with the reference of the openbios fix ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Remove hunk which disabled rfi on 64-bit CPUS.  The change was
 correct, but we need to fix OpenBIOS before applying it]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-23 12:42:25 +10:00
Gerd Hoffmann
0d4cf3e72a usb-uas: hotplug support
Make attached property settable and turns off auto-attach in case the
device was hotplugged.  Hotplugging works simliar to usb-bot now.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1465984019-28963-6-git-send-email-kraxel@redhat.com
2016-06-22 12:53:26 +02:00
Gerd Hoffmann
b78ecd0998 usb-bot: hotplug support
This patch marks usb-bot as hot-pluggable device, makes attached
property settable and turns off auto-attach in case the device
was hotplugged.

Hot-plugging a usb-bot device with one or more scsi devices can be
done this way now:

  (1) device-add usb-bot,id=foo
  (2) device-add scsi-{hd,cd},bus=foo.0,lun=0
  (2b) optionally add more devices (luns 0 ... 15).
  (3) qom-set foo.attached = true

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1465984019-28963-5-git-send-email-kraxel@redhat.com
2016-06-22 12:53:26 +02:00
Gerd Hoffmann
1e351dc373 usb: Add QOM property "attached".
USB devices in attached state are visible to the guest.  This patch adds
a QOM property for this.  Write access is opt-in per device.  Some
devices manage attached state automatically (usb-host, usb-serial,
usb-redir), so we can't enable write access universally but have to do
it on a case by case base.  So far, no device opts in.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1465984019-28963-4-git-send-email-kraxel@redhat.com

[ minor codestyle fix ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-22 12:53:26 +02:00
Gerd Hoffmann
eb19d2b9d1 usb: make USBDevice->attached bool
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1465984019-28963-3-git-send-email-kraxel@redhat.com
2016-06-22 12:53:26 +02:00
Gerd Hoffmann
8d3830efca usb-storage: qcow2 encryption support is finally gone, zap dead code
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1465984019-28963-2-git-send-email-kraxel@redhat.com
2016-06-22 12:53:26 +02:00
Juergen Gross
25f8f6b4c2 xen: move xen_sysdev to xen_backend.c
Commit 9432e53a5b added xen_sysdev as a
system device to serve as an anchor for removable virtual buses. This
introduced a build failure for non-x86 builds with CONFIG_XEN_BACKEND
set, as xen_sysdev was defined in a x86 specific file while being
consumed in an architecture independent source.

Move the xen_sysdev definition and initialization to xen_backend.c to
avoid the build failure.

Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-22 11:28:42 +01:00
Juergen Gross
16246018d3 xen: fix qdisk BLKIF_OP_DISCARD for 32/64 word size mix
In case the word size of the domU and qemu running the qdisk backend
differ BLKIF_OP_DISCARD will not work reliably, as the request
structure in the ring have different layouts for different word size.

Correct this by copying the request structure in case of different
word size element by element in the BLKIF_OP_DISCARD case, too.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-22 11:28:17 +01:00
Juergen Gross
0d8e58942a xen: fix style of hw/block/xen_blkif.h
Fix hw/block/xen_blkif.h to match qemu coding style.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-22 11:28:17 +01:00
Alexey Kardashevskiy
f682e9c244 memory: Add reporting of supported page sizes
Every IOMMU has some granularity which MemoryRegionIOMMUOps::translate
uses when translating, however this information is not available outside
the translate context for various checks.

This adds a get_min_page_size callback to MemoryRegionIOMMUOps and
a wrapper for it so IOMMU users (such as VFIO) can know the minimum
actual page size supported by an IOMMU.

As IOMMU MR represents a guest IOMMU, this uses TARGET_PAGE_SIZE
as fallback.

This removes vfio_container_granularity() and uses new helper in
memory_region_iommu_replay() when replaying IOMMU mappings on added
IOMMU memory region.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: Removed an unnecessary calculation]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:13:09 +10:00
Benjamin Herrenschmidt
f0278900d3 ppc: Improve emulation of THRM registers
The 75x and 74xx processors have some thermal monitoring SPRs that
some OSes such as MacOS do use. Our current "dumb" implementation
isn't good enough and will cause some versions of MacOS to hang during
boot.

This lifts an improved emulation from MacOnLinux and adapts it to
qemu, thus fixing the problem.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[dwg: Fixed typo in comment, a number of minor checkpatch warnings,
 and a compile failure with CONFIG_USER_ONLY]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Richard Henderson
820724d170 target-ppc: Fix rlwimi, rlwinm, rlwnm again
In 63ae0915f8, I arranged to use a 32-bit rotate, without
considering the effect of a mask value that wraps around to
the high bits of the word.

[dwg: In 2e11b15 this was partially fixed, but an edge case was still
incorrect, which this fixes]

Signed-off-by: Richard Henderson <rth@twiddle.net>
[dwg: Folded with a revert of 2e11b15, an earlier buggy version of
 this patch which already went upstream]
Tested-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Laurent Vivier
7f2b1744b3 ppc64: disable gen_pause() for linux-user mode
While trying to install a fedora container with
"lxc-create -t fedora -- -I qemu-ppc64" the installation abort with
the following error:

qemu: fatal: Unknown exception 0x65537. Aborting

NIP 0000004000927924   LR 00000040009e325c CTR 0000004000927480 XER 0000000000000000 CPU#0
MSR 9000000102806000 HID0 0000000000000000  HF 9000000002806000 iidx 3 didx 3
TB 00248932 1069155773327487
GPR00 00000040009e325c 00000040007ff800 0000004000aba098 0000000000000000
GPR04 00000040007ff878 0000004000dcb588 0000004000dcb830 0000004000a7a098
GPR08 0000000000000000 0000000000000000 00000040007ff878 0000004000927960
GPR12 0000000022022448 0000004000e2aef0 0000000000000000 0000000000000000
GPR16 0000000000000000 0000000000000000 0000000000000002 0000000000000001
GPR20 0000000000000000 0000000000000000 0000000000000000 0000004000800699
GPR24 0000004000e13320 0000000000000000 0000004000ac9ad8 0000004000ac9ae0
GPR28 0000000000000001 00000000100210a0 0000000000000000 0000000000000038
CR 22022442  [ E  E  -  E  E  G  G  E  ]             RES ffffffffffffffff
FPR00 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR04 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR08 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR12 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR16 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR20 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR24 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPR28 0000000000000000 0000000000000000 0000000000000000 0000000000000000
FPSCR 0000000000000000
/usr/share/lxc/templates/lxc-fedora: line 487: 26661 Aborted                 (core dumped) chroot . yum -y --nogpgcheck --installroot /run/install install python rpm yum

I've bisected until the commit:

    commit b68e60e6f0
    Author: Benjamin Herrenschmidt <benh@kernel.crashing.org>
    Date:   Tue May 3 18:03:33 2016 +0200

        ppc: Get out of emulation on SMT "OR" ops

        Otherwise tight loops at smt_low for example, which OPAL does,
        eat so much CPU that we can't boot a kernel anymore. With that,
        I can boot 8 CPUs just fine with powernv.

        Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
        Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
        Signed-off-by: David Gibson <david@gibson.dropbear.id.au>

We can fix that by preventing to send EXCP_HLT in the case of linux-user mode,
as the main loop doesn't know how to manage it.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Thomas Huth
0ccac16f59 tests: Use '+=' to add additional tests, not '='
The recent commit that added the prom-env-test accidentially
overwrote the check-qtest-ppc-y, check-qtest-ppc64-y and
check-qtest-sparc-y variables instead of extending them.

Fixes: fcbf4a3c0c
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Aneesh Kumar K.V
c117590769 powerpc/mm: Update the WIMG check during H_ENTER
Support for 0 value for memeory coherence is optional and with ppc64
we can always enable memory coherence. Linux kernel did that during
the development of 4.7 kernel. But that resulted in failure in Qemu
in H_ENTER hcall due to below check. The mentioned change was reverted
in the kernel and kernel right now enable memory coherence only if
cache inhibited is not set. Nevertheless update qemu WIMG flag check
to cover the case where we enable memory coherence along with cache
inhibited flag.

In order to handle older and newer kernel version consider both Cache
inhibitted and (cache inhibitted | memory conference) as valid values
for wimg flags.

Signed-off-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-22 11:12:17 +10:00
Peter Maydell
6f1d2d1c5a Merge remote-tracking branch 'remotes/stsquad/tags/pull-travis-20160621-1' into staging
This pull request contains:

  - disable sparse testing
  - add trusty build target
  - add libnfs-dev for NFS block driver

These are the same patches posted last week for any last minute review.

# gpg: Signature made Tue 21 Jun 2016 10:06:34 BST
# gpg:                using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8  DF35 FBD0 DB09 5A9E 2A44

* remotes/stsquad/tags/pull-travis-20160621-1:
  .travis.yml: disable Sparse testing
  .travis.yml: add trusty GCE target
  .travis.yml: add libnfs-dev for NFS block driver

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-21 15:19:58 +01:00
Gerd Hoffmann
55543e7623 milkymist: fix tmu2.c build failure (missing error.h include)
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-21 13:25:09 +01:00
Peter Maydell
728cc990f6 Merge remote-tracking branch 'remotes/mcayland/tags/qemu-sparc-signed' into staging
qemu-sparc update

# gpg: Signature made Mon 20 Jun 2016 21:55:23 BST
# gpg:                using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C  C9C4 5BC2 C56F AE0F 321F

* remotes/mcayland/tags/qemu-sparc-signed:
  MAINTAINERS: remove Blue Swirl as SPARC maintainer
  MAINTAINERS: add Artyom Tarasenko as SPARC maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-21 10:36:16 +01:00
Peter Maydell
b0ad00b8c9 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Mon 20 Jun 2016 21:29:27 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/tracing-pull-request: (42 commits)
  trace: split out trace events for linux-user/ directory
  trace: split out trace events for qom/ directory
  trace: split out trace events for target-ppc/ directory
  trace: split out trace events for target-s390x/ directory
  trace: split out trace events for target-sparc/ directory
  trace: split out trace events for net/ directory
  trace: split out trace events for audio/ directory
  trace: split out trace events for ui/ directory
  trace: split out trace events for hw/alpha/ directory
  trace: split out trace events for hw/arm/ directory
  trace: split out trace events for hw/acpi/ directory
  trace: split out trace events for hw/vfio/ directory
  trace: split out trace events for hw/s390x/ directory
  trace: split out trace events for hw/pci/ directory
  trace: split out trace events for hw/ppc/ directory
  trace: split out trace events for hw/9pfs/ directory
  trace: split out trace events for hw/i386/ directory
  trace: split out trace events for hw/isa/ directory
  trace: split out trace events for hw/sd/ directory
  trace: split out trace events for hw/sparc/ directory
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 22:30:34 +01:00
Mark Cave-Ayland
3a978051b2 MAINTAINERS: remove Blue Swirl as SPARC maintainer
Blue is no longer active in the QEMU project, so remove him from the list of
SPARC maintainers.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Blue Swirl <blauwirbel@gmail.com>
2016-06-20 21:55:16 +01:00
Mark Cave-Ayland
2c742bf736 MAINTAINERS: add Artyom Tarasenko as SPARC maintainer
Artyom has been working on QEMU's SPARC emulation for several years, providing
initial support for Solaris under qemu-system-sparc and more recently bugfixes
for qemu-system-sparc64 and TCG patch reviews. As work progresses on improving
emulation for sun4u machines and beyond, Artyom has agreed to take on
co-maintainership of SPARC with a focus on 64-bit architecture.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
2016-06-20 21:55:16 +01:00
Peter Maydell
7e13ea57f4 Merge remote-tracking branch 'remotes/mwalle/tags/lm32-queue/20160620' into staging
lm32/milkymist: some qomifying

# gpg: Signature made Mon 20 Jun 2016 17:27:53 BST
# gpg:                using RSA key 0xB458ABB0D8D378E3
# gpg: Good signature from "Michael Walle <michael@walle.cc>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2190 3E48 4537 A7C2 90CE  3EB2 B458 ABB0 D8D3 78E3

* remotes/mwalle/tags/lm32-queue/20160620:
  milkymist: update specification URLs
  hw/intc: QOM'ify lm32_pic.c
  hw/display: QOM'ify milkymist-vgafb.c
  hw/display: QOM'ify milkymist-tmu2.c
  hw/timer: QOM'ify milkymist_sysctl
  hw/timer: QOM'ify lm32_timer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 18:14:26 +01:00
Daniel P. Berrange
f52347d5b0 trace: split out trace events for linux-user/ directory
Move all trace-events for files in the linux-user/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 1466066426-16657-41-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
38b1eedcdc trace: split out trace events for qom/ directory
Move all trace-events for files in the qom/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-40-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
6dfba0ef6e trace: split out trace events for target-ppc/ directory
Move all trace-events for files in the target-ppc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1466066426-16657-39-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
a4e21b3e35 trace: split out trace events for target-s390x/ directory
Move all trace-events for files in the target-s390x/ directory to
their own file.

[Added missing newline in target-s390x/trace-events as suggested by
Cornelia Huck <cornelia.huck@de.ibm.com>.
--Stefan]

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1466066426-16657-38-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
1cba9b2942 trace: split out trace events for target-sparc/ directory
Move all trace-events for files in the target-sparc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-37-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
f3b0163b18 trace: split out trace events for net/ directory
Move all trace-events for files in the net/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-36-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
66d7a36d75 trace: split out trace events for audio/ directory
Move all trace-events for files in the audio/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-35-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
357ac7f318 trace: split out trace events for ui/ directory
Move all trace-events for files in the ui/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-34-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
4f92ce1376 trace: split out trace events for hw/alpha/ directory
Move all trace-events for files in the hw/alpha/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-33-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:17 +01:00
Daniel P. Berrange
0b8276d644 trace: split out trace events for hw/arm/ directory
Move all trace-events for files in the hw/arm/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-32-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
65b5bd3be5 trace: split out trace events for hw/acpi/ directory
Move all trace-events for files in the hw/acpi/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-31-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
1cf6ebc73f trace: split out trace events for hw/vfio/ directory
Move all trace-events for files in the hw/vfio/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-30-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
80aa71e98d trace: split out trace events for hw/s390x/ directory
Move all trace-events for files in the hw/s390x/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 1466066426-16657-29-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
fec28134e5 trace: split out trace events for hw/pci/ directory
Move all trace-events for files in the hw/pci/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-28-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
3054fba87b trace: split out trace events for hw/ppc/ directory
Move all trace-events for files in the hw/ppc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1466066426-16657-27-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
d018a2e931 trace: split out trace events for hw/9pfs/ directory
Move all trace-events for files in the hw/9pfs/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-26-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
5eb76e480b trace: split out trace events for hw/i386/ directory
Move all trace-events for files in the hw/i386/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-25-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
2b785e3cbf trace: split out trace events for hw/isa/ directory
Move all trace-events for files in the hw/isa/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-24-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
1374aecc7c trace: split out trace events for hw/sd/ directory
Move all trace-events for files in the hw/sd/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-23-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
f0b9e35687 trace: split out trace events for hw/sparc/ directory
Move all trace-events for files in the hw/sparc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-22-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
8101345076 trace: split out trace events for hw/dma/ directory
Move all trace-events for files in the hw/dma/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-21-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
c3e203f30d trace: split out trace events for hw/timer/ directory
Move all trace-events for files in the hw/timer/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-20-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:16 +01:00
Daniel P. Berrange
d1d5119864 trace: split out trace events for hw/input/ directory
Move all trace-events for files in the hw/input/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-19-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
14750ef1b5 trace: split out trace events for hw/display/ directory
Move all trace-events for files in the hw/display/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-18-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
ddc63e4556 trace: split out trace events for hw/nvram/ directory
Move all trace-events for files in the hw/nvram/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-17-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
de4291ca4a trace: split out trace events for hw/scsi/ directory
Move all trace-events for files in the hw/scsi/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-16-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
7da2981e59 trace: split out trace events for hw/usb/ directory
Move all trace-events for files in the hw/usb/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-15-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
6b5bacf6af trace: split out trace events for hw/misc/ directory
Move all trace-events for files in the hw/misc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-14-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
92fe6aff87 trace: split out trace events for hw/audio/ directory
Move all trace-events for files in the hw/audio/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-13-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
270ab88f7c trace: split out trace events for hw/virtio/ directory
Move all trace-events for files in the hw/virtio/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-12-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
cd8c2fe77b trace: split out trace events for hw/net/ directory
Move all trace-events for files in the hw/net/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-11-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
aebd4d17dc trace: split out trace events for hw/intc/ directory
Move all trace-events for files in the hw/intc/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-10-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
732d83145e trace: split out trace events for hw/char/ directory
Move all trace-events for files in the hw/char/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-9-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
92d3265212 trace: split out trace events for hw/block/ directory
Move all trace-events for files in the hw/block/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-8-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:15 +01:00
Daniel P. Berrange
b54ca48e40 trace: split out trace events for block/ directory
Move all trace-events for files in the block/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-7-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Daniel P. Berrange
521d47c6dc trace: split out trace events for migration/ directory
Move all trace-events for files in the migration/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-6-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Daniel P. Berrange
892bd32ea3 trace: split out trace events for io/ directory
Move all trace-events for files in the io/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-5-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Daniel P. Berrange
8451f2f28d trace: split out trace events for crypto/ directory
Move all trace-events for files in the crypto/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-4-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Daniel P. Berrange
492bb2dd65 trace: split out trace events for util/ directory
Move all trace-events for files in the util/ directory to
their own file.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-3-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Daniel P. Berrange
1412cf58be trace: add build framework for merging trace-events files
Switch make rules over to use trace-events-all as the
master trace events input file. Add rule that will
construct trace-events-all from $(trace-events-y).

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1466066426-16657-2-git-send-email-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:22:14 +01:00
Lluís Vilanova
dcdaadb6ea trace: [all] Add "guest_mem_before" event
The event is described in "trace-events". Note that the "MO_AMASK" flag
is not traced, since it does not seem to affect the visible semantics of
instructions.

[s/inline inline/inline/ to fix clang build.
--Stefan]

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 146549350711.18437.726780393247474362.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 17:21:56 +01:00
Michael Walle
6dbbe24337 milkymist: update specification URLs
The old milkymist.org domain just forwards to mm-labs.hk nowadays. I've
created a mirror of the documents.

Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:12:04 +02:00
xiaoqiang zhao
5e502d31db hw/intc: QOM'ify lm32_pic.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:12:04 +02:00
xiaoqiang zhao
165b244b98 hw/display: QOM'ify milkymist-vgafb.c
* Drop the old SysBus init function and use instance_init
* Move graphic_console_init into realize stage

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:11:59 +02:00
xiaoqiang zhao
cf79c64d58 hw/display: QOM'ify milkymist-tmu2.c
* Drop the old SysBus init function and use instance_init
* Move tmu2_glx_init into realize stage

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:11:06 +02:00
xiaoqiang zhao
596ca93386 hw/timer: QOM'ify milkymist_sysctl
* split the old SysBus init function into an instance_init
  and a Device realize function
* use DeviceClass::realize instead of SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:09:53 +02:00
xiaoqiang zhao
a18eac523a hw/timer: QOM'ify lm32_timer
* split the old SysBus init function into an instance_init
  and a Device realize function
* use DeviceClass::realize instead of SysBusDeviceClass::init

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Walle <michael@walle.cc>
2016-06-20 18:09:53 +02:00
Peter Maydell
7fa124b273 Merge remote-tracking branch 'remotes/armbru/tags/pull-error-2016-06-20' into staging
Error reporting patches for 2016-06-20

# gpg: Signature made Mon 20 Jun 2016 15:56:15 BST
# gpg:                using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867  4E5F 3870 B400 EB91 8653

* remotes/armbru/tags/pull-error-2016-06-20:
  log: Fix qemu_set_log_filename() error handling
  log: Fix qemu_set_dfilter_ranges() error reporting
  log: Plug memory leak on multiple -dfilter
  coccinelle: Remove unnecessary variables for function return value
  error: Remove unnecessary local_err variables
  error: Remove NULL checks on error_propagate() calls
  vl: Error messages need to go to stderr, fix some

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 16:19:18 +01:00
Markus Armbruster
daa76aa416 log: Fix qemu_set_log_filename() error handling
When qemu_set_log_filename() detects an invalid file name, it reports
an error, closes the log file (if any), and starts logging to stderr
(unless daemonized or nothing is being logged).

This is wrong.  Asking for an invalid log file on the command line
should be fatal.  Asking for one in the monitor should fail without
messing up an existing logfile.

Fix by converting qemu_set_log_filename() to Error.  Pass it
&error_fatal, except for hmp_logfile report errors.

This also permits testing without a subprocess, so do that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-4-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-20 16:39:08 +02:00
Markus Armbruster
bd6fee9f12 log: Fix qemu_set_dfilter_ranges() error reporting
g_error() is not an acceptable way to report errors to the user:

    $ qemu-system-x86_64 -dfilter 1000+0

    ** (process:17187): ERROR **: Failed to parse range in: 1000+0
    Trace/breakpoint trap (core dumped)

g_assert() isn't, either:

    $ qemu-system-x86_64 -dfilter 1000x+64
    **
    ERROR:/work/armbru/qemu/util/log.c:180:qemu_set_dfilter_ranges: assertion failed: (e == range_op)
    Aborted (core dumped)

Convert qemu_set_dfilter_ranges() to Error.  Rework its deeply nested
control flow.  Touch up the error messages.  Call it with
&error_fatal.

This also permits testing without a subprocess, so do that.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-20 16:38:31 +02:00
Markus Armbruster
2ec62faea2 log: Plug memory leak on multiple -dfilter
-dfilter overwrites any previous filter.  The overwritten filter is
leaked.  Leaks since the beginning (commit 3514552, v2.6.0).  Free it
properly.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1466011636-6112-2-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-20 16:38:14 +02:00
Eduardo Habkost
9be385980d coccinelle: Remove unnecessary variables for function return value
Use Coccinelle script to replace 'ret = E; return ret' with
'return E'. The script will do the substitution only when the
function return type and variable type are the same.

Manual fixups:

* audio/audio.c: coding style of "read (...)" and "write (...)"
* block/qcow2-cluster.c: wrap line to make it shorter
* block/qcow2-refcount.c: change indentation of wrapped line
* target-tricore/op_helper.c: fix coding style of
  "remainder|quotient"
* target-mips/dsp_helper.c: reverted changes because I don't
  want to argue about checkpatch.pl
* ui/qemu-pixman.c: fix line indentation
* block/rbd.c: restore blank line between declarations and
  statements

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-4-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Unused Coccinelle rule name dropped along with a redundant comment;
whitespace touched up in block/qcow2-cluster.c; stale commit message
paragraph deleted]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20 16:38:13 +02:00
Eduardo Habkost
6b62d96137 error: Remove unnecessary local_err variables
This patch simplifies code that uses a local_err variable just to
immediately use it for an error_propagate() call.

Coccinelle patch used to perform the changes added to
scripts/coccinelle/remove_local_err.cocci.

Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-3-git-send-email-ehabkost@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Blank line in s390-virtio-ccw.c restored]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20 16:38:13 +02:00
Eduardo Habkost
621ff94d50 error: Remove NULL checks on error_propagate() calls
error_propagate() already ignores local_err==NULL, so there's no
need to check it before calling.

Coccinelle patch used to perform the changes added to
scripts/coccinelle/error_propagate_null.cocci.

Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1465855078-19435-2-git-send-email-ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-06-20 16:38:13 +02:00
Markus Armbruster
da002526ac vl: Error messages need to go to stderr, fix some
We print a few fatal error messages to stdout instead of stderr.
Reproducer:

    $ qemu-system-x86_64 -g 1024x768
    Option g not supported for this target
    $ qemu-system-x86_64 -g 1024x768 >/dev/null

Fix by printing them with error_report().  This also improves the messages.
The above one becomes

    qemu-system-x86_64: -g 1024x768: Option not supported for this target

Reported-by: Tobi {github.com/tobimensch}
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1464683498-28779-1-git-send-email-armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-20 16:36:29 +02:00
Lluís Vilanova
7c2550432a exec: [tcg] Track which vCPU is performing translation and execution
Information is tracked inside the TCGContext structure, and later used
by tracing events with the 'tcg' and 'vcpu' properties.

The 'cpu' field is used to check tracing of translation-time
events ("*_trans"). The 'tcg_env' field is used to pass it to
execution-time events ("*_exec").

Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 146549350162.18437.3033661139638458143.stgit@fimbulvetr.bsc.es
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 15:30:01 +01:00
Peter Maydell
fd2590bccc Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Mon 20 Jun 2016 15:05:24 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35  775A 9CA4 ABB3 81AB 73C8

* remotes/stefanha/tags/block-pull-request:
  backup: follow AioContext change gracefully
  mirror: follow AioContext change gracefully
  blockjob: add AioContext attached callback
  block: use safe iteration over AioContext notifiers
  blockjob: add block_job_get_aio_context()
  blockjob: add pause points
  blockjob: rename block_job_is_paused()
  blockjob: move iostatus reset out of block_job_enter()
  block: process before_write_notifiers in bdrv_co_discard
  block: fix race in bdrv_co_discard with drive-mirror
  block: fixed BdrvTrackedRequest filling in bdrv_co_discard
  libqos: add qvirtqueue_cleanup()
  libqos: drop duplicated virtio_pci.h definitions
  libqos: drop duplicated virtio_scsi.h definitions
  libqos: drop duplicated virtio_blk.h definitions
  libqos: drop duplicated virtio_vring.h structs
  libqos: drop duplicated virtio_ring.h bit definitions
  libqos: drop duplicated virtio_config.h definitions
  libqos: drop duplicated PCI vendor ID definition
  libqos: use virtio_ids.h for device ID definitions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 15:07:56 +01:00
Stefan Hajnoczi
5ab4b69ce2 backup: follow AioContext change gracefully
Move s->target to the new AioContext when there is an AioContext change.

The backup_run() coroutine does not use asynchronous I/O so there is no
need to wait for in-flight requests in a BlockJobDriver->pause()
callback.

Guest writes are intercepted by the backup job.  Treat them as guest
activity and do it even while the job is paused.  This is necessary
since the only alternative would be to fail a job that experienced guest
writes during pause once the job is resumed.  In practice the guest
writes don't interfere with AioContext switching since bdrv_drain() is
used by bdrv_set_aio_context().

Loops already contain pause points because of block_job_sleep_ns() calls
in the yield_and_check() helper function.  It is necessary to convert a
raw qemu_coroutine_yield() to block_job_yield() so the
MIRROR_SYNC_MODE_NONE case can pause.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-9-git-send-email-stefanha@redhat.com
2016-06-20 14:25:41 +01:00
Stefan Hajnoczi
565ac01f8d mirror: follow AioContext change gracefully
Add block_job_pause_point() calls to mark quiescent points and make sure
to complete in-flight requests when switching AioContexts.

This patch solves undefined behavior in the mirror block job when the
BDS AioContext is changed by dataplane.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-8-git-send-email-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 14:25:41 +01:00
Stefan Hajnoczi
463e0be101 blockjob: add AioContext attached callback
Block jobs that use additional BDSes or event loop resources need a
callback to get their affairs in order when the AioContext is switched.

Simple block jobs don't need an attach callback, they automatically work
thanks to the generic attach/detach notifiers that this patch adds.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-7-git-send-email-stefanha@redhat.com
2016-06-20 14:25:41 +01:00
Stefan Hajnoczi
e8a095dadb block: use safe iteration over AioContext notifiers
It's possible that an AioContext notifier user was close to finishing
when .detach_aio_context() or .attached_aio_context() is called.  In
that case they may call bdrv_remove_aio_context_notifier() during the
callback.

Use safe iteration to avoid crashing when the notifier list is modified
during iteration.  We must not only handle the case where the current
aio notifier is removed during a callback but also the one where any
other aio notifier is removed.

The next patch adds an AioContext notifier for block jobs and they
really could be terminating just as .detach_aio_context() is invoked.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-6-git-send-email-stefanha@redhat.com
2016-06-20 14:25:41 +01:00
Stefan Hajnoczi
9f6bc648c4 blockjob: add block_job_get_aio_context()
Add a helper function to document why block jobs sometimes run in the
QEMU main loop and to avoid code duplication in a following patch.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-5-git-send-email-stefanha@redhat.com
2016-06-20 14:25:41 +01:00
Stefan Hajnoczi
fc9c0a9c4b blockjob: add pause points
Block jobs are coroutines that usually perform I/O but sometimes also
sleep or yield.  Currently only sleeping or yielded block jobs can be
paused.  This means jobs that do not sleep or yield (using
block_job_yield()) are unaffected by block_job_pause().

Add block_job_pause_point() so that block jobs can mark quiescent points
that are suitable for pausing.  This solves the problem that it can take
a block job a long time to pause if it is performing a long series of
I/O operations.

Transitioning to paused state involves a .pause()/.resume() callback.
These callbacks are used to ensure that I/O and event loop activity has
ceased while the job is at a pause point.

Note that this patch introduces a stricter pause state than previously.
The job->busy flag was incorrectly documented as a quiescent state
without I/O pending.  This is violated by any job that has I/O pending
across sleep or block_job_yield(), like the mirror block job.

[Add missing block_job_should_pause() check to avoid deadlock after
job->driver->pause() in block_job_pause_point().
--Stefan]

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-4-git-send-email-stefanha@redhat.com
2016-06-20 14:25:36 +01:00
Peter Maydell
5edbd4e304 Merge remote-tracking branch 'remotes/otubo/tags/pull-seccomp-20160620' into staging
seccomp branch queue

# gpg: Signature made Mon 20 Jun 2016 10:06:59 BST
# gpg:                using RSA key 0xFD0CFF5B12F8BD2F
# gpg: Good signature from "Eduardo Otubo (Software Engineer @ ProfitBricks) <eduardo.otubo@profitbricks.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 1C96 46B6 E1D1 C38A F2EC  3FDE FD0C FF5B 12F8 BD2F

* remotes/otubo/tags/pull-seccomp-20160620:
  seccomp: Add support for ppc/ppc64

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-20 12:53:35 +01:00
Stefan Hajnoczi
a7f3b7ff03 blockjob: rename block_job_is_paused()
The block_job_is_paused() function name is not great because callers
only use it to determine whether pausing has been requested.  Rename it
to highlight those semantics and remove it from the public header file
as there are no external callers.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1466096189-6477-3-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
17bd51f936 blockjob: move iostatus reset out of block_job_enter()
The QMP block-job-resume command and cancellation may want to reset the
job's iostatus.  The next patches add a user who does not want to reset
iostatus so move it up to block_job_enter() callers.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1466096189-6477-2-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Denis V. Lunev
ec050f77a5 block: process before_write_notifiers in bdrv_co_discard
This is mandatory for correct backup creation. In the other case the
content under this area would be lost.

Dirty bits are set exactly like in bdrv_aligned_pwritev, i.e. they are set
even if notifier has returned a error.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1466093381-6120-4-git-send-email-den@openvz.org
CC: Fam Zheng <famz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 11:44:12 +01:00
Denis V. Lunev
968d8b0627 block: fix race in bdrv_co_discard with drive-mirror
Actually we must set dirty bitmap dirty after we have written all our
zeroes for correct processing in drive mirror code. In the other case
we can face not zeroes in this area in mirror_iteration.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1466093381-6120-3-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 11:44:12 +01:00
Denis V. Lunev
3a36e474f2 block: fixed BdrvTrackedRequest filling in bdrv_co_discard
The request area is specified in bytes, not in sectors.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1466093381-6120-2-git-send-email-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
f1d3b99154 libqos: add qvirtqueue_cleanup()
qvirtqueue_setup() allocates the vring and virtqueue state.  So far
there has been no function to free it.  Callers have been using
guest_free() for the vring but forgot to free the QVirtQueue state.

This patch solves the memory leak by introducing qvirtqueue_cleanup().

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
c75f4c061b libqos: drop duplicated virtio_pci.h definitions
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-9-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
74f079a7ee libqos: drop duplicated virtio_scsi.h definitions
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-8-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
4565a3e029 libqos: drop duplicated virtio_blk.h definitions
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-7-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
780b11a097 libqos: drop duplicated virtio_vring.h structs
The descriptor element, used, and avail vring structs are defined in
virtio_ring.h.  There is no need to duplicate them in libqos virtio.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-6-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
ee3b850a70 libqos: drop duplicated virtio_ring.h bit definitions
Note that virtio_ring.h defines feature bits using their bit number:

  #define VIRTIO_RING_F_INDIRECT_DESC     28

On the other hand libqos virtio.h uses the bit mask:

  #define QVIRTIO_F_RING_INDIRECT_DESC    0x10000000

The patch makes the necessary adjustments.

I have used "1u << BITMASK" instead of "1ULL << BITMASK" because the
64-bit feature fields are not implemented in libqos virtio.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-5-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
1373a4c256 libqos: drop duplicated virtio_config.h definitions
Note that VIRTIO_F_ANY_LAYOUT and VIRTIO_F_NOTIFY_ON_EMPTY are bit
numbers in virtio_config.h but bit masks in qtest virtio.h.  Therefore
it's necessary to change users from X to (1u << X).

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-4-git-send-email-stefanha@redhat.com
2016-06-20 11:44:12 +01:00
Stefan Hajnoczi
7ad1e708e6 libqos: drop duplicated PCI vendor ID definition
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-3-git-send-email-stefanha@redhat.com
2016-06-20 11:44:11 +01:00
Stefan Hajnoczi
8ac9e205bd libqos: use virtio_ids.h for device ID definitions
Avoid redefining device IDs.  Use the standard Linux headers that are
already in the source tree.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1462798061-30382-2-git-send-email-stefanha@redhat.com
2016-06-20 11:44:11 +01:00
Peter Maydell
b1e3493b25 hw/intc/arm_gicv3: Fix compilation with simple trace backend
Fix missing includes of qemu/log.h, which broke compilation with the
simple trace backend (the default backend pulls in log.h implicitly
via trace.h).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Tested-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-id: 1466416634-9798-1-git-send-email-peter.maydell@linaro.org
2016-06-20 11:35:15 +01:00
Michael Strosaker
3e68445503 seccomp: Add support for ppc/ppc64
Support for ppc/ppc64 is official in libseccomp 2.3.0, so modify the
configuration script to allow qemuu to enable seccomp for those platforms.

Signed-off-by: Michael Strosaker <strosake@linux.vnet.ibm.com>
2016-06-20 11:04:09 +02:00
Peter Maydell
482b61844a Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160617' into staging
target-arm queue:
 * GICv3 emulation

# gpg: Signature made Fri 17 Jun 2016 15:24:28 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20160617: (22 commits)
  ACPI: ARM: Present GIC version in MADT table
  hw/timer: Add value matching support to aspeed_timer
  target-arm/monitor.c: Advertise emulated GICv3 in capabilities
  target-arm/machine.c: Allow user to request GICv3 emulation
  hw/intc/arm_gicv3: Add IRQ handling CPU interface registers
  hw/intc/arm_gicv3: Implement CPU i/f SGI generation registers
  hw/intc/arm_gicv3: Implement gicv3_cpuif_update()
  hw/intc/arm_gicv3: Implement GICv3 CPU interface registers
  hw/intc/arm_gicv3: Implement gicv3_set_irq()
  hw/intc/arm_gicv3: Wire up distributor and redistributor MMIO regions
  hw/intc/arm_gicv3: Implement GICv3 redistributor registers
  hw/intc/arm_gicv3: Implement GICv3 distributor registers
  hw/intc/arm_gicv3: Implement functions to identify next pending irq
  hw/intc/arm_gicv3: ARM GICv3 device framework
  hw/intc/arm_gicv3: Add vmstate descriptors
  hw/intc/arm_gicv3: Move irq lines into GICv3CPUState structure
  hw/intc/arm_gicv3: Add state information
  target-arm: Add mp-affinity property for ARM CPU class
  target-arm: Provide hook to tell GICv3 about changes of security state
  target-arm: Define new arm_is_el3_or_mon() function
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 16:16:37 +01:00
Peter Maydell
da838dfc40 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine queue, 2016-06-17

# gpg: Signature made Fri 17 Jun 2016 14:45:48 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/machine-pull-request:
  vnc: Wrap vnc initialization code with CONFIG_VNC
  qdev: Use GList for global properties

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:31:27 +01:00
Shannon Zhao
f06765a94a ACPI: ARM: Present GIC version in MADT table
In ACPI 5.1 Errata, it adds GIC version in GIC distributor structure.
This is useful for guest kernel to identify which version GIC hardware
is. Update GIC distributor structure and present GIC version in MADT
table.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465960955-17388-1-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:23:51 +01:00
Andrew Jeffery
1d3e65aa7a hw/timer: Add value matching support to aspeed_timer
Value matching allows Linux to boot with CONFIG_NO_HZ_IDLE=y on the
palmetto-bmc machine. Two match registers are provided for each timer.

Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1465974248-20434-1-git-send-email-andrew@aj.id.au
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:23:51 +01:00
Peter Maydell
3b1a222501 target-arm/monitor.c: Advertise emulated GICv3 in capabilities
Now we have an emulated GICv3 we should advertise it via the
capabilities in the monitor protocol.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-21-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
5de8229db5 target-arm/machine.c: Allow user to request GICv3 emulation
Now we have an emulated GICv3, remove the restriction in
gicv3_class_name() so that the user can request a GICv3 with
-machine gic-version=3 even when not using KVM.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-20-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
227a865366 hw/intc/arm_gicv3: Add IRQ handling CPU interface registers
Add the CPU interface registers which deal with acknowledging
and dismissing interrupts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-19-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
b1a0eb777d hw/intc/arm_gicv3: Implement CPU i/f SGI generation registers
Implement the registers in the GICv3 CPU interface which generate
new SGI interrupts.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-18-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
f7b9358e2c hw/intc/arm_gicv3: Implement gicv3_cpuif_update()
Implement the gicv3_cpuif_update() function which deals with correctly
asserting IRQ and FIQ based on the current running priority of the CPU,
the priority of the highest priority pending interrupt and the CPU's
current exception level and security state.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-17-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
359fbe65e0 hw/intc/arm_gicv3: Implement GICv3 CPU interface registers
Implement the CPU interface registers for the GICv3; these are
CPU system registers, not MMIO registers.

This commit implements all the registers which are simple
accessors for GIC state, but not those which act as interfaces
for acknowledging, dismissing or generating interrupts. (Those
will be added in a later commit.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-16-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
c84428b33f hw/intc/arm_gicv3: Implement gicv3_set_irq()
Implement the code which updates the GIC state when an interrupt
input into the GIC is asserted.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-15-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
287c181ae4 hw/intc/arm_gicv3: Wire up distributor and redistributor MMIO regions
Wire up the MMIO functions exposed by the distributor and the
redistributor into MMIO regions exposed by the GICv3 device.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-14-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Shlomo Pongratz
cec93a938a hw/intc/arm_gicv3: Implement GICv3 redistributor registers
Implement the redistributor registers of a GICv3.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-13-git-send-email-peter.maydell@linaro.org
[PMM: significantly overhauled/rewritten:
 * use the new data structures
 * restructure register read/write to handle different width accesses
   natively, since almost all registers are 32-bit only, rather
   than implementing everything as byte accesses
 * implemented security extension support
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:23:51 +01:00
Shlomo Pongratz
e52af51340 hw/intc/arm_gicv3: Implement GICv3 distributor registers
Implement the distributor registers of a GICv3.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-12-git-send-email-peter.maydell@linaro.org
[PMM: significantly overhauled/rewritten:
 * use the new bitmap data structures
 * restructure register read/write to handle different width accesses
   natively, since almost all registers are 32-bit only, rather
   than implementing everything as byte accesses
 * implemented security extension support
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:23:51 +01:00
Peter Maydell
ce187c3c15 hw/intc/arm_gicv3: Implement functions to identify next pending irq
Implement the GICv3 logic to recalculate the highest priority pending
interrupt for each CPU after some part of the GIC state has changed.
We avoid unnecessary full recalculation where possible.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-11-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Shlomo Pongratz
56992670a4 hw/intc/arm_gicv3: ARM GICv3 device framework
This patch includes the device class itself, some ID register
value functions which will be needed by both distributor
and redistributor, and some skeleton functions for handling
interrupts coming in and going out, which will be filled in
in a subsequent patch.

Signed-off-by: Shlomo Pongratz <shlomo.pongratz@huawei.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-10-git-send-email-peter.maydell@linaro.org
[PMM: pulled this patch earlier in the sequence, and left
 some code out of it for a later patch]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
2016-06-17 15:23:51 +01:00
Pavel Fedin
757caeed76 hw/intc/arm_gicv3: Add vmstate descriptors
Add state structure descriptors for the GICv3 state. We mark
the KVM GICv3 device as having a migration blocker until the
code to save and restore the state in the kernel is implemented.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-9-git-send-email-peter.maydell@linaro.org
[PMM: Adjust to renamed struct fields; switched to using uint32_t
 array backed bitmaps; add migration blocker setting]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 15:23:51 +01:00
Peter Maydell
3faf2b0cd5 hw/intc/arm_gicv3: Move irq lines into GICv3CPUState structure
Move the GICv3 parent_irq and parent_fiq pointers into the
GICv3CPUState structure rather than giving them their own array.
This will make it easy to assert the IRQ and FIQ lines for a
particular CPU interface without having to know or calculate
the CPU index for the GICv3CPUState we are working on.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-8-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Pavel Fedin
07e2034d08 hw/intc/arm_gicv3: Add state information
Add state information to GICv3 object structure and implement
arm_gicv3_common_reset().

This commit includes accessor functions for the fields which are
stored as bitmaps in uint32_t arrays.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465915112-29272-7-git-send-email-peter.maydell@linaro.org
[PMM: significantly overhauled:
 * Add missing qom/cpu.h include
 * Remove legacy-only state fields (we can add them later if/when we add
   legacy emulation)
 * Use arrays of uint32_t to store the various distributor bitmaps,
   and provide accessor functions for the various set/test/etc operations
 * Add various missing register offset #defines
 * Accessor macros which combine distributor and redistributor behaviour
   removed
 * Fields in state structures renamed to match architectural register names
 * Corrected the reset value for GICR_IENABLER0 since we don't support
   legacy mode
 * Added ARM_LINUX_BOOT_IF interface for "we are directly booting a kernel in
   non-secure" so that we can fake up the firmware-mandated reconfiguration
   only when we need it
]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
2016-06-17 15:23:51 +01:00
Pavel Fedin
15a21fe028 target-arm: Add mp-affinity property for ARM CPU class
This allows to override default affinity IDs on a per-machine basis, and
possibility to retrieve IDs will be used by vGICv3 live migration code.

Signed-off-by: Pavel Fedin <p.fedin@samsung.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-6-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
bd7d00fc50 target-arm: Provide hook to tell GICv3 about changes of security state
The GICv3 CPU interface needs to know when the CPU it is attached
to makes an exception level or mode transition that changes the
security state, because whether it is asserting IRQ or FIQ can change
depending on these things. Provide a mechanism for letting the GICv3
device register a hook to be called on such changes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-5-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
712058764d target-arm: Define new arm_is_el3_or_mon() function
The GICv3 system registers need to know if the CPU is AArch64
in EL3 or AArch32 in Monitor mode. This happens to be the first
part of the check for arm_is_secure(), so factor it out into a
new arm_is_el3_or_mon() function that the GIC can also use.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-4-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
b355438de5 bitops.h: Implement half-shuffle and half-unshuffle ops
A half-shuffle operation takes a word with zeros in the high half:
 0000 0000 0000 0000 ABCD EFGH IJKL MNOP
and spreads the bits out so they are in every other bit of the word:
 0A0B 0C0D 0E0F 0G0H 0I0J 0K0L 0M0N 0O0P
A half-unshuffle performs the reverse operation.

Provide functions in bitops.h which implement these operations
for 32-bit and 64-bit inputs, and add tests for them.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-3-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
04716bc8fd migration: Define VMSTATE_UINT64_2DARRAY
Define a VMSTATE_UINT64_2DARRAY macro, to go with the ones we
already have for other type sizes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Tested-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1465915112-29272-2-git-send-email-peter.maydell@linaro.org
2016-06-17 15:23:51 +01:00
Peter Maydell
d121fcdf45 nbd/client.c: Correct trace format string
The trace format string in nbd_send_request uses PRIu16 for
request->type, but request->type is a uint32_t. This provokes
compiler warnings on the OSX clang. Use PRIu32 instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1466167331-17063-1-git-send-email-peter.maydell@linaro.org
2016-06-17 15:05:55 +01:00
Chao Peng
a663fbd9e2 vnc: Wrap vnc initialization code with CONFIG_VNC
commit f8c75b2486 (vnc: Initialization stubs) removed CONFIG_VNC in vl.c
code. However qemu_find_opts("vnc") is NULL when vnc is configured out.
Crash will happen in qemu_opts_foreach() before stub vnc_init_func() is
called. This patch add it back.

Cc: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-17 10:42:21 -03:00
Eduardo Habkost
f9a8b5530d qdev: Use GList for global properties
If the same GlobalProperty struct is registered twice, the list
entry gets corrupted, making tqe_next points to itself, and
qdev_prop_set_globals() gets stuck in a loop. The bug can be
easily reproduced by running:

  $ qemu-system-x86_64 -rtc-td-hack -rtc-td-hack

Change global_props to use GList instead of queue.h, making the
code simpler and able to deal with properties being registered
twice.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-17 10:42:21 -03:00
Peter Maydell
98b5b7422f Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.7-5' into staging
Migration:

 - many compression/decompression fixes
 - trace improvements
 - static checker fix for detecting size mismatch in unused fields
 - fix VM save after snapshot

# gpg: Signature made Fri 17 Jun 2016 13:59:44 BST
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.7-5:
  vmstate-static-checker: fix size mismatch detection in unused fields
  migration: code clean up
  migration: refine the decompression code
  migration: refine the compression code
  migration: protect the quit flag by lock
  migration: refine ram_save_compressed_page
  qemu-file: Fix qemu_put_compression_data flaw
  migration: remove useless code
  migration: Fix a potential issue
  migration: Fix multi-thread compression bug
  migration: fix inability to save VM after snapshot
  migration: Trace improvements
  migration: Don't use *_to_cpup() and cpu_to_*w()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 14:09:46 +01:00
Amit Shah
0794d8895e vmstate-static-checker: fix size mismatch detection in unused fields
If a field changed from something to unused, the checker wasn't flagging
if the field size mismatched.  This was noticed in:

http://thread.gmane.org/gmane.comp.emulators.qemu/419802

where the 4->1 size change along with field name change to 'unused'
wasn't being flagged.  Fix this.

Signed-off-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <d7ec03a9b2edfa0616764887a51ba8f64fdd3f68.1466165736.git.amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:33 +05:30
Liang Li
0d9f9a5c52 migration: code clean up
Use 'QemuMutex comp_done_lock' and 'QemuCond comp_done_cond' instead
of 'QemuMutex *comp_done_lock' and 'QemuCond comp_done_cond'. To keep
consistent with 'QemuMutex decomp_done_lock' and
'QemuCond comp_done_cond'.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-10-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:31 +05:30
Liang Li
33d151f418 migration: refine the decompression code
The current code for multi-thread decompression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-9-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:28 +05:30
Liang Li
a7a9a88f9d migration: refine the compression code
The current code for multi-thread compression is not clear,
especially in the aspect of using lock. Refine the code
to make it clear.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-8-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:26 +05:30
Liang Li
90e56fb46d migration: protect the quit flag by lock
quit_comp_thread and quit_decomp_thread are accessed by several
thread, it's better to protect them with locks. We use a per
thread flag to replace the global one, and the new flag is protected
by a lock.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-7-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:23 +05:30
Liang Li
fc50438ed0 migration: refine ram_save_compressed_page
Use qemu_put_compression_data to do the compression directly
instead of using do_compress_ram_page, avoid some data copy.
very small improvement, at the same time, add code to check
if the compression is successful.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-6-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:21 +05:30
Liang Li
b3be28969b qemu-file: Fix qemu_put_compression_data flaw
Current qemu_put_compression_data can only work with no writable
QEMUFile, and can't work with the writable QEMUFile. But it does
not provide any measure to prevent users from using it with a
writable QEMUFile.

We should fix this flaw to make it works with writable QEMUFile.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Suggested-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1462433579-13691-5-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:18 +05:30
Liang Li
e7bb92e21a migration: remove useless code
page_buffer is set twice repeatedly, remove the previous set.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1462433579-13691-4-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:14 +05:30
Liang Li
5533b2e9bc migration: Fix a potential issue
At the end of live migration and before vm_start() on the destination
side, we should make sure all the decompression tasks are finished, if
this can not be guaranteed, the VM may get the incorrect memory data,
or the updated memory may be overwritten by the decompression thread.
Add the code to fix this potential issue.

Suggested-by: David Alan Gilbert <dgilbert@redhat.com>
Suggested-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-3-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:09 +05:30
Liang Li
73a8912b8a migration: Fix multi-thread compression bug
Recently, a bug related to multiple thread compression feature for
live migration is reported. The destination side will be blocked
during live migration if there are heavy workload in host and
memory intensive workload in guest, this is most likely to happen
when there is one decompression thread.

Some parts of the decompression code are incorrect:
1. The main thread receives data from source side will enter a busy
loop to wait for a free decompression thread.
2. A lock is needed to protect the decomp_param[idx]->start, because
it is checked in the main thread and is updated in the decompression
thread.

Fix these two issues by following the code pattern for compression.

Signed-off-by: Liang Li <liang.z.li@intel.com>
Reported-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Tested-by: Daniel P. Berrange <berrange@redhat.com>

Signed-off-by: Liang Li <liang.z.li@intel.com>
Message-Id: <1462433579-13691-2-git-send-email-liang.z.li@intel.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:24:01 +05:30
Denis V. Lunev
6dcf66681a migration: fix inability to save VM after snapshot
The following sequence of operations fails:
    virsh start vm
    virsh snapshot-create vm
    virshh save vm --file file
with the following error
    error: Failed to save domain vm to file
    error: internal error: unable to execute QEMU command 'migrate':
    There's a migration process in progress

The problem is that qemu_savevm_state() calls migrate_init() which sets
migration state to MIGRATION_STATUS_SETUP and never cleaned it up.
This patch do the job.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Amit Shah <amit.shah@redhat.com>
Message-Id: <1466003203-26263-1-git-send-email-den@openvz.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:23:57 +05:30
Dr. David Alan Gilbert
023ad1a6e9 migration: Trace improvements
A couple of improvements to tracing that have come out of helping
people with migration problems:
  * vmstate_n_elems trace the count/name - for when you have problems
    getting array counts right
  * vmstate_subsection_load_bad - add the idstr, for when you receive a
    subsection you weren't expecting.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1465896986-16132-1-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:23:53 +05:30
Peter Maydell
4d88513157 migration: Don't use *_to_cpup() and cpu_to_*w()
The *_to_cpup() and cpu_to_*w() functions just compose a pointer
dereference with a byteswap. Instead use ld*_p() and st*_p(),
which handle potential pointer misalignment and avoid the need
to cast the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Message-Id: <1465574962-2710-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-17 18:23:49 +05:30
Peter Maydell
4acc8fdfd3 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160617' into staging
ppc patch queue for 2016-06-17

Here's the current accumulated set of spapr, ppc and related patches.
  * The big thing in here is CPU hotplug for spapr
    - This includes a number of acked generic changes adding new
      infrastructure for hotplugging cpu cores
  * A number of TCG bug fixes are also included
  * This adds a new testcase to make it harder to accidentally break
    Macintosh (and other openbios) platforms

# gpg: Signature made Fri 17 Jun 2016 07:35:29 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160617:
  spapr: implement query-hotpluggable-cpus callback
  hmp: Add 'info hotpluggable-cpus' HMP command
  QMP: Add query-hotpluggable-cpus
  spapr: CPU hot unplug support
  spapr: CPU hotplug support
  spapr: convert boot CPUs into CPU core devices
  spapr: Move spapr_cpu_init() to spapr_cpu_core.c
  spapr: Abstract CPU core device and type specific core devices
  qom: API to get instance_size of a type
  spapr_drc: Prevent detach racing against attach for CPU DR
  xics,xics_kvm: Handle CPU unplug correctly
  cpu: Abstract CPU core type
  qdev: hotplug: Introduce HotplugHandler.pre_plug() callback
  target-ppc: Fix rlwimi, rlwinm, rlwnm
  vfio: Fix broken EEH
  target-ppc: Bug in BookE wait instruction
  ppc / sparc: Add a tester for checking whether OpenBIOS runs successfully
  hw/ppc/spapr: Silence deprecation message in qtest mode

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-17 12:36:27 +01:00
Peter Maydell
7263a903c3 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes

Beginning of reconnect support for vhost-user.
Misc cleanups and fixes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Fri 17 Jun 2016 01:28:39 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17  0970 C350 3912 AFBE 8E67
#      Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA  8A0D 281F 0DB8 D28D 5469

* remotes/mst/tags/for_upstream:
  MAINTAINERS: add Marcel to PCI
  msi_init: change return value to 0 on success
  fix some coding style problems
  pci core: assert ENOSPC when add capability
  test: start vhost-user reconnect test
  tests: append i386 tests
  vhost-net: save & restore vring enable state
  vhost-net: save & restore vhost-user acked features
  vhost-net: do not crash if backend is not present
  vhost-user: disconnect on start failure
  qemu-char: add qemu_chr_disconnect to close a fd accepted by listen fd
  tests/vhost-user-bridge: workaround stale vring base
  tests/vhost-user-bridge: add client mode
  vhost-user: add ability to know vhost-user backend disconnection
  pci: fix pci_requester_id()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Conflicts:
	tests/Makefile.include
2016-06-17 11:25:46 +01:00
Igor Mammedov
2474bfd460 spapr: implement query-hotpluggable-cpus callback
It returns a list of present/possible to hotplug CPU
objects with a list of properties to use with
device_add.

in spapr case returned list would looks like:
-> { "execute": "query-hotpluggable-cpus" }
<- {"return": [
     { "props": { "core": 8 }, "type": "POWER8-spapr-cpu-core",
       "vcpus-count": 2 },
     { "props": { "core": 0 }, "type": "POWER8-spapr-cpu-core",
       "vcpus-count": 2,
       "qom-path": "/machine/unattached/device[0]"}
   ]}'

TODO:
  add 'node' property for core <-> numa node mapping

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao
d2d8d46ff7 hmp: Add 'info hotpluggable-cpus' HMP command
This is the HMP equivalent for QMP query-hotpluggable-cpus.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Fixed problem with printf formats on 32-bit host]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Igor Mammedov
d4633541ee QMP: Add query-hotpluggable-cpus
It will allow mgmt to query present and hotpluggable CPU objects,
it is required from a target platform that wishes to support command
to implement and set MachineClass.query_hotpluggable_cpus callback,
which will return a list of possible CPU objects with options that
would be needed for hotplugging possible CPU objects.

There are:
'type': 'str' - QOM CPU object type for usage with device_add
'vcpus-count': 'int' - number of logical VCPU threads per
                        CPU object (mgmt needs to know)

and a set of optional fields that are to used for hotplugging a CPU
objects and would allows mgmt tools to know what/where it could be
hotplugged;
[node],[socket],[core],[thread]

For present CPUs there is a 'qom-path' field which would allow mgmt to
inspect whatever object/abstraction the target platform considers
as CPU object.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao
6f4b5c3ec5 spapr: CPU hot unplug support
Remove the CPU core device by removing the underlying CPU thread devices.
Hot removal of CPU for sPAPR guests is achieved by sending the hot unplug
notification to the guest. Release the vCPU object after CPU hot unplug so
that vCPU fd can be parked and reused.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao
af81cf323c spapr: CPU hotplug support
Set up device tree entries for the hotplugged CPU core and use the
exising RTAS event logging infrastructure to send CPU hotplug notification
to the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao
94a94e4c49 spapr: convert boot CPUs into CPU core devices
Introduce sPAPRMachineClass.dr_cpu_enabled to indicate support for
CPU core hotplug. Initialize boot time CPUs as core deivces and prevent
topologies that result in partially filled cores. Both of these are done
only if CPU core hotplug is supported.

Note: An unrelated change in the call to xics_system_init() is done
in this patch as it makes sense to use the local variable smt introduced
in this patch instead of kvmppc_smt_threads() call here.

TODO: We derive sPAPR core type by looking at -cpu <model>. However
we don't take care of "compat=" feature yet for boot time as well
as hotplug CPUs.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:49 +10:00
Bharata B Rao
afd10a0fa6 spapr: Move spapr_cpu_init() to spapr_cpu_core.c
Start consolidating CPU init related routines in spapr_cpu_core.c. As
part of this, move spapr_cpu_init() and its dependencies from spapr.c
to spapr_cpu_core.c

No functionality change in this patch.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Rename TIMEBASE_FREQ to SPAPR_TIMEBASE_FREQ, since it's now in a
 public(ish) header]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao
3b54254966 spapr: Abstract CPU core device and type specific core devices
Add sPAPR specific abastract CPU core device that is based on generic
CPU core device. Use this as base type to create sPAPR CPU specific core
devices.

TODO:
- Add core types for other remaining CPU types
- Handle CPU model alias correctly

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao
3f97b53a68 qom: API to get instance_size of a type
Add an API object_type_get_size(const char *typename) that returns the
instance_size of the give typename.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao
aab99135b6 spapr_drc: Prevent detach racing against attach for CPU DR
If a CPU is hot removed while hotplug of the same is still in progress,
the guest crashes. Prevent this by ensuring that detach is done only
after attach has completed.

The existing code already prevents such race for PCI hotplug. However
given that CPU is a logical DR unlike PCI and starts with ISOLATED
state, we need a logic that works for CPU too.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
               [Don't set awaiting_attach for PCI devices]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao
4a4b344c7c xics,xics_kvm: Handle CPU unplug correctly
XICS is setup for each CPU during initialization. Provide a routine
to undo the same when CPU is unplugged. While here, move ss->cs management
into xics from xics_kvm since there is nothing KVM specific in it.
Also ensure xics reset doesn't set irq for CPUs that are already unplugged.

This allows reboot of a VM that has undergone CPU hotplug and unplug
to work correctly.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Bharata B Rao
f1020c2c26 cpu: Abstract CPU core type
Add an abstract CPU core type that could be used by machines that want
to define and hotplug CPUs in core granularity.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
               [Integer core property]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
[dwg: changed property names to 'core-id' and 'nr-threads']
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Igor Mammedov
41346263c4 qdev: hotplug: Introduce HotplugHandler.pre_plug() callback
pre_plug callback is to be called before device.realize() is executed.
This would allow to check/set device's properties from HotplugHandler.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:48 +10:00
Richard Henderson
2e11b15dff target-ppc: Fix rlwimi, rlwinm, rlwnm
In 63ae0915f8, I arranged to use a 32-bit rotate, without
considering the effect of a mask value that wraps around to
the high bits of the word.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 16:33:33 +10:00
Gavin Shan
d917e88d85 vfio: Fix broken EEH
vfio_eeh_container_op() is the backend that communicates with
host kernel to support EEH functionality in QEMU. However, the
functon should return the value from host kernel instead of 0
unconditionally.

dwg: Specifically the problem occurs for the handful of EEH
sub-operations which can return a non-zero, non-error result.

Signed-off-by: Gavin Shan <gwshan@linux.vnet.ibm.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
[dwg: clarification to commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 15:59:18 +10:00
Jakub Horak
35b5066ea7 target-ppc: Bug in BookE wait instruction
Fixed bug in code generation for the PowerPC "wait" instruction. It
doesn't make sense to store a non-initialized register.

Signed-off-by: Jakub Horak <thement@ibawizard.net>
[dwg: revised commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 15:59:16 +10:00
Thomas Huth
fcbf4a3c0c ppc / sparc: Add a tester for checking whether OpenBIOS runs successfully
Since the mac99 and g3beige PowerPC machines recently broke without
being noticed, it would be good to have a tester for "make check"
that detects such issues immediately. A simple way to test the firmware
of these machines is to use the "-prom-env" parameter of QEMU. This
parameter can be used to put some Forth code into the 'boot-command'
firmware variable which then can signal success to the tester by
writing a magic value to a known memory location. And since some of the
Sparc machines are also using OpenBIOS, they are now tested with this
prom-env-tester, too.

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
[dwg: Removed sparc64, because it trips a TCG bug on 32-bit hosts]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 15:57:59 +10:00
Michael S. Tsirkin
874a235830 MAINTAINERS: add Marcel to PCI
Marcel is reviewing PCI patches anyway, things will
be easier if people remember to Cc him.

Cc: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Cao jin
2cbb1a6826 msi_init: change return value to 0 on success
No caller use its return value as msi capability offset, also in order
to make its return behaviour consistent with msix_init().

cc: Michael S. Tsirkin <mst@redhat.com>
cc: Paolo Bonzini <pbonzini@redhat.com>
cc: Hannes Reinecke <hare@suse.de>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>

Acked-by: Hannes Reinecke <hare@suse.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Cao jin
52ea63dea4 fix some coding style problems
It has:
1. More newlines make the code block well separated.
2. Add more comments for msi_init.
3. Fix a indentation in vmxnet3.c.
4. ioh3420 & xio3130_downstream: put PCI Express capability init function
   together, make it more readable.

cc: Michael S. Tsirkin <mst@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
cc: Dmitry Fleytman <dmitry@daynix.com>
cc: Jason Wang <jasowang@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Cao jin
97fe42f19b pci core: assert ENOSPC when add capability
ENOSPC is programming error, assert it for debugging.

cc: Michael S. Tsirkin <mst@redhat.com>
cc: Marcel Apfelbaum <marcel@redhat.com>
cc: Markus Armbruster <armbru@redhat.com>

Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
4616e3592b test: start vhost-user reconnect test
This is a simple reconnect test, that simply checks if vhost-user
reconnection is possible and restore the state. A more complete test
would actually manipulate and check the ring contents (such extended
testing would benefit from the libvhost-user proposed in QEMU list to
avoid duplication of ring manipulations)

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
0ee2e9daf8 tests: append i386 tests
Do not overwrite x86-64 tests, re-enable vhost-user-test.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
bfc6cf31ce vhost-net: save & restore vring enable state
A driver may change the vring enable state at run time but vhost-user
backend may not be present (a contrived example is when the backend is
disconnected and the device is reconfigured after driver rebinding)

Restore the vring state when the vhost-user backend is started, so it
can process the ring.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
a463215b08 vhost-net: save & restore vhost-user acked features
The initial vhost-user connection sets the features to be negotiated
with the driver. Renegotiation isn't possible without device reset.

To handle reconnection of vhost-user backend, ensure the same set of
features are provided, and reuse already acked features.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
72b65f922b vhost-net: do not crash if backend is not present
Do not crash when backend is not present while enabling the ring. A
following patch will save the enabled state so it can be restored once
the backend is started.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:03 +03:00
Marc-André Lureau
0d572afd52 vhost-user: disconnect on start failure
If the backend failed to start (for example feature negociation failed),
do not exit, but disconnect the char device instead. Slightly more
robust for reconnect case.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Tetsuya Mukawa
7d9d17f71e qemu-char: add qemu_chr_disconnect to close a fd accepted by listen fd
The patch introduces qemu_chr_disconnect(). The function is used for
closing a fd accepted by listen fd. Though we already have qemu_chr_delete(),
but it closes not only accepted fd but also listen fd. This new function
is used when we still want to keep listen fd.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Marc-André Lureau
523b018dde tests/vhost-user-bridge: workaround stale vring base
This patch is a similar solution to what Yuanhan Liu/Huawei Xie have
suggested for DPDK. When vubr quits (killed or crashed), a restart of
vubr would get stale vring base from QEMU. That would break the kernel
virtio net completely, making it non-work any more, unless a driver
reset is done.

So, instead of getting the stale vring base from QEMU, Huawei suggested
we could get a proper one from used->idx. This works because the queues
packets are processed in order.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Marc-André Lureau
aef8486ede tests/vhost-user-bridge: add client mode
If -c is specified, vubr will try to connect to the socket instead of
listening for connections.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Tetsuya Mukawa
a6553598be vhost-user: add ability to know vhost-user backend disconnection
Current QEMU cannot detect vhost-user backend disconnection. The
patch adds ability to know it.
To know disconnection, add watcher to detect G_IO_HUP event. When
G_IO_HUP event is detected, the disconnected socket will be read
to cause a CHR_EVENT_CLOSED.

Signed-off-by: Tetsuya Mukawa <mukawa@igel.co.jp>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Yuanhan Liu <yuanhan.liu@linux.intel.com>
Reviewed-by: Victor Kaplansky <victork@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Peter Xu
4a94b3aa6d pci: fix pci_requester_id()
This fix SID verification failure when IOMMU IR is enabled with PCI
bridges. Existing pci_requester_id() is more like getting BDF info
only. Renaming it to pci_get_bdf(). Meanwhile, we provide the correct
implementation to get requester ID. VT-d spec 5.1.1 is a good reference
to go, though it talks only about interrupt delivery, the rule works
exactly the same for non-interrupt cases.

Currently, there are three use cases for pci_requester_id():

- PCIX status bits: here we need BDF only, not requester ID. Replacing
  with pci_get_bdf().
- PCIe Error injection and MSI delivery: for both these cases, we are
  looking for requester IDs. Here we should use the new impl.

To avoid a PCI walk every time we send MSI message, one requester_id
cache field is added to PCIDevice to cache the result when initialize
PCI device.

Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-17 03:28:02 +03:00
Thomas Huth
a1aa130989 hw/ppc/spapr: Silence deprecation message in qtest mode
When running "make check", there is currently always an error message
saying "spapr-pci-vfio-host-bridge is deprecated". This happens because
the QOM tests are instantiating all possible devices, and the error
message is currently located in the instance_init() function of the
device. Since it is legal for the tests to instantiate a device without
using it, the error message should be silenced when we're running in
test mode.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-17 09:47:59 +10:00
Peter Maydell
585fcd4b11 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* KVM startup speedup (Chao Peng)
* configure fixes and cleanups (David, Thomas)
* ctags fix (Sergey)
* NBD cleanups (Peter, Eric)
* "-L help" command line option (Richard)
* More esp.c bugfixes (me, Prasad)
* KVM_CAP_MAX_VCPU_ID support (Greg)

# gpg: Signature made Thu 16 Jun 2016 17:39:10 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4  E2F7 7E15 100C CD36 69B1
#      Subkey fingerprint: F133 3857 4B66 2389 866C  7682 BFFB D25F 78C7 AE83

* remotes/bonzini/tags/for-upstream: (29 commits)
  vl: smp_parse: cleanups
  scsi: esp: make cmdbuf big enough for maximum CDB size
  scsi: esp: clean up handle_ti/esp_do_dma if s->do_cmd
  scsi: esp: respect FIFO invariant after message phase
  scsi: esp: check buffer length before reading scsi command
  nbd: Avoid magic number for NBD max name size
  nbd: Detect servers that send unexpected error values
  nbd: Clean up ioctl handling of qemu-nbd -c
  nbd: Group all Linux-specific ioctl code in one place
  nbd: Reject unknown request flags
  nbd: Improve server handling of bogus commands
  nbd: Quit server after any write error
  nbd: More debug typo fixes, use correct formats
  nbd: Use BDRV_REQ_FUA for better FUA where supported
  vl.c: Add '-L help' which lists data dirs.
  KVM: use KVM_CAP_MAX_VCPU_ID
  scsi-disk: Use (unsigned long) typecasts when using "%lu" format string
  target-i386: kvm: cache KVM_GET_SUPPORTED_CPUID data
  nbd: simplify the nbd_request and nbd_reply structs
  nbd: Don't use cpu_to_*w() functions
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-16 17:58:45 +01:00
Andrew Jones
0544edd88a vl: smp_parse: cleanups
No functional changes; only some code movement and removal of
dead code (impossible conditions). Also, max_cpus can be
initialized to 1, like smp_cpus, because it's either set by the
user or set to smp_cpus, when smp_cpus is set by the user, or
set to 1, when nothing is set.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-Id: <1465580427-13596-2-git-send-email-drjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Prasad J Pandit
926cde5f3e scsi: esp: make cmdbuf big enough for maximum CDB size
While doing DMA read into ESP command buffer 's->cmdbuf', it could
write past the 's->cmdbuf' area, if it was transferring more than 16
bytes.  Increase the command buffer size to 32, which is maximum when
's->do_cmd' is set, and add a check on 'len' to avoid OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Paolo Bonzini
7f0b6e114a scsi: esp: clean up handle_ti/esp_do_dma if s->do_cmd
Avoid duplicated code between esp_do_dma and handle_ti.  esp_do_dma
has the same code that handle_ti contains after the call to esp_do_dma;
but the code in handle_ti is never reached because it is in an "else if".
Remove the else and also the pointless return.

esp_do_dma also has a partially dead assignment of the to_device
variable.  Sink it to the point where it's actually used.

Finally, assert that the other caller of esp_do_dma (esp_transfer_data)
only transfers data and not a command.  This is true because get_cmd
cancels the old request synchronously before its caller handle_satn_stop
sets do_cmd to 1.

Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Paolo Bonzini
d020aa504c scsi: esp: respect FIFO invariant after message phase
The FIFO contains two bytes; hence the write ptr should be two bytes ahead
of the read pointer.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Prasad J Pandit
d3cdc49138 scsi: esp: check buffer length before reading scsi command
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() in non-DMA mode, uses 'ti_size' to read scsi
command into a buffer. Add check to validate command length against
buffer size to avoid any overrun.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464717207-7549-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
943cec86d0 nbd: Avoid magic number for NBD max name size
Declare a constant and use that when determining if an export
name fits within the constraints we are willing to support.

Note that upstream NBD recently documented that clients MUST
support export names of 256 bytes (not including trailing NUL),
and SHOULD support names up to 4096 bytes.  4096 is a bit big
(we would lose benefits of stack-allocation of a name array),
and we already have other limits in place (for example, qcow2
snapshot names are clamped around 1024).  So for now, just
stick to the required minimum, as that's easier to audit than
a full-scale support for larger names.

Signed-off-by: Eric Blake <eblake@redhat.com>

Message-Id: <1463006384-7734-12-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
f3c32fce36 nbd: Detect servers that send unexpected error values
Add some debugging to flag servers that are not compliant to
the NBD protocol.  This would have flagged the server bug
fixed in commit c0301fcc.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alex Bligh <alex@alex.org.uk>

Message-Id: <1463006384-7734-11-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
f57e2416aa nbd: Clean up ioctl handling of qemu-nbd -c
The kernel ioctl() interface into NBD is limited to 'unsigned long';
we MUST pass in input with that type (and not int or size_t, as
there may be platform ABIs where the wrong types promote incorrectly
through var-args).  Furthermore, on 32-bit platforms, the kernel
is limited to a maximum export size of 2T (our BLKSIZE of 512 times
a SIZE_BLOCKS constrained by 32 bit unsigned long).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-8-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
98494e3b92 nbd: Group all Linux-specific ioctl code in one place
NBD ioctl()s are used to manage an NBD client session where
initial handshake is done in userspace, but then the transmission
phase is handed off to the kernel through a /dev/nbdX device.
As such, all ioctls sent to the kernel on the /dev/nbdX fd belong
in client.c; nbd_disconnect() was out-of-place in server.c.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-7-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
ab7c548e26 nbd: Reject unknown request flags
The NBD protocol says that clients should not send a command flag
that has not been negotiated (whether by the client requesting an
option during a handshake, or because we advertise support for the
flag in response to NBD_OPT_EXPORT_NAME), and that servers should
reject invalid flags with EINVAL.  We were silently ignoring the
flags instead.  The client can't rely on our behavior, since it is
their fault for passing the bad flag in the first place, but it's
better to be robust up front than to possibly behave differently
than the client was expecting with the attempted flag.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Message-Id: <1463006384-7734-6-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
29b6c3b319 nbd: Improve server handling of bogus commands
We have a few bugs in how we handle invalid client commands:

- A client can send an NBD_CMD_DISC where from + len overflows,
convincing us to reply with an error and stay connected, even
though the protocol requires us to silently disconnect. Fix by
hoisting the special case sooner.

- A client can send an NBD_CMD_WRITE where from + len overflows,
where we reply to the client with EINVAL without consuming the
payload; this will normally cause us to fail if the next thing
read is not the right magic, but in rare cases, could cause us
to interpret the data payload as valid commands and do things
not requested by the client. Fix by adding a complete flag to
track whether we are in sync or must disconnect.

Furthermore, we have split the checks for bogus from/len across
two functions, when it is easier to do it all at once.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-5-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
63d5ef869e nbd: Quit server after any write error
We should never ignore failure from nbd_negotiate_send_rep(); if
we are unable to write to the client, then it is not worth trying
to continue the negotiation.  Fortunately, the problem is not
too severe - chances are that the errors being ignored here (mainly
inability to write the reply to the client) are indications of
a closed connection or something similar, which will also affect
the next attempt to interact with the client and eventually reach
a point where the errors are detected to end the loop.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-4-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
2cb347493c nbd: More debug typo fixes, use correct formats
Clean up some debug message oddities missed earlier; this includes
some typos, and recognizing that %d is not necessarily compatible
with uint32_t. Also add a couple messages that I found useful
while debugging things.

Signed-off-by: Eric Blake <eblake@redhat.com>

Message-Id: <1463006384-7734-3-git-send-email-eblake@redhat.com>
[Do not use PRIx16, clang complains. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:05 +02:00
Eric Blake
a0c303693e nbd: Use BDRV_REQ_FUA for better FUA where supported
Rather than always flushing ourselves, let the block layer
forward the FUA on to the underlying device - where all
underlying layers also understand FUA, we are now more
efficient; and where any underlying layer doesn't understand
it, now the block layer takes care of the full flush fallback
on our behalf.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463006384-7734-2-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Richard W.M. Jones
37146e7eaf vl.c: Add '-L help' which lists data dirs.
QEMU compiles a list of data directories from various sources.  When
consuming a QEMU binary it's useful to be able to get this list of
data directories: a primary reason is so you can list what BIOSes or
keymaps ship with this version of QEMU.  However without reproducing
the method that QEMU uses internally, it's not possible to get the
list of data directories.

This commit adds a simple '-L help' option that just lists out the
data directories as qemu calculates them:

$ ./x86_64-softmmu/qemu-system-x86_64 -L help
/home/rjones/d/qemu/pc-bios
/usr/local/share/qemu

$ ./x86_64-softmmu/qemu-system-x86_64 -L /tmp -L help
/tmp
/home/rjones/d/qemu/pc-bios
/usr/local/share/qemu

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463416475-11728-2-git-send-email-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Greg Kurz
f31e326637 KVM: use KVM_CAP_MAX_VCPU_ID
As stated in linux/Documentation/virtual/kvm/api.txt:

The maximum possible value for max_vcpu_id can be retrieved using the
KVM_CAP_MAX_VCPU_ID of the KVM_CHECK_EXTENSION ioctl() at run-time.

If the KVM_CAP_MAX_VCPU_ID does not exist, you should assume that
max_vcpu_id is the same as the value returned from KVM_CAP_MAX_VCPUS.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <146424974323.5666.5471538288045048119.stgit@bahia.huguette.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Thomas Huth
142c21455b scsi-disk: Use (unsigned long) typecasts when using "%lu" format string
Some source code analyzers like cppcheck spill out a warning if
the sign of the argument does not match the format string.

Ticket: https://bugs.launchpad.net/qemu/+bug/1589564
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1465805418-15906-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Chao Peng
494e95e910 target-i386: kvm: cache KVM_GET_SUPPORTED_CPUID data
KVM_GET_SUPPORTED_CPUID ioctl is called frequently when initializing
CPU. Depends on CPU features and CPU count, the number of calls can be
extremely high which slows down QEMU booting significantly. In our
testing, we saw 5922 calls with switches:

    -cpu SandyBridge -smp 6,sockets=6,cores=1,threads=1

This ioctl takes more than 100ms, which is almost half of the total
QEMU startup time.

While for most cases the data returned from two different invocations
are not changed, that means, we can cache the data to avoid trapping
into kernel for the second time. To make sure the cache safe one
assumption is desirable: the ioctl is stateless. This is not true for
CPUID leaves in general (such as CPUID leaf 0xD, whose value depends
on guest XCR0 and IA32_XSS) but it is true of KVM_GET_SUPPORTED_CPUID,
which runs before there is a value for XCR0 and IA32_XSS.

Signed-off-by: Chao Peng <chao.p.peng@linux.intel.com>
Message-Id: <1465784487-23482-1-git-send-email-chao.p.peng@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Paolo Bonzini
56af2dda98 nbd: simplify the nbd_request and nbd_reply structs
These structs are never used to represent the bytes that go over the
network.  The big-endian network data is built into a uint8_t array
in nbd_{receive,send}_{request,reply}.  Remove the unused magic field,
reorder the struct to avoid holes, and remove the packed attribute.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Peter Maydell
f6be672084 nbd: Don't use cpu_to_*w() functions
The cpu_to_*w() functions just compose a pointer dereference
with a byteswap. Instead use st*_p(), which handles potential
pointer misalignment and avoids the need to cast the pointer.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1465575342-12146-1-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Peter Maydell
773dce3c72 nbd: Don't use *_to_cpup() functions
The *_to_cpup() functions are not very useful, as they simply do
a pointer dereference and then a *_to_cpu(). Instead use either:
 * ld*_*_p(), if the data is at an address that might not be
   correctly aligned for the load
 * a local dereference and *_to_cpu(), if the pointer is
   the correct type and known to be correctly aligned

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1465570836-22211-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Thomas Huth
0fb2331254 configure: Remove unused CONFIG_SIGEV_THREAD_ID switch
The CONFIG_SIGEV_THREAD_ID switch is unused since the related code
has been removed by commit 6d32717155
("aio / timers: Remove alarm timers"), so it can safely be removed
nowadays.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1465571084-19885-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Dr. David Alan Gilbert
4fb8320a2e avx2 configure: Use primitives in test
Use the avx2 primitives during the test, thus making sure that the
compiler and assembler could actually use avx2.

This also detects the failure case on gcc 4.8.x with -save-temps
and avoids the need for the gcc version check in cutils.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1465557378-24105-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Dr. David Alan Gilbert
fc6e1de9d8 Make avx2 configure test work with -O2
When configured with --extra-cflags=-O2 gcc optimised out the test
and the readelf failed the check leaving avx2 disabled.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1465557378-24105-2-git-send-email-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Sergey Fedorov
ac99c624c6 Makefile: Fix tag file generation targets
"ctags" produces a file named "tags", not "ctags". It doesn't look
reasonable to use phony target name as a file name to remove. Just use
exact file names to remove in "ctags" and "TAGS" target receipts.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1465495115-24665-1-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Thomas Huth
e4650c81b3 configure: Enable -Werror for MinGW builds, too
MinGW seems to compile currently without warnings, so it should
be safe to enable -Werror now for this environment, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1465373606-18486-1-git-send-email-thuth@redhat.com>
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:04 +02:00
Paolo Bonzini
e9abfcb57f clean-includes: run it once more
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:03 +02:00
Paolo Bonzini
02d0e09503 os-posix: include sys/mman.h
qemu/osdep.h checks whether MAP_ANONYMOUS is defined, but this check
is bogus without a previous inclusion of sys/mman.h.  Include it in
sysemu/os-posix.h and remove it from everywhere else.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:39:03 +02:00
Thomas Huth
89266923df configure: Remove unused CONFIG_ZERO_MALLOC setting
CONFIG_ZERO_MALLOC was only used in qemu-malloc.c and
this file has been removed with the following commit:

	41a748265f
	Remove qemu_malloc/qemu_free

So we don't need this configuration setting anymore.
This patch also removes the z_version variable, since
this is now also not needed anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <1465398683-3152-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 18:32:35 +02:00
Peter Maydell
dc278c58fa Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Thu 16 Jun 2016 15:01:27 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74  56FE 7F09 B272 C88F 2FD6

* remotes/kevin/tags/for-upstream: (39 commits)
  hbitmap: add 'pos < size' asserts
  iotests: Add test for oVirt-like storage migration
  iotests: Add test for post-mirror backing chains
  block/null: Implement bdrv_refresh_filename()
  block/mirror: Fix target backing BDS
  block: Allow replacement of a BDS by its overlay
  rbd:change error_setg() to error_setg_errno()
  iotests: 095: Clean up QEMU before showing image info
  block: Create the commit block job before reopening any image
  block: Prevent sleeping jobs from resuming if they have been paused
  block: use the block job list in qmp_query_block_jobs()
  block: use the block job list in bdrv_drain_all()
  block: Fix snapshot=on with aio=native
  block: Remove bs->zero_beyond_eof
  qcow2: Let vmstate call qcow2_co_preadv/pwrite directly
  block: Make bdrv_load/save_vmstate coroutine_fns
  block: Allow .bdrv_load/save_vmstate() to return 0/-errno
  block: Make .bdrv_load_vmstate() vectored
  block: Introduce bdrv_preadv()
  doc: Fix mailing list address in tests/qemu-iotests/README
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-16 15:22:56 +01:00
Kevin Wolf
60251f4d3e Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-06-16' into queue-block
Block patches

# gpg: Signature made Thu Jun 16 15:21:35 2016 CEST
# gpg:                using RSA key 0x3BB14202E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1  1829 F407 DB00 61D5 CF40
#      Subkey fingerprint: 58B3 81CE 2DC8 9CF9 9730  EE64 3BB1 4202 E838 ACAD

* mreitz/tags/pull-block-for-kevin-2016-06-16:
  hbitmap: add 'pos < size' asserts
  iotests: Add test for oVirt-like storage migration
  iotests: Add test for post-mirror backing chains
  block/null: Implement bdrv_refresh_filename()
  block/mirror: Fix target backing BDS
  block: Allow replacement of a BDS by its overlay
  rbd:change error_setg() to error_setg_errno()
  iotests: 095: Clean up QEMU before showing image info
  block: Create the commit block job before reopening any image
  block: Prevent sleeping jobs from resuming if they have been paused
  block: use the block job list in qmp_query_block_jobs()
  block: use the block job list in bdrv_drain_all()

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:22:18 +02:00
Vladimir Sementsov-Ogievskiy
0e32119122 hbitmap: add 'pos < size' asserts
For now, fail in hbitmap_set on start + count > size will come from
hbitmap_set
  hb_count_between
    hbitmap_iter_init
      assert(pos < hb->size)

This patch adds such checks to set/get/reset functions of hbitmap.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 1465924093-76875-2-git-send-email-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Max Reitz
3dd48fdc55 iotests: Add test for oVirt-like storage migration
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-6-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Max Reitz
298c6009dc iotests: Add test for post-mirror backing chains
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-5-mreitz@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
[mreitz@redhat.com: Removed unnecessary imports]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Max Reitz
67882b1535 block/null: Implement bdrv_refresh_filename()
The null block driver ignores any filename used for creating its BDSs,
which allows creating such BDSs even without any filename at all. In
that case, we currently construct a JSON filename when queried instead
of a plain "null-co://" or "null-aio://". This patch implements
bdrv_refresh_filename() to remedy this behavior.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-4-mreitz@redhat.com
[mreitz@redhat.com: Added commit message]
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Max Reitz
274fccee2b block/mirror: Fix target backing BDS
Currently, we are trying to move the backing BDS from the source to the
target in bdrv_replace_in_backing_chain() which is called from
mirror_exit(). However, mirror_complete() already tries to open the
target's backing chain with a call to bdrv_open_backing_file().

First, we should only set the target's backing BDS once. Second, the
mirroring block job has a better idea of what to set it to than the
generic code in bdrv_replace_in_backing_chain() (in fact, the latter's
conditions on when to move the backing BDS from source to target are not
really correct).

Therefore, remove that code from bdrv_replace_in_backing_chain() and
leave it to mirror_complete().

Depending on what kind of mirroring is performed, we furthermore want to
use different strategies to open the target's backing chain:

- If blockdev-mirror is used, we can assume the user made sure that the
  target already has the correct backing chain. In particular, we should
  not try to open a backing file if the target does not have any yet.

- If drive-mirror with mode=absolute-paths is used, we can and should
  reuse the already existing chain of nodes that the source BDS is in.
  In case of sync=full, no backing BDS is required; with sync=top, we
  just link the source's backing BDS to the target, and with sync=none,
  we use the source BDS as the target's backing BDS.
  We should not try to open these backing files anew because this would
  lead to two BDSs existing per physical file in the backing chain, and
  we would like to avoid such concurrent access.

- If drive-mirror with mode=existing is used, we have to use the
  information provided in the physical image file which means opening
  the target's backing chain completely anew, just as it has been done
  already.
  If the target's backing chain shares images with the source, this may
  lead to multiple BDSs per physical image file. But since we cannot
  reliably ascertain this case, there is nothing we can do about it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-3-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Max Reitz
9bd910e2cb block: Allow replacement of a BDS by its overlay
change_parent_backing_link() asserts that the BDS to be replaced is not
used as a backing file. However, we may want to replace a BDS by its
overlay in which case that very link should not be redirected.

For instance, when doing a sync=none drive-mirror operation, we may have
the following BDS/BB forest before block job completion:

  target

  base <- source <- BlockBackend

During job completion, we want to establish the source BDS as the
target's backing node:

          target
            |
            v
  base <- source <- BlockBackend

This makes the target a valid replacement for the source:

          target <- BlockBackend
            |
            v
  base <- source

Without this modification to change_parent_backing_link() we have to
inject the target into the graph before the source is its backing node,
thus temporarily creating a wrong graph:

  target <- BlockBackend

  base <- source

Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20160610185750.30956-2-mreitz@redhat.com
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Vikhyat Umrao
87cd3d20e1 rbd:change error_setg() to error_setg_errno()
Ceph RBD block driver does not use error_setg_errno() where
it is possible to use. This patch replaces error_setg()
from error_setg_errno().

Signed-off-by: Vikhyat Umrao <vumrao@redhat.com>
Message-id: 1462780319-5796-1-git-send-email-vumrao@redhat.com
Reviewed-by: Josh Durgin <jdurgin@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Fam Zheng
6ea66b590c iotests: 095: Clean up QEMU before showing image info
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464944872-24484-1-git-send-email-famz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Alberto Garcia
834fe28ddf block: Create the commit block job before reopening any image
If the base or overlay images need to be reopened in read-write mode
but the block_job_create() call fails then no one will put those
images back in read-only mode.

We can solve this problem easily by calling block_job_create() first.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: aa495045770a6f1a7cc5d408397a17c75097fdd8.1464346103.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Alberto Garcia
0824afda0c block: Prevent sleeping jobs from resuming if they have been paused
If we pause a block job and drain its BlockDriverState we want that
the job remains inactive until we call block_job_resume() again.

However if we pause the job while it is sleeping then it will resume
when the sleep timer fires.

This patch prevents that from happening by checking if the job has
been paused after it comes back from sleeping.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 3d9011151512326b890d22bdab3530244ef349d7.1464346103.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Alberto Garcia
f0f55deda2 block: use the block job list in qmp_query_block_jobs()
qmp_query_block_jobs() uses bdrv_next() to look for block jobs, but
this function can only find those in top-level BlockDriverStates.

This patch uses block_job_next() instead.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: a8b7e5497b7c1fa67c12fcceae1630d01c3b1f96.1464346103.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Alberto Garcia
eb1364ceac block: use the block job list in bdrv_drain_all()
bdrv_drain_all() pauses all block jobs by using bdrv_next() to iterate
over all top-level BlockDriverStates. Therefore the code is unable to
find block jobs in other nodes.

This patch uses block_job_next() to iterate over all block jobs.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 55ee7d7d4a65c28aa1a1b28823897ef326f328e2.1464346103.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-06-16 15:20:37 +02:00
Kevin Wolf
418690447a block: Fix snapshot=on with aio=native
snapshot=on creates a temporary overlay that is always opened with
cache=unsafe (the cache mode specified by the user is only for the
actual image file and its children). This means that we must not inherit
the BDRV_O_NATIVE_AIO flag for the temporary overlay because trying to
use Linux AIO with cache=unsafe results in an error.

Reproducer without this patch:

$ x86_64-softmmu/qemu-system-x86_64 -drive file=/tmp/test.qcow2,cache=none,aio=native,snapshot=on
qemu-system-x86_64: -drive file=/tmp/test.qcow2,cache=none,aio=native,snapshot=on: aio=native was
specified, but it requires cache.direct=on, which was not specified.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:56 +02:00
Kevin Wolf
c9d20029f4 block: Remove bs->zero_beyond_eof
It is always true for open images now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:56 +02:00
Kevin Wolf
734a77584a qcow2: Let vmstate call qcow2_co_preadv/pwrite directly
We don't really want to go through the block layer in order to read from
or write to the vmstate in a qcow2 image. Doing so required a few ugly
hacks like saving and restoring the old image size (because writing to
vmstate offsets would increase the image size) or disabling the "reads
after EOF = zeroes" logic. When calling the right functions directly,
these hacks aren't necessary any more.

Note that .bdrv_vmstate_load/save() return 0 instead of the number of
bytes in case of success now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:56 +02:00
Kevin Wolf
1a8ae82217 block: Make bdrv_load/save_vmstate coroutine_fns
This allows drivers to share code between normal I/O and vmstate
accesses.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:56 +02:00
Kevin Wolf
b433d9424d block: Allow .bdrv_load/save_vmstate() to return 0/-errno
The return value of .bdrv_load/save_vmstate() can be any non-negative
number in case of success now. It used to be bytes/-errno.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
5ddda0b8f0 block: Make .bdrv_load_vmstate() vectored
This brings it in line with .bdrv_save_vmstate().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
f1e8474115 block: Introduce bdrv_preadv()
We already have a byte-based bdrv_pwritev(), but the read counterpart
was still missing. This commit adds it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Thomas Huth
48bea96572 doc: Fix mailing list address in tests/qemu-iotests/README
The address of the mailing list is qemu-devel@nongnu.org
instead of qemu-devel@savannah.nongnu.org. And while we're
at it, also mention the qemu-block mailing list here.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
ccb9dc1012 linux-aio: Cancel BH if not needed
linux-aio uses a BH in order to make sure that the remaining completions
are processed even in nested event loops of completion callbacks in
order to avoid deadlocks.

There is no need, however, to have the BH overhead for the first call
into qemu_laio_completion_bh() or after all pending completions have
already been processed. Therefore, this patch calls directly into
qemu_laio_completion_bh() in qemu_laio_completion_cb() and cancels
the BH after qemu_laio_completion_bh() has processed all pending
completions.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
23b0d9fb1d block: Don't enforce 512 byte minimum alignment
If block drivers say that they can do an alignment < 512 bytes, let's
just suppose they mean it. raw-posix used to be an offender with respect
to this, but it can actually deal with byte-aligned requests now.

The default is still 512 bytes for any drivers that only implement
sector-based interfaces, but it is 1 now for drivers that implement
.bdrv_co_preadv.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
9d52aa3c38 raw-posix: Implement .bdrv_co_preadv/pwritev
The raw-posix block driver actually supports byte-aligned requests now
on non-O_DIRECT images, like it already (and previously incorrectly)
claimed in bs->request_alignment.

For some block drivers this means that a RMW cycle can be avoided when
they write sub-sector metadata e.g. for cluster allocation.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
2174f12bde raw-posix: Switch to bdrv_co_* interfaces
In order to use the modern byte-based .bdrv_co_preadv/pwritev()
interface, this patch switches raw-posix to coroutine-based interfaces
as a first step. In terms of semantics and performance, it doesn't make
a difference with the existing code whether we go from a coroutine to a
callback-based interface already in block/io.c or only in linux-aio.c

As there have been concerns in the past that this change may be a step
in the wrong direction with respect to a possible AIO fast path, the
old callback-based interface for linux-aio is left around and can be
reactivated when a fast path (e.g. directly from virtio-blk dataplane,
bypassing the whole block layer) is implemented.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
9896c8765f block: Prepare bdrv_aligned_pwritev() for byte-aligned requests
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
49c0752600 block: Prepare bdrv_aligned_preadv() for byte-aligned requests
This patch makes bdrv_aligned_preadv() ready to accept byte-aligned
requests. Note that this doesn't mean that such requests are actually
made. The caller still ensures that all requests are aligned to at least
512 bytes.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
244483e64e block: Byte-based bdrv_co_do_copy_on_readv()
In a first step to convert the common I/O path to work on bytes rather
than sectors, this converts the copy-on-read logic that is used by
bdrv_aligned_preadv().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Daniel P. Berrange
8c0dcbc4ad block: drop support for using qcow[2] encryption with system emulators
Back in the 2.3.0 release we declared qcow[2] encryption as
deprecated, warning people that it would be removed in a future
release.

  commit a1f688f415
  Author: Markus Armbruster <armbru@redhat.com>
  Date:   Fri Mar 13 21:09:40 2015 +0100

    block: Deprecate QCOW/QCOW2 encryption

The code still exists today, but by a (happy?) accident we entirely
broke the ability to use qcow[2] encryption in the system emulators
in the 2.4.0 release due to

  commit 8336aafae1
  Author: Daniel P. Berrange <berrange@redhat.com>
  Date:   Tue May 12 17:09:18 2015 +0100

    qcow2/qcow: protect against uninitialized encryption key

This commit was designed to prevent future coding bugs which
might cause QEMU to read/write data on an encrypted block
device in plain text mode before a decryption key is set.

It turns out this preventative measure was a little too good,
because we already had a long standing bug where QEMU read
encrypted data in plain text mode during system emulator
startup, in order to guess disk geometry:

  Thread 10 (Thread 0x7fffd3fff700 (LWP 30373)):
  #0  0x00007fffe90b1a28 in raise () at /lib64/libc.so.6
  #1  0x00007fffe90b362a in abort () at /lib64/libc.so.6
  #2  0x00007fffe90aa227 in __assert_fail_base () at /lib64/libc.so.6
  #3  0x00007fffe90aa2d2 in  () at /lib64/libc.so.6
  #4  0x000055555587ae19 in qcow2_co_readv (bs=0x5555562accb0, sector_num=0, remaining_sectors=1, qiov=0x7fffffffd260) at block/qcow2.c:1229
  #5  0x000055555589b60d in bdrv_aligned_preadv (bs=bs@entry=0x5555562accb0, req=req@entry=0x7fffd3ffea50, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=512, qiov=qiov@entry=0x7fffffffd260, flags=0) at block/io.c:908
  #6  0x000055555589b8bc in bdrv_co_do_preadv (bs=0x5555562accb0, offset=0, bytes=512, qiov=0x7fffffffd260, flags=<optimized out>) at block/io.c:999
  #7  0x000055555589c375 in bdrv_rw_co_entry (opaque=0x7fffffffd210) at block/io.c:544
  #8  0x000055555586933b in coroutine_thread (opaque=0x555557876310) at coroutine-gthread.c:134
  #9  0x00007ffff64e1835 in g_thread_proxy (data=0x5555562b5590) at gthread.c:778
  #10 0x00007ffff6bb760a in start_thread () at /lib64/libpthread.so.0
  #11 0x00007fffe917f59d in clone () at /lib64/libc.so.6

  Thread 1 (Thread 0x7ffff7ecab40 (LWP 30343)):
  #0  0x00007fffe91797a9 in syscall () at /lib64/libc.so.6
  #1  0x00007ffff64ff87f in g_cond_wait (cond=cond@entry=0x555555e085f0 <coroutine_cond>, mutex=mutex@entry=0x555555e08600 <coroutine_lock>) at gthread-posix.c:1397
  #2  0x00005555558692c3 in qemu_coroutine_switch (co=<optimized out>) at coroutine-gthread.c:117
  #3  0x00005555558692c3 in qemu_coroutine_switch (from_=0x5555562b5e30, to_=to_@entry=0x555557876310, action=action@entry=COROUTINE_ENTER) at coroutine-gthread.c:175
  #4  0x0000555555868a90 in qemu_coroutine_enter (co=0x555557876310, opaque=0x0) at qemu-coroutine.c:116
  #5  0x0000555555859b84 in thread_pool_completion_bh (opaque=0x7fffd40010e0) at thread-pool.c:187
  #6  0x0000555555859514 in aio_bh_poll (ctx=ctx@entry=0x5555562953b0) at async.c:85
  #7  0x0000555555864d10 in aio_dispatch (ctx=ctx@entry=0x5555562953b0) at aio-posix.c:135
  #8  0x0000555555864f75 in aio_poll (ctx=ctx@entry=0x5555562953b0, blocking=blocking@entry=true) at aio-posix.c:291
  #9  0x000055555589c40d in bdrv_prwv_co (bs=bs@entry=0x5555562accb0, offset=offset@entry=0, qiov=qiov@entry=0x7fffffffd260, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:591
  #10 0x000055555589c503 in bdrv_rw_co (bs=bs@entry=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845, is_write=is_write@entry=false, flags=flags@entry=(unknown: 0)) at block/io.c:614
  #11 0x000055555589c562 in bdrv_read_unthrottled (nb_sectors=21845, buf=0x7fffffffd2e0 "\321,", sector_num=0, bs=0x5555562accb0) at block/io.c:622
  #12 0x000055555589c562 in bdrv_read_unthrottled (bs=0x5555562accb0, sector_num=sector_num@entry=0, buf=buf@entry=0x7fffffffd2e0 "\321,", nb_sectors=nb_sectors@entry=21845) at block/io.c:634
    nb_sectors@entry=1) at block/block-backend.c:504
  #14 0x0000555555752e9f in guess_disk_lchs (blk=blk@entry=0x5555562a5290, pcylinders=pcylinders@entry=0x7fffffffd52c, pheads=pheads@entry=0x7fffffffd530, psectors=psectors@entry=0x7fffffffd534) at hw/block/hd-geometry.c:68
  #15 0x0000555555752ff7 in hd_geometry_guess (blk=0x5555562a5290, pcyls=pcyls@entry=0x555557875d1c, pheads=pheads@entry=0x555557875d20, psecs=psecs@entry=0x555557875d24, ptrans=ptrans@entry=0x555557875d28) at hw/block/hd-geometry.c:133
  #16 0x0000555555752b87 in blkconf_geometry (conf=conf@entry=0x555557875d00, ptrans=ptrans@entry=0x555557875d28, cyls_max=cyls_max@entry=65536, heads_max=heads_max@entry=16, secs_max=secs_max@entry=255, errp=errp@entry=0x7fffffffd5e0) at hw/block/block.c:71
  #17 0x0000555555799bc4 in ide_dev_initfn (dev=0x555557875c80, kind=IDE_HD) at hw/ide/qdev.c:174
  #18 0x0000555555768394 in device_realize (dev=0x555557875c80, errp=0x7fffffffd640) at hw/core/qdev.c:247
  #19 0x0000555555769a81 in device_set_realized (obj=0x555557875c80, value=<optimized out>, errp=0x7fffffffd730) at hw/core/qdev.c:1058
  #20 0x00005555558240ce in property_set_bool (obj=0x555557875c80, v=<optimized out>, opaque=0x555557875de0, name=<optimized out>, errp=0x7fffffffd730)
        at qom/object.c:1514
  #21 0x0000555555826c87 in object_property_set_qobject (obj=obj@entry=0x555557875c80, value=value@entry=0x55555784bcb0, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/qom-qobject.c:24
  #22 0x0000555555825760 in object_property_set_bool (obj=obj@entry=0x555557875c80, value=value@entry=true, name=name@entry=0x55555591cb3d "realized", errp=errp@entry=0x7fffffffd730) at qom/object.c:905
  #23 0x000055555576897b in qdev_init_nofail (dev=dev@entry=0x555557875c80) at hw/core/qdev.c:380
  #24 0x0000555555799ead in ide_create_drive (bus=bus@entry=0x555557629630, unit=unit@entry=0, drive=0x5555562b77e0) at hw/ide/qdev.c:122
  #25 0x000055555579a746 in pci_ide_create_devs (dev=dev@entry=0x555557628db0, hd_table=hd_table@entry=0x7fffffffd830) at hw/ide/pci.c:440
  #26 0x000055555579b165 in pci_piix3_ide_init (bus=<optimized out>, hd_table=0x7fffffffd830, devfn=<optimized out>) at hw/ide/piix.c:218
  #27 0x000055555568ca55 in pc_init1 (machine=0x5555562960a0, pci_enabled=1, kvmclock_enabled=<optimized out>) at /home/berrange/src/virt/qemu/hw/i386/pc_piix.c:256
  #28 0x0000555555603ab2 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4249

So the safety net is correctly preventing QEMU reading cipher
text as if it were plain text, during startup and aborting QEMU
to avoid bad usage of this data.

For added fun this bug only happens if the encrypted qcow2
file happens to have data written to the first cluster,
otherwise the cluster won't be allocated and so qcow2 would
not try the decryption routines at all, just return all 0's.

That no one even noticed, let alone reported, this bug that
has shipped in 2.4.0, 2.5.0 and 2.6.0 shows that the number
of actual users of encrypted qcow2 is approximately zero.

So rather than fix the crash, and backport it to stable
releases, just go ahead with what we have warned users about
and disable any use of qcow2 encryption in the system
emulators. qemu-img/qemu-io/qemu-nbd are still able to access
qcow2 encrypted images for the sake of data conversion.

In the future, qcow2 will gain support for the alternative
luks format, but when this happens it'll be using the
'-object secret' infrastructure for getting keys, which
avoids this problematic scenario entirely.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Eric Blake
fa16653874 block: Assert that flags are in range
Add a new BDRV_REQ_MASK constant, and use it to make sure that
caller flags are always valid.

Tested with 'make check' and with qemu-iotests on both '-raw'
and '-qcow2'; the only failure turned up was fixed in the
previous commit.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Eric Blake
73698c30ca block: Avoid bogus flags during mirroring
Commit e253f4b8 converted mirroring from sector-based bdrv_aio_*
to byte-based blk_aio_*, but failed to account for the subtle
difference in signatures (the former takes a semi-redundant length,
the latter takes a flags parameter).  Since all of our flags are
currently smaller in size than BDRV_SECTOR_SIZE, it has no ill
effects until we either perform sub-sector mirroring, or we start
asserting that no unexpected flags are set.  I found it while
testing new asserts when qemu-iotests 132 started warning about an
unknown flag 0x200000.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
604e861362 qemu-img bench: Fix uninitialised writethrough mode
If no -t option is specified, bool writethrough stayed uninitialised.
Initialise it as false, which makes cache=writeback the default cache
mode.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-16 15:19:55 +02:00
Cédric Le Goater
9e19036e5a m25p80: fix test on blk_pread() return value
commit 243e6f69c1 ("m25p80: Switch to byte-based block access")
replaced blk_read() calls with blk_pread() but return values are
different.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Vladimir Sementsov-Ogievskiy
479b5998d4 hmp: acquire aio_context in hmp_qemu_io
Acquire aio context before run command, this is mandatory for unit tests.

Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Colin Lord
38a53d506b blockdev: clarify error on attempt to open locked tray
When opening a device with a locked tray, gives an error explaining the
device tray is locked and that the user should wait and try again. This
is less confusing than the previous error, which simply stated that the
tray was locked.

Signed-off-by: Colin Lord <clord@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
d46a0bb24d qcow2: Implement .bdrv_co_pwritev()
This changes qcow2 to implement the byte-based .bdrv_co_pwritev
interface rather than the sector-based old one.

As preallocation uses the same allocation function as normal writes, and
the interface of that function needs to be changed, it is converted in
the same patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
8556739355 qcow2: Use bytes instead of sectors for QCowL2Meta
In preparation for implementing .bdrv_co_pwritev in qcow2.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
aaa4d20b49 qcow2: Make copy_sectors() byte based
This will allow copy on write operations where the overwritten part of
the cluster is not aligned to sector boundaries.

Also rename the function because it has nothing to do with sectors any
more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
ecfe186380 qcow2: Implement .bdrv_co_preadv()
Reading from qcow2 images is now byte granularity.

Most of the affected code in qcow2 actually gets simpler with this
change. The only exception is encryption, which is fixed on 512 bytes
blocks; in order to keep this working, bs->request_alignment is set for
encrypted images.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-16 15:19:55 +02:00
Kevin Wolf
b2f65d6b02 qcow2: Work with bytes in qcow2_get_cluster_offset()
This patch changes the units that qcow2_get_cluster_offset() uses
internally, without touching the interface just yet. This will be done
in another patch.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-16 15:19:55 +02:00
Peter Maydell
a66370b08d Merge remote-tracking branch 'remotes/amit-migration/tags/migration-for-2.7-4' into staging
Migration:

- Fixes for TLS series
- Postcopy: Add stats, fix, test case

# gpg: Signature made Thu 16 Jun 2016 05:40:09 BST
# gpg:                using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337  2735 1E9A 3B5F 8540 83B6
#      Subkey fingerprint: CC63 D332 AB8F 4617 4529  6534 EB0B 4DFC 657E F670

* remotes/amit-migration/tags/migration-for-2.7-4:
  migration: rename functions to starting migrations
  migration: fix typos in qapi-schema from latest migration additions
  Postcopy: Check for support when setting the capability
  tests: fix libqtest socket timeouts
  test: Postcopy
  Postcopy: Add stats on page requests
  Migration: Split out ram part of qmp_query_migrate
  Postcopy: Avoid 0 length discards

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-16 10:53:33 +01:00
Daniel P. Berrange
22724f4921 migration: rename functions to starting migrations
Apply the following renames for starting incoming migration:

 process_incoming_migration -> migration_fd_process_incoming
 migration_set_incoming_channel -> migration_channel_process_incoming
 migration_tls_set_incoming_channel -> migration_tls_channel_process_incoming

and for starting outgoing migration:

 migration_set_outgoing_channel -> migration_channel_connect
 migration_tls_set_outgoing_channel -> migration_tls_channel_connect

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1464776234-9910-3-git-send-email-berrange@redhat.com
Message-Id: <1464776234-9910-3-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:51:37 +05:30
Daniel P. Berrange
bdbba12b6f migration: fix typos in qapi-schema from latest migration additions
Recent migration QAPI enhancements had a few spelling mistakes
and also incorrect version number in a few places.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1464776234-9910-2-git-send-email-berrange@redhat.com
Message-Id: <1464776234-9910-2-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:51:37 +05:30
Dr. David Alan Gilbert
096631bd95 Postcopy: Check for support when setting the capability
Knowing whether the destination host supports migration with
postcopy can be tricky.
The destination doesn't need the capability set, however
if we set it then use the opportunity to do the test and
tell the user/management layer early.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1465816605-29488-7-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-7-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Andrea Arcangeli
f5d4579178 tests: fix libqtest socket timeouts
I kept getting timeouts and unix socket accept failures under high
load, the patch fixes it.

Signed-off-by: Andrea Arcangeli <aarcange@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Message-id: 1465816605-29488-6-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-6-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert
ea0c6d6239 test: Postcopy
This is a postcopy test (x86 only) that actually runs the guest
and checks the memory contents.

The test runs from an x86 boot block with the hex embedded in the test;
the source for this is:

...........

.code16
.org 0x7c00
	.file	"fill.s"
	.text
	.globl	start
	.type	start, @function
start:             # at 0x7c00 ?
        cli
        lgdt gdtdesc
        mov $1,%eax
        mov %eax,%cr0  # Protected mode enable
        data32 ljmp $8,$0x7c20

.org 0x7c20
.code32
        # A20 enable - not sure I actually need this
        inb $0x92,%al
        or  $2,%al
        outb %al, $0x92

        # set up DS for the whole of RAM (needed on KVM)
        mov $16,%eax
        mov %eax,%ds

        mov $65,%ax
        mov $0x3f8,%dx
        outb %al,%dx

        # bl keeps a counter so we limit the output speed
        mov $0, %bl
mainloop:
        # Start from 1MB
        mov $(1024*1024),%eax
innerloop:
        incb (%eax)
        add $4096,%eax
        cmp $(100*1024*1024),%eax
        jl innerloop

        inc %bl
        jnz mainloop

        mov $66,%ax
        mov $0x3f8,%dx
        outb %al,%dx

	jmp mainloop

        # GDT magic from old (GPLv2)  Grub startup.S
        .p2align        2       /* force 4-byte alignment */
gdt:
        .word   0, 0
        .byte   0, 0, 0, 0

        /* -- code segment --
         * base = 0x00000000, limit = 0xFFFFF (4 KiB Granularity), present
         * type = 32bit code execute/read, DPL = 0
         */
        .word   0xFFFF, 0
        .byte   0, 0x9A, 0xCF, 0

        /* -- data segment --
         * base = 0x00000000, limit 0xFFFFF (4 KiB Granularity), present
         * type = 32 bit data read/write, DPL = 0
         */
        .word   0xFFFF, 0
        .byte   0, 0x92, 0xCF, 0

gdtdesc:
        .word   0x27                    /* limit */
        .long   gdt                     /* addr */

/* I'm a bootable disk */
.org 0x7dfe
        .byte 0x55
        .byte 0xAA

...........

and that can be assembled by the following magic:
    as --32 -march=i486 fill.s -o fill.o
    objcopy -O binary fill.o fill.boot
    dd if=fill.boot of=bootsect bs=256 count=2 skip=124
    xxd -i bootsect

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Message-id: 1465816605-29488-5-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-5-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert
d3bf5418e2 Postcopy: Add stats on page requests
On the source, add a count of page requests received from the
destination.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-id: 1465816605-29488-4-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-4-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert
a22463a5dc Migration: Split out ram part of qmp_query_migrate
The RAM section of qmp_query_migrate is reasonably complex
and repeated 3 times.  Split it out into a helper.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1465816605-29488-3-git-send-email-dgilbert@redhat.com
Reviwed-by: Denis V. Lunev <den@openvz.org>
Message-Id: <1465816605-29488-3-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Dr. David Alan Gilbert
d688c62d09 Postcopy: Avoid 0 length discards
The discard code in migration/ram.c would send request for
zero length discards in the case where no discards were needed.
It doesn't appear to have had any bad effect.

Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-id: 1465816605-29488-2-git-send-email-dgilbert@redhat.com
Message-Id: <1465816605-29488-2-git-send-email-dgilbert@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-06-16 09:50:07 +05:30
Peter Maydell
5deaac15bf Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
X86 queue, 2016-06-14 (v2)

# gpg: Signature made Tue 14 Jun 2016 22:29:04 BST
# gpg:                using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF  D1AA 2807 936F 984D C5A6

* remotes/ehabkost/tags/x86-pull-request:
  target-i386: Consolidate calls of object_property_parse() in x86_cpu_parse_featurestr
  target-i386: Use cpu_generic_init() in cpu_x86_init()
  target-i386: Move xcc->kvm_required check to realize time
  target-i386: Remove assert(kvm_enabled()) from host_x86_cpu_initfn()
  target-i386: Move features logic that requires CPUState to realize time
  target-i386: Remove xlevel & hv-spinlocks option fixups
  target-i386: Implement CPUID[0xB] (Extended Topology Enumeration)
  pc: Add 2.7 machine
  target-i386: add Skylake-Client cpu model

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-15 16:12:19 +01:00
Eduardo Habkost
f6750e959a target-i386: Consolidate calls of object_property_parse() in x86_cpu_parse_featurestr
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:09 -03:00
Igor Mammedov
a57d0163e7 target-i386: Use cpu_generic_init() in cpu_x86_init()
Now cpu_x86_init() does nothing more or less
than duplicating cpu_generic_init() logic.
So simplify it by using cpu_generic_init().

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:09 -03:00
Igor Mammedov
104494ea25 target-i386: Move xcc->kvm_required check to realize time
It will allow to drop custom cpu_x86_init() and use
cpu_generic_init() instead, reducing cpu_x86_create()
to a simple 3-liner.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:09 -03:00
Eduardo Habkost
e435601058 target-i386: Remove assert(kvm_enabled()) from host_x86_cpu_initfn()
The code will be changed to allow creation of the CPU object and
report kvm_required errors only at realizefn, so we need to make
the instance_init function more flexible.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:09 -03:00
Igor Mammedov
dc15c0517b target-i386: Move features logic that requires CPUState to realize time
Making x86_cpu_parse_featurestr() a pure convertor
of legacy feature string into global properties, needs
it to be called before a CPU instance is created so
parser shouldn't modify CPUState directly or access
it at all. Hence move current hack that directly pokes
into CPUState, to set/unset +-feats, from parser to
CPU's realize method.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:09 -03:00
Eduardo Habkost
c19b85216b target-i386: Remove xlevel & hv-spinlocks option fixups
The "fixup will be removed in future versions" warnings are
present since QEMU 1.7.0, at least, so users should have fixed
their scripts and configurations, already.

In the case of libvirt users, libvirt doesn't use the "xlevel"
option, and already rejects HyperV spinlock retry count < 0xFFF.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:08 -03:00
Radim Krčmář
5232d00a04 target-i386: Implement CPUID[0xB] (Extended Topology Enumeration)
I looked at a dozen Intel CPU that have this CPUID and all of them
always had Core offset as 1 (a wasted bit when hyperthreading is
disabled) and Package offset at least 4 (wasted bits at <= 4 cores).

QEMU uses more compact IDs and it doesn't make much sense to change it
now.  I keep the SMT and Core sub-leaves even if there is just one
thread/core;  it makes the code simpler and there should be no harm.

Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:08 -03:00
Igor Mammedov
d86c145114 pc: Add 2.7 machine
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:08 -03:00
Eduardo Habkost
f6f949e929 target-i386: add Skylake-Client cpu model
Introduce Skylake-Client cpu mode which inherits the features from
Broadwell and supports some additional features that are: MPX,
XSAVEC, and XGETBV1.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-06-14 16:17:08 -03:00
Peter Maydell
49237b856a Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160614-tag' into staging
Xen 2016/06/14

# gpg: Signature made Tue 14 Jun 2016 16:01:52 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20160614-tag:
  xen: Clean up includes
  xen/blkif: avoid double access to any shared ring request fields

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 16:32:32 +01:00
Peter Maydell
1be08a0946 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160614-2' into staging
target-arm queue:
 * add PMU support for virt machine under KVM
 * fix reset and migration of TTBCR(S)
 * add virt-2.7 machine type
 * QOMify various ARM devices
 * implement xilinx DisplayPort device
 * don't permit ARMv8-only Neon insns to work on ARMv7

# gpg: Signature made Tue 14 Jun 2016 16:01:45 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83  15CF 3C25 25ED 1436 0CDE

* remotes/pmaydell/tags/pull-target-arm-20160614-2: (30 commits)
  target-arm: Don't permit ARMv8-only Neon insns on ARMv7
  arm: xlnx-zynqmp: Add xlnx-dp and xlnx-dpdma
  introduce xlnx-dp
  introduce xlnx-dpdma
  hw/i2c-ddc.c: Implement DDC I2C slave
  introduce dpcd module
  introduce aux-bus
  i2c: Factor our send() and recv() common logic
  i2c: implement broadcast write
  i2cbus: remove unused dev field
  hw/sd: QOM'ify pl181.c
  hw/dma: QOM'ify pxa2xx_dma.c
  hw/misc: QOM'ify mst_fpga.c
  hw/misc: QOM'ify exynos4210_pmu.c
  hw/misc: QOM'ify arm_l2x0.c
  hw/gpio: QOM'ify zaurus.c
  hw/gpio: QOM'ify pl061.c
  hw/gpio: QOM'ify omap_gpio.c
  hw/i2c: QOM'ify versatile_i2c.c
  hw/i2c: QOM'ify omap_i2c.c
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 16:04:25 +01:00
Peter Maydell
fe8fcf3d64 target-arm: Don't permit ARMv8-only Neon insns on ARMv7
The Neon instructions VCVTA, VCVTM, VCVTN, VCVTP, VRINTA, VRINTM,
VRINTN, VRINTP, VRINTX, and VRINTZ were only introduced with ARMv8,
so they need a guard to make them UNDEF if the CPU only supports ARMv7.
(We got this right for all the other new-in-v8 insns, but forgot
it for these Neon 2-reg-misc ops.)

Reported-by: Christophe Lyon <christophe.lyon@linaro.org>
Tested-by: Christophe Lyon <christophe.lyon@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465492511-9333-1-git-send-email-peter.maydell@linaro.org
2016-06-14 16:01:03 +01:00
KONRAD Frederic
b93dbcdd59 arm: xlnx-zynqmp: Add xlnx-dp and xlnx-dpdma
This adds the DP and the DPDMA to the Zynq MP platform.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Peter Crosthwaite <peter.crosthwaite@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-10-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 16:01:03 +01:00
KONRAD Frederic
58ac482a66 introduce xlnx-dp
This is the implementation of the DisplayPort.
It has an aux-bus to access dpcd and edid.

Graphic plane is connected to the channel 3.
Video plane is connected to the channel 0.
Audio stream are connected to the channels 4 and 5.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-9-git-send-email-fred.konrad@greensocs.com
[PMM: fixed format strings]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 16:01:03 +01:00
KONRAD Frederic
d3c6369a96 introduce xlnx-dpdma
This is the implementation of the DPDMA.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-8-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 16:01:03 +01:00
Peter Maydell
78c71af804 hw/i2c-ddc.c: Implement DDC I2C slave
Implement an I2C slave which implements DDC and returns the
EDID data for an attached monitor.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-by: Hyun Kwon <hyun.kwon@xilinx.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-7-git-send-email-fred.konrad@greensocs.com

  - Rebased on the current master.
  - Modified for QOM.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
[PMM: actually wire up the vmstate to dc->vmsd]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:15 +01:00
KONRAD Frederic
e27ed1bdd3 introduce dpcd module
This introduces dpcd module.
It wires on a aux-bus and can be accessed by the driver to get lane-speed, etc.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-6-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:15 +01:00
KONRAD Frederic
6fc7f77fd2 introduce aux-bus
This introduces a new bus: aux-bus.

It contains an address space for aux slaves devices and a bridge to an I2C bus
for I2C through AUX transactions.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1465833014-21982-5-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:15 +01:00
Peter Crosthwaite
056fca7b51 i2c: Factor our send() and recv() common logic
Most of the control flow logic between send and recv (error checking
etc) is the same. Factor this out into a common send_recv() API.
This is then usable by clients, where the control logic for send
and receive differs only by a boolean. E.g.

if (send)
   i2c_send(...):
else
   i2c_recv(...);

becomes:

i2c_send_recv(... , send);

Signed-off-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1465833014-21982-4-git-send-email-fred.konrad@greensocs.com
Changes from FK:
  * Rebased on master.
  * Rebased on my i2c broadcast patch.
Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
KONRAD Frederic
2293c27fad i2c: implement broadcast write
This does a write to every slaves when the I2C bus get a write to address 0.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-3-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
KONRAD Frederic
a9d2f1d45f i2cbus: remove unused dev field
The dev field in i2cbus is not used.
So just drop it.

Signed-off-by: KONRAD Frederic <fred.konrad@greensocs.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Tested-By: Hyun Kwon <hyun.kwon@xilinx.com>
Message-id: 1465833014-21982-2-git-send-email-fred.konrad@greensocs.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
0d554cb043 hw/sd: QOM'ify pl181.c
split the old SysBus init function into an instance_init
and a Device realize function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-13-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
c9796d714c hw/dma: QOM'ify pxa2xx_dma.c
split the old SysBus init function into an instance_init
and a Device realize function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-12-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
e2d4f17e55 hw/misc: QOM'ify mst_fpga.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-11-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
b4ebbab9a1 hw/misc: QOM'ify exynos4210_pmu.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-10-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
da8060bfc0 hw/misc: QOM'ify arm_l2x0.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-9-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:14 +01:00
xiaoqiang zhao
5367766742 hw/gpio: QOM'ify zaurus.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-8-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
09e6fb3e36 hw/gpio: QOM'ify pl061.c
* Merge the pl061_initfn into pl061_init
* Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-7-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
ebc116f8c3 hw/gpio: QOM'ify omap_gpio.c
* Split the old SysBus init into an instance_init and
  DeviceClass::realize function
* Drop the SysBus init function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-6-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
8ce26fcd29 hw/i2c: QOM'ify versatile_i2c.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-5-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
758aba7d73 hw/i2c: QOM'ify omap_i2c.c
* Split the omap_i2c_init into an instance_init and realize function
* Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-4-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
93d6599f46 hw/i2c: QOM'ify exynos4210_i2c.c
* Rename the exynos4210_i2c_realize to exynos4210_i2c_init
* Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-3-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
xiaoqiang zhao
00b2f75870 hw/i2c: QOM'ify bitbang_i2c.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465815255-21776-2-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
Andrew Jones
1287f2b340 hw/arm/virt: create the 2.7 machine type
Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465746713-30414-5-git-send-email-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:13 +01:00
Andrew Jones
3356ebce9e hw/arm/virt: introduce DEFINE_VIRT_MACHINE_AS_LATEST
Create two variants of DEFINE_VIRT_MACHINE. One, just called
DEFINE_VIRT_MACHINE, that does not set properties that only
the latest machine type should have, and another that does.
This will hopefully reduce potential for errors when adding
new versions.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465746713-30414-4-git-send-email-drjones@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Andrew Jones
ab093c3c55 hw/arm/virt: introduce DEFINE_VIRT_MACHINE
Use DEFINE_VIRT_MACHINE to generate versioned machine type info.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465746713-30414-3-git-send-email-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Andrew Jones
7a2ecd95d9 hw/arm/virt: separate versioned type-init code
Rename machvirt_info (which is specifically for 2.6 TypeInfo)
to machvirt_2_6_info, and separate the type registration of the
abstract machine type from the versioned type.

Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465746713-30414-2-git-send-email-drjones@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Peter Maydell
811595a2d4 target-arm: Fix reset and migration of TTBCR(S)
Commit 6459b94c26 broke reset and migration of the AArch32
TTBCR(S) register if the guest used non-LPAE page tables. This is
because the AArch32 TTBCR register definition is marked as ARM_CP_ALIAS,
meaning that the AArch64 variant has to handle migration and reset.
Although AArch64 TCR_EL3 doesn't need to care about the mask and
base_mask fields, AArch32 may do so, and so we must use the special
TTBCR reset and raw write functions to ensure they are set correctly.

This doesn't affect TCR_EL2, because the AArch32 equivalent of that
is HTCR, which never uses the non-LPAE page table variant.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Pranith Kumar <bobby.prani+qemu@gmail.com>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-id: 1465488181-31977-1-git-send-email-peter.maydell@linaro.org
2016-06-14 15:59:12 +01:00
Shannon Zhao
8433dee027 hw/arm/virt-acpi-build: Add PMU IRQ number in ACPI table
Add PMU IRQ number in ACPI table, then we can use PMU in guest through
ACPI.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465267577-1808-4-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Shannon Zhao
01fe6b6076 hw/arm/virt: Add PMU node for virt machine
Add a virtual PMU device for virt machine while use PPI 7 for PMU
overflow interrupt number.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465267577-1808-3-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Shannon Zhao
5c0a3819f0 target-arm: kvm64: set guest PMUv3 feature bit if supported
Check if kvm supports guest PMUv3. If so, set the corresponding feature
bit for vcpu.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1465267577-1808-2-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 15:59:12 +01:00
Peter Maydell
b1b23e5bbf xen: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-14 15:37:43 +01:00
Peter Maydell
7474f1be70 qdev_try_create(): Assert that devices we put onto the system bus are SysBusDevices
If qdev_try_create() is passed NULL for the bus, it will automatically
put the newly created device onto the default system bus. However
if the device is not actually a SysBusDevice then this will result
in later crashes (for instance when running the monitor "info qtree"
command) because code reasonably assumes that all devices on the system
bus are system bus devices.

Generally the mistake is that the calling code should create the
object with object_new(TYPE_FOO) rather than qdev_create(NULL, TYPE_FOO);
see commit 6749695eaa for an example of fixing this bug.

Assert in qdev_try_create() if the device isn't suitable to put on
the system bus, so that this mistake results in failure earlier
and more reliably.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
2016-06-14 15:07:43 +01:00
Peter Maydell
d32490ca74 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160614' into staging
More s390x patches, this time mostly dealing with channel I/O:
Bugfixes and cleanups, and dequeue pending interrupts after
machine checks.

# gpg: Signature made Tue 14 Jun 2016 13:09:43 BST
# gpg:                using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0  18CE DECF 6B93 C6F0 2FAF

* remotes/cohuck/tags/s390x-20160614:
  s390x/kvm: Fixup interrupt type for non-adapter I/O interrupts
  s390x: Limit s390-ccw machines to 248 CPUs
  virtio-ccw: Provide traces for indicator changes
  s390x/css: introduce property type for device ids
  s390x/css: clear IO irqs when generating IPI CRW
  s390x/kvm: add interface for clearing IO irqs
  linux-headers: update

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 13:14:55 +01:00
Christian Borntraeger
393ad2a4a1 s390x/kvm: Fixup interrupt type for non-adapter I/O interrupts
The current algorithm for I/O interrupts would result in a wrong
interrupt type for subchannel numbers fffe and ffff. In addition
a non adapter interrupt might look like an adapter interrupt for
any subchannel number that has the 0x0400 bit set.

No kernel has ever used the type outside logging - and the logging
was wrong all the time. For everything else the kernel used the
interrupt parameters.

Let's use the KVM_S390_INT_IO macro as for adapter interrupts.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 14:00:05 +02:00
Christian Borntraeger
dcddc75e47 s390x: Limit s390-ccw machines to 248 CPUs
The sclp scp read info call fills in a buffer with information about the
system. With more than 248 CPUs we overflow the 4k buffer of the SCCB,
leading to random data corruption. Basically ALL guest operating systems
call scp read info, so let's limit the machines to 248 CPUs to make it
obvious that >=249 does not work.

As KVM also limits itself to 248 and TCG on s390 does not support
SMP, this should cause no regression for any user as no VMs with more
than 248 VCPUs were ever possible.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Boris Fiuczynski <fiuczy@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 14:00:05 +02:00
Christian Borntraeger
06409bd91b virtio-ccw: Provide traces for indicator changes
This allows to trace changes in the summary and queue indicators
for the non-irqfd case. For irqfd, kernel traces are needed instead.

Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 14:00:05 +02:00
Cornelia Huck
06e686eaab s390x/css: introduce property type for device ids
Let's introduce a CssDevId to handle device ids of the xx.x.xxxx
type used for channel devices. This has some benefits:

- We can use them in virtio-ccw and split the validity checks for
  a channel device id in general from the constraint checking
  within the virtio-ccw scope.
- We can reuse the device id type for future non-virtio channel
  devices.

While we're at it, improve the validity checks and disallow e.g.
trailing characters.

Suggested-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Acked-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 13:34:50 +02:00
Halil Pasic
c1755b14fa s390x/css: clear IO irqs when generating IPI CRW
According to the Principles of Operation (more precisely the subsection
'Channel-Report Word'), a subchannel put into the installed parameters
initialized state is in the same state as after an I/O system reset (just
parameters possibly changed). This implies that any I/O interrupts for that
subchannel are no longer pending (as I/O system resets clear I/O
interrupts). Therefore, we need an interface to clear pending I/O
interrupts. Make css_generate_sch_crws clear the pending IO interrupts for
the subchannel.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 13:34:50 +02:00
Halil Pasic
9eccb8622c s390x/kvm: add interface for clearing IO irqs
According to the platform specification, under certain conditions,
pending IO interruptions have to be cleared. Let's add an interface
for that.

Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 13:34:50 +02:00
Cornelia Huck
ff804f15a1 linux-headers: update
Update to 4.7-rc2.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-14 13:34:50 +02:00
Peter Maydell
a28aae041a Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160614' into staging
ppc patch queue for 2016-06-14

Latest patch queue for ppc.
    * Allow qemu to support a generic architecture 2.07 (POWER8-era)
      compatibility mode.  This is useful for guests which are POWER8
      aware, but don't know about the specific POWER8 variant that
      qemu (and/or KVM) is emulating. (Thomas Huth)
    * Fix a bug where macio wasn't removing DMA mappings (Mark Cave-Ayland)
    * Add a workaround for Linux guest's miscalculation of maximum
      memory address (including hotplugged memory), which could break
      when hotplug memory was combined with VFIO.  The previous
      approach was technically correct by spec, but differed from
      PowerVM's behaviour enough to trip a guest kernel bug.  This
      works around the bug, while remaining correct-to-spec. (Bharata Rao)

# gpg: Signature made Tue 14 Jun 2016 06:53:58 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160614:
  spapr: Ensure all LMBs are represented in ibm,dynamic-memory
  macio: call dma_memory_unmap() at the end of each DMA transfer
  Add PowerPC AT_HWCAP2 definitions
  ppc: Add PowerISA 2.07 compatibility mode
  ppc: Improve PCR bit selection in ppc_set_compat()
  ppc: Provide function to get CPU class of the host CPU
  ppc: Split pcr_mask settings into supported bits and the register mask
  ppc/spapr: Refactor h_client_architecture_support() CPU parsing code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-14 09:30:05 +01:00
Bharata B Rao
d0e5a8f293 spapr: Ensure all LMBs are represented in ibm,dynamic-memory
Memory hotplug can fail for some combinations of RAM and maxmem when
DDW is enabled in the presence of devices like nec-usb-xhci. DDW depends
on maximum addressable memory returned by guest and this value is currently
being calculated wrongly by the guest kernel routine memory_hotplug_max().
While there is an attempt to fix the guest kernel, this patch works
around the problem within QEMU itself.

memory_hotplug_max() routine in the guest kernel arrives at max
addressable memory by multiplying lmb-size with the lmb-count obtained
from ibm,dynamic-memory property. There are two assumptions here:

- All LMBs are part of ibm,dynamic memory: This is not true for PowerKVM
  where only hot-pluggable LMBs are present in this property.
- The memory area comprising of RAM and hotplug region is contiguous: This
  needn't be true always for PowerKVM as there can be gap between
  boot time RAM and hotplug region.

To work around this guest kernel bug, ensure that ibm,dynamic-memory
has information about all the LMBs (RMA, boot-time LMBs, future
hotpluggable LMBs, and dummy LMBs to cover the gap between RAM and
hotpluggable region).

RMA is represented separately by memory@0 node. Hence mark RMA LMBs
and also the LMBs for the gap b/n RAM and hotpluggable region as
reserved and as having no valid DRC so that these LMBs are not considered
by the guest.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Nathan Fontenot <nfont@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 13:20:01 +10:00
Mark Cave-Ayland
bc9ca5958d macio: call dma_memory_unmap() at the end of each DMA transfer
This ensures that the underlying memory is marked dirty once the transfer
is complete and resolves cache coherency problems under MacOS 9.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:43:24 +10:00
Anton Blanchard
42bff4772e Add PowerPC AT_HWCAP2 definitions
We need the PPC_FEATURE2_HAS_HTM bit in a subsequent patch, so
add the PowerPC AT_HWCAP2 definitions.

Signed-off-by: Anton Blanchard <anton@samba.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth
b30ff227c2 ppc: Add PowerISA 2.07 compatibility mode
Make sure that guests can use the PowerISA 2.07 CPU sPAPR
compatibility mode when they request it and the target CPU
supports it.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth
eac4fba965 ppc: Improve PCR bit selection in ppc_set_compat()
When using an olderr PowerISA level, all the upper compatibility
bits have to be enabled, too. For example when we want to run
something in PowerISA 2.05 compatibility mode on POWER8, the bit
for 2.06 has to be set beside the bit for 2.05.
Additionally, to make sure that we do not set bits that are not
supported by the host, we apply a mask with the known-to-be-good
bits here, too.

Signed-off-by: Thomas Huth <thuth@redhat.com>
[dwg: Added some #ifs to fix compile on 32-bit targets]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth
52b2519c4e ppc: Provide function to get CPU class of the host CPU
When running with KVM, we might be interested in some details
of the host CPU class, too, so provide a function to get the
corresponding CPU class.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth
8cd2ce7aaa ppc: Split pcr_mask settings into supported bits and the register mask
The current pcr_mask values are ambiguous: Should these be the mask
that defines valid bits in the PCR register? Or should these rather
indicate which compatibility levels are possible? Anyway, POWER6 and
POWER7 should certainly not use the same values here. So let's
introduce an additional variable "pcr_supported" here which is
used to indicate the valid compatibility levels, and use pcr_mask
to signal the valid bits in the PCR register.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:38 +10:00
Thomas Huth
7386ae6372 ppc/spapr: Refactor h_client_architecture_support() CPU parsing code
The h_client_architecture_support() function has become quite big
and nested already. So factor out the code that takes care of the
sPAPR compatibility PVRs (which will be modified by the following
patches).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-14 10:41:37 +10:00
Peter Maydell
2c96c379ac Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160613-1' into staging
usb: misc fixes.

# gpg: Signature made Mon 13 Jun 2016 14:09:15 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-usb-20160613-1:
  vl: Eliminate usb_enabled()
  pxa2xx: Unconditionally enable USB controller
  hw/usb/dev-network.c: Use ldl_le_p() and stl_le_p()
  usb-host: add special case for bus+addr

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-13 15:15:03 +01:00
Jan Beulich
4837a1a516 xen/blkif: avoid double access to any shared ring request fields
Commit f9e98e5d7a ("xen/blkif: Avoid double access to
src->nr_segments") didn't go far enough: src->operation is also being
used twice. And nothing was done to prevent the compiler from using the
source side of the copy done by blk_get_request() (granted that's very
unlikely).

Move the barrier()s up, and add another one to blk_get_request().

Note that for completing XSA-155, the barrier() getting added to
blk_get_request() would suffice, and hence the changes to xen_blkif.h
are more like just cleanup. And since, as said, the unpatched code
getting compiled to something vulnerable is very unlikely (and not
observed in practice), this isn't being viewed as a new security issue.

Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-13 14:32:28 +01:00
Peter Maydell
55e5c3a2d2 Merge remote-tracking branch 'remotes/berrange/tags/qcrypto-next-2016-06-13-v1' into staging
Merge qcrypto-next 2016/06/13 v1

# gpg: Signature made Mon 13 Jun 2016 12:43:22 BST
# gpg:                using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg:                 aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E  8E3F BE86 EBB4 1510 4FDF

* remotes/berrange/tags/qcrypto-next-2016-06-13-v1:
  crypto: aes: always rename internal symbols
  crypto: assert that qcrypto_hash_digest_len is in range
  crypto: remove temp files on completion of secrets test
  TLS: provide slightly more information when TLS certificate loading fails

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-13 13:05:02 +01:00
Mike Frysinger
c8d70e5973 crypto: aes: always rename internal symbols
OpenSSL's libcrypto always defines AES symbols with the same names as
qemu's local aes code.  This is problematic when enabling at least curl
as that frequently also uses libcrypto.  It might not be noticed when
running, but if you try to statically link, everything falls down.

An example snippet:
  LINK  qemu-nbd
.../libcrypto.a(aes-x86_64.o): In function 'AES_encrypt':
(.text+0x460): multiple definition of 'AES_encrypt'
crypto/aes.o:aes.c:(.text+0x670): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_decrypt':
(.text+0x9f0): multiple definition of 'AES_decrypt'
crypto/aes.o:aes.c:(.text+0xb30): first defined here
.../libcrypto.a(aes-x86_64.o): In function 'AES_cbc_encrypt':
(.text+0xf90): multiple definition of 'AES_cbc_encrypt'
crypto/aes.o:aes.c:(.text+0xff0): first defined here
collect2: error: ld returned 1 exit status
.../qemu-2.6.0/rules.mak:105: recipe for target 'qemu-nbd' failed
make: *** [qemu-nbd] Error 1

The aes.h header has redefines already for FreeBSD, but go ahead and
enable that for everyone since there's no real good reason to not use
a namespace all the time.

Signed-off-by: Mike Frysinger <vapier@chromium.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-06-13 12:41:17 +01:00
Paolo Bonzini
b35c1f3361 crypto: assert that qcrypto_hash_digest_len is in range
Otherwise unintended results could happen.  For example,
Coverity reports a division by zero in qcrypto_afsplit_hash.
While this cannot really happen, it shows that the contract
of qcrypto_hash_digest_len can be improved.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-06-13 12:41:17 +01:00
Daniel P. Berrange
e7ed11f083 crypto: remove temp files on completion of secrets test
The secret object tests left some temporary files on disk
when completing. Ensure they are unlink, and rename them
to make it more obvious where they come from.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-06-13 12:41:17 +01:00
Alex Bligh
b7b68166dc TLS: provide slightly more information when TLS certificate loading fails
Give slightly more information when certification loading fails.
Rather than have no information, you now get gnutls's only slightly
less unhelpful error messages.

Signed-off-by: Alex Bligh <alex@alex.org.uk>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
2016-06-13 12:41:17 +01:00
Eduardo Habkost
4bcbe0b636 vl: Eliminate usb_enabled()
This wrapper for machine_usb(current_machine) is not necessary,
replace all usages of usb_enabled() with machine_usb().

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: qemu-arm@nongnu.org
Cc: qemu-ppc@nongnu.org
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1465419025-21519-3-git-send-email-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-13 13:24:41 +02:00
Eduardo Habkost
c92cfba822 pxa2xx: Unconditionally enable USB controller
Simplify initialization logic by removing the usb_enabled()
check. The USB controller is part of the SoC, so it doesn't make
sense to create a system where it is not present.

Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Andrzej Zaborowski <balrogg@gmail.com>
Cc: qemu-arm@nongnu.org,
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1465419025-21519-2-git-send-email-ehabkost@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-13 13:24:41 +02:00
Peter Maydell
ec9125bc0e hw/usb/dev-network.c: Use ldl_le_p() and stl_le_p()
Use stl_le_p() and ldl_le_p() to read and write data from
buffers, rather than using pointer casts and cpu_to_le32()
for writes and le32_to_cpup() for reads. This:
 * avoids lots of casts
 * works even if the buffer isn't as aligned as the host would like
 * avoids using the *_to_cpup() functions which we want to get rid of

Note that there may still be some places where a pointer from the
guest is cast to a pointer to a host structure; these would also
have to be changed for the device to work on a host CPU which
enforces alignment restrictions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1465573077-29221-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-13 13:18:51 +02:00
Peter Maydell
8fdf038722 Merge remote-tracking branch 'remotes/sstabellini/tags/xen-20160613-tag' into staging
Xen 2016/06/13

# gpg: Signature made Mon 13 Jun 2016 11:53:18 BST
# gpg:                using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3  0AEA 894F 8F48 70E1 AE90

* remotes/sstabellini/tags/xen-20160613-tag:
  Introduce "xen-load-devices-state"
  exec: Fix qemu_ram_block_from_host for Xen

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-13 12:18:17 +01:00
Gerd Hoffmann
e058fa2dd5 usb-host: add special case for bus+addr
This patch changes usb-host behavior in case we hostbus= and hostaddr=
properties are used to identify the usb device in question.  Instead of
adding the device to the hotplug watchlist we try to open directly using
the given bus number and device address.

Putting a device specified by hostaddr to the hotplug watchlist isn't
a great idea as the address isn't a fixed property.  It changes every
time the device is plugged in.  So considering this case as "use the
device at bus:addr _now_" is more sane.  Also usb-host will throw errors
in case it can't initialize the host device.

Note: For devices on the hotplug watchlist (hostport or vendorid or
productid specified) qemu continues to ignore errors and keeps
monitoring the usb bus to see if the device eventually shows up.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464945175-28939-1-git-send-email-kraxel@redhat.com
2016-06-13 13:17:06 +02:00
Wen Congyang
88c16567d2 Introduce "xen-load-devices-state"
Introduce a "xen-load-devices-state" QAPI command that can be used to
load the state of all devices, but not the RAM or the block devices of
the VM.

We only have hmp commands savevm/loadvm, and qmp commands
xen-save-devices-state.

We use this new command for COLO:
1. suspend both primary vm and secondary vm
2. sync the state
3. resume both primary vm and secondary vm

In such case, we need to update all devices' state in any time.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-13 11:50:53 +01:00
Anthony PERARD
d6b6aec409 exec: Fix qemu_ram_block_from_host for Xen
Since f615f39 (exec: remove ram_addr argument from
qemu_ram_block_from_host), migration under Xen is likely to fail, with a
SEGV of QEMU. But the commit only reveal a bug with the calculation of
the offset value in qemu_ram_block_from_host().

This patch calculates the offset from the ram_addr as
qemu_ram_addr_from_host() will later calculate the ram_addr from the
offset.

Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
2016-06-13 11:50:20 +01:00
Peter Maydell
da2fdd0bd1 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20160611' into staging
TB hashing improvements

# gpg: Signature made Sun 12 Jun 2016 01:12:50 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC  16A4 AD12 70CC 4DD0 279B

* remotes/rth/tags/pull-tcg-20160611:
  translate-all: add tb hash bucket info to 'info jit' dump
  tb hash: track translated blocks with qht
  qht: add test-qht-par to invoke qht-bench from 'check' target
  qht: add qht-bench, a performance benchmark
  qht: add test program
  qht: QEMU's fast, resizable and scalable Hash Table
  qdist: add test program
  qdist: add module to represent frequency distributions of data
  tb hash: hash phys_pc, pc, and flags with xxhash
  exec: add tb_hash_func5, derived from xxhash
  qemu-thread: add simple test-and-set spinlock
  include/processor.h: define cpu_relax()
  seqlock: rename write_lock/unlock to write_begin/end
  seqlock: remove optional mutex
  compiler.h: add QEMU_ALIGNED() to enforce struct alignment

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-13 10:12:44 +01:00
Emilio G. Cota
329844d4bc translate-all: add tb hash bucket info to 'info jit' dump
Examples:

- Good hashing, i.e. tb_hash_func5(phys_pc, pc, flags):
TB count            715135/2684354
[...]
TB hash buckets     388775/524288 (74.15% head buckets used)
TB hash occupancy   33.04% avg chain occ. Histogram: [0,10)%|▆ █  ▅▁▃▁▁|[90,100]%
TB hash avg chain   1.017 buckets. Histogram: 1|█▁▁|3

- Not-so-good hashing, i.e. tb_hash_func5(phys_pc, pc, 0):
TB count            712636/2684354
[...]
TB hash buckets     344924/524288 (65.79% head buckets used)
TB hash occupancy   31.64% avg chain occ. Histogram: [0,10)%|█ ▆  ▅▁▃▁▂|[90,100]%
TB hash avg chain   1.047 buckets. Histogram: 1|█▁▁▁|4

- Bad hashing, i.e. tb_hash_func5(phys_pc, 0, 0):
TB count            702818/2684354
[...]
TB hash buckets     112741/524288 (21.50% head buckets used)
TB hash occupancy   10.15% avg chain occ. Histogram: [0,10)%|█ ▁  ▁▁▁▁▁|[90,100]%
TB hash avg chain   2.107 buckets. Histogram: [1.0,10.2)|█▁▁▁▁▁▁▁▁▁|[83.8,93.0]

- Good hashing, but no auto-resize:
TB count            715634/2684354
TB hash buckets     8192/8192 (100.00% head buckets used)
TB hash occupancy   98.30% avg chain occ. Histogram: [95.3,95.8)%|▁▁▃▄▃▄▁▇▁█|[99.5,100.0]%
TB hash avg chain   22.070 buckets. Histogram: [15.0,16.7)|▁▂▅▄█▅▁▁▁▁|[30.3,32.0]

Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-16-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 17:11:16 -07:00
Emilio G. Cota
909eaac9bb tb hash: track translated blocks with qht
Having a fixed-size hash table for keeping track of all translation blocks
is suboptimal: some workloads are just too big or too small to get maximum
performance from the hash table. The MRU promotion policy helps improve
performance when the hash table is a little undersized, but it cannot
make up for severely undersized hash tables.

Furthermore, frequent MRU promotions result in writes that are a scalability
bottleneck. For scalability, lookups should only perform reads, not writes.
This is not a big deal for now, but it will become one once MTTCG matures.

The appended fixes these issues by using qht as the implementation of
the TB hash table. This solution is superior to other alternatives considered,
namely:

- master: implementation in QEMU before this patchset
- xxhash: before this patch, i.e. fixed buckets + xxhash hashing + MRU.
- xxhash-rcu: fixed buckets + xxhash + RCU list + MRU.
              MRU is implemented here by adding an intermediate struct
              that contains the u32 hash and a pointer to the TB; this
              allows us, on an MRU promotion, to copy said struct (that is not
              at the head), and put this new copy at the head. After a grace
              period, the original non-head struct can be eliminated, and
              after another grace period, freed.
- qht-fixed-nomru: fixed buckets + xxhash + qht without auto-resize +
                   no MRU for lookups; MRU for inserts.
The appended solution is the following:
- qht-dyn-nomru: dynamic number of buckets + xxhash + qht w/ auto-resize +
                 no MRU for lookups; MRU for inserts.

The plots below compare the considered solutions. The Y axis shows the
boot time (in seconds) of a debian jessie image with arm-softmmu; the X axis
sweeps the number of buckets (or initial number of buckets for qht-autoresize).
The plots in PNG format (and with errorbars) can be seen here:
  http://imgur.com/a/Awgnq

Each test runs 5 times, and the entire QEMU process is pinned to a
single core for repeatability of results.

                            Host: Intel Xeon E5-2690

  28 ++------------+-------------+-------------+-------------+------------++
     A*****        +             +             +             master **A*** +
  27 ++    *                                                 xxhash ##B###++
     |      A******A******                               xxhash-rcu $$C$$$ |
  26 C$$                  A******A******            qht-fixed-nomru*%%D%%%++
     D%%$$                              A******A******A*qht-dyn-mru A*E****A
  25 ++ %%$$                                          qht-dyn-nomru &&F&&&++
     B#####%                                                               |
  24 ++    #C$$$$$                                                        ++
     |      B###  $                                                        |
     |          ## C$$$$$$                                                 |
  23 ++           #       C$$$$$$                                         ++
     |             B######       C$$$$$$                                %%%D
  22 ++                  %B######       C$$$$$$C$$$$$$C$$$$$$C$$$$$$C$$$$$$C
     |                    D%%%%%%B######      @E@@@@@@    %%%D%%%@@@E@@@@@@E
  21 E@@@@@@E@@@@@@F&&&@@@E@@@&&&D%%%%%%B######B######B######B######B######B
     +             E@@@   F&&&   +      E@     +      F&&&   +             +
  20 ++------------+-------------+-------------+-------------+------------++
     14            16            18            20            22            24
                             log2 number of buckets

                                 Host: Intel i7-4790K

  14.5 ++------------+------------+-------------+------------+------------++
       A**           +            +             +            master **A*** +
    14 ++ **                                                 xxhash ##B###++
  13.5 ++   **                                           xxhash-rcu $$C$$$++
       |                                            qht-fixed-nomru %%D%%% |
    13 ++     A******                                   qht-dyn-mru @@E@@@++
       |             A*****A******A******             qht-dyn-nomru &&F&&& |
  12.5 C$$                               A******A******A*****A******    ***A
    12 ++ $$                                                        A***  ++
       D%%% $$                                                             |
  11.5 ++  %%                                                             ++
       B###  %C$$$$$$                                                      |
    11 ++  ## D%%%%% C$$$$$                                               ++
       |     #      %      C$$$$$$                                         |
  10.5 F&&&&&&B######D%%%%%       C$$$$$$C$$$$$$C$$$$$$C$$$$$C$$$$$$    $$$C
    10 E@@@@@@E@@@@@@B#####B######B######E@@@@@@E@@@%%%D%%%%%D%%%###B######B
       +             F&&          D%%%%%%B######B######B#####B###@@@D%%%   +
   9.5 ++------------+------------+-------------+------------+------------++
       14            16           18            20           22            24
                              log2 number of buckets

Note that the original point before this patch series is X=15 for "master";
the little sensitivity to the increased number of buckets is due to the
poor hashing function in master.

xxhash-rcu has significant overhead due to the constant churn of allocating
and deallocating intermediate structs for implementing MRU. An alternative
would be do consider failed lookups as "maybe not there", and then
acquire the external lock (tb_lock in this case) to really confirm that
there was indeed a failed lookup. This, however, would not be enough
to implement dynamic resizing--this is more complex: see
"Resizable, Scalable, Concurrent Hash Tables via Relativistic
Programming" by Triplett, McKenney and Walpole. This solution was
discarded due to the very coarse RCU read critical sections that we have
in MTTCG; resizing requires waiting for readers after every pointer update,
and resizes require many pointer updates, so this would quickly become
prohibitive.

qht-fixed-nomru shows that MRU promotion is advisable for undersized
hash tables.

However, qht-dyn-mru shows that MRU promotion is not important if the
hash table is properly sized: there is virtually no difference in
performance between qht-dyn-nomru and qht-dyn-mru.

Before this patch, we're at X=15 on "xxhash"; after this patch, we're at
X=15 @ qht-dyn-nomru. This patch thus matches the best performance that we
can achieve with optimum sizing of the hash table, while keeping the hash
table scalable for readers.

The improvement we get before and after this patch for booting debian jessie
with arm-softmmu is:

- Intel Xeon E5-2690: 10.5% less time
- Intel i7-4790K: 5.2% less time

We could get this same improvement _for this particular workload_ by
statically increasing the size of the hash table. But this would hurt
workloads that do not need a large hash table. The dynamic (upward)
resizing allows us to start small and enlarge the hash table as needed.

A quick note on downsizing: the table is resized back to 2**15 buckets
on every tb_flush; this makes sense because it is not guaranteed that the
table will reach the same number of TBs later on (e.g. most bootup code is
thrown away after boot); it makes sense to grow the hash table as
more code blocks are translated. This also avoids the complication of
having to build downsizing hysteresis logic into qht.

Reviewed-by: Sergey Fedorov <serge.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-15-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 17:11:16 -07:00
Emilio G. Cota
896a9ee967 qht: add test-qht-par to invoke qht-bench from 'check' target
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-14-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 17:11:16 -07:00
Emilio G. Cota
515864a0d7 qht: add qht-bench, a performance benchmark
This serves as a performance benchmark as well as a stress test
for QHT. We can tweak quite a number of things, including the
number of resize threads and how frequently resizes are triggered.

A performance comparison of QHT vs CLHT[1] and ck_hs[2] using
this same benchmark program can be found here:
  http://imgur.com/a/0Bms4

The tests are run on a 64-core AMD Opteron 6376, pinning threads
to cores favoring same-socket cores. For each run, qht-bench is
invoked with:
  $ tests/qht-bench -d $duration -n $n -u $u -g $range
, where $duration is in seconds, $n is the number of threads,
$u is the update rate (0.0 to 100.0), and $range is the number
of keys.

Note that ck_hs's performance drops significantly as writes go
up, since it requires an external lock (I used a ck_spinlock)
around every write.

Also, note that CLHT instead of using a seqlock, relies on an
allocator that does not ever return the same address during the
same read-critical section. This gives it a slight performance
advantage over QHT on read-heavy workloads, since the seqlock
writes aren't there.

[1] CLHT: https://github.com/LPD-EPFL/CLHT
          https://infoscience.epfl.ch/record/207109/files/ascy_asplos15.pdf

[2] ck_hs: http://concurrencykit.org/
           http://backtrace.io/blog/blog/2015/03/13/workload-specialization/

A few of those plots are shown in text here, since that site
might not be online forever. Throughput is on Mops/s on the Y axis.

                             200K keys, 0 % updates

  450 ++--+------+------+-------+-------+-------+-------+------+-------+--++
      |   +      +      +       +       +       +       +      +      +N+  |
  400 ++                                                           ---+E+ ++
      |                                                       +++----      |
  350 ++          9 ++------+------++                       --+E+    -+H+ ++
      |             |      +H+-     |                 -+N+----   ---- +++  |
  300 ++          8 ++     +E+     ++             -----+E+  --+H+         ++
      |             |      +++      |         -+N+-----+H+--               |
  250 ++          7 ++------+------++  +++-----+E+----                    ++
  200 ++                    1         -+E+-----+H+                        ++
      |                           ----                     qht +-E--+      |
  150 ++                      -+E+                        clht +-H--+     ++
      |                   ----                              ck +-N--+      |
  100 ++               +E+                                                ++
      |            ----                                                    |
   50 ++       -+E+                                                       ++
      |   +E+E+  +      +       +       +       +       +      +       +   |
    0 ++--E------+------+-------+-------+-------+-------+------+-------+--++
          1      8      16      24      32      40      48     56      64
                                Number of threads

                             200K keys, 1 % updates

  350 ++--+------+------+-------+-------+-------+-------+------+-------+--++
      |   +      +      +       +       +       +       +      +     -+E+  |
  300 ++                                                         -----+H+ ++
      |                                                       +E+--        |
      |           9 ++------+------++                  +++----             |
  250 ++            |      +E+   -- |                 -+E+                ++
      |           8 ++         --  ++             ----                     |
  200 ++            |      +++-     |  +++  ---+E+                        ++
      |           7 ++------N------++ -+E+--               qht +-E--+      |
      |                     1  +++----                    clht +-H--+      |
  150 ++                      -+E+                          ck +-N--+     ++
      |                   ----                                             |
  100 ++               +E+                                                ++
      |            ----                                                    |
      |        -+E+                                                        |
   50 ++    +H+-+N+----+N+-----+N+------                                  ++
      |   +E+E+  +      +       +      +N+-----+N+-----+N+----+N+-----+N+  |
    0 ++--E------+------+-------+-------+-------+-------+------+-------+--++
          1      8      16      24      32      40      48     56      64
                                Number of threads

                             200K keys, 20 % updates

  300 ++--+------+------+-------+-------+-------+-------+------+-------+--++
      |   +      +      +       +       +       +       +      +       +   |
      |                                                              -+H+  |
  250 ++                                                         ----     ++
      |           9 ++------+------++                       --+H+  ---+E+  |
      |           8 ++     +H+--   ++                 -+H+----+E+--        |
  200 ++            |      +E+    --|             -----+E+--  +++         ++
      |           7 ++      + ---- ++       ---+H+---- +++ qht +-E--+      |
  150 ++          6 ++------N------++ -+H+-----+E+        clht +-H--+     ++
      |                     1     -----+E+--                ck +-N--+      |
      |                       -+H+----                                     |
  100 ++                  -----+E+                                        ++
      |                +E+--                                               |
      |            ----+++                                                 |
   50 ++       -+E+                                                       ++
      |     +E+ +++                                                        |
      |   +E+N+-+N+-----+       +       +       +       +      +       +   |
    0 ++--E------+------N-------N-------N-------N-------N------N-------N--++
          1      8      16      24      32      40      48     56      64
                                Number of threads

                            200K keys, 100 % updates       qht +-E--+
                                                          clht +-H--+
  160 ++--+------+------+-------+-------+-------+-------+---ck-+-N-----+--++
      |   +      +      +       +       +       +       +      +   ----H   |
  140 ++                                                      +H+--  -+E+ ++
      |                                                +++----   ----      |
  120 ++          8 ++------+------++                 -+H+    +E+         ++
      |           7 ++     +H+---- ++             ---- +++----             |
  100 ++            |      +E+      |  +++  ---+H+    -+E+                ++
      |           6 ++     +++     ++ -+H+--   +++----                     |
   80 ++          5 ++------N----------+E+-----+E+                        ++
      |                     1 -+H+---- +++                                 |
      |                   -----+E+                                         |
   60 ++               +H+---- +++                                        ++
      |            ----+E+                                                 |
   40 ++        +H+----                                                   ++
      |       --+E+                                                        |
   20 ++    +E+                                                           ++
      |  +EE+    +      +       +       +       +       +      +       +   |
    0 ++--+N-N---N------N-------N-------N-------N-------N------N-------N--++
          1      8      16      24      32      40      48     56      64
                                Number of threads

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-13-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 17:11:16 -07:00
Emilio G. Cota
1a95404fbd qht: add test program
Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-12-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:20 +00:00
Emilio G. Cota
2e11264aaf qht: QEMU's fast, resizable and scalable Hash Table
This is a fast, scalable chained hash table with optional auto-resizing, allowing
reads that are concurrent with reads, and reads/writes that are concurrent
with writes to separate buckets.

A hash table with these features will be necessary for the scalability
of the ongoing MTTCG work; before those changes arrive we can already
benefit from the single-threaded speedup that qht also provides.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-11-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:20 +00:00
Emilio G. Cota
ff9249b733 qdist: add test program
Acked-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-10-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:19 +00:00
Emilio G. Cota
bf3afd5f41 qdist: add module to represent frequency distributions of data
Sometimes it is useful to have a quick histogram to represent a certain
distribution -- for example, when investigating a performance regression
in a hash table due to inadequate hashing.

The appended allows us to easily represent a distribution using Unicode
characters. Further, the data structure keeping track of the distribution
is so simple that obtaining its values for off-line processing is trivial.

Example, taking the last 10 commits to QEMU:

 Characters in commit title  Count
-----------------------------------
                         39      1
                         48      1
                         53      1
                         54      2
                         57      1
                         61      1
                         67      1
                         78      1
                         80      1
qdist_init(&dist);
qdist_inc(&dist, 39);
[...]
qdist_inc(&dist, 80);

char *str = qdist_pr(&dist, 9, QDIST_PR_LABELS);
// -> [39.0,43.6)▂▂ █▂ ▂ ▄[75.4,80.0]
g_free(str);

char *str = qdist_pr(&dist, 4, QDIST_PR_LABELS);
// -> [39.0,49.2)▁█▁▁[69.8,80.0]
g_free(str);

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-9-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:19 +00:00
Emilio G. Cota
42bd32287f tb hash: hash phys_pc, pc, and flags with xxhash
For some workloads such as arm bootup, tb_phys_hash is performance-critical.
The is due to the high frequency of accesses to the hash table, originated
by (frequent) TLB flushes that wipe out the cpu-private tb_jmp_cache's.
More info:
  https://lists.nongnu.org/archive/html/qemu-devel/2016-03/msg05098.html

To dig further into this I modified an arm image booting debian jessie to
immediately shut down after boot. Analysis revealed that quite a bit of time
is unnecessarily spent in tb_phys_hash: the cause is poor hashing that
results in very uneven loading of chains in the hash table's buckets;
the longest observed chain had ~550 elements.

The appended addresses this with two changes:

1) Use xxhash as the hash table's hash function. xxhash is a fast,
   high-quality hashing function.

2) Feed the hashing function with not just tb_phys, but also pc and flags.

This improves performance over using just tb_phys for hashing, since that
resulted in some hash buckets having many TB's, while others getting very few;
with these changes, the longest observed chain on a single hash bucket is
brought down from ~550 to ~40.

Tests show that the other element checked for in tb_find_physical,
cs_base, is always a match when tb_phys+pc+flags are a match,
so hashing cs_base is wasteful. It could be that this is an ARM-only
thing, though. UPDATE:
On Tue, Apr 05, 2016 at 08:41:43 -0700, Richard Henderson wrote:
> The cs_base field is only used by i386 (in 16-bit modes), and sparc (for a TB
> consisting of only a delay slot).
> It may well still turn out to be reasonable to ignore cs_base for hashing.

BTW, after this change the hash table should not be called "tb_hash_phys"
anymore; this is addressed later in this series.

This change gives consistent bootup time improvements. I tested two
host machines:
- Intel Xeon E5-2690: 11.6% less time
- Intel i7-4790K: 19.2% less time

Increasing the number of hash buckets yields further improvements. However,
using a larger, fixed number of buckets can degrade performance for other
workloads that do not translate as many blocks (600K+ for debian-jessie arm
bootup). This is dealt with later in this series.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-8-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:19 +00:00
Emilio G. Cota
dc8b295d05 exec: add tb_hash_func5, derived from xxhash
This will be used by upcoming changes for hashing the tb hash.

Add this into a separate file to include the copyright notice from
xxhash.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-7-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:18 +00:00
Guillaume Delbergue
ac9a9eba1e qemu-thread: add simple test-and-set spinlock
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Guillaume Delbergue <guillaume.delbergue@greensocs.com>
[Rewritten. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Emilio's additions: use TAS instead of atomic_xchg; emit acquire/release
 barriers; return bool from trylock; call cpu_relax() while spinning;
 optimize for uncontended locks by acquiring the lock with TAS instead
 of TATAS; add qemu_spin_locked().]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:18 +00:00
Emilio G. Cota
462cda505f include/processor.h: define cpu_relax()
Taken from the linux kernel.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-5-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 23:10:17 +00:00
Emilio G. Cota
03719e44b6 seqlock: rename write_lock/unlock to write_begin/end
It is a more appropriate name, now that the mutex embedded
in the seqlock is gone.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-4-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 22:59:34 +00:00
Emilio G. Cota
ccdb3c1fc8 seqlock: remove optional mutex
This option is unused; besides, it bloats the struct when not needed.
Let's just let writers define their own locks elsewhere.

Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-3-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 22:59:34 +00:00
Emilio G. Cota
911a4d2215 compiler.h: add QEMU_ALIGNED() to enforce struct alignment
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1465412133-3029-2-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-11 22:59:33 +00:00
Peter Maydell
a93c1bdf0b Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160610-1' into staging
ui: misc bug fixes.

# gpg: Signature made Fri 10 Jun 2016 10:56:06 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901  FE7D 4CB6 D8EE D3E8 7138

* remotes/kraxel/tags/pull-ui-20160610-1:
  console: ignore ui_info updates which don't actually update something
  ui/console-gl: Add support for big endian display surfaces
  gtk: fix vte version check
  ui: fix regression in printing VNC host/port on startup
  vnc: drop unused depth arg for set_pixel_format

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-10 15:47:17 +01:00
Gerd Hoffmann
1185fde40c console: ignore ui_info updates which don't actually update something
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464597673-26464-1-git-send-email-kraxel@redhat.com
2016-06-10 11:16:18 +02:00
Thomas Huth
2c2311c545 ui/console-gl: Add support for big endian display surfaces
This is required for running QEMU on big endian hosts (like
PowerPC machines) that use RGB instead of BGR byte ordering.

Ticket: https://bugs.launchpad.net/qemu/+bug/1581796
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1465243261-26731-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-10 11:13:59 +02:00
Olaf Hering
4d5942332f gtk: fix vte version check
vte_terminal_set_encoding takes 3 args since 0.38.0.
This fixes commit fba958c6 ("gtk: implement set_echo")

Signed-off-by: Olaf Hering <olaf@aepfle.de>
Message-id: 20160608214352.32669-1-olaf@aepfle.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-10 11:13:15 +02:00
Daniel P. Berrange
83cf07b0b5 ui: fix regression in printing VNC host/port on startup
If VNC is chosen as the compile time default display backend,
QEMU will print the host/port it listens on at startup.
Previously this would look like

  VNC server running on '::1:5900'

but in 04d2529da2 the ':' was
accidentally replaced with a ';'. This the ':' back.

Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1465382576-25552-1-git-send-email-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-10 11:08:39 +02:00
Gerd Hoffmann
ec9fb41a9f vnc: drop unused depth arg for set_pixel_format
Spotted by Coverity.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1465204725-31562-1-git-send-email-kraxel@redhat.com
2016-06-10 11:08:19 +02:00
Peter Maydell
0c33682d5f target-i386: Move user-mode exception actions out of user-exec.c
The exception_action() function in user-exec.c is just a call to
cpu_loop_exit() for every target CPU except i386.  Since this
function is only called if the target's handle_mmu_fault() hook has
indicated an MMU fault, and that hook is only called from the
handle_cpu_signal() code path, we can simply move the x86-specific
setup into that hook, which allows us to remove the TARGET_I386
ifdef from user-exec.c.

Of the actions that were done by the call to raise_interrupt_err():
 * cpu_svm_check_intercept_param() is a no-op in user mode
 * check_exception() is a no-op since double faults are impossible
   for user-mode
 * assignments to cs->exception_index and env->error_code are no-ops
 * assigning to env->exception_next_eip is unnecessary because it
   is not used unless env->exception_is_int is true
 * cpu_loop_exit_restore() is equivalent to cpu_loop_exit() since
   pc is 0
which leaves just setting env_>exception_is_int as the action that
needs to be added to x86_cpu_handle_mmu_fault().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-7-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
3327182332 target-i386: Add comment about do_interrupt_user() next_eip argument
Add a comment to do_interrupt_user() along the same lines as the
existing one for do_interrupt_all() noting that the next_eip
argument is not used unless is_int is true or intno is EXCP_SYSCALL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-6-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
a5852dc5de user-exec: Don't reextract sigmask from usercontext pointer
Extracting the old signal mask from the usercontext pointer passed to
a signal handler is a pain because it is OS and CPU dependent.
Since we've already done it once and passed it to handle_cpu_signal(),
there's no need to do it again in cpu_exit_tb_from_sighandler().
This then means we don't need to pass a usercontext pointer in to
handle_cpu_signal() at all.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-5-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
6886b98036 cpu-exec: Rename cpu_resume_from_signal() to cpu_loop_exit_noexc()
The function cpu_resume_from_signal() is now always called with a
NULL puc argument, and is rather misnamed since it is never called
from a signal handler. It is essentially forcing an exit to the
top level cpu loop but without raising any exception, so rename
it to cpu_loop_exit_noexc() and drop the useless unused argument.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-4-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
f213e72f23 user-exec: Push resume-from-signal code out to handle_cpu_signal()
Since the only caller of page_unprotect() which might cause it to
need to call cpu_resume_from_signal() is handle_cpu_signal() in
the user-mode code, push the longjump handling out to that function.

Since this is the only caller of cpu_resume_from_signal() which
passes a non-NULL puc argument, split the non-NULL handling into
a new cpu_exit_tb_from_sighandler() function. This allows us
to merge the softmmu and usermode implementations of the
cpu_resume_from_signal() function, which are now identical.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-3-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
Peter Maydell
75809229bb translate-all.c: Don't pass puc, locked to tb_invalidate_phys_page()
The user-mode-only function tb_invalidate_phys_page() is only
called from two places:
 * page_unprotect(), which passes in a non-zero pc, a puc pointer
   and the value 'true' for the locked argument
 * page_set_flags(), which passes in a zero pc, a NULL puc pointer
   and a 'false' locked argument

If the pc is non-zero then we may call cpu_resume_from_signal(),
which does a longjmp out of the calling code (and out of the
signal handler); this is to cover the case of a target CPU with
"precise self-modifying code" (currently only x86) executing
a store instruction which modifies code in the same TB as the
store itself. Rather than doing the longjump directly here,
return a flag to the caller which indicates whether the current
TB was modified, and move the longjump to page_unprotect.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Riku Voipio <riku.voipio@linaro.org>
Message-id: 1463494687-25947-2-git-send-email-peter.maydell@linaro.org
2016-06-09 15:55:02 +01:00
xiaoqiang zhao
9bbbf6497a hw/arm: virt uart fix
commit f0d1d2c115
("hw/char: QOM'ify pl011 model") break qemu-system-arm virt machine
if option '-machine secure=on' is provided.

The function create_uart is called twice. So make CharDriverState pointer
a parameter to create_uart instead of hardcoded.

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Tested-by: Jerome Forissier <jerome.forissier@linaro.org>
Message-id: 1465353045-26323-1-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-08 19:41:44 +01:00
Peter Maydell
b66e10e4c9 Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160608' into staging
linux-user pull request for June 2016

# gpg: Signature made Wed 08 Jun 2016 14:27:14 BST
# gpg:                using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20160608: (44 commits)
  linux-user: In fork_end(), remove correct CPUs from CPU list
  linux-user: Special-case ERESTARTSYS in target_strerror()
  linux-user: Make target_strerror() return 'const char *'
  linux-user: Correct signedness of target_flock l_start and l_len fields
  linux-user: Use safe_syscall wrapper for ioctl
  linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
  linux-user: Use safe_syscall wrapper for semop
  linux-user: Use safe_syscall wrapper for epoll_wait syscalls
  linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
  linux-user: Use safe_syscall wrapper for sleep syscalls
  linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
  linux-user: Use safe_syscall wrapper for flock
  linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
  linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
  linux-user: Use safe_syscall wrapper for send* and recv* syscalls
  linux-user: Use safe_syscall wrapper for connect syscall
  linux-user: Use safe_syscall wrapper for readv and writev syscalls
  linux-user: Fix error conversion in 64-bit fadvise syscall
  linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
  linux-user: Fix handling of arm_fadvise64_64 syscall
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>

Conflicts:
	configure
	scripts/qemu-binfmt-conf.sh
2016-06-08 18:34:32 +01:00
Paolo Bonzini
23654d45f8 .travis.yml: disable Sparse testing
On travis-ci.org, all builds fail with
   /usr/include/features.h:324:11: error: unable to open bits/predefs.h

With "make docker-travis@ubuntu", they fail with
   /usr/include/features.h:374:13: error: unable to open sys/cdefs.h

With "make docker-travis@fedora", finally, they fail due to sparse
not being able to parse some #pragmas in glib headers.  Just kill
the thing from the CI builds.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[AJB: tweak title for my OCD]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2016-06-08 17:47:22 +01:00
Alex Bennée
4adb05d8bd .travis.yml: add trusty GCE target
If we want to run our docker based tests we'll need to do them on a
normal VM with docker support. Lets just enable the build on trusty for
now to check against a newer Ubuntu.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2016-06-08 17:47:14 +01:00
Stefan Hajnoczi
4ca94085f1 .travis.yml: add libnfs-dev for NFS block driver
Let's ensure that block/nfs.o is built in Travis.

This patch depends on the following build fixes:
1. block/nfs: add missing #include "qapi/error.h"
2. block/nfs: add missing #include "qemu/cutils.h"

This patch also depends on Travis adding libnfs-dev to the list of
approved packages.  This patch can be safely committed but will not do
anything until the Travis maintainers allow libnfs-dev to be installed.
Please see the GitHub Issue I raised here:
https://github.com/travis-ci/apt-package-whitelist/issues/2788

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2016-06-08 17:47:10 +01:00
Peter Maydell
6f50f25c82 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Wed 08 Jun 2016 09:31:38 BST
# gpg:                using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (31 commits)
  qemu-img bench: Add --flush-interval
  qemu-img bench: Implement -S (step size)
  qemu-img bench: Make start offset configurable
  qemu-img bench: Sequential writes
  qemu-img bench
  block: Don't emulate natively supported pwritev flags
  blockdev: clean up error handling in do_open_tray
  block: Fix bdrv_all_delete_snapshot() error handling
  qcow2: avoid extra flushes in qcow2
  raw-posix: Fetch max sectors for host block device
  block: assert that bs->request_alignment is a power of 2
  migration/block: Convert saving to BlockBackend
  migration/block: Convert load to BlockBackend
  block: Kill bdrv_co_write_zeroes()
  vmdk: Convert to bdrv_co_pwrite_zeroes()
  raw_bsd: Convert to bdrv_co_pwrite_zeroes()
  raw-posix: Convert to bdrv_co_pwrite_zeroes()
  qed: Convert to bdrv_co_pwrite_zeroes()
  gluster: Convert to bdrv_co_pwrite_zeroes()
  blkreplay: Convert to bdrv_co_pwrite_zeroes()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-08 17:17:16 +01:00
Peter Maydell
d36ebffe94 Merge remote-tracking branch 'remotes/famz/tags/pull-docker-20160608' into staging
Docker testing fixes by Paolo.

# gpg: Signature made Wed 08 Jun 2016 08:20:54 BST
# gpg:                using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/pull-docker-20160608:
  tests/docker: build all targets in test-clang
  tests/docker: support travis test with fedora image
  tests/docker: remove unused feature "ccache"
  tests/docker: fix test-mingw
  tests/docker: make test-full build all targets, not none
  tests/docker: fix make-archive-maybe

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-08 16:31:53 +01:00
Peter Maydell
c1a3b8b745 Merge remote-tracking branch 'remotes/mdroth/tags/qga-pull-2016-07-07-tag' into staging
qemu-ga patch queue

* add unit tests for guest-exec command set

# gpg: Signature made Tue 07 Jun 2016 21:43:33 BST
# gpg:                using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg:                 aka "Michael Roth <mdroth@utexas.edu>"
# gpg:                 aka "Michael Roth <mdroth@linux.vnet.ibm.com>"

* remotes/mdroth/tags/qga-pull-2016-07-07-tag:
  tests: start a /qga/guest-exec test

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-08 16:04:52 +01:00
Peter Maydell
c503a85599 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* max-ram-below-4g improvement (Gerd)
* escc fix (xiaoqiang)
* ESP fix (Prasad)
* scsi-disk tweaks/fix (me)
* Makefile dependency fixes (me)
* PKGVERSION improvement (Fam)
* -vnc man improvement (Robert)

# gpg: Signature made Tue 07 Jun 2016 18:06:22 BST
# gpg:                using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream:
  vnc: list the 'to' parameter of '-vnc' in the qemu man page
  scsi-disk: add missing break
  Makefile: Derive "PKGVERSION" from "git describe" by default
  Makefile: add dependency on scripts/hxtool
  Makefile: add dependency on scripts/make_device_config.sh
  Makefile: add dependency on scripts/create_config
  Makefile: Add a "FORCE" target
  scsi: megasas: null terminate bios version buffer
  scsi: mark TYPE_SCSI_DISK_BASE as abstract
  scsi: esp: check TI buffer index before read/write
  hw/char: QOM'ify escc.c (fix)
  pc: allow raising low memory via max-ram-below-4g option
  tests: Rename tests/Makefile to tests/Makefile.include

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-08 14:45:28 +01:00
Peter Maydell
014628a705 linux-user: In fork_end(), remove correct CPUs from CPU list
In fork_end(), we must fix the list of current CPUs to match the fact
that the child of the fork has only one thread. Unfortunately we were
removing the wrong CPUs from the list, which meant that if the child
subsequently did an exclusive operation it would deadlock in
start_exclusive() waiting for a sibling CPU which didn't exist.

In particular this could cause hangs doing git submodule init
operations, as reported in https://bugs.launchpad.net/qemu/+bug/955379
comment #47.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 12:06:57 +03:00
Peter Maydell
da2a34f7f9 linux-user: Special-case ERESTARTSYS in target_strerror()
Since TARGET_ERESTARTSYS and TARGET_ESIGRETURN are internal-to-QEMU
error numbers, handle them specially in target_strerror(), to avoid
confusing strace output like:

9521 rt_sigreturn(14,8,274886297808,8,0,268435456) = -1 errno=513 (Unknown error 513)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 12:06:57 +03:00
Peter Maydell
7dcdaeafe0 linux-user: Make target_strerror() return 'const char *'
Make target_strerror() return 'const char *' rather than just 'char *';
this will allow us to return constant strings from it for some special
cases.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-06-08 12:06:57 +03:00
Peter Maydell
8efb2ed5ec linux-user: Correct signedness of target_flock l_start and l_len fields
The l_start and l_len fields in the various target_flock structures are
supposed to be '__kernel_off_t' or '__kernel_loff_t', which means they
should be signed, not unsigned. Correcting the structure definitions means
that __get_user() and __put_user() will correctly sign extend them if
the guest is using 32 bit offsets and the host is using 64 bit offsets.

This fixes failures in the LTP 'fcntl14' tests where it checks that
negative seek offsets work correctly.

We reindent the structures to drop hard tabs since we're touching 40%
of the fields anyway.

RV: long long -> abi_llong as suggested by Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 12:06:50 +03:00
Kevin Wolf
55d539c8f7 qemu-img bench: Add --flush-interval
This options allows to flush the image periodically during write tests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
83de9be0dc qemu-img bench: Implement -S (step size)
With this new option, qemu-img bench can be told to advance the current
offset after each request by a different value than the buffer size.
This is useful for controlling the conditions for cluster allocation in
image formats (e.g. qcow2 cluster allocation with COW in front of the
request, or COW areas that aren't overwritten immediately).

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
d3199a31c7 qemu-img bench: Make start offset configurable
This patch adds an option the specify the offset of the first request
made by qemu-img bench. This allows to benchmark misaligned requests.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
b6495fa849 qemu-img bench: Sequential writes
This extends qemu-img bench with an option that makes it use sequential
writes instead of reads for the test run.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
b6133b8c68 qemu-img bench
This adds a qemu-img command that allows doing some simple benchmarks
for the block layer without involving guest devices and a real VM.

For the start, this implements only a test of sequential reads.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
515c2f431e block: Don't emulate natively supported pwritev flags
Drivers that implement .bdrv_co_pwritev() get the flags passed as an
argument to said function, but we also unconditionally emulate the flags
anyway. We shouldn't do that.

Fix this by clearing all flags that the driver supports natively after
it returns from .bdrv_co_pwritev().

Fixes: 4df863f3 ('block: Make supported_write_flags a per-bds property')
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-08 10:21:09 +02:00
Colin Lord
bf18bee547 blockdev: clean up error handling in do_open_tray
Returns negative error codes and accompanying error messages in cases where
the device has no tray or the tray is locked and isn't forced open. This
extra information should result in better flexibility in functions that
call do_open_tray.

Suggested by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Colin Lord <clord@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
2a9170bcd4 block: Fix bdrv_all_delete_snapshot() error handling
The code to exit the loop after bdrv_snapshot_delete_by_id_or_name()
returned failure was duplicated. The first copy of it was too early so
that the AioContext lock would not be freed. This patch removes it so
that only the second, correct copy remains.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:09 +02:00
Denis V. Lunev
f3c3b87dae qcow2: avoid extra flushes in qcow2
The problem with excessive flushing was found by a couple of performance
tests:
  - parallel directory tree creation (from 2 processes)
  - 32 cached writes + fsync at the end in a loop

For the first one results improved from 2.6 loops/sec to 3.5 loops/sec.
Each loop creates 10^3 directories with 10 files in each.

For the second one results improved from ~600 fsync/sec to ~1100
fsync/sec. Though, it was run on SSD so it probably won't show such
performance gain on rotational media.

qcow2_cache_flush() calls bdrv_flush() unconditionally after writing
cache entries of a particular cache. This can lead to as many as
2 additional fdatasyncs inside bdrv_flush.

We can simply skip all fdatasync calls inside qcow2_co_flush_to_os
as bdrv_flush for sure will do the job. These flushes are necessary to
keep the right order of writes to the different caches. Though this is
not necessary in the current code base as this ordering is ensured through
the flush in qcow2_cache_flush_dependency().

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Pavel Borzenkov <pborzenkov@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:09 +02:00
Fam Zheng
6f6071745b raw-posix: Fetch max sectors for host block device
This is sometimes a useful value we should count in.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:09 +02:00
Peter Lieven
107d433cbb block: assert that bs->request_alignment is a power of 2
at least bdrv_co_preadv/pwritev expect this.

Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:09 +02:00
Kevin Wolf
ebd2f9e7db migration/block: Convert saving to BlockBackend
This creates a new BlockBackend for copying data from an images to the
migration stream on the source host. All I/O for block migration goes
through BlockBackend now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:08 +02:00
Kevin Wolf
ad2964b4ff migration/block: Convert load to BlockBackend
This converts the loading part of block migration to use BlockBackend
interfaces rather than accessing the BlockDriverState directly.

Note that this takes a lazy shortcut. We should really use a separate
BlockBackend that is configured for the migration rather than for the
guest (e.g. writethrough caching is unnecessary) and holds its own
reference to the BlockDriverState, but the impact isn't that big and we
didn't have a separate migration reference before either, so it must be
good enough, I guess...

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
c1499a5e73 block: Kill bdrv_co_write_zeroes()
Now that all drivers have been converted to a byte interface,
we no longer need a sector interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
a620f2ae15 vmdk: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
39ad937e16 raw_bsd: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
2ffa76c2bf raw-posix: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
[ kwolf: Fixed up trace_paio_submit_co() call for qiov == NULL ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
49a2e48348 qed: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Kill an abuse of the comma operator while at it (fortunately,
the semantics were still right).  Also, the test for requests
not aligned to clusters should be applied always, not just
when a backing file is present.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
e88a36ebad gluster: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
9c21a4220b blkreplay: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
5544b59f8e qcow2: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
94d047a35b iscsi: Convert to bdrv_co_pwrite_zeroes()
Another step on our continuing quest to switch to byte-based
interfaces.

As this is the first byte-based iscsi interface, convert
is_request_lun_aligned() into two versions, one for sectors
and one for bytes.  Also, change from outright -EINVAL failure
on an unaligned request, to instead failing with -ENOTSUP to
trigger a read-modify-write fallback, particularly since the
block layer should be honoring bs->request_alignment to avoid
-EINVAL on read/write requests.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
74021bc497 block: Switch bdrv_write_zeroes() to byte interface
Rename to bdrv_pwrite_zeroes() to let the compiler ensure we
cater to the updated semantics.  Do the same for bdrv_co_write_zeroes().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
d05aa8bb4a block: Add .bdrv_co_pwrite_zeroes()
Update bdrv_co_do_write_zeroes() to be byte-based, and select
between the new byte-based bdrv_co_pwrite_zeroes() or the old
bdrv_co_write_zeroes().  The next patches will convert drivers,
then remove the old interface.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
cf081fca4e block: Track write zero limits in bytes
Another step towards removing sector-based interfaces: convert
the maximum write and minimum alignment values from sectors to
bytes.  Rename the variables to let the compiler check that all
users are converted to the new semantics.

The maximum remains an int as long as BDRV_REQUEST_MAX_SECTORS
is constrained by INT_MAX (this means that we can't even
support a 2G write_zeroes, but just under it) - changing
operation lengths to unsigned or to 64-bits is a much bigger
audit, and debatable if we even want to do it (since at the
core, a 32-bit platform will still have ssize_t as its
underlying limit on write()).

Meanwhile, alignment is changed to 'uint32_t', since it makes no
sense to have an alignment larger than the maximum write, and
less painful to use an unsigned type with well-defined behavior
in bit operations than to have to worry about what happens if
a driver mistakenly supplies a negative alignment.

Add an assert that no one was trying to use sectors to get a
write zeroes larger than 2G, and therefore that a later conversion
to bytes won't be impacted by keeping the limit at 32 bits.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
8b18474451 iscsi: Use block size as minimum zero/discard alignment
If hardware does not advertise a minimum zero/discard
alignment, we still want to guarantee that the block layer
will align requests to our blocks, rather than the arbitrary
512-byte BDRV sector size.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
ebb718a5c7 qcow2: Catch more unaligned write_zero into zero cluster
is_zero_cluster() and is_zero_cluster_top_locked() are used only
by qcow2_co_write_zeroes().  The former is too broad (we don't
care if the sectors we are about to overwrite are non-zero, only
that all other sectors in the cluster are zero), so it needs to
be called up to twice but with smaller limits - rename it along
with adding the neeeded parameter.  The latter can be inlined for
more compact code.

The testsuite change shows that we now have a sparser top file
when an unaligned write_zeroes overwrites the only portion of
the backing file with data.

Based on a patch proposal by Denis V. Lunev.

CC: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Eric Blake
31ad4fdf91 qemu-iotests: Test one more spot for optimizing write_zeroes
Add another test to 154, showing that we currently allocate a
data cluster in the top layer if any sector of the backing file
was allocated.  The next patch will optimize this case.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Denis V. Lunev
5a64e94251 qcow2: add tracepoints for qcow2_co_write_zeroes
This patch follows guidelines of all other tracepoints in qcow2, like ones
in qcow2_co_writev. I think that they should dump values in the same
quantities or be changed all together.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1463476543-3087-4-git-send-email-den@openvz.org>
[eblake: typo fix in commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Denis V. Lunev
ba142846b0 qcow2: simplify logic in qcow2_co_write_zeroes
Unaligned requests will occupy only one cluster. This is true since the
previous commit. Simplify the code taking this consideration into
account.

In other words, the caller is now buggy if it ever passes us an unaligned
request that crosses cluster boundaries (the only requests that can cross
boundaries will be aligned).

There are no other changes so far.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1463476543-3087-3-git-send-email-den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Denis V. Lunev
443668ca40 block: split write_zeroes always
We should split requests even if they are less than write_zeroes_alignment.
For example we can have the following request:
  offset 62k
  size   4k
  write_zeroes_alignment 64k
The original code sent 1 request covering 2 qcow2 clusters, and resulted
in both clusters being allocated. But by splitting the request, we can
cater to the case where one of the two clusters can be zeroed as a
whole, for only 1 cluster allocated after the operation.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Eric Blake <eblake@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1463476543-3087-2-git-send-email-den@openvz.org>

[eblake: Avoid exceeding nb_sectors, hoist alignment checks out of
loop, and update testsuite to show that patch works]

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-06-08 10:21:08 +02:00
Paolo Bonzini
d3a49cbed5 tests/docker: build all targets in test-clang
Warnings specific to clang may affect devices that are not build by
x86_64-softmmu and aarch64-softmmu.  Build all targets since that
is also what Peter does.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-7-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Paolo Bonzini
78465d74c2 tests/docker: support travis test with fedora image
Install sparse and PyYAML.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-6-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Paolo Bonzini
8080214dc8 tests/docker: remove unused feature "ccache"
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-5-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Paolo Bonzini
2346b12fc5 tests/docker: fix test-mingw
Add flex and bison for use in test-mingw, because test-mingw
uses the in-tree libdtc.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-4-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Paolo Bonzini
53735f0b82 tests/docker: make test-full build all targets, not none
Fix common.rc to avoid passing an empty --target-list= option to configure.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-3-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Paolo Bonzini
34c98c54c3 tests/docker: fix make-archive-maybe
make-archive-maybe expects an archive path relative
to $1, but receives a path relative to the current directory.  Redirect
the output outside the subshell to bypass the "cd $1".

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1465224417-141321-2-git-send-email-pbonzini@redhat.com
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-08 15:19:30 +08:00
Peter Maydell
49ca6f3e24 linux-user: Use safe_syscall wrapper for ioctl
Use the safe_syscall wrapper to implement the ioctl syscall.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:47 +03:00
Peter Maydell
ff6dc13079 linux-user: Use safe_syscall wrapper for accept and accept4 syscalls
Use the safe_syscall wrapper for the accept and accept4 syscalls.
accept4 has been in the kernel since 2.6.28 so we can assume it
is always present.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
ffb7ee796a linux-user: Use safe_syscall wrapper for semop
Use the safe_syscall wrapper for the semop syscall or IPC operation.
(We implement via the semtimedop syscall to make it easier to
implement the guest semtimedop syscall later.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
227f02143f linux-user: Use safe_syscall wrapper for epoll_wait syscalls
Use the safe_syscall wrapper for epoll_wait and epoll_pwait syscalls.

Since we now directly use the host epoll_pwait syscall for both
epoll_wait and epoll_pwait, we don't need the configure machinery
to check whether glibc supports epoll_pwait(). (The kernel has
supported the syscall since 2.6.19 so we can assume it's always there.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
a6130237b8 linux-user: Use safe_syscall wrapper for poll and ppoll syscalls
Use the safe_syscall wrapper for the poll and ppoll syscalls.
Since not all host architectures will have a poll syscall, we
have to rewrite the TARGET_NR_poll handling to use ppoll instead
(we can assume everywhere has ppoll by now).

We take the opportunity to switch to the code structure
already used in the implementation of epoll_wait and epoll_pwait,
which uses a switch() to avoid interleaving #if and if (),
and to stop using a variable with a leading '_' which is in
the implementation's namespace.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
9e518226f4 linux-user: Use safe_syscall wrapper for sleep syscalls
Use the safe_syscall wrapper for the clock_nanosleep and nanosleep
syscalls.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
b3f8233068 linux-user: Use safe_syscall wrapper for rt_sigtimedwait syscall
Use the safe_syscall wrapper for the rt_sigtimedwait syscall.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
2a8459892f linux-user: Use safe_syscall wrapper for flock
Use the safe_syscall wrapper for the flock syscall.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
d40ecd6618 linux-user: Use safe_syscall wrapper for mq_timedsend and mq_timedreceive
Use the safe_syscall wrapper for mq_timedsend and mq_timedreceive syscalls.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:46 +03:00
Peter Maydell
89f9fe4452 linux-user: Use safe_syscall wrapper for msgsnd and msgrcv
Use the safe_syscall wrapper for msgsnd and msgrcv syscalls.
This is made slightly awkward by some host architectures providing
only a single 'ipc' syscall rather than separate syscalls per
operation; we provide safe_msgsnd() and safe_msgrcv() as wrappers
around safe_ipc() to handle this if needed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
666875306e linux-user: Use safe_syscall wrapper for send* and recv* syscalls
Use the safe_syscall wrapper for the send, sendto, sendmsg, recv,
recvfrom and recvmsg syscalls.

RV: adjusted to apply
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
2a3c761928 linux-user: Use safe_syscall wrapper for connect syscall
Use the safe_syscall wrapper for the connect syscall.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
918c03ed9a linux-user: Use safe_syscall wrapper for readv and writev syscalls
Use the safe_syscall wrapper for readv and writev syscalls.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
977d8241c1 linux-user: Fix error conversion in 64-bit fadvise syscall
Fix a missing host-to-target errno conversion in the 64-bit
fadvise syscall emulation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
badd3cd880 linux-user: Fix NR_fadvise64 and NR_fadvise64_64 for 32-bit guests
Fix errors in the implementation of NR_fadvise64 and NR_fadvise64_64
for 32-bit guests, which pass their off_t values in register pairs.
We can't use the 64-bit code path for this, so split out the 32-bit
cases, so that we can correctly handle the "only offset is 64-bit"
and "both offset and length are 64-bit" syscall flavours, and
"uses aligned register pairs" and "does not" flavours of target.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
e0156a9dc4 linux-user: Fix handling of arm_fadvise64_64 syscall
32-bit ARM has an odd variant of the fadvise syscall which has
rearranged arguments, which we try to implement. Unfortunately we got
the rearrangement wrong.

This is a six-argument syscall whose arguments are:
 * fd
 * advise parameter
 * offset high half
 * offset low half
 * len high half
 * len low half

Stop trying to share code with the standard fadvise syscalls,
and just implement the syscall with the correct argument order.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
9e024732f5 linux-user: provide frame information in x86-64 safe_syscall
Use cfi directives in the x86-64 safe_syscall to allow gdb to get
backtraces right from within it. (In particular this will be
quite a common situation if the user interrupts QEMU while it's
in a blocked safe-syscall: at the point of the syscall insn RBP
is in use for something else, and so gdb can't find the frame then
without assistance.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:45 +03:00
Peter Maydell
90c0f080fe linux-user: Avoid possible misalignment in target_to_host_siginfo()
Reimplement target_to_host_siginfo() to use __get_user(), which
handles possibly misaligned source guest structures correctly.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-08 10:13:32 +03:00
Marc-André Lureau
3dab9fa1ac tests: start a /qga/guest-exec test
Test a few guest-exec guest agent commands, added in qemu 2.5.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
2016-06-07 11:25:06 -05:00
Peter Maydell
6ed5546fa7 Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2016-06-07' into staging
trivial patches for 2016-06-07

# gpg: Signature made Tue 07 Jun 2016 16:20:52 BST
# gpg:                using RSA key 0xBEE59D74A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2016-06-07: (51 commits)
  hbitmap: Use DIV_ROUND_UP
  qemu-timer: Use DIV_ROUND_UP
  linux-user: Use DIV_ROUND_UP
  slirp: Use DIV_ROUND_UP
  usb: Use DIV_ROUND_UP
  rocker: Use DIV_ROUND_UP
  SPICE: Use DIV_ROUND_UP
  audio: Use DIV_ROUND_UP
  xen: Use DIV_ROUND_UP
  crypto: Use DIV_ROUND_UP
  block: Use DIV_ROUND_UP
  qed: Use DIV_ROUND_UP
  qcow/qcow2: Use DIV_ROUND_UP
  parallels: Use DIV_ROUND_UP
  coccinelle: use macro DIV_ROUND_UP instead of (((n) + (d) - 1) /(d))
  thunk: Rename args and fields in host-target bitmask conversion code
  thunk: Drop unused NO_THUNK_TYPE_SIZE guards
  qemu-common.h: Drop WORDS_ALIGNED define
  host-utils: Prefer 'false' for bool type
  docs/multi-thread-compression: Fix wrong command string
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-07 16:34:45 +01:00
Laurent Vivier
30f549c2f3 hbitmap: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
5029b969d1 qemu-timer: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
b1b2db29bd linux-user: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Riku Voipio <riku.voipio@iki.fi>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
806956834a slirp: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
66c68a12ae usb: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
df5d1c17b6 rocker: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Scott Feldman <sfeldma@gmail.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
5d61cafd0b SPICE: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
b988a650b1 audio: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:25 +03:00
Laurent Vivier
d0448de7f6 xen: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
207ba7c885 crypto: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
13385ae168 block: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
c41a73ffaf qed: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
d737b78cc1 qcow/qcow2: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
969401fe76 parallels: Use DIV_ROUND_UP
Replace (((n) + (d) - 1) /(d)) by DIV_ROUND_UP(n,d).

This patch is the result of coccinelle script
scripts/coccinelle/round.cocci

CC: qemu-block@nongnu.org
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Laurent Vivier
db718b4b15 coccinelle: use macro DIV_ROUND_UP instead of (((n) + (d) - 1) /(d))
sample from http://coccinellery.org/

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
e0ca2ed562 thunk: Rename args and fields in host-target bitmask conversion code
The target_to_host_bitmask() and host_to_target_bitmask() functions
and the associated struct bitmask_transtbl are completely generic,
but for historical reasons the target related fields and parameters
are named 'x86' and the host related fields are named 'alpha'.
Rename them to 'target' and 'host'.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
7a00217d1a thunk: Drop unused NO_THUNK_TYPE_SIZE guards
The thunk_type_size_array() and thunk_type_align_array() functions
are only provided if NO_THUNK_TYPE_SIZE is not defined. However
nothing in the codebase defines that, and so in fact these functions
are always present. Drop the unnecessary #ifdefs.

(Over a decade ago thunk.h used to be included by some softmmu
files, which defined NO_THUNK_TYPE_SIZE, but these includes are
long gone; see for instance commit f193c7979c2f7.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
0d5c21f2b3 qemu-common.h: Drop WORDS_ALIGNED define
The WORDS_ALIGNED #define is not used anywhere, and hasn't been since
2013 when commit 612d590ebc rewrote the various ld<type>_<endian>_p
functions to not use it. Remove the #define and the comment describing it.
Also remove the line in the comment about TARGET_WORDS_ALIGNED, since
it has never actually existed.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Eric Blake
e52eeb468d host-utils: Prefer 'false' for bool type
Mixing '0' and 'bool' looks stupid.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Wei Jiangang
aa5982e0fd docs/multi-thread-compression: Fix wrong command string
s/info_migrate_capabilities/info migrate_capabilities

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Liang Li <liang.z.li@intel.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
030c98aff1 all: Remove unnecessary glib.h includes
Remove glib.h includes, as it is provided by osdep.h.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
36a2c2d6d3 qga: Remove unnecessary glib.h includes
Remove glib.h includes, as it is provided by osdep.h.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
79ffb277ec tests: Remove unnecessary glib.h includes
Remove glib.h includes, as it is provided by osdep.h.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Peter Maydell
df891b9197 clean-includes: Add glib.h to list of unneeded includes
osdep.h pulls in glib.h via glib-compat.h, so add it to the list of
includes that we remove. (This then means we must avoid running
clean-includes on glib-compat.h or it will delete the glib.h include.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:24 +03:00
Igor Mammedov
20875332b0 pc: cleanup unused struct PcRomPciInfo
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Sameeh Jubran
b92233b329 e1000: Removing unnecessary if statement
Since mit_delay can never be 0 this if statement is
superfluous.

Signed-off-by: Sameeh Jubran <sameeh@daynix.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Steven Luo
9e87a691bd Fix configure test for PBKDF2 in nettle
On my Debian jessie system, including nettle/pbkdf2.h does not cause
NULL to be defined, which causes the test to fail to compile.  Include
stddef.h to bring in a definition of NULL.

Cc: qemu-trivial@nongnu.org
Cc: qemu-stable@nongnu.org
Signed-off-by: Steven Luo <steven+qemu@steven676.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Alberto Garcia
0bab0ebb17 docs: Fix a couple of typos in throttle.txt
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Peter Maydell
24a6e0633a hw: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Peter Maydell
2d7fedeb54 replay: Clean up includes
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.

This commit was created with scripts/clean-includes.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Cao jin
a8d38f3b02 fw_cfg: follow CODING_STYLE
Replace tab with 4 spaces; brace the indented statement.

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Acked-by: Corey Minyard <cminyard@mvista.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Cao jin
d9d8d452da qdev: Clean up around properties
include:
1. remove unnecessary declaration of static function
2. fix inconsistency between comment and function name, and typo OOM->QOM
2. update comments of functions, use uniform format(GTK-Doc style)

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Eric Blake
3b7c78c83a monitor: Typo fix
s/partinal/partial/

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Cao jin
0668a06b81 ICH9: fix typo
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Stefan Weil
bbd908025c scripts: Use $(..) instead of deprecated ..
This fixes these warnings from shellcheck:

    ^-- SC2006: Use $(..) instead of deprecated `..`

Update also a comment using the same pattern.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Stefan Weil
8913885761 configure: Use $(..) instead of deprecated ..
This fixes these warnings from shellcheck:

    ^-- SC2006: Use $(..) instead of deprecated `..`

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
Peter Maydell
9640401389 qemu-options.hx: Specify the units for -machine kvm_shadow_mem
The -machine kvm_shadow_mem option takes a size in bytes; say
so explicitly in its documentation.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Tobi (github.com/tobimensch)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:23 +03:00
James Clarke
6969ec6cfd Fix linking relocatable objects on Sparc
On Sparc, gcc implicitly passes --relax to the linker, but -r is
incompatible with this. Therefore, if --no-relax is supported, it should
be passed to the linker.

Signed-off-by: James Clarke <jrtc27@jrtc27.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:19:06 +03:00
Laurent Vivier
a2c5eaf7a9 ppc: Remove a potential overflow in muldiv64()
The coccinelle script:
scripts/coccinelle/overflow_muldiv64.cocci
gives us a list of potential overflows in muldiv64()
(the two first parameters are 64bit values).

This patch fixes one, as the fix seems obvious:

replace muldiv64(a, b, c) by muldiv64(b, a, c)
as "a" and "b" are 64bit values but a <= NANOSECONDS_PER_SECOND.
(10^9 -> 30bit value).

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Laurent Vivier
c00dc6750f replace muldiv64(a, b, c) by (uint64_t)a * b / c
When "a" and "b" are 32bit values, we don't have to cast
them to 128bit, 64bit is enough.

This patch is the result of coccinelle script
scripts/coccinelle/simplify_muldiv64.cocci

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
For xtensa PIC:
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Laurent Vivier
cd1f16f947 remove useless muldiv64()
muldiv64(a, 1, b) is like "a / b".

This patch is the result of coccinelle script
scripts/coccinelle/remove_muldiv64.cocci.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Laurent Vivier
3498686220 The only 64bit parameter of muldiv64() is the first one.
muldiv64() is "uint64_t muldiv64(uint64_t a, uint32_t b, uint32_t c)"

Some time it is used as muldiv64(uint32_t a, uint64_t b, uint32_t c)"

This patch is the result of coccinelle script
scripts/coccinelle/swap_muldiv64.cocci to reorder arguments.

Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Laurent Vivier
e9d5150739 scripts: add muldiv64() checking coccinelle scripts
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Peter Wu
96165b9eb4 gdbstub: set listen backlog to 1
Avoid possible connection drops on Linux (when tcp_syncookies is
disabled) or fallbacks to SYN cookies with the following kernel warning:

    TCP: request_sock_TCP: Possible SYN flooding on port 1234. Sending cookies.  Check SNMP counters.

Since Linux 4.4 (ef547f2ac16b "tcp: remove max_qlen_log"), a backlog of
zero is really treated as the "queue length for completely established
sockets waiting to be accepted" (listen(2)). This is apparently a valid
interpretation of an "implementation-defined minimum value" for a
backlog value of 0 (listen(3p)). Previous kernels would use 8 as
minimum value, but that is no longer the case.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Jan Vesely
891f8dcd25 po/Makefile: call rm -f directly
Default variables are undefined in rules.mak and this is what the rest
of the build system uses.
Fixes make clean in ./po/

Signed-off-by: Jan Vesely <jano.vesely@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Stefan Weil
a5cbe92199 target-moxie: Remove unused struct elements
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Michael Tokarev
395fe5f241 fsdev: spelling fix
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:49 +03:00
Michael Tokarev
e35916ac0f qga: spelling fix
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-06-07 18:02:48 +03:00
Michael Tokarev
d33c8a7d46 docs: "specify" spell fix
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-06-07 18:02:48 +03:00
Michael Tokarev
a6210f5701 hw/ipmi: fix spelling
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Corey Minyard <cminyard@mvista.com>
2016-06-07 18:02:48 +03:00
Michael Tokarev
b34aee54aa s390x/virtio-ccw: fix spelling
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-06-07 18:02:48 +03:00
Peter Maydell
40eeb397c8 Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 07 Jun 2016 15:26:09 BST
# gpg:                using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  throttle: refuse iops-size without iops-total/read/write
  block: Drop bdrv_ioctl_bh_cb
  block: Move BlockRequest type to io.c
  block/io: optimize bdrv_co_pwritev for small requests
  iostatus: fix comments for block_job_iostatus_reset
  block/io: Remove unused bdrv_aio_write_zeroes()
  virtio: drop duplicate virtio_queue_get_id() function
  virtio-scsi: Remove op blocker for dataplane
  virtio-blk: Remove op blocker for dataplane
  blockdev-backup: Don't move target AioContext if it's attached
  blockdev-backup: Use bdrv_lookup_bs on target
  tests: avoid coroutine pool test crash

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-07 15:59:28 +01:00
Peter Maydell
79cecb3520 Merge remote-tracking branch 'remotes/mst/tags/for_upstream' into staging
pc, pci, virtio: new features, cleanups, fixes

This includes some infrastructure for ipmi smbios tables.
Beginning of acpi hotplug rework by Igor for supporting >255 CPUs.
Misc cleanups and fixes.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>

# gpg: Signature made Tue 07 Jun 2016 13:55:22 BST
# gpg:                using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg:                 aka "Michael S. Tsirkin <mst@redhat.com>"

* remotes/mst/tags/for_upstream: (25 commits)
  virtio: move bi-endian target support to a single location
  pc-dimm: introduce realize callback
  pc-dimm: get memory region from ->get_memory_region()
  acpi: make bios_linker_loader_add_checksum() API offset based
  acpi: make bios_linker_loader_add_pointer() API offset based
  tpm: apci: cleanup TCPA table initialization
  acpi: cleanup bios_linker_loader_cleanup()
  acpi: simplify bios_linker API by removing redundant 'table' argument
  acpi: convert linker from GArray to BIOSLinker structure
  pc: use AcpiDeviceIfClass.send_event to issue GPE events
  acpi: extend ACPI interface to provide send_event hook
  pc: Postpone SMBIOS table installation to post machine init
  ipmi: rework the fwinfo to be fetched from the interface
  tests: acpi: update tables with consolidated legacy cpu-hotplug AML
  pc: acpi: cpuhp-legacy: switch ProcessorID to possible_cpus idx
  pc: acpi: simplify build_legacy_cpu_hotplug_aml() signature
  pc: acpi: consolidate legacy CPU hotplug in one file
  pc: acpi: mark current CPU hotplug functions as legacy
  pc: acpi: cpu-hotplug: make AML CPU_foo defines local to cpu_hotplug_acpi_table.c
  pc: acpi: consolidate \GPE._E02 with the rest of CPU hotplug AML
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-07 15:30:25 +01:00
Eduardo Habkost
d5aebef884 docker: Don't use eval trick on Makefile
The eval trick for defining DOCKER_SRC_COPY doesn't do anything
useful, as DOCKER_SRC_COPY is immediately expanded just after it
is defined, and CUR_TIME is already defined using ":=". Simply
define it using ":=" so it is evaluated only once.

The eval trick was also triggering an weird error on Travis builds:
  qemu/tests/docker/Makefile.include:34: *** unterminated variable reference.  Stop.

The issue is not easily reproducible (maybe it's a bug in some
versions of Make), but it is avoided if removing the eval trick.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-07 15:00:02 +01:00
Stefan Hajnoczi
8860eabdee throttle: refuse iops-size without iops-total/read/write
In a similar vein to commit ee2bdc33c9
("throttle: refuse bps_max/iops_max without bps/iops") it is likely that
the user made a configuration error if iops-size has been set but no
iops limit has been set.

Print an error message so the user can check their throttling
configuration.  They should either remove iops-size if they don't want
any throttling or specify one of iops-total, iops-read, or iops-write.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1464828031-25601-1-git-send-email-stefanha@redhat.com
2016-06-07 14:40:51 +01:00
Fam Zheng
c8a9fd8071 block: Drop bdrv_ioctl_bh_cb
Similar to the "!drv || !drv->bdrv_aio_ioctl" case above, here it is
okay to set co.ret and return. As pointed out by Paolo, a BH will be
created as necessary by the caller (bdrv_co_maybe_schedule_bh).
Besides, as pointed out by Kevin, "data" was leaked before.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20160601015223.19277-1-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Eric Blake
41574268b7 block: Move BlockRequest type to io.c
I was thrown by the fact that the public type BlockRequest had
an anonymous union, but no obvious discriminator.  Turns out
that the only client of the second branch of the union was code
internal to io.c, now that commit 91c6e4b killed public
multiwrite, so move it into io.c and improve the comments.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1463699150-19445-1-git-send-email-eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-07 14:40:51 +01:00
Peter Lieven
117bc3fa22 block/io: optimize bdrv_co_pwritev for small requests
in a read-modify-write cycle a small request might cause
head and tail to fall into the same aligned block. Currently
QEMU reads the same block twice in this case which is
not necessary.

Signed-off-by: Peter Lieven <pl@kamp.de>
Message-id: 1464607873-28206-1-git-send-email-pl@kamp.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Changlong Xie
e3a4f91b4d iostatus: fix comments for block_job_iostatus_reset
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-id: 1464600491-23340-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Kevin Wolf
a7944dfad0 block/io: Remove unused bdrv_aio_write_zeroes()
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1464599852-15392-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Stefan Hajnoczi
3a90c4ace2 virtio: drop duplicate virtio_queue_get_id() function
The virtio_queue_get_id() function is the lesser used duplicate of
virtio_get_queue_index().  Use the latter instead.

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1463767461-17922-1-git-send-email-stefanha@redhat.com
2016-06-07 14:40:51 +01:00
Fam Zheng
ef8875b549 virtio-scsi: Remove op blocker for dataplane
The previous patch dropped all op blockers from virtio-blk data plane.
The situation of virtio-scsi is exactly the same it can drop them too.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1463969978-24970-5-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Fam Zheng
3482958383 virtio-blk: Remove op blocker for dataplane
Block layer is prepared to unspecialize dataplane, an evidence is this
almost complete list of unblocked operations. It has all types except
two (actually three if DATAPLANE itself counts but blockdev.c makes sure
attaching twice is not possible): MIRROR_TARGET and BACKUP_TARGET.

blockdev-mirror refuses to start if target is attached, so the first is
not a problem.

By removing BACKUP_TARGET, blockdev-backup will become permissive to
write to a virtio-blk dataplane disk, but that is not worse than
non-dataplane given the latter is already possible. In either case,
blockdev.c always checks the target and source are on the same
AioContext, or bring them together if possible.

Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1463969978-24970-4-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:51 +01:00
Fam Zheng
efd7556708 blockdev-backup: Don't move target AioContext if it's attached
If the BDS is attached, it will want to stay on the AioContext where its
BlockBackend is. Don't call bdrv_set_aio_context in this case.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1463969978-24970-3-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:50 +01:00
Fam Zheng
0d97891312 blockdev-backup: Use bdrv_lookup_bs on target
This allows backing up to a BDS that has not been attached to any BB.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1463969978-24970-2-git-send-email-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 14:40:50 +01:00
Stefan Hajnoczi
271b385e7e tests: avoid coroutine pool test crash
Skip the test_co_queue test case if the coroutine pool is not enabled.
The test case does not work without the pool because it touches memory
belonging to a freed coroutine (on purpose).

Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-id: 1463767231-13379-1-git-send-email-stefanha@redhat.com
2016-06-07 14:40:50 +01:00
Peter Maydell
a70dadc7f1 linux-user: Use both si_code and si_signo when converting siginfo_t
The siginfo_t struct includes a union. The correct way to identify
which fields of the union are relevant is complicated, because we
have to use a combination of the si_code and si_signo to figure out
which of the union's members are valid.  (Within the host kernel it
is always possible to tell, but the kernel carefully avoids giving
userspace the high 16 bits of si_code, so we don't have the
information to do this the easy way...) We therefore make our best
guess, bearing in mind that a guest can spoof most of the si_codes
via rt_sigqueueinfo() if it likes.  Once we have made our guess, we
record it in the top 16 bits of the si_code, so that tswap_siginfo()
later can use it.  tswap_siginfo() then strips these top bits out
before writing si_code to the guest (sign-extending the lower bits).

This fixes a bug where fields were sometimes wrong; in particular
the LTP kill10 test went into an infinite loop because its signal
handler got a si_pid value of 0 rather than the pid of the sending
process.

As part of this change, we switch to using __put_user() in the
tswap_siginfo code which writes out the byteswapped values to
the target memory, in case the target memory pointer is not
sufficiently aligned for the host CPU's requirements.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:08 +03:00
Timothy E Baldwin
7d92d34ee4 linux-user: Restart fork() if signals pending
If there is a signal pending during fork() the signal handler will
erroneously be called in both the parent and child, so handle any
pending signals first.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-20-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:08 +03:00
Peter Maydell
bef653d92e linux-user: Use safe_syscall for kill, tkill and tgkill syscalls
Use the safe_syscall wrapper for the kill, tkill and tgkill syscalls.
Without this, if a thread sent a SIGKILL to itself it could kill the
thread before we had a chance to process a signal that arrived just
before the SIGKILL, and that signal would get lost.

We drop all the ifdeffery for tkill and tgkill, because every guest
architecture we support implements them, and they've been in Linux
since 2003 so we can assume the host headers define the __NR_tkill
and __NR_tgkill constants.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:08 +03:00
Timothy E Baldwin
a0995886e2 linux-user: Restart exit() if signal pending
Without this a signal could vanish on thread exit.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-26-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:08 +03:00
Timothy E Baldwin
f59ec60610 linux-user: pause() should not pause if signal pending
Fix races between signal handling and the pause syscall by
reimplementing it using block_signals() and sigsuspend().
(Using safe_syscall(pause) would also work, except that the
pause syscall doesn't exist on all architectures.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-28-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: tweaked commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Timothy E Baldwin
ef6a778ea2 linux-user: Block signals during sigaction() handling
Block signals while emulating sigaction. This is a non-interruptible
syscall, and using block_signals() avoids races where the host
signal handler is invoked and tries to examine the signal handler
data structures while we are updating them.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-29-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: expanded commit message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Timothy E Baldwin
655ed67c2a linux-user: Queue synchronous signals separately
If a synchronous signal and an asynchronous signal arrive near simultaneously,
and the signal number of the asynchronous signal is lower than that of the
synchronous signal the the handler for the asynchronous would be called first,
and then the handler for the synchronous signal would be called within or
after the first handler with an incorrect context.

This is fixed by queuing synchronous signals separately. Note that this does
risk delaying a asynchronous signal until the synchronous signal handler
returns rather than handling the signal on another thread, but this seems
unlikely to cause problems for real guest programs and is unavoidable unless
we could guarantee to roll back and reexecute whatever guest instruction
caused the synchronous signal (which would be a bit odd if we've already
logged its execution, for instance, and would require careful analysis of
all guest CPUs to check it was possible in all cases).

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-24-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: added a comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Timothy E Baldwin
907f5fddaa linux-user: Remove real-time signal queuing
As host signals are now blocked whenever guest signals are blocked, the
queue of realtime signals is now in Linux. The QEMU queue is now
redundant and can be removed. (We already did not queue non-RT signals, and
none of the calls to queue_signal() except the one in host_signal_handler()
pass an RT signal number.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-23-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: minor commit message tweak]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Timothy E Baldwin
8fdb9fef3d linux-user: Remove redundant gdb_queuesig()
Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-22-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Timothy E Baldwin
c19c1578f8 linux-user: Remove redundant default action check in queue_signal()
Both queue_signal() and process_pending_signals() did check for default
actions of signals, this is redundant and also causes fatal and stopping
signals to incorrectly cause guest system calls to be interrupted.

The code in queue_signal() is removed.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-21-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Peter Maydell
3d3efba020 linux-user: Fix race between multiple signals
If multiple host signals are received in quick succession they would
be queued in TaskState then delivered to the guest in spite of
signals being supposed to be blocked by the guest signal handler's
sa_mask. Fix this by decoupling the guest signal mask from the
host signal mask, so we can have protected sections where all
host signals are blocked. In particular we block signals from
when host_signal_handler() queues a signal from the guest until
process_pending_signals() has unqueued it. We also block signals
while we are manipulating the guest signal mask in emulation of
sigprocmask and similar syscalls.

Blocking host signals also ensures the correct behaviour with respect
to multiple threads and the overrun count of timer related signals.
Alas blocking and queuing in qemu is still needed because of virtual
processor exceptions, SIGSEGV and SIGBUS.

Blocking signals inside process_pending_signals() protects against
concurrency problems that would otherwise happen if host_signal_handler()
ran and accessed the signal data structures while process_pending_signals()
was manipulating them.

Since we now track the guest signal mask separately from that
of the host, the sigsuspend system calls must track the signal
mask passed to them, because when we process signals as we leave
the sigsuspend the guest signal mask in force is that passed to
sigsuspend.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-19-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: make signal_pending a simple flag rather than a word with two flag bits;
 ensure we don't call block_signals() twice in sigreturn codepaths;
 document and assert() the guarantee that using do_sigprocmask() to
 get the current mask never fails;  use the qemu atomics.h functions
 rather than raw volatile variable access; add extra commentary and
 documentation; block SIGSEGV/SIGBUS in block_signals() and in
 process_pending_signals() because they can't occur synchronously here;
 check the right do_sigprocmask() call for errors in ssetmask syscall;
 expand commit message; fixed sigsuspend() hanging]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:07 +03:00
Peter Maydell
2fe4fba115 linux-user: Use safe_syscall for sigsuspend syscalls
Use the safe_syscall wrapper for sigsuspend syscalls. This
means that we will definitely deliver a signal that arrives
before we do the sigsuspend call, rather than blocking first
and delivering afterwards.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Peter Maydell
b28a1f333a linux-user: Define macro for size of host kernel sigset_t
Some host syscalls take an argument specifying the size of a
host kernel's sigset_t (which isn't necessarily the same as
that of the host libc's type of that name). Instead of hardcoding
_NSIG / 8 where we do this, define and use a SIGSET_T_SIZE macro.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Peter Maydell
9eede5b69f linux-user: Factor out uses of do_sigprocmask() from sigreturn code
All the architecture specific handlers for sigreturn include calls
to do_sigprocmask(SIGSETMASK, &set, NULL) to set the signal mask
from the uc_sigmask in the context being restored. Factor these
out into calls to a set_sigmask() function. The next patch will
want to add code which is not run when setting the signal mask
via do_sigreturn, and this change allows us to separate the two
cases.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Peter Maydell
7ec87e06c7 linux-user: Fix stray tab-indent
Fix a stray tab-indented linux in linux-user/signal.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Peter Maydell
e902d588dc linux-user: Move handle_pending_signal() to avoid need for declaration
Move the handle_pending_signal() function above process_pending_signals()
to avoid the need for a forward declaration. (Whitespace only change.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Peter Maydell
eb5525013a linux-user: Factor out handle_signal code from process_pending_signals()
Factor out the code to handle a single signal from the
process_pending_signals() function. The use of goto for flow control
is OK currently, but would get significantly uglier if extended to
allow running the handle_signal code multiple times.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 16:39:06 +03:00
Greg Kurz
c02d7030c3 virtio: move bi-endian target support to a single location
Paolo's recent cpu.h cleanups broke legacy virtio for ppc64 LE guests (and
arm BE guests as well, even if I have not verified that). Especially, commit
"33c11879fd42 qemu-common: push cpu.h inclusion out of qemu-common.h" has
the side-effect of silently hiding the TARGET_IS_BIENDIAN macro from the
virtio memory accessors, and thus fully disabling support of endian changing
targets.

To be sure this cannot happen again, let's gather all the bi-endian bits
where they belong in include/hw/virtio/virtio-access.h.

The changes in hw/virtio/vhost.c are safe because vhost_needs_vring_endian()
is not called on a hot path and non bi-endian targets will return false
anyway.

While here, also rename TARGET_IS_BIENDIAN to be more precise: it is only for
legacy virtio and bi-endian guests.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 15:39:28 +03:00
Xiao Guangrong
9f318f8f7e pc-dimm: introduce realize callback
nvdimm needs to  check if the backend memory is large enough to contain
label data and init its memory region when the device is realized, so
introduce realize callback which is called after common dimm has been
realize

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 15:39:28 +03:00
Xiao Guangrong
3c3e88a814 pc-dimm: get memory region from ->get_memory_region()
Curretly, the memory region of backed memory is all directly
mapped to guest's address space, however, it will be not true
for nvdimm device if we introduce nvdimm label which only can
be indirectly accessed by ACPI DSM method

Also it improves the comments a bit to reflect this fact

Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-06-07 15:39:28 +03:00
Igor Mammedov
28213cb6a6 acpi: make bios_linker_loader_add_checksum() API offset based
It should help to make clear that bios_linker works in terms
of offsets within a file. Also it should prevent mistakes
where user passes as arguments pointers to unrelated to file blobs.

While at it, considering that it's a ACPI checksum and
it's initial value must be 0, move checksum field zeroing
into bios_linker_loader_add_checksum() instead of doing it
at every call site manually before bios_linker_loader_add_checksum()
is called.

In addition add extra boundary checks.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:39:27 +03:00
Igor Mammedov
4678124bb9 acpi: make bios_linker_loader_add_pointer() API offset based
cleanup bios_linker_loader_add_pointer() API by switching
arguments to taking offsets relative to corresponding files
instead of doing pointer arithmetic on behalf of user which
were confusing.

Also make offset inside of source file explicit in API
so that user won't have to manually set it in
destination file blob and while at it add additional
boundary checks.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:39:27 +03:00
Igor Mammedov
9774ccf7cd tpm: apci: cleanup TCPA table initialization
At the time build_tpm_tcpa() is called the tcpalog size is
always 0, so log_area_start_address which is actually offset
from the start of ACPI_BUILD_TPMLOG_FILE is always 0.

Also as 'TCPA' is allocated 0 filled, there is no point
in calculating always 0 log_area_start_address and set
tcpa->log_area_start_address to it since the field should
always point to start of ACPI_BUILD_TPMLOG_FILE.
Make code easier to read dropping not needed offset
calculations.
While at that move tcpalog allocation closer to the code
that defines its size.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:39:27 +03:00
Igor Mammedov
8cc87c3179 acpi: cleanup bios_linker_loader_cleanup()
bios_linker_loader_cleanup() is called only from one place
and returned value is immediately freed wich makes returning
pointer from bios_linker_loader_cleanup() useless.

Cleanup bios_linker_loader_cleanup() by freeing
data there so that caller won't have to free it.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:39:27 +03:00
Igor Mammedov
ad9671b870 acpi: simplify bios_linker API by removing redundant 'table' argument
'table' argument in bios_linker_add_foo() commands is
a data blob of one of files also passed to the same API.
So instead of passing blob in every API call, add and keep
file name association with related blob at bios_linker_loader_alloc()
time.

And find blob by name looking up allocated file entries
inside of bios_linker_add_foo() commands.

It will:
 - make API less confusing,
 - enforce calling bios_linker_loader_alloc() before
   calling any bios_linker_add_foo()
 - make sure that blob is the correct one, i.e.
   associated with the right file name

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:39:27 +03:00
Igor Mammedov
0e9b9edae7 acpi: convert linker from GArray to BIOSLinker structure
Patch just changes type of of linker variables to
a structure, there aren't any functional changes.

Converting linker to a structure will allow to extend
it functionality in follow up patch adding sanity blob
checks.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
0058c08238 pc: use AcpiDeviceIfClass.send_event to issue GPE events
it reduces number of args passed in handlers by 1 and
a number of used proxy wrappers saving ~20LOC.
Also it allows to make cpu/mem hotplug code more
universal as it would allow ARM to reuse it without
rewrite by providing its own send_event callback
to trigger events usiong GPIO instead of GPE
as fixed hadrware ACPI model doen't have GPE at all.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
eaf23bf794 acpi: extend ACPI interface to provide send_event hook
send_event() hook will allow to send ACPI event in
a target specific way (GPE or GPIO based impl.)
it will also simplify proxy wrappers in piix4pm/ich9
that access ACPI regs and SCI which are part of
piix4pm/lcp_ich9 devices and call acpi_foo() API directly.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Corey Minyard
6d42eefad8 pc: Postpone SMBIOS table installation to post machine init
This is the same place that the ACPI SSDT table gets added, so that
devices can add themselves to the SMBIOS table.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Corey Minyard
15139b8ef0 ipmi: rework the fwinfo to be fetched from the interface
Instead of scanning IPMI devices from a fwinfo list, allow
the fwinfo to be fetched from the IPMI interface class.
Then the code looking for IPMI fwinfo can scan devices on a
bus and look for ones that implement the IPMI class.

This will let the ACPI scope be defined by the calling
code so the IPMI code doesn't have to know the scope.

Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
53c400a6ac tests: acpi: update tables with consolidated legacy cpu-hotplug AML
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
76bdd24ec0 pc: acpi: cpuhp-legacy: switch ProcessorID to possible_cpus idx
In legacy cpu-hotplug ProcessorID == APIC ID is used
in MADT and cpu-hotplug AML. It was fine as both
are 8bit and unique. Spec depricated Processor()
with corresponding ProcessorID and advises to use
Device() and UID instead of it.

However UID is just 32bit and it can't fit ARM's
arch_id(MPIDR) which is 64bit. Also in case of
sparse arch_id() distribution, managment/lookup
of maps by arch_id(APIC ID/MPIDR) becomes complex
and expensive.

In preparation to common CPU hotplug with ARM
and to simplify lookup in possible_cpus[] map
switch ProcessorID to possible_cpus index in
MADT.

Legacy cpu-hotplug considerations:
HW interface of it is APIC ID based bitmask so
it's impossible to change, also CPON package in
AML also APIC ID based as well all the methods.

To avoid massive rewrite of AML keep is so and
just break assumption that ProcessorID == APIC ID,
ammending CPU_MAT_METHOD to accept APIC ID and
possible_cpus index, it needs them both to patch
MADT entry template. Also switch to possible_cpus
index Processor(ProcessorID) AML.
That way changes to MADT/AML are minimal and kept
inside AML/MADT not affecting external interfaces.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
ebd8ea8244 pc: acpi: simplify build_legacy_cpu_hotplug_aml() signature
since IO block used by CPU hotplug is fixed size and
initialized it the same file as build_legacy_cpu_hotplug_aml()
just use ACPI_GPE_PROC_LEN directly instead of passing
it around in several files.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
672a287227 pc: acpi: consolidate legacy CPU hotplug in one file
Since AML part of CPU hotplug is tightly coupled with
its hardware part (IO port layout/protocol), move
build_legacy_cpu_hotplug_aml() to cpu_hotplug.c
and remove empty cpu_hotplug_acpi_table.c

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
96e3e12bff pc: acpi: mark current CPU hotplug functions as legacy
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
a630bb314c pc: acpi: cpu-hotplug: make AML CPU_foo defines local to cpu_hotplug_acpi_table.c
now as those defines are used only locally inside of
cpu_hotplug_acpi_table.c, move them out of header file.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
8edf77e497 pc: acpi: consolidate \GPE._E02 with the rest of CPU hotplug AML
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
7c2991fa11 pc: acpi: consolidate CPU hotplug AML
move the former SSDT part of CPU hoplug close to DSDT part.
AML is only moved but there isn't any functional change.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
86958d2ddd pc: acpi: remove AML for empty/not used GPE handlers
ACPI spec requires GPE handlers only for GPE events
that hardware implements.
So remove AML for not supported by QEMU device model
events.

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
7bc6fd2464 acpi: add aml_refof()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
e8977414a2 acpi: add aml_debug()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Igor Mammedov
d19587db9e tests: acpi: report names of expected files in verbose mode
print expected file name if it doesn't exists if
verbose mode is enabled*. It helps to avoid running
bios-tables-test under debugger to figure out missing
file name.

*)
verbose mode is enabled if "V" env. variable is set

Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
2016-06-07 15:36:54 +03:00
Robert Ho
99a9a52a23 vnc: list the 'to' parameter of '-vnc' in the qemu man page
Signed-off-by: Robert Ho <robert.hu@intel.com>
Message-Id: <1464678190-9290-2-git-send-email-robert.hu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:39 +02:00
Paolo Bonzini
ed45cae391 scsi-disk: add missing break
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:39 +02:00
Fam Zheng
67a1de0d19 Makefile: Derive "PKGVERSION" from "git describe" by default
Currently, if not specified in "./configure", QEMU_PKGVERSION will be
empty. Write a rule in Makefile to generate a value from "git describe"
combined with a possible git tree cleanness suffix, and write into a new
header.

    $ cat qemu-version.h
    #define QEMU_PKGVERSION "-v2.6.0-557-gd6550e9-dirty"

Include the header in .c files where the macro is referenced. It's not
necessary to include it in all files, otherwise each time the content of
the file changes, all sources have to be recompiled.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464774261-648-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:39 +02:00
Paolo Bonzini
077de81a4c Makefile: add dependency on scripts/hxtool
Make sure that the various documentation and C code files are rebuilt
whenever there is a change in the script that splits them out of
.hx files.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:38 +02:00
Paolo Bonzini
0ab0c99851 Makefile: add dependency on scripts/make_device_config.sh
Make sure that config-devices.mak is rebuilt whenever
there is a change in the scripts that generates it.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:38 +02:00
Paolo Bonzini
553350156d Makefile: add dependency on scripts/create_config
Make sure that config-host.h and config-target.h are rebuilt whenever
there is a change in the scripts that generates them; add the dependency
to the pattern rule as suggested by Peter.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:14:30 +02:00
Fam Zheng
d41d4da3c5 Makefile: Add a "FORCE" target
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464774261-648-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:10:52 +02:00
Prasad J Pandit
844864fbae scsi: megasas: null terminate bios version buffer
While reading information via 'megasas_ctrl_get_info' routine,
a local bios version buffer isn't null terminated. Add the
terminating null byte to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-07 14:09:05 +02:00
Peter Maydell
0601d6a411 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160607' into staging
ppc patch queue for 2016-05-31

Latest patch queue for ppc.  Several significant things in here:
  * A bunch of patches from BenH fixing things in TCG
     - This should fix several regressions introduced by recent
       patches for better HV mode support
     - It also fixes some other bugs discovered along the way
  * Some fixes and cleanups for Mac machine types from Marc
    Cave-Ayland
  * Preliminary patches towards dynamic DMA window support from Alexey
    Kardashevskiy
      - This includes a patch to migration code code
  * Increase number of hotpluggable memory slots
      - Includes a change to KVM generic code, ACKed by Paolo
  * Another TCG fix for an SPE instruction

# gpg: Signature made Tue 07 Jun 2016 11:46:57 BST
# gpg:                using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160607: (26 commits)
  ppc: Do not take exceptions on unknown SPRs in privileged mode
  ppc: Add missing slbfee. instruction on ppc64 BookS processors
  ppc: Fix slbia decode
  ppc: Fix mtmsr decoding
  ppc: POWER7 has lq/stq instructions and stq need to check ISA
  ppc: POWER7 had ACOP and PID registers
  ppc: Batch TLB flushes on 32-bit 6xx/7xx/7xxx in hash mode
  ppc: Fix tlb invalidations on 6xx/7xx/7xxx 32-bit processors
  ppc: Properly tag the translation cache based on MMU mode
  dbdma: use DMA memory interface for memory accesses
  macio: use DMA memory interface for non-block ATAPI transfers
  target-ppc: fixup bitrot in mmu_helper.c debug statements
  spapr_pci: Drop cannot_instantiate_with_device_add_yet=false
  ppc: fix hrfid, tlbia and slbia privilege
  ppc: Fix hreg_store_msr() so that non-HV mode cannot alter MSR:HV
  ppc: Better figure out if processor has HV mode
  spapr: Introduce pseries-2.7 machine type
  spapr: Increase hotpluggable memory slots to 256
  spapr_pci: Add and export DMA resetting helper
  spapr_pci: Reset DMA config on PHB reset
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-07 12:54:25 +01:00
Laurent Vivier
575b22b1b7 linux-user: check if NETLINK_ROUTE is available
Some IFLA_* symbols can be missing in the host linux/if_link.h,
but as they are enums and not "#defines", check in "configure" if
last known  (IFLA_PROTO_DOWN) is available and if not, disable
management of NETLINK_ROUTE protocol.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 11:39:00 +03:00
Laurent Vivier
5ce9bb5937 linux-user: add netlink audit
This is, for instance, needed to log in a container.

Without this, the user cannot be identified and the console login
fails with "Login incorrect".

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 11:37:14 +03:00
Laurent Vivier
b265620bfb linux-user: support netlink protocol NETLINK_KOBJECT_UEVENT
This is the protocol used by udevd to manage kernel events.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 11:34:36 +03:00
Laurent Vivier
6c5b5645ae linux-user: add rtnetlink(7) support
rtnetlink is needed to use iproute package (ip addr, ip route)
and dhcp client.

Examples:

Without this patch:
    # ip link
    Cannot open netlink socket: Address family not supported by protocol
    # ip addr
    Cannot open netlink socket: Address family not supported by protocol
    # ip route
    Cannot open netlink socket: Address family not supported by protocol
    # dhclient eth0
    Cannot open netlink socket: Address family not supported by protocol
    Cannot open netlink socket: Address family not supported by protocol

With this patch:
    # ip link
    1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN mode DEFAULT
        link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00
    51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP mode DEFAULT qlen 1000
        link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
    # ip addr show eth0
    51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000
        link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
        inet 192.168.122.197/24 brd 192.168.122.255 scope global eth0
           valid_lft forever preferred_lft forever
        inet6 fe80::216:3eff:fe89:6bd7/64 scope link
           valid_lft forever preferred_lft forever
    # ip route
    default via 192.168.122.1 dev eth0
    192.168.122.0/24 dev eth0  proto kernel  scope link  src 192.168.122.197
    # ip addr flush eth0
    # ip addr add 192.168.122.10 dev eth0
    # ip addr show eth0
    51: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc fq_codel state UP qlen 1000
        link/ether 00:16:3e:89:6b:d7 brd ff:ff:ff:ff:ff:ff
        inet 192.168.122.10/32 scope global eth0
           valid_lft forever preferred_lft forever
    # ip route add 192.168.122.0/24 via 192.168.122.10
    # ip route
        192.168.122.0/24 via 192.168.122.10 dev eth0

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 11:33:36 +03:00
Laurent Vivier
3bef0451e6 linux-user: Fix qemu-binfmt-conf.sh to store config across reboot
Original qemu-binfmt-conf.sh is only able to write configuration
into /proc/sys/fs/binfmt_misc, and the configuration is lost on reboot.

This script can configure debian and systemd services to restore
configuration on reboot. Moreover, it is able to manage binfmt
credential and to configure the path of the interpreter.

List of supported CPU is:

i386 i486 alpha arm sparc32plus ppc ppc64 ppc64le
m68k mips mipsel mipsn32 mipsn32el mips64 mips64el
sh4 sh4eb s390x aarch64

Usage: qemu-binfmt-conf.sh [--qemu-path PATH][--debian][--systemd CPU]
                           [--help][--credential yes|no][--exportdir PATH]

       Configure binfmt_misc to use qemu interpreter

       --help:       display this usage
       --qemu-path:  set path to qemu interpreter (/usr/local/bin)
       --debian:     don't write into /proc,
                     instead generate update-binfmts templates
       --systemd:    don't write into /proc,
                     instead generate file for systemd-binfmt.service
                     for the given CPU
       --exportdir:  define where to write configuration files
                     (default: /etc/binfmt.d or /usr/share/binfmts)
       --credential: if yes, credential an security tokens are
                     calculated according to the binary to interpret

    To import templates with update-binfmts, use :

        sudo update-binfmts --importdir /usr/share/binfmts --import qemu-CPU

    To remove interpreter, use :

        sudo update-binfmts --package qemu-CPU --remove qemu-CPU /usr/local/bin

    With systemd, binfmt files are loaded by systemd-binfmt.service

    The environment variable HOST_ARCH allows to override 'uname' to generate
    configuration files for a different architecture than the current one.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-06-07 09:38:06 +03:00
Benjamin Herrenschmidt
4d6a0680fa ppc: Do not take exceptions on unknown SPRs in privileged mode
The architecture specifies that mtspr/mfspr on an unknown SPR number
should act as a nop in privileged mode.

I haven't removed the warning however as it can be useful for
diagnosing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:45 +10:00
Benjamin Herrenschmidt
c76c22d51d ppc: Add missing slbfee. instruction on ppc64 BookS processors
Used to lookup SLB entries by address, for some reason it was missing.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:45 +10:00
Benjamin Herrenschmidt
2f9254d964 ppc: Fix slbia decode
Since at least the 2.05 architecture, the slbia instruction takes an
IH field in the opcode to provide some control on the effect of the
slbia on the ERATs (level-1 TLB).

We can safely ignore it as we always flush the whole qemu TLB but
we should allow the bits in the decode.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:45 +10:00
Benjamin Herrenschmidt
5e31867fbd ppc: Fix mtmsr decoding
We had code to handle the L bit in the opcode but we didn't
allow it in the decode mask.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:45 +10:00
Benjamin Herrenschmidt
dfdd3e4362 ppc: POWER7 has lq/stq instructions and stq need to check ISA
The PPC_64BX instruction flag is used for a couple of newer
instructions currently on POWER8 but our implementation for
them works for POWER7 too (and already does the proper checking
of what is permitted) with one exception: stq needs to check
the ISA version.

This fixes the latter and add the instructions to POWER7

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:44 +10:00
Benjamin Herrenschmidt
8eb0f56372 ppc: POWER7 had ACOP and PID registers
We only had them on POWER8, add them to POWER7 as well

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:44 +10:00
Benjamin Herrenschmidt
c5a8d8f32d ppc: Batch TLB flushes on 32-bit 6xx/7xx/7xxx in hash mode
This ports the existing 64-bit mechanism to 32-bit, thus series
of 64 tlbie's followed by a sync like some versions of Darwin
(ab)use will result in a single flush.

We apply a pending flush on any sync instruction though, as Darwin
doesn't use tlbsync on non-SMP systems.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:44 +10:00
Benjamin Herrenschmidt
3dcfb74fd4 ppc: Fix tlb invalidations on 6xx/7xx/7xxx 32-bit processors
The processor only uses some bits of the address and invalidates an
entire congruence class. Some OSes such as Darwin and HelenOS take
advantage of this and occasionally invalidate the entire TLB by just
doing a series of 64 consecutive tlbie for example.

Our code tries to be too smart here only invalidating a segment
congruence class (ie, allowing more address bits to be relevant
in the invalidation), this fails miserably on those OSes.

Instead don't bother, do like ppc64 and blow the whole tlb when tlbie
is executed.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:44 +10:00
Benjamin Herrenschmidt
f5d9c1089f ppc: Properly tag the translation cache based on MMU mode
We used to always flush the TLB when changing relocation mode in
MSR:IR and MSR:DR (ie. MMU on/off for Instructions and Data).

We don't anymore since we have split mmu_idx for instruction and data.

However, since we hard code the mmu_idx in the translated code, we
now need to also make sure MSR:IR and MSR:DR are part of the hflags
used to tag translated code, so that we use different translated
code for different MMU settings.

Darwin gets hurt by this problem.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 13:10:44 +10:00
Mark Cave-Ayland
8865588133 dbdma: use DMA memory interface for memory accesses
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Mark Cave-Ayland
ddd495e5e3 macio: use DMA memory interface for non-block ATAPI transfers
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Mark Cave-Ayland
9207113dcc target-ppc: fixup bitrot in mmu_helper.c debug statements
This fixes compilation of mmu_helper.c when all of the debug #defines at
the start of the file are enabled.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Markus Armbruster
679dd415bb spapr_pci: Drop cannot_instantiate_with_device_add_yet=false
It's become redundant since it was added in commit 09aa9a5 "spapr-pci:
enable adding PHB via -device".

Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Cédric Le Goater
1c7336c5d1 ppc: fix hrfid, tlbia and slbia privilege
commit 74693da988 ('ppc: tlbie, tlbia and tlbisync are HV only')
introduced some extra checks on the instruction privilege. slbia was
changed wrongly and hrfid, tlbia were forgotten.

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Benjamin Herrenschmidt
1c953ba57a ppc: Fix hreg_store_msr() so that non-HV mode cannot alter MSR:HV
This helper is only used by the various instructions that can alter
MSR and not interrupts. Add a comment to that effect to the interrupt
code as well in case somebody wants to change this

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Benjamin Herrenschmidt
932ccbdd48 ppc: Better figure out if processor has HV mode
We use an env. flag which is set to the initial value of MSR_HVB in
the msr_mask. We also adjust the POWER8 mask to set SHV.

Also use this to adjust ctx.hv so that it is *set* when the processor
doesn't have an HV mode (970 with Apple mode for example), thus enabling
hypervisor instructions/SPRs.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[clg: ctx.hv used to be defined only for the hypervisor kernel
      (HV=1|PR=0). It is now defined also when PR=1 and conditions are
      fixed accordingly.
      stripped unwanted tabs.]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Bharata B Rao
1ea1eefcbb spapr: Introduce pseries-2.7 machine type
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Bharata B Rao
71c9a3dd04 spapr: Increase hotpluggable memory slots to 256
KVM now supports 512 memslots on PowerPC (earlier it was 32). Allow half
of it (256) to be used as hotpluggable memory slots.

Instead of hard coding the max value, use the KVM supplied value if KVM
is enabled. Otherwise resort to the default value of 32.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
b3162f22cb spapr_pci: Add and export DMA resetting helper
This will be later used by the "ibm,reset-pe-dma-window" RTAS handler
which resets the DMA configuration to the defaults.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
acf1b6dd22 spapr_pci: Reset DMA config on PHB reset
LoPAPR dictates that during system reset all DMA windows must be removed
and the default DMA32 window must be created so does the patch.

At the moment there is just one window supported so no change in
behaviour is expected.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
b4b6eb771a spapr_iommu: Add root memory region
We are going to have multiple DMA windows at different offsets on
a PCI bus. For the sake of migration, we will have as many TCE table
objects pre-created as many windows supported.
So we need a way to map windows dynamically onto a PCI bus
when migration of a table is completed but at this stage a TCE table
object does not have access to a PHB to ask it to map a DMA window
backed by just migrated TCE table.

This adds a "root" memory region (UINT64_MAX long) to the TCE object.
This new region is mapped on a PCI bus with enabled overlapping as
there will be one root MR per TCE table, each of them mapped at 0.
The actual IOMMU memory region is a subregion of the root region and
a TCE table enables/disables this subregion and maps it at
the specific offset inside the root MR which is 1:1 mapping of
a PCI address space.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
a26fdf3934 spapr_iommu: Migrate full state
The source guest could have reallocated the default TCE table and
migrate bigger/smaller table. This adds reallocation in post_load()
if the default table size is different on source and destination.

This adds @bus_offset, @page_shift to the migration stream as
a subsection so when DDW is added, migration to older machines will
still be possible. As @bus_offset and @page_shift are not used yet,
this makes no change in behavior.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
df7625d422 spapr_iommu: Introduce "enabled" state for TCE table
Currently TCE tables are created once at start and their sizes never
change. We are going to change that by introducing a Dynamic DMA windows
support where DMA configuration may change during the guest execution.

This changes spapr_tce_new_table() to create an empty zero-size IOMMU
memory region (IOMMU MR). Only LIOBN is assigned by the time of creation.
It still will be called once at the owner object (VIO or PHB) creation.

This introduces an "enabled" state for TCE table objects, some
helper functions are added:
- spapr_tce_table_enable() receives TCE table parameters, stores in
sPAPRTCETable and allocates a guest view of the TCE table
(in the user space or KVM) and sets the correct size on the IOMMU MR;
- spapr_tce_table_disable() disposes the table and resets the IOMMU MR
size; it is made public as the following DDW code will be using it.

This changes the PHB reset handler to do the default DMA initialization
instead of spapr_phb_realize(). This does not make differenct now but
later with more than just one DMA window, we will have to remove them all
and create the default one on a system reset.

No visible change in behaviour is expected except the actual table
will be reallocated every reset. We might optimize this later.

The other way to implement this would be dynamically create/remove
the TCE table QOM objects but this would make migration impossible
as the migration code expects all QOM objects to exist at the receiver
so we have to have TCE table objects created when migration begins.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Alexey Kardashevskiy
705124ea6d vmstate: Define VARRAY with VMS_ALLOC
This allows dynamic allocation for migrating arrays.

Already existing VMSTATE_VARRAY_UINT32 requires an array to be
pre-allocated, however there are cases when the size is not known in
advance and there is no real need to enforce it.

This defines another variant of VMSTATE_VARRAY_UINT32 with WMS_ALLOC
flag which tells the receiving side to allocate memory for the array
before receiving the data.

The first user of it is a dynamic DMA window which existence and size
are totally dynamic.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Bharata B Rao
44f2e6c10e kvm: API to obtain max supported mem slots
Introduce kvm_get_max_memslots() API that can be used to obtain the
maximum number of memslots supported by KVM.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:45 +10:00
Talha Imran
a575d9ab2e target-ppc/fpu_helper: Fix efscmp* instructions handling
With specification at hand from the reference manual from Freescale
http://cache.nxp.com/files/32bit/doc/ref_manual/SPEPEM.pdf , I have found a fix
to efscmp* instructions handling in QEMU.

efscmp* instructions in QEMU set crD (Condition Register nibble) values as
(0b0100 << 2) = 0b10000 (consider the HELPER_SINGLE_SPE_CMP macro which left
shifts the value returned by efscmp* handler by 2 bits). A value of 0b10000 is
not correct according the to the reference manual.

The reference manual expects efscmp* instructions to return a value of 0bx1xx.
Please find attached a patch which disables left shifting in
HELPER_SINGLE_SPE_CMP macro. This macro is used by efscmp* and efstst*
instructions only. efstst* instruction handlers, in turn, call efscmp* handlers
too.

*Explanation:*
Traditionally, each crD (condition register nibble) consist of 4 bits, which is
set by comparisons as follows:
crD = W X Y Z
where
W = Less than
X = Greater than
Y = Equal to

However, efscmp* instructions being a special case return a binary result.
(efscmpeq will set the crD = 0bx1xx iff when op1 == op2 and 0bx0xx otherwise;
i.e. there is no notion of different crD values based on Less than, Greater
than and Equal to).

This effectively means that crD will store a "Greater than" comparison result
iff efscmp* instruction comparison is TRUE. Compiler exploits this feature by
checking for "Branch if Less than or Equal to" (ble instruction) OR "Branch if
Greater than" (bgt instruction) for Branch if FALSE OR Branch if TRUE
respectively after an efscmp* instruction. This can be seen in a assembly code
snippet below:

27          if (__real__ x != 3.0f || __imag__ x != 4.0f)
10000498:   lwz r10,8(r31)
1000049c:   lis r9,16448
100004a0:   efscmpeq cr7,r10,r9
100004a4:   ble- cr7,0x100004b8 <bar+60>  //jump to abort() call
100004a8:   lwz r10,12(r31)
100004ac:   lis r9,16512
100004b0:   efscmpeq cr7,r10,r9
100004b4:   bgt- cr7,0x100004bc <bar+64>  //skip abort() call
28            abort ();
100004b8:   bl 0x10000808 <abort>

Signed-off-by: Talha Imran <talha_imran@mentor.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-06-07 10:17:44 +10:00
Paolo Bonzini
6214a11ac1 scsi: mark TYPE_SCSI_DISK_BASE as abstract
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-06 19:00:24 +02:00
Prasad J Pandit
ff589551c8 scsi: esp: check TI buffer index before read/write
The 53C9X Fast SCSI Controller(FSC) comes with internal 16-byte
FIFO buffers. One is used to handle commands and other is for
information transfer. Three control variables 'ti_rptr',
'ti_wptr' and 'ti_size' are used to control r/w access to the
information transfer buffer ti_buf[TI_BUFSZ=16]. In that,

'ti_rptr' is used as read index, where read occurs.
'ti_wptr' is a write index, where write would occur.
'ti_size' indicates total bytes to be read from the buffer.

While reading/writing to this buffer, index could exceed its
size. Add check to avoid OOB r/w access.

Reported-by: Huawei PSIRT <psirt@huawei.com>
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1465230883-22303-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-06 18:57:15 +02:00
xiaoqiang zhao
4b3eec91b9 hw/char: QOM'ify escc.c (fix)
The previous commit e7c9136977
(hw/char: QOM'ify escc.c) cause qemu-system-ppc/ppc64
OpenBIOS to freeze on startup, this commit fix it.

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <1464767898-30526-1-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-06 18:57:06 +02:00
Gerd Hoffmann
8156d48086 pc: allow raising low memory via max-ram-below-4g option
This patch extends the functionality of the max-ram-below-4g option
to also allow increasing lowmem.  Use case: Give as much memory as
possible to legacy non-PAE guests.

While being at it also rework the lowmem calculation logic and add a
longish comment describing how it works and what the compatibility
constrains are.

Note:  This is a incompatible change.  When setting max-ram-below-4g to
a value larger than 3.5G (or 3G with gigabyte alignment) it has no
effect on older qemu versions: qemu silently ignores it.  With the patch
applied it actually has an effect and changes the ram layout.  Highly
unlikely to hit in practive though as there is no reason start old qemu
versions that way.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1464857305-26675-1-git-send-email-kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-06 18:57:06 +02:00
Fam Zheng
46e7b70699 tests: Rename tests/Makefile to tests/Makefile.include
The file is only included from the top Makefile. Rename it to reflect
this more obviously.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464747811-26917-1-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-06-06 18:57:05 +02:00
Peter Maydell
7646240580 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160606-1' into staging
target-arm queue:
 * support instruction syndrome info for data aborts from A64 to EL2
 * add HSTR_EL2 register
 * fix incorrect ESR IL bits in various syndrome register cases
 * virt: fix limit of 64-bit ACPI/ECAM PCI MMIO range
 * gicv2: RAZ/WI non-sec access to sec interrupts
 * i2c: add aspeed i2c controller
 * virt: Reject gic-version=host for non-KVM (don't segv on aarch64 host)
 * xlnx-zynqmp: Add a secure prop to en/disable ARM Security Extensions
 * xlnx-zynqmp: Support KVM on AArch64 hosts
 * ptimer: Various fixes for awkward corner cases
 * char: QOMify various ARM UART models
 * char: get rid of qemu_char_get_next_serial
 * target-arm: Fix TTBR selecting logic on AArch32 Stage 2 translation
 * zynqmp: Add the ZCU102 board

# gpg: Signature made Mon 06 Jun 2016 17:01:11 BST
# gpg:                using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160606-1: (25 commits)
  zynqmp: Add the ZCU102 board
  target-arm: Fix TTBR selecting logic on AArch32 Stage 2 translation
  char: get rid of qemu_char_get_next_serial
  hw/char: QOM'ify xilinx_uartlite model
  hw/char: QOM'ify stm32f2xx_usart model
  hw/char: QOM'ify digic-uart model
  hw/char: QOM'ify cadence_uart model
  hw/char: QOM'ify pl011 model
  hw/ptimer: Introduce ptimer_get_limit
  hw/ptimer: Support "on the fly" timer mode switch
  hw/ptimer: Update .delta on period/freq change
  hw/ptimer: Perform counter wrap around if timer already expired
  hw/ptimer: Fix issues caused by the adjusted timer limit value
  xlnx-zynqmp: Use the in kernel GIC model for KVM runs
  xlnx-zynqmp: Delay realization of GIC until post CPU realization
  xlnx-zynqmp: Make the RPU subsystem optional
  xlnx-zynqmp: Add a secure prop to en/disable ARM Security Extensions
  hw/arm/virt: Reject gic-version=host for non-KVM
  i2c: add aspeed i2c controller
  hw/intc/gic: RAZ/WI non-sec access to sec interrupts
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 17:02:42 +01:00
Alistair Francis
0c18c6c67e zynqmp: Add the ZCU102 board
Most Zynq UltraScale+ users will be targetting and using the ZCU102
board instead of the development focused EP108. To make our QEMU machine
names clearer add a ZCU102 machine model.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: cc82eec026b2febfca252d73362bb7084616c1ad.1464213234.git.alistair.francis@xilinx.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:32 +01:00
Sergey Sorokin
6e99f76261 target-arm: Fix TTBR selecting logic on AArch32 Stage 2 translation
Address size is 40-bit for the AArch32 stage 2 translation,
and t0sz can be negative (from -8 to 7),
so we need to adjust it to use the existing TTBR selecting logic.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-id: 1464974151-1231644-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:32 +01:00
xiaoqiang zhao
e5fabad7cc char: get rid of qemu_char_get_next_serial
since there is no user of qemu_char_get_next_serial any more,
it's time to let it go away.

Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-7-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:32 +01:00
xiaoqiang zhao
1b6d0781c2 hw/char: QOM'ify xilinx_uartlite model
* drop qemu_char_get_next_serial and use chardev prop
* create xilinx_uartlite_create wrapper function to create
  xilinx_uartlite device
* change affected board code to use the new way

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-6-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:32 +01:00
xiaoqiang zhao
7bd43519da hw/char: QOM'ify stm32f2xx_usart model
* drop qemu_char_get_next_serial and use chardev prop
* change affected board code to use the new way

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-5-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:32 +01:00
xiaoqiang zhao
746c3b3eba hw/char: QOM'ify digic-uart model
* drop qemu_char_get_next_serial and use chardev prop
* change affected board code to use the new way

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-4-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:31 +01:00
xiaoqiang zhao
4be12ea09a hw/char: QOM'ify cadence_uart model
* drop qemu_char_get_next_serial and use chardev prop
* create cadence_uart_create wrapper function to create
  cadence_uart_device
* change affected board code to use the new way

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-3-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:31 +01:00
xiaoqiang zhao
f0d1d2c115 hw/char: QOM'ify pl011 model
* drop qemu_char_get_next_serial and use chardev prop
* add pl011_create wrapper function to create pl011 uart device
* change affected board code to use the new way

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1465028065-5855-2-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:31 +01:00
Dmitry Osipenko
578c4b2f23 hw/ptimer: Introduce ptimer_get_limit
Currently ptimer users are used to store copy of the limit value, because
ptimer doesn't provide facility to retrieve the limit. Let's provide it.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 8f1fa9f90d8dbf8086fb02f3b4835eaeb4089cf6.1464367869.git.digetx@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:31 +01:00
Dmitry Osipenko
869e92b5c3 hw/ptimer: Support "on the fly" timer mode switch
Allow switching between periodic <-> oneshot modes while timer is running.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: f030be6e28fbd219e1e8d22297aee367bd9af5bb.1464367869.git.digetx@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:31 +01:00
Dmitry Osipenko
7ef6e3cf8d hw/ptimer: Update .delta on period/freq change
Delta value must be updated on period/freq change, otherwise running timer
would be restarted (counter reloaded with old delta). Only m68k/mcf520x
and arm/arm_timer devices are currently doing freq change correctly, i.e.
stopping the timer. Perform delta update to fix affected devices and
eliminate potential further mistakes.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 4987ef5fdc128bb9a744fd794d3f609135c6a39c.1464367869.git.digetx@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:30 +01:00
Dmitry Osipenko
5a50307b48 hw/ptimer: Perform counter wrap around if timer already expired
ptimer_get_count() might be called while QEMU timer already been expired.
In that case ptimer would return counter = 0, which might be undesirable
in case of polled timer. Do counter wrap around for periodic timer to keep
it distributed. In order to achieve more accurate emulation behaviour of
certain hardware, don't perform wrap around when in icount mode and return
counter = 0 in that case (that doesn't affect polled counter distribution).

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 4ce381c7d24d85d165ff251d2875d16a4b6a5c04.1464367869.git.digetx@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:30 +01:00
Dmitry Osipenko
e91171e302 hw/ptimer: Fix issues caused by the adjusted timer limit value
Multiple issues here related to the timer with a adjusted .limit value:

1) ptimer_get_count() returns incorrect counter value for the disabled
timer after loading the counter with a small value, because adjusted limit
value is used instead of the original.

For instance:
    1) ptimer_stop(t)
    2) ptimer_set_period(t, 1)
    3) ptimer_set_limit(t, 0, 1)
    4) ptimer_get_count(t) <-- would return 10000 instead of 0

2) ptimer_get_count() might return incorrect value for the timer running
with a adjusted limit value.

For instance:
    1) ptimer_stop(t)
    2) ptimer_set_period(t, 1)
    3) ptimer_set_limit(t, 10, 1)
    4) ptimer_run(t)
    5) ptimer_get_count(t) <-- might return value > 10

3) Neither ptimer_set_period() nor ptimer_set_freq() are adjusting the
limit value, so it is still possible to make timer timeout value
arbitrary small.

For instance:
    1) ptimer_set_period(t, 10000)
    2) ptimer_set_limit(t, 1, 0)
    3) ptimer_set_period(t, 1) <-- bypass limit correction

Fix all of the above issues by adjusting timer period instead of the limit.
Perform the adjustment for periodic timer only. Use the delta value instead
of the limit to make decision whether adjustment is required, as limit could
be altered while timer is running, resulting in incorrect value returned by
ptimer_get_count.

Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: cd141f74f5737480ec586b9c7d18cce1d69884e2.1464367869.git.digetx@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:30 +01:00
Edgar E. Iglesias
2a0ee672c9 xlnx-zynqmp: Use the in kernel GIC model for KVM runs
Use the in kernel GIC model when running with KVM enabled.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1464173555-12800-5-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:30 +01:00
Edgar E. Iglesias
0776d9679d xlnx-zynqmp: Delay realization of GIC until post CPU realization
Delay the realization of the GIC until after CPUs are
realized. This is needed for KVM as the in-kernel GIC
model will fail if it is realized with no available CPUs.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1464173555-12800-4-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:30 +01:00
Edgar E. Iglesias
6ed92b14f6 xlnx-zynqmp: Make the RPU subsystem optional
The way we currently model the RPU subsystem is of quite
limited use. In addition to that, it causes problems for
KVM and for GDB debugging.

Make the RPU optional by adding a has_rpu property and
default to having it disabled.

This changes the default setup from having the RPU to not
longer having it.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1464173555-12800-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:29 +01:00
Edgar E. Iglesias
37d42473d1 xlnx-zynqmp: Add a secure prop to en/disable ARM Security Extensions
Add a secure prop to en/disable ARM Security Extensions.
This is particularly useful for KVM runs.

Default to disabled to match the behavior of KVM.

This changes the default setup from having the ARM Security
Extensions to not longer having them.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1464173555-12800-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:29 +01:00
Cole Robinson
0bf8039dca hw/arm/virt: Reject gic-version=host for non-KVM
If you try to gic-version=host with TCG on a KVM aarch64 host,
qemu segfaults, since host requires KVM APIs.

Explicitly reject gic-version=host if KVM is not enabled

https://bugzilla.redhat.com/show_bug.cgi?id=1339977
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: b1b3b0dd143b7995a7f4062966b80a2cf3e3c71e.1464273085.git.crobinso@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:29 +01:00
Cédric Le Goater
1602001195 i2c: add aspeed i2c controller
The Aspeed AST2400 integrates a set of 14 I2C/SMBus bus controllers
directly connected to the APB bus. They can be programmed as master or
slave but the propopsed model only supports the master mode.

On the TODO list, we also have :

 - improve and harden the state machine.
 - bus recovery support (used by the Linux driver).
 - transfer mode state machine bits. this is not strictly necessary as
   it is mostly used for debug. The bus busy bit is deducted from the
   I2C core engine of qemu.
 - support of the pool buffer: 2048 bytes of internal SRAM (not used
   by the Linux driver).

Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1464704307-25178-1-git-send-email-clg@kaod.org
[PMM: removed unused functions aspeed_i2c_bus_get_state() and
 aspeed_i2c_bus_set_state()]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:29 +01:00
Jens Wiklander
fea8a08e16 hw/intc/gic: RAZ/WI non-sec access to sec interrupts
Treat non-secure accesses to registers and bits in registers of secure
interrupts as RAZ/WI.

Signed-off-by: Jens Wiklander <jens.wiklander@linaro.org>
Message-id: 1464273945-2055-1-git-send-email-jens.wiklander@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:29 +01:00
Ard Biesheuvel
e40c3d2e7f hw/arm/virt: fix limit of 64-bit ACPI/ECAM PCI MMIO range
Set the MMIO range limit field to 'base + size - 1' as required.

Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Reviewed-by: Shannon Zhao <shannon.zhao@linaro.org>
Message-id: 1463856217-17969-1-git-send-email-ard.biesheuvel@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:28 +01:00
Peter Maydell
78f1edb19f target-arm: Don't try to set ESR IL bit in arm_cpu_do_interrupt_aarch64()
Remove some incorrect code from arm_cpu_do_interrupt_aarch64()
which attempts to set the IL bit in the syndrome register based
on the value of env->thumb. This is wrong in several ways:
 * IL doesn't indicate Thumb-vs-ARM, it indicates instruction
   length (which may be 16 or 32 for Thumb and is always 32 for ARM)
 * not every syndrome format uses IL like this -- for some IL is
   always set, and for some it is always clear
 * the code is changing esr_el[new_el] even for interrupt entry,
   which is not supposed to modify ESR_ELx at all

Delete the code, and instead rely on the syndrome value in
env->exception.syndrome having already been set up with the
correct value of IL.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1463487258-27468-3-git-send-email-peter.maydell@linaro.org
2016-06-06 16:59:28 +01:00
Peter Maydell
04ce861ea5 target-arm: Set IL bit in syndromes for insn abort, watchpoint, swstep
For some exception syndrome types, the IL bit should always be set.
This includes the instruction abort, watchpoint and software step
syndrome types; add the missing ARM_EL_IL bit to the syndrome
values returned by syn_insn_abort(), syn_swstep() and syn_watchpoint().

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1463487258-27468-2-git-send-email-peter.maydell@linaro.org
2016-06-06 16:59:28 +01:00
Edgar E. Iglesias
aaa1f954d4 target-arm: A64: Create Instruction Syndromes for Data Aborts
Add support for generating the ISS (Instruction Specific Syndrome) for
Data Abort exceptions taken from AArch64.
These syndromes are used by hypervisors for example to trap and emulate
memory accesses.

We save the decoded data out-of-band with the TBs at translation time.
When exceptions hit, the extra data attached to the TB is used to
recreate the state needed to encode instruction syndromes.
This avoids the need to emit moves with every load/store.

Based on a suggestion from Peter Maydell.

Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1462464601-10888-2-git-send-email-edgar.iglesias@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:28 +01:00
Alistair Francis
2a5a9abd4b target-arm: Add the HSTR_EL2 register
Add the Hypervisor System Trap Register for EL2.

This register is used early in the Linux boot and without it the kernel
aborts with a "Synchronous Abort" error.

Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: ea5aae4b10283de4705b864fe9d4bd2eaddaacae.1463174342.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 16:59:28 +01:00
Peter Maydell
280b2358cd Merge remote-tracking branch 'remotes/gkurz/tags/for-upstream' into staging
readdir_r() to readdir() conversion, various minor cleanups

# gpg: Signature made Mon 06 Jun 2016 10:52:52 BST
# gpg:                using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <gkurz@fr.ibm.com>"
# gpg:                 aka "Greg Kurz <groug@free.fr>"
# gpg:                 aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg:                 aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg:                 aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg:                 aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894  DBA2 02FC 3AEB 0101 DBC2

* remotes/gkurz/tags/for-upstream:
  9p: switch back to readdir()
  9p: add locking to V9fsDir
  9p: introduce the V9fsDir type
  9p: drop useless out: label
  9p: drop useless inclusion of hw/i386/pc.h
  9p/fsdev: remove obsolete references to virtio
  9p: some more cleanup in #include directives

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 15:17:52 +01:00
Peter Maydell
e854d0cf78 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160606-1' into staging
virtio-gpu: scanout fix, live migration support
vmsvga: security fixes

# gpg: Signature made Mon 06 Jun 2016 08:05:00 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20160606-1:
  virtio-gpu: add live migration support
  vmsvga: don't process more than 1024 fifo commands at once
  vmsvga: shadow fifo registers
  vmsvga: add more fifo checks
  vmsvga: move fifo sanity checks to vmsvga_fifo_length
  virtio-gpu: fix scanout rectangles

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 13:58:24 +01:00
Mark Cave-Ayland
890e48d7fc scsi-disk: fix reads from scsi-disk devices
Commit fcaafb1001 accidentally broke reads from
scsi-disk devices when being updated from its original form to use the new
byte-based block functions. Add the extra missing sector to offset conversion
in order to restore read functionality.

Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1464931021-25117-1-git-send-email-mark.cave-ayland@ilande.co.uk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 13:23:41 +01:00
Peter Maydell
8625c3ffc8 Merge remote-tracking branch 'remotes/kraxel/tags/pull-audio-20160606-1' into staging
audio: pa volume fix, some qomifying.

# gpg: Signature made Mon 06 Jun 2016 08:01:21 BST
# gpg:                using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-audio-20160606-1:
  hw/audio: QOM'ify milkymist-ac97.c
  hw/audio: QOM'ify intel-hda
  hw/audio: QOM cleanup for intel-hda
  hw/audio: QOM'ify cs4231.c
  audio: pa: Set volume of recording stream instead of recording device

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 12:47:37 +01:00
Peter Maydell
b0491a1a17 Merge remote-tracking branch 'remotes/rth/tags/pull-tgt-20160605' into staging
Check address ranges for disassembly

# gpg: Signature made Sun 05 Jun 2016 17:30:28 BST
# gpg:                using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tgt-20160605:
  target-*: dfilter support for in_asm

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 12:04:59 +01:00
Peter Wu
5819e3e072 gdbstub: avoid busy loop while waiting for gdb
While waiting for a gdb response, or while sending an acknowledgement
there is not much to do, so do not mark the socket as non-blocking to
avoid a busy loop while paused at gdb. This only affects the user-mode
emulation (qemu-arm -g 1234 ./a.out).

Note that this issue was reported before at
https://lists.nongnu.org/archive/html/qemu-devel/2013-02/msg02277.html.

While at it, close the gdb client fd on EOF or error while reading.

Signed-off-by: Peter Wu <peter@lekensteyn.nl>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 11:15:54 +01:00
Greg Kurz
635324e83e 9p: switch back to readdir()
This patch changes the 9p code to use readdir() again instead of
readdir_r(), which is deprecated in glibc 2.24.

All the locking was put in place by a previous patch.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
7cde47d4a8 9p: add locking to V9fsDir
If several threads concurrently call readdir() with the same directory
stream pointer, it is possible that they all get a pointer to the same
dirent structure, whose content is overwritten each time readdir() is
called.

We must thus serialize accesses to the dirent structure.

This may be achieved with a mutex like below:

lock_mutex();

readdir();

// work with the dirent

unlock_mutex();

This patch adds all the locking, to prepare the switch to readdir().

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
f314ea4e30 9p: introduce the V9fsDir type
If we are to switch back to readdir(), we need a more complex type than
DIR * to be able to serialize concurrent accesses to the directory stream.

This patch introduces a placeholder type and fixes all users.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
8762a46d36 9p: drop useless out: label
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
beff62e683 9p: drop useless inclusion of hw/i386/pc.h
Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
af8b38b0d1 9p/fsdev: remove obsolete references to virtio
Most of the 9p code is now virtio agnostic. This patch does a final cleanup:
- drop references to Virtio from the header comments
- fix includes

Also drop a couple of leading empty lines while here.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Greg Kurz
aae91ad9ae 9p: some more cleanup in #include directives
The "9p-attr.h" header isn't needed by 9p synth and virtio 9p.

While here, also drop last references to virtio from 9p synth since it is
now transport agnostic code.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
2016-06-06 11:52:34 +02:00
Dmitry Fleytman
de5dca1b79 e1000e: Fix build with gcc 4.6.3 and ust tracing
This patch fixes used-uninitialized false
positive while compiling with ust tracing
backend plus gcc 4.6.3:

hw/net/e1000e.c: In function ‘e1000e_io_write’:
hw/net/e1000e.c:170:39: error: ‘idx’ may be used uninitialized in this function [-Werror=uninitialized]
hw/net/e1000e.c: In function ‘e1000e_io_read’:
hw/net/e1000e.c:145:35: error: ‘idx’ may be used uninitialized in this function [-Werror=uninitialized]
cc1: all warnings being treated as errors
make: *** [hw/net/e1000e.o] Error 1

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Message-id: 1465023763-10773-1-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-06 09:42:54 +01:00
Gerd Hoffmann
0c244e50ee virtio-gpu: add live migration support
Store some additional state for cursor and resource backing storage,
so we can write out and reload things.  Implement vmsave+vmload for
2d mode.  Continue blocking live migration in 3d/virgl mode.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464009727-7753-1-git-send-email-kraxel@redhat.com
2016-06-06 09:04:34 +02:00
Gerd Hoffmann
4e68a0ee17 vmsvga: don't process more than 1024 fifo commands at once
vmsvga_fifo_run is called in regular intervals (on each display update)
and will resume where it left off.  So we can simply exit the loop,
without having to worry about how processing will continue.

Fixes: CVE-2016-4453
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-5-git-send-email-kraxel@redhat.com
2016-06-06 09:04:29 +02:00
Gerd Hoffmann
7e486f7577 vmsvga: shadow fifo registers
The fifo is normal ram.  So kvm vcpu threads and qemu iothread can
access the fifo in parallel without syncronization.  Which in turn
implies we can't use the fifo pointers in-place because the guest
can try changing them underneath us.  So add shadows for them, to
make sure the guest can't modify them after we've applied sanity
checks.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-4-git-send-email-kraxel@redhat.com
2016-06-06 09:04:24 +02:00
Gerd Hoffmann
c2e3c54d39 vmsvga: add more fifo checks
Make sure all fifo ptrs are within range.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-3-git-send-email-kraxel@redhat.com
2016-06-06 09:04:19 +02:00
Gerd Hoffmann
5213602678 vmsvga: move fifo sanity checks to vmsvga_fifo_length
Sanity checks are applied when the fifo is enabled by the guest
(SVGA_REG_CONFIG_DONE write).  Which doesn't help much if the guest
changes the fifo registers afterwards.  Move the checks to
vmsvga_fifo_length so they are done each time qemu is about to read
from the fifo.

Fixes: CVE-2016-4454
Cc: qemu-stable@nongnu.org
Cc: P J P <ppandit@redhat.com>
Reported-by: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1464592161-18348-2-git-send-email-kraxel@redhat.com
2016-06-06 09:03:51 +02:00
Richard Henderson
4910e6e42e target-*: dfilter support for in_asm
The arm target was handled by 06486077, but other targets
were ignored.  This handles all the rest which actually support
disassembly (that is, skipping moxie and tilegx).

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-06-05 09:26:24 -07:00
Peter Maydell
6b3532b20b Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160603-1' into staging
vnc: keyboard delay, colormap support
ui: misc bugfixes

# gpg: Signature made Fri 03 Jun 2016 08:02:32 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160603-1:
  vnc: add configurable keyboard delay
  sdl2: skip init without outputs
  vnc: Add support for color map
  SDL2: add bgrx pixel format
  gtk: fix unchecked vc dereference
  ui: spice: Exit if gl=on EGL init fails
  ui: egl: Replace fprintf with error_report

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-03 12:03:36 +01:00
Dmitry Fleytman
defbaec160 e1000e: Fix build with ust trace backend
ust trace backend has limitation of maximum 10
arguments per event. Traces with more arguments
cannot be compiled for this backend.

Trace e1000e_rx_rss_ip6 introduced by previous
commits has 11 arguments and fails to compile with
ust trace backend.

This patch fixes the problem by splitting this
tracepoint into two successive tracepoints with
smaller number of arguments.

For more information see comment regarding TP_ARGS
in lttng/tracepoint.h:

/*
* TP_ARGS takes tuples of type, argument separated by a comma.
* It can take up to 10 tuples (which means that less than 10 tuples is
* fine too).
* Each tuple is also separated by a comma.
*/

Build log generated by this problem:

In file included from ./trace/generated-tracers.h:9:0,
                 from /home/travis/build/qemu/qemu/include/trace.h:4,
                 from util/oslib-posix.c:36:
./trace/generated-ust-provider.h:16556:3: error: unknown type name ‘_TP_EXPROTO_Bool’
In file included from /home/travis/build/qemu/qemu/include/trace.h:4:0,
                 from util/oslib-posix.c:36:
./trace/generated-tracers.h: In function ‘trace_e1000e_rx_rss_ip6’:
./trace/generated-tracers.h:8379:431: error: expected string literal before ‘_SDT_ASM_OPERANDS_ipv6_enabled’
./trace/generated-tracers.h:8379:431: error: implicit declaration of function ‘__tracepoint_cb_qemu___e1000e_rx_rss_ip6’ [-Werror=implicit-function-declaration]
./trace/generated-tracers.h:8379:431: error: nested extern declaration of ‘__tracepoint_cb_qemu___e1000e_rx_rss_ip6’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [util/oslib-posix.o] Error 1
make: *** Waiting for unfinished jobs....
In file included from ./trace/generated-tracers.h:9:0,
                 from /home/travis/build/qemu/qemu/include/trace.h:4,
                 from util/hbitmap.c:16:
./trace/generated-ust-provider.h:16556:3: error: unknown type name ‘_TP_EXPROTO_Bool’
In file included from /home/travis/build/qemu/qemu/include/trace.h:4:0,
                 from util/hbitmap.c:16:
./trace/generated-tracers.h: In function ‘trace_e1000e_rx_rss_ip6’:
./trace/generated-tracers.h:8379:431: error: expected string literal before ‘_SDT_ASM_OPERANDS_ipv6_enabled’
./trace/generated-tracers.h:8379:431: error: implicit declaration of function ‘__tracepoint_cb_qemu___e1000e_rx_rss_ip6’ [-Werror=implicit-function-declaration]
./trace/generated-tracers.h:8379:431: error: nested extern declaration of ‘__tracepoint_cb_qemu___e1000e_rx_rss_ip6’ [-Werror=nested-externs]
cc1: all warnings being treated as errors
make: *** [util/hbitmap.o] Error 1

Signed-off-by: Dmitry Fleytman <dmitry@daynix.com>
Message-id: 1464894748-27803-1-git-send-email-dmitry@daynix.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-03 11:06:09 +01:00
xiaoqiang zhao
07b9098dfc hw/audio: QOM'ify milkymist-ac97.c
* Drop the old SysBus init function and use instance_init
* Move AUD_open_in / AUD_open_out function into realize stage

Acked-by: Michael Walle <michael@walle.cc>
Tested-by: Michael Walle <michael@walle.cc>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1463111220-30335-5-git-send-email-zxq_yx_007@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 11:13:38 +02:00
xiaoqiang zhao
bda8d9b8b1 hw/audio: QOM'ify intel-hda
* use DeviceClass::realize instead of DeviceClass::init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1463111220-30335-4-git-send-email-zxq_yx_007@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 11:13:38 +02:00
xiaoqiang zhao
e19202af79 hw/audio: QOM cleanup for intel-hda
drop the DO_UPCAST macro

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1463111220-30335-3-git-send-email-zxq_yx_007@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 11:13:38 +02:00
xiaoqiang zhao
ff2df541bb hw/audio: QOM'ify cs4231.c
Drop the old SysBus init function and use instance_init

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1463111220-30335-2-git-send-email-zxq_yx_007@163.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 11:13:38 +02:00
Peter Krempa
e58ff62d58 audio: pa: Set volume of recording stream instead of recording device
Since pulseaudio 1.0 it's possible to set the individual stream volume
rather than setting the device volume. With this, setting hardware mixer
of a emulated sound card doesn't mess up the volume configuration of the
host.

A side effect is that this limits compatible pulseaudio version to 1.0
which was released on 2011-09-27.

Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 78853815be2069971b89b3a2e3181837064dd8f3.1462962512.git.pkrempa@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 11:13:38 +02:00
Gerd Hoffmann
fa06e5cb7b virtio-gpu: fix scanout rectangles
Commit "ca58b45 ui/virtio-gpu: add and use qemu_create_displaysurface_pixman"
breaks scanouts which use a region of the underlying resource only.

So, we need another way to handle the underlying issue.  Lets create a
new pixman image, grab a reference on the pixman providing the
underlying storage, hook up a destroy callback which releases the
reference.  That way regions work again and releasing the backing
storage should still be impossible thanks to the extra reference we are
holding.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1464597655-26341-1-git-send-email-kraxel@redhat.com
2016-06-03 09:05:28 +02:00
Gerd Hoffmann
c5ce833344 vnc: add configurable keyboard delay
Limits the rate kbd events from the vnc server are forwarded to the
guest, so input devices which are typically low-bandwidth can keep
up even on bulky input.

v2: update documentation too.
v3: spell fixes.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Yang Hongyang <hongyang.yang@easystack.cn>
Message-id: 1464762150-25817-1-git-send-email-kraxel@redhat.com
2016-06-03 08:23:26 +02:00
Gerd Hoffmann
8efa5f29f8 sdl2: skip init without outputs
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
Message-id: 1464790116-32405-1-git-send-email-kraxel@redhat.com
2016-06-03 08:23:26 +02:00
Alexander Graf
0c426e4534 vnc: Add support for color map
Our current VNC code does not handle color maps (aka non-true-color) at all
and aborts if a client requests them. There are 2 major issues with this:

 1) A VNC viewer on an 8-bit X11 system may request color maps
 2) RealVNC _always_ starts requesting color maps, then moves on to full color

In order to support these 2 use cases, let's just create a fake color map
that covers exactly our normal true color 8 bit color space. That way we don't
lose anything over a client that wants true color.

Reported-by: Sascha Wehnert <swehnert@suse.com>
Signed-off-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
Message-id: 1464099559-20789-1-git-send-email-den@openvz.org

Actually this is a very old patch originally submitted in 2013 by
Alexander. The situation is still the same with RealVNC, it does not
connect by default to QEMU VNC. The problem is that this client is
really popular. This is better to be kludged.

Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Gerd Hoffmann <kraxel@redhat.com>
CC: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 08:23:26 +02:00
Pavel Dovgalyuk
435deffefb SDL2: add bgrx pixel format
This patch adds support of b8g8r8x8 pixel format for SDL2.

Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-id: 20160517072848.4540.34695.stgit@PASHA-ISP
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 08:23:26 +02:00
Gerd Hoffmann
41cc5239f3 gtk: fix unchecked vc dereference
Spotted by Coverity.

Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1463737748-1062-1-git-send-email-kraxel@redhat.com
2016-06-03 08:23:26 +02:00
Cole Robinson
daafc661cc ui: spice: Exit if gl=on EGL init fails
The user explicitly requested spice GL, so if we know it isn't
going to work we should exit

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: e3789e35b16f9e3cc6f2652f91c52d88ba6d6936.1463588606.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 08:23:26 +02:00
Cole Robinson
38a55bddcc ui: egl: Replace fprintf with error_report
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: c880920f6e40a506394d89dbbe1f67c63d359c17.1463588606.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-06-03 08:23:26 +02:00
Peter Maydell
2c107d7684 Merge remote-tracking branch 'remotes/jasowang/tags/net-pull-request' into staging
# gpg: Signature made Thu 02 Jun 2016 07:23:18 BST using RSA key ID 398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F  3562 EF04 965B 398D 6211

* remotes/jasowang/tags/net-pull-request: (31 commits)
  Add ENET device to i.MX6 SOC.
  Add ENET/Gbps Ethernet support to FEC device
  i.MX: move FEC device to a register array structure.
  i.MX: Rename i.MX FEC defines to ENET_XXX
  i.MX: reset TX/RX descriptors when FEC is disabled.
  i.MX: Fix FEC code for ECR register reset value.
  i.MX: Fix FEC code for MDIO address selection
  i.MX: Fix FEC code for MDIO operation selection
  net: handle optional VLAN header in checksum computation.
  net: improve UDP/TCP checksum computation.
  e1000e: Introduce qtest for e1000e device
  net: Introduce e1000e device emulation
  e1000: Move out code that will be reused in e1000e
  e1000_regs: Add definitions for Intel 82574-specific bits
  vmxnet3: Use pci_dma_* API instead of cpu_physical_memory_*
  net_pkt: Extend packet abstraction as required by e1000e functionality
  rtl8139: Move more TCP definitions to common header
  net_pkt: Name vmxnet3 packet abstractions more generic
  vmxnet3: Use common MAC address tracing macros
  net: Add macros for MAC address tracing
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-02 14:26:57 +01:00
Peter Maydell
cbd614870f Merge remote-tracking branch 'remotes/famz/tags/pull-docker-20160601' into staging
v2: Fix warning due to include.
    Various temp dir/file changes.
    Don't use "find -executable" to be compatible with Mac.

# gpg: Signature made Wed 01 Jun 2016 10:30:33 BST using RSA key ID 6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021  AD56 CA35 624C 6A91 71C6

* remotes/famz/tags/pull-docker-20160601:
  .gitignore: Ignore docker source copy
  MAINTAINERS: Add tests/docker
  docker: Add EXTRA_CONFIGURE_OPTS
  docs: Add text for tests/docker in build-system.txt
  docker: Add travis tool
  docker: Add mingw test
  docker: Add clang test
  docker: Add full test
  docker: Add quick test
  docker: Add common.rc
  docker: Add test runner
  docker: Add images
  Makefile: Rules for docker testing
  Makefile: Always include rules.mak
  rules.mak: Add "COMMA" constant
  tests: Add utilities for docker testing

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-06-02 13:42:52 +01:00
Jean-Christophe Dubois
517b5e9a17 Add ENET device to i.MX6 SOC.
This adds the ENET device to the i.MX6 SOC.

This was tested by booting Linux on an Qemu i.MX6 instance and accessing
the internet from the linux guest.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
a699b410d7 Add ENET/Gbps Ethernet support to FEC device
The ENET device (present in i.MX6) is "derived" from FEC and backward
compatible with it.

This patch adds the necessary support of the added feature in the ENET
device to allow Linux to use it (on supported processors).

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
db0de35268 i.MX: move FEC device to a register array structure.
This is to prepare for the ENET Gb device of the i.MX6.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
1bb3c37182 i.MX: Rename i.MX FEC defines to ENET_XXX
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
ff4b325f5e i.MX: reset TX/RX descriptors when FEC is disabled.
According to the FEC chapter of i.MX25 reference manual

RX adn TX descriptors are reseted when the FEC device is disabled through ECR.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
ccdb81d327 i.MX: Fix FEC code for ECR register reset value.
According to the FEC chapter of i.MX25 reference manual ECR register is
initialized at 0xf0000000 at reset time.

We fix the value.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
b413643a5c i.MX: Fix FEC code for MDIO address selection
According to the FEC chapter of i.MX25 reference manual

When writing to MMFR register, the MDIO device and adress are selected by
bit 27 to 23 and bit 22 to 18 respectively. This is a total of 10 bits
that need to be used by the Phy chip/address decoding function.

This patch fixes the number of bits used from 9 to 10.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
4816dc168b i.MX: Fix FEC code for MDIO operation selection
According to the FEC chapter of i.MX25 reference manual

When writing the MMFR register, bit 29 and 28 select the requested operation.
 * 10 means read operation with valid MII mgmt frame
 * 11 means read operation with non compliant MII mgmt frame
 * 01 means write operation with valid MII mgmt frame
 * 00 means write operation with non compliant MII mgmt frame

So while bit 28 does change beween read/write for valid MII mgmt frame, the
mening is inverted for non compliant MII mgmt frame.

Bit 29 on the other hand means read/write whatever the type of mgmt frame
involved.

So this patch change the operation selection from bit 28 to bit 29 as it is
more generic.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
ade6bad111 net: handle optional VLAN header in checksum computation.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:46 +08:00
Jean-Christophe Dubois
50dbce6538 net: improve UDP/TCP checksum computation.
* based on Eth, UDP, TCP struct present in eth.h instead of hardcoded
   indexes and sizes.
 * based on various macros present in eth.h.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:30 +08:00
Dmitry Fleytman
7c375e2294 e1000e: Introduce qtest for e1000e device
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:30 +08:00
Dmitry Fleytman
6f3fbe4ed0 net: Introduce e1000e device emulation
This patch introduces emulation for the Intel 82574 adapter, AKA e1000e.

This implementation is derived from the e1000 emulation code, and
utilizes the TX/RX packet abstractions that were initially developed for
the vmxnet3 device. Although some parts of the introduced code may be
shared with e1000, the differences are substantial enough so that the
only shared resources for the two devices are the definitions in
hw/net/e1000_regs.h.

Similarly to vmxnet3, the new device uses virtio headers for task
offloads (for backends that support virtio extensions). Usage of
virtio headers may be forcibly disabled via a boolean device property
"vnet" (which is enabled by default). In such case task offloads
will be performed in software, in the same way it is done on
backends that do not support virtio headers.

The device code is split into two parts:

  1. hw/net/e1000e.c: QEMU-specific code for a network device;
  2. hw/net/e1000e_core.[hc]: Device emulation according to the spec.

The new device name is e1000e.

Intel specifications for the 82574 controller are available at:
http://www.intel.com/content/dam/doc/datasheet/82574l-gbe-controller-datasheet.pdf

Throughput measurement results (iperf2):

                Fedora 22 guest, TCP, RX
    4 ++------------------------------------------+
      |                                           |
      |                           X   X   X   X   X
  3.5 ++          X   X   X   X                   |
      |       X                                   |
      |                                           |
    3 ++                                          |
G     |   X                                       |
b     |                                           |
/ 2.5 ++                                          |
s     |                                           |
      |                                           |
    2 ++                                          |
      |                                           |
      |                                           |
  1.5 X+                                          |
      |                                           |
      +   +   +   +   +   +   +   +   +   +   +   +
    1 ++--+---+---+---+---+---+---+---+---+---+---+
     32  64  128 256 512  1   2   4   8  16  32  64
      B   B   B   B   B   KB  KB  KB  KB KB  KB  KB
                       Buffer size

               Fedora 22 guest, TCP, TX
  18 ++-------------------------------------------+
     |                        X                   |
  16 ++                           X   X   X   X   X
     |                   X                        |
  14 ++                                           |
     |                                            |
  12 ++                                           |
G    |               X                            |
b 10 ++                                           |
/    |                                            |
s  8 ++                                           |
     |                                            |
   6 ++          X                                |
     |                                            |
   4 ++                                           |
     |       X                                    |
   2 ++  X                                        |
     X   +   +   +   +   +    +   +   +   +   +   +
   0 ++--+---+---+---+---+----+---+---+---+---+---+
    32  64  128 256 512  1    2   4   8  16  32  64
     B   B   B   B   B   KB   KB  KB  KB KB  KB  KB
                       Buffer size

                Fedora 22 guest, UDP, RX
    3 ++------------------------------------------+
      |                                           X
      |                                           |
  2.5 ++                                          |
      |                                           |
      |                                           |
    2 ++                                 X        |
G     |                                           |
b     |                                           |
/ 1.5 ++                                          |
s     |                         X                 |
      |                                           |
    1 ++                                          |
      |                                           |
      |                 X                         |
  0.5 ++                                          |
      |        X                                  |
      X        +        +       +        +        +
    0 ++-------+--------+-------+--------+--------+
     32       64       128     256      512       1
      B        B         B       B        B      KB
                       Datagram size

                Fedora 22 guest, UDP, TX
    1 ++------------------------------------------+
      |                                           X
  0.9 ++                                          |
      |                                           |
  0.8 ++                                          |
  0.7 ++                                          |
      |                                           |
G 0.6 ++                                          |
b     |                                           |
/ 0.5 ++                                          |
s     |                                  X        |
  0.4 ++                                          |
      |                                           |
  0.3 ++                                          |
  0.2 ++                        X                 |
      |                                           |
  0.1 ++                X                         |
      X        X        +       +        +        +
    0 ++-------+--------+-------+--------+--------+
     32       64       128     256      512       1
      B        B         B       B        B      KB
                       Datagram size

              Windows 2012R2 guest, TCP, RX
  3.2 ++------------------------------------------+
      |                                   X       |
    3 ++                                          |
      |                                           |
  2.8 ++                                          |
      |                                           |
  2.6 ++                              X           |
G     |   X                   X   X           X   X
b 2.4 ++      X       X                           |
/     |                                           |
s 2.2 ++                                          |
      |                                           |
    2 ++                                          |
      |           X       X                       |
  1.8 ++                                          |
      |                                           |
  1.6 X+                                          |
      +   +   +   +   +   +   +   +   +   +   +   +
  1.4 ++--+---+---+---+---+---+---+---+---+---+---+
     32  64  128 256 512  1   2   4   8  16  32  64
      B   B   B   B   B   KB  KB  KB  KB KB  KB  KB
                       Buffer size

             Windows 2012R2 guest, TCP, TX
  14 ++-------------------------------------------+
     |                                            |
     |                                        X   X
  12 ++                                           |
     |                                            |
  10 ++                                           |
     |                                            |
G    |                                            |
b  8 ++                                           |
/    |                                    X       |
s  6 ++                                           |
     |                                            |
     |                                            |
   4 ++                               X           |
     |                                            |
   2 ++                                           |
     |           X   X            X               |
     +   X   X   +   +   X    X   +   +   +   +   +
   0 X+--+---+---+---+---+----+---+---+---+---+---+
    32  64  128 256 512  1    2   4   8  16  32  64
     B   B   B   B   B   KB   KB  KB  KB KB  KB  KB
                       Buffer size

              Windows 2012R2 guest, UDP, RX
  1.6 ++------------------------------------------X
      |                                           |
  1.4 ++                                          |
      |                                           |
  1.2 ++                                          |
      |                                  X        |
      |                                           |
G   1 ++                                          |
b     |                                           |
/ 0.8 ++                                          |
s     |                                           |
  0.6 ++                        X                 |
      |                                           |
  0.4 ++                                          |
      |                 X                         |
      |                                           |
  0.2 ++       X                                  |
      X        +        +       +        +        +
    0 ++-------+--------+-------+--------+--------+
     32       64       128     256      512       1
      B        B         B       B        B      KB
                       Datagram size

              Windows 2012R2 guest, UDP, TX
  0.6 ++------------------------------------------+
      |                                           X
      |                                           |
  0.5 ++                                          |
      |                                           |
      |                                           |
  0.4 ++                                          |
G     |                                           |
b     |                                           |
/ 0.3 ++                                 X        |
s     |                                           |
      |                                           |
  0.2 ++                                          |
      |                                           |
      |                         X                 |
  0.1 ++                                          |
      |                 X                         |
      X        X        +       +        +        +
    0 ++-------+--------+-------+--------+--------+
     32       64       128     256      512       1
      B        B         B       B        B      KB
                       Datagram size

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:29 +08:00
Dmitry Fleytman
093454e21d e1000: Move out code that will be reused in e1000e
Code that will be shared moved to a separate files.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:29 +08:00
Dmitry Fleytman
06e7fa0ad7 e1000_regs: Add definitions for Intel 82574-specific bits
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:29 +08:00
Dmitry Fleytman
111710107d vmxnet3: Use pci_dma_* API instead of cpu_physical_memory_*
To make this device and network packets
abstractions ready for IOMMU.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:28 +08:00
Dmitry Fleytman
eb700029c7 net_pkt: Extend packet abstraction as required by e1000e functionality
This patch extends the TX/RX packet abstractions with features that will
be used by the e1000e device implementation.

Changes are:

  1. Support iovec lists for RX buffers
  2. Deeper RX packets parsing
  3. Loopback option for TX packets
  4. Extended VLAN headers handling
  5. RSS processing for RX packets

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:28 +08:00
Dmitry Fleytman
66409b7c8b rtl8139: Move more TCP definitions to common header
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:28 +08:00
Dmitry Fleytman
605d52e62f net_pkt: Name vmxnet3 packet abstractions more generic
This patch drops "vmx" prefix from packet abstractions names
to emphasize the fact they are generic and not tied to any
specific network device.

These abstractions will be reused by e1000e emulation implementation
introduced by following patches so their names need generalization.

This patch (except renamed files, adjusted comments and changes in MAINTAINTERS)
was produced by:

git grep -lz 'vmxnet_tx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_tx_pkt/net_tx_pkt/g"
git grep -lz 'vmxnet_rx_pkt' | xargs -0 perl -i'' -pE "s/vmxnet_rx_pkt/net_rx_pkt/g"
git grep -lz 'VmxnetTxPkt' | xargs -0 perl -i'' -pE "s/VmxnetTxPkt/NetTxPkt/g"
git grep -lz 'VMXNET_TX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_TX_PKT/NET_TX_PKT/g"
git grep -lz 'VmxnetRxPkt' | xargs -0 perl -i'' -pE "s/VmxnetRxPkt/NetRxPkt/g"
git grep -lz 'VMXNET_RX_PKT' | xargs -0 perl -i'' -pE "s/VMXNET_RX_PKT/NET_RX_PKT/g"
sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_rx_pkt.c
sed -ie 's/VMXNET_/NET_/g' hw/net/vmxnet_tx_pkt.c

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:27 +08:00
Dmitry Fleytman
ab64787201 vmxnet3: Use common MAC address tracing macros
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:27 +08:00
Dmitry Fleytman
6d1d4939a6 net: Add macros for MAC address tracing
These macros will be used by future commits introducing
e1000e device emulation and by vmxnet3 tracing code.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:27 +08:00
Dmitry Fleytman
0478d1ddae net: Introduce Toeplitz hash calculator
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:27 +08:00
Dmitry Fleytman
a4b387e623 vmxnet3: Use generic function for DSN capability definition
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:26 +08:00
Dmitry Fleytman
b56b9285e4 pcie: Introduce function for DSN capability creation
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:26 +08:00
Dmitry Fleytman
6383292ac8 pcie: Add support for PCIe CAP v1
Added support for PCIe CAP v1, while reusing some of the existing v2
infrastructure.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:26 +08:00
Dmitry Fleytman
83f17ed278 pci: Introduce define for PM capability version 1.1
Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:09 +08:00
Dmitry Fleytman
3bdfaabbcf msix: make msix_clr_pending() visible for clients
This function will be used by e1000e device code.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:42:09 +08:00
Dmitry Fleytman
059a65f3ad pci: fix unaligned access in pci_xxx_quad()
Replace legacy cpu_to_le64w()/le64_to_cpup()
calls with stq_le_p()/ldq_le_p().

Motivation for this modification is that
follow up patches add utility function
pcie_dev_ser_num_init() for PCIe DSN
capability creation which uses
pci_set_quad() with a misaligned offset.

Signed-off-by: Dmitry Fleytman <dmitry.fleytman@ravellosystems.com>
Signed-off-by: Leonid Bloch <leonid.bloch@ravellosystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-02 10:16:53 +08:00
Fam Zheng
0bc7a6f307 .gitignore: Ignore docker source copy
Signed-off-by: Fam Zheng <famz@redhat.com>
2016-06-01 17:27:35 +08:00
Fam Zheng
8a49e97f45 MAINTAINERS: Add tests/docker
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-16-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
35e0f959b5 docker: Add EXTRA_CONFIGURE_OPTS
Whatever passed in this variable will be appended to all
configure commands.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-15-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
dc2e7eebd8 docs: Add text for tests/docker in build-system.txt
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-14-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
d5bd789198 docker: Add travis tool
The script is not prefixed with test- so it won't run with "make docker-test",
because it can take too long.

Run it with "make docker-travis@ubuntu".

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-13-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
c4f0eed1f3 docker: Add mingw test
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-12-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
c8908570dc docker: Add clang test
The (currently partially commented out) configure options are suggested
by John Snow <jsnow@redhat.com>.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-11-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
d710ac871c docker: Add full test
This builds all available targets.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-10-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
b7899d63c8 docker: Add quick test
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-9-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
3568f98ca5 docker: Add common.rc
"requires" checks the "FEATURE" environment for specified prerequisits,
and skip the execution of test if not found.

"build_qemu" is the central routine to compile QEMU for tests to call.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-8-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
b344aa9132 docker: Add test runner
It's better to have a launcher for all tests, to make it easier to
initialize and manage the environment.

If "DEBUG=1"  a shell prompt will show up before the test runs.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-7-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
ca853f0c76 docker: Add images
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-6-git-send-email-famz@redhat.com
2016-06-01 17:27:35 +08:00
Fam Zheng
324027c24c Makefile: Rules for docker testing
This adds a group of make targets to run docker tests, all are available
in source tree without running ./configure.

The usage is shown with "make docker".

Besides the fixed ones, dynamic targets for building each image and
running each test in each image are generated automatically by make,
scanning $(SRC_PATH)/tests/docker/ files with specific patterns.

Alternative to manually list particular targets (docker-TEST@IMAGE)
set, you can control which tests/images to run by filtering variables,
TESTS= and IMAGES=, which are expressed in Makefile pattern syntax,
"foo% %bar ...". For example:

    $ make docker-test IMAGES="ubuntu fedora"

Unfortunately, it's impossible to propagate "-j $JOBS" into make in
containers, however since each combination is made a first class target
in the top Makefile, "make -j$N docker-test" still parallels the tests
coarsely.

Still, $J is made a magic variable to let all make invocations in
containers to use -j$J.

Instead of providing a live version of the source tree to the docker
container we snapshot it with git-archive. This ensures the tree is in a
pristine state for whatever operations the container is going to run on
them.

Uncommitted changes known to files known by the git index will be
included in the snapshot if there are any.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 1464755128-32490-5-git-send-email-famz@redhat.com
2016-06-01 17:27:34 +08:00
Fam Zheng
fb57c88102 Makefile: Always include rules.mak
When config-host.mak is not found it is safe to assume SRC_PATH is ".".
So, it is okay to move inclusion of ruls.mak out of the ifeq condition.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-4-git-send-email-famz@redhat.com
2016-06-01 17:25:50 +08:00
Fam Zheng
2f4e4dc237 rules.mak: Add "COMMA" constant
Using "," literal in $(call quiet-command, ...) arguments is awkward.
Add this constant to make it at least doable.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-3-git-send-email-famz@redhat.com
2016-06-01 17:25:50 +08:00
Fam Zheng
4485b04be9 tests: Add utilities for docker testing
docker.py is added with a number of useful subcommands to manager docker
images and instances for QEMU docker testing. Subcommands are:

run: A wrapper of "docker run" (or "sudo -n docker run" if necessary),
which takes care of killing and removing the running container at
SIGINT.

clean: Tear down all the containers including inactive ones that are
started by docker_run.

build: Compare an image from given dockerfile and rebuild it if they're
different.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 1464755128-32490-2-git-send-email-famz@redhat.com
2016-06-01 17:25:50 +08:00
Zhang Chen
16a3df403b net/net: Add SocketReadState for reuse codes
This function is from net/socket.c, move it to net.c and net.h.
Add SocketReadState to make others reuse net_fill_rstate().
suggestion from jason.

v4:
 - move 'rs->finalize = finalize' to rs_init()

v3:
 - remove SocketReadState init callback
 - put finalize callback to net_fill_rstate()

v2:
 - rename ReadState to SocketReadState
 - add SocketReadState init and finalize callback

v1:
 - init patch

Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-01 09:25:29 +08:00
Eduardo Habkost
d30300f771 net: vl: Move default_net to vl.c
All handling of defaults (default_* variables) is inside vl.c,
move default_net there too, so we can more easily refactor that
code later.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-06-01 09:25:29 +08:00
Peter Maydell
500acc9c41 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160531' into staging
ppc patch queue for 2016-05-31

Here's another ppc patch queue.  This batch is all preliminaries
towards two significant features:

1) Full hypervisor-mode support for POWER8
    Patches 1-8 start fixing various bugs with TCG's handling of
    hypervisor mode

2) CPU hotplug support
    Patches 9-12 make some preliminary fixes towards implementing CPU
    hotplug on ppc64 (and other non-x86 platforms).  These patches are
    actually to generic code, not ppc, but are included here with
    Paolo's ACK.

# gpg: Signature made Tue 31 May 2016 01:39:44 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160531:
  cpu: Add a sync version of cpu_remove()
  cpu: Reclaim vCPU objects
  exec: Do vmstate unregistration from cpu_exec_exit()
  exec: Remove cpu from cpus list during cpu_exec_exit()
  ppc: Add PPC_64H instruction flag to POWER7 and POWER8
  ppc: Get out of emulation on SMT "OR" ops
  ppc: Fix sign extension issue in mtmsr(d) emulation
  ppc: Change 'invalid' bit mask of tlbiel and tlbie
  ppc: tlbie, tlbia and tlbisync are HV only
  ppc: Do some batching of TCG tlb flushes
  ppc: Use split I/D mmu modes to avoid flushes on interrupts
  ppc: Remove MMU_MODEn_SUFFIX definitions

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-31 10:37:22 +01:00
Peter Maydell
07e070aac4 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* docs/atomics fixes and atomic_rcu_* optimization (Emilio)
* NBD bugfix (Eric)
* Memory fixes and cleanups (Paolo, Paul)
* scsi-block support for SCSI status, including persistent
  reservations (Paolo)
* kvm_stat moves to the Linux repository
* SCSI bug fixes (Peter, Prasad)
* Killing qemu_char_get_next_serial, non-ARM parts (Xiaoqiang)

# gpg: Signature made Sun 29 May 2016 08:11:20 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (30 commits)
  exec: hide mr->ram_addr from qemu_get_ram_ptr users
  memory: split memory_region_from_host from qemu_ram_addr_from_host
  exec: remove ram_addr argument from qemu_ram_block_from_host
  memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
  scsi-generic: Merge block max xfer len in INQUIRY response
  scsi-block: always use SG_IO
  scsi-disk: introduce scsi_disk_req_check_error
  scsi-disk: add need_fua_emulation to SCSIDiskClass
  scsi-disk: introduce dma_readv and dma_writev
  scsi-disk: introduce a common base class
  xen-hvm: ignore background I/O sections
  docs/atomics: update comparison with Linux
  atomics: do not emit consume barrier for atomic_rcu_read
  atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer
  docs/atomics: update atomic_read/set comparison with Linux
  bt: rewrite csrhci_write to avoid out-of-bounds writes
  block/iscsi: avoid potential overflow of acb->task->cdb
  scsi: megasas: check 'read_queue_head' index value
  scsi: megasas: initialise local configuration data buffer
  scsi: megasas: use appropriate property buffer size
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-31 09:29:23 +01:00
Bharata B Rao
2c579042e3 cpu: Add a sync version of cpu_remove()
This sync API will be used by the CPU hotplug code to wait for the CPU to
completely get removed before flagging the failure to the device_add
command.

Sync version of this call is needed to correctly recover from CPU
realization failures when ->plug() handler fails.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:17:05 +10:00
Gu Zheng
4c055ab54f cpu: Reclaim vCPU objects
In order to deal well with the kvm vcpus (which can not be removed without any
protection), we do not close KVM vcpu fd, just record and mark it as stopped
into a list, so that we can reuse it for the appending cpu hot-add request if
possible. It is also the approach that kvm guys suggested:
https://www.mail-archive.com/kvm@vger.kernel.org/msg102839.html

Signed-off-by: Chen Fan <chen.fan.fnst@cn.fujitsu.com>
Signed-off-by: Gu Zheng <guz.fnst@cn.fujitsu.com>
Signed-off-by: Zhu Guihua <zhugh.fnst@cn.fujitsu.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
               [- Explicit CPU_REMOVE() from qemu_kvm/tcg_destroy_vcpu()
                  isn't needed as it is done from cpu_exec_exit()
                - Use iothread mutex instead of global mutex during
                  destroy
                - Don't cleanup vCPU object from vCPU thread context
                  but leave it to the callers (device_add/device_del)]
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:03:59 +10:00
Bharata B Rao
9dfeca7c6b exec: Do vmstate unregistration from cpu_exec_exit()
cpu_exec_init() does vmstate_register for the CPU device. This needs to be
undone from cpu_exec_exit(). This change is needed to support CPU hot
removal.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
[dwg: added missing include to fix compile on some archs]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 14:03:29 +10:00
Bharata B Rao
1c59eb39cf exec: Remove cpu from cpus list during cpu_exec_exit()
CPUState *cpu gets added to the cpus list during cpu_exec_init(). It
should be removed from cpu_exec_exit().

cpu_exec_exit() is called from generic CPU::instance_finalize and some
archs like PowerPC call it from CPU unrealizefn. So ensure that we
dequeue the cpu only once.

Now -1 value for cpu->cpu_index indicates that we have already dequeued
the cpu for CONFIG_USER_ONLY case also.

Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:22:20 +10:00
Benjamin Herrenschmidt
4e0806110c ppc: Add PPC_64H instruction flag to POWER7 and POWER8
This will enable decoding of hrfid

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
b68e60e6f0 ppc: Get out of emulation on SMT "OR" ops
Otherwise tight loops at smt_low for example, which OPAL does,
eat so much CPU that we can't boot a kernel anymore. With that,
I can boot 8 CPUs just fine with powernv.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Michael Neuling
c409bc5daf ppc: Fix sign extension issue in mtmsr(d) emulation
Signed-off-by: Michael Neuling <mikey@neuling.org>
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
f9ef0527ff ppc: Change 'invalid' bit mask of tlbiel and tlbie
Otherwise it will trip on the forms used in recent architecture.

Ideally, we should have different handlers for different architecture
levels but our current implementation of TLB flushing is dumb enough
that this will do for now.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
74693da988 ppc: tlbie, tlbia and tlbisync are HV only
Not that anything remotely recent supports tlbia but ...

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
cd0c6f4735 ppc: Do some batching of TCG tlb flushes
On ppc64 especially, we flush the tlb on any slbie or tlbie instruction.

However, those instructions often come in bursts of 3 or more (context
switch will favor a series of slbie's for example to an slbia if the
SLB has less than a certain number of entries in it, and tlbie's can
happen in a series, with PAPR, H_BULK_REMOVE can remove up to 4 entries
at a time.

Doing a tlb_flush() each time is a waste of time. We end up doing a memset
of the whole TLB, reloading it for the next instruction, memset'ing again,
etc...

Those instructions don't have to take effect immediately. For slbie, they
can wait for the next context synchronizing event. For tlbie, the next
tlbsync.

This implements batching by keeping a flag that indicates that we have a
TLB in need of flushing. We check it on interrupts, rfi's, isync's and
tlbsync and flush the TLB if needed.

This reduces the number of tlb_flush() on a boot to a ubuntu installer
first dialog screen from roughly 360K down to 36K.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: added a 'CPUPPCState *' variable in h_remove() and
      h_bulk_remove() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: removed spurious whitespace change, use 0/1 not true/false
      consistently, since tlb_need_flush has int type]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
9fb0449114 ppc: Use split I/D mmu modes to avoid flushes on interrupts
We rework the way the MMU indices are calculated, providing separate
indices for I and D side based on MSR:IR and MSR:DR respectively,
and thus no longer need to flush the TLB on context changes. This also
adds correct support for HV as a separate address space.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Benjamin Herrenschmidt
5fd1111b20 ppc: Remove MMU_MODEn_SUFFIX definitions
We don't use the resulting accessors and this gets in the way of
the split I/D TLB work.

Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-30 13:20:04 +10:00
Paolo Bonzini
0878d0e11b exec: hide mr->ram_addr from qemu_get_ram_ptr users
Let users of qemu_get_ram_ptr and qemu_ram_ptr_length pass in an
address that is relative to the MemoryRegion.  This basically means
what address_space_translate returns.

Because the semantics of the second parameter change, rename the
function to qemu_map_ram_ptr.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
07bdaa4196 memory: split memory_region_from_host from qemu_ram_addr_from_host
Move the old qemu_ram_addr_from_host to memory_region_from_host and
make it return an offset within the region.  For qemu_ram_addr_from_host
return the ram_addr_t directly, similar to what it was before
commit 1b5ec23 ("memory: return MemoryRegion from qemu_ram_addr_from_host",
2013-07-04).

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
f615f39616 exec: remove ram_addr argument from qemu_ram_block_from_host
Of the two callers, one does not use it, and the other can compute
it itself based on the other output argument (offset) and the RAMBlock.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
4ff87573df memory: remove qemu_get_ram_fd, qemu_set_ram_fd, qemu_ram_block_host_ptr
Remove direct uses of ram_addr_t and optimize memory_region_{get,set}_fd
now that a MemoryRegion knows its RAMBlock directly.

Reviewed-by: Marc-André Lureau <marcandre.lureau@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Fam Zheng
063143d5b1 scsi-generic: Merge block max xfer len in INQUIRY response
The rationale is similar to the above mode sense response interception:
this is practically the only channel to communicate restraints from
elsewhere such as host and block driver.

The scsi bus we attach onto can have a larger max xfer len than what is
accepted by the host file system (guarding between the host scsi LUN and
QEMU), in which case the SG_IO we generate would get -EINVAL.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1464243305-10661-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:12 +02:00
Paolo Bonzini
8fdc7839e4 scsi-block: always use SG_IO
Using pread/pwrite or io_submit has the advantage of eliminating the
bounce buffer, but drops the SCSI status.  This keeps the guest from
seeing unit attention codes, as well as statuses such as RESERVATION
CONFLICT.  Because we know scsi-block operates on an SBC device we can
still use the DMA helpers with SG_IO; just remember to patch the CDBs
if the transfer is split into multiple segments.

This means that scsi-block will always use the thread-pool unfortunately,
instead of respecting aio=native.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
5b956f415a scsi-disk: introduce scsi_disk_req_check_error
Commonize all the checks for canceled requests and errors.  The next patch
will add another case to check for, in order to handle passthrough commands.

There is no semantic change here; the only nontrivial modification is in
scsi_write_do_fua, where cancellation has been checked earlier by both
callers.  Thus, the check is replaced with an assertion.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
94f8ba1125 scsi-disk: add need_fua_emulation to SCSIDiskClass
scsi-block will be able to do FUA just by passing the request through
to the LUN (which is also more efficient); there is no need to emulate
it like we do for scsi-disk.

Add a new method to distinguish this.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
fcaafb1001 scsi-disk: introduce dma_readv and dma_writev
These are replacements for blk_aio_readv and blk_aio_writev that allow
customization of the data path.  They reuse the DMA helpers' DMAIOFunc
callback type, so that the same function can be used in either the
QEMUSGList or the bounce-buffered case.

This customization will be needed in the next patch to do zero-copy
SG_IO on scsi-block.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
993935f315 scsi-disk: introduce a common base class
This will be the place to add DMAIOFuncs in the next patch.  There
are also a couple DeviceClass members that can be moved to the
abstract class's initialization function.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paul Durrant
a8ff431679 xen-hvm: ignore background I/O sections
Since Xen will correctly handle accesses to unimplemented I/O ports (by
returning all 1's for reads and ignoring writes) there is no need for
QEMU to register backgroud I/O sections.

This patch therefore adds checks to xen_io_add/del so that sections with
memory-region ops pointing at 'unassigned_io_ops' are ignored.

Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1462811480-16295-1-git-send-email-paul.durrant@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
a4a0e4b258 docs/atomics: update comparison with Linux
Over time, some differences between QEMU and Linux atomics are getting
smoothed.  In particular, Linux grew atomic_fetch_or (and in general
the differences regarding RMW operations were not described accurately)
and smp_load_acquire/smp_store_release.  Also, set_mb was renamed to
smp_store_mb().  Include these changes in the documentation.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Emilio G. Cota
15487aa132 atomics: do not emit consume barrier for atomic_rcu_read
Currently we emit a consume-load in atomic_rcu_read.  Because of
limitations in current compilers, this is overkill for non-Alpha hosts
and it is only useful to make Thread Sanitizer work.

This patch leaves the consume-load in atomic_rcu_read when
compiling with Thread Sanitizer enabled, and resorts to a
relaxed load + smp_read_barrier_depends otherwise.

On an RMO host architecture, such as aarch64, the performance
improvement of this change is easily measurable. For instance,
qht-bench performs an atomic_rcu_read on every lookup. Performance
before and after applying this patch:

$ tests/qht-bench -d 5 -n 1
Before: 9.78 MT/s
After:  10.96 MT/s

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-4-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Emilio G. Cota
c983895258 atomics: emit an smp_read_barrier_depends() barrier only for Alpha and Thread Sanitizer
For correctness, smp_read_barrier_depends() is only required to
emit a barrier on Alpha hosts. However, we are currently emitting
a consume fence unconditionally, and most compilers currently treat
consume and acquire fences as equivalent.

Fix it by keeping the consume fence if we're compiling with Thread
Sanitizer, since this might help prevent false warnings. Otherwise,
only emit the barrier for Alpha hosts. Note that we still guarantee
that smp_read_barrier_depends() is a compiler barrier.

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-3-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Emilio G. Cota
56ebe02203 docs/atomics: update atomic_read/set comparison with Linux
Recently Linux did a mass conversion of its atomic_read/set calls
so that they at least are READ/WRITE_ONCE. See Linux's commit
62e8a325 ("atomic, arch: Audit atomic_{read,set}()"). It seems though
that their documentation hasn't been updated to reflect this.

The appended updates our documentation to reflect the change, which
means there is effectively no difference between our atomic_read/set
and the current Linux implementation.

While at it, fix the statement that a barrier is implied by
atomic_read/set, which is incorrect. Volatile/atomic semantics prevent
transformations pertaining the variable they apply to; this, however,
has no effect on surrounding statements like barriers do. For more
details on this, see:
  https://gcc.gnu.org/onlinedocs/gcc/Volatiles.html

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1464120374-8950-2-git-send-email-cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Paolo Bonzini
141af038dd bt: rewrite csrhci_write to avoid out-of-bounds writes
The usage of INT_MAX in this function confuses Coverity.  I think
the defect is bogus, however there is no protection against
getting more than sizeof(s->inpkt) bytes from the character device
backend.

Rewrite the function to only fill in as much data as needed from
buf into s->inpkt.  The plen variable is replaced by a simple
state machine and there is no need anymore to shift contents to
the beginning of s->inpkt.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Peter Lieven
a6b3167fa0 block/iscsi: avoid potential overflow of acb->task->cdb
at least in the path via virtio-blk the maximum size is not
restricted.

Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1464080368-29584-1-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Prasad J Pandit
b60bdd1f1e scsi: megasas: check 'read_queue_head' index value
While doing MegaRAID SAS controller command frame lookup, routine
'megasas_lookup_frame' uses 'read_queue_head' value as an index
into 'frames[MEGASAS_MAX_FRAMES=2048]' array. Limit its value
within array bounds to avoid any OOB access.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464179110-18593-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:11 +02:00
Prasad J Pandit
d37af74073 scsi: megasas: initialise local configuration data buffer
When reading MegaRAID SAS controller configuration via MegaRAID
Firmware Interface(MFI) commands, routine megasas_dcmd_cfg_read
uses an uninitialised local data buffer. Initialise this buffer
to avoid stack information leakage.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464178304-12831-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Prasad J Pandit
1b85898025 scsi: megasas: use appropriate property buffer size
When setting MegaRAID SAS controller properties via MegaRAID
Firmware Interface(MFI) commands, a user supplied size parameter
is used to set property value. Use appropriate size value to avoid
OOB access issues.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1464172291-2856-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Prasad J Pandit
06630554cc scsi: mptsas: infinite loop while fetching requests
The LSI SAS1068 Host Bus Adapter emulator in Qemu, periodically
looks for requests and fetches them. A loop doing that in
mptsas_fetch_requests() could run infinitely if 's->state' was
not operational. Move check to avoid such a loop.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464077264-25473-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Prasad J Pandit
3e831b40e0 scsi: pvscsi: check command descriptor ring buffer size (CVE-2016-4952)
Vmware Paravirtual SCSI emulation uses command descriptors to
process SCSI commands. These descriptors come with their ring
buffers. A guest could set the ring buffer size to an arbitrary
value leading to OOB access issue. Add check to avoid it.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Cc: qemu-stable@nongnu.org
Message-Id: <1464000485-27041-1-git-send-email-ppandit@redhat.com>
Reviewed-by: Shmulik Ladkani <shmulik.ladkani@ravellosystems.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Paolo Bonzini
60b412dd18 kvm_stat: Remove
The source has moved to the Linux kernel tree.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Eric Blake
353ab96973 nbd: Don't trim unrequested bytes
Similar to commit df7b97ff, we are mishandling clients that
give an unaligned NBD_CMD_TRIM request, and potentially
trimming bytes that occur before their request; which in turn
can cause potential unintended data loss (unlikely in
practice, since most clients are sane and issue aligned trim
requests).  However, while we fixed read and write by switching
to the byte interfaces of blk_, we don't yet have a byte
interface for discard.  On the other hand, trim is advisory, so
rounding the user's request to simply ignore the first and last
unaligned sectors (or the entire request, if it is sub-sector
in length) is just fine.

CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1464173965-9694-1-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
xiaoqiang zhao
e269fbe231 hw/char: QOM'ify milkymist-uart.c
drop the qemu_char_get_next_serial and use chardev prop instead

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-6-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
xiaoqiang zhao
7aaefcaf66 hw/char: QOM'ify lm32_uart.c
* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial
* Add lm32_uart_create function to create lm32 uart device

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-5-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
xiaoqiang zhao
c2ddaa62b6 hw/char: QOM'ify lm32_juart.c
* Drop the old SysBus init function
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-4-git-send-email-zxq_yx_007@163.com>
Tested-by: Michael Walle <michael@walle.cc>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
xiaoqiang zhao
8290de92b8 hw/char: QOM'ify etraxfs_ser.c
* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback
* Use qdev chardev prop instead of qemu_char_get_next_serial
* Add etraxfs_ser_create function to create etraxfs serial device

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-3-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
xiaoqiang zhao
e7c9136977 hw/char: QOM'ify escc.c
* Drop the old SysBus init function and use instance_init
* Call qemu_chr_add_handlers in the realize callback

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1464158344-12266-2-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Paolo Bonzini
b138e654a0 Revert "memory: Drop FlatRange.romd_mode"
This reverts commit 5b5660adf1,
as it breaks the UEFI guest firmware (known as ArmVirtPkg or AAVMF)
running in the "virt" machine type of "qemu-system-aarch64":

Contrary to the commit message, (a->mr == b->mr) does *not* imply
that (a->romd_mode == b->romd_mode): the pflash device model calls
memory_region_rom_device_set_romd() -- for switching between the above
modes --, and that function changes mr->romd_mode but the current
AddressSpaceDispatch's FlatRange keeps the old value.  Therefore
region_del/region_add are not called on the KVM MemoryListener.

Reported-by: Drew Jones <drjones@redhat.com>
Tested-by: Drew Jones <drjones@redhat.com>
Analyzed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-29 09:11:10 +02:00
Peter Maydell
d6550e9ed2 Merge remote-tracking branch 'remotes/riku/tags/pull-linux-user-20160527' into staging
linux-user pull request v2 for may 2016

# gpg: Signature made Fri 27 May 2016 12:51:10 BST using RSA key ID DE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg:                 aka "Riku Voipio <riku.voipio@linaro.org>"

* remotes/riku/tags/pull-linux-user-20160527: (38 commits)
  linux-user,target-ppc: fix use of MSR_LE
  linux-user/signal.c: Use s390 target space address instead of host space
  linux-user/signal.c: Use target address instead of host address for microblaze restorer
  linux-user/signal.c: Generate opcode data for restorer in setup_rt_frame
  linux-user: arm: Remove ARM_cpsr and similar #defines
  linux-user: Use direct syscalls for setuid(), etc
  linux-user: x86_64: Don't use 16-bit UIDs
  linux-user: Use g_try_malloc() in do_msgrcv()
  linux-user: Handle msgrcv error case correctly
  linux-user: Handle negative values in timespec conversion
  linux-user: Use safe_syscall for futex syscall
  linux-user: Use safe_syscall for pselect, select syscalls
  linux-user: Use safe_syscall for execve syscall
  linux-user: Use safe_syscall for wait system calls
  linux-user: Use safe_syscall for open and openat system calls
  linux-user: Use safe_syscall for read and write system calls
  linux-user: Provide safe_syscall for fixing races between signals and syscalls
  linux-user: Add debug code to exercise restarting system calls
  linux-user: Support for restarting system calls for Microblaze targets
  linux-user: Set r14 on exit from microblaze syscall
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-27 14:05:48 +01:00
Laurent Vivier
49e55cbacf linux-user,target-ppc: fix use of MSR_LE
setup_frame()/setup_rt_frame()/restore_user_regs() are using
MSR_LE as the similar kernel functions do: as a bitmask.

But in QEMU, MSR_LE is a bit position, so change this
accordingly.

The previous code was doing nothing as MSR_LE is 0,
and "env->msr &= ~MSR_LE" doesn't change the value of msr.

And yes, a user process can change its endianness,
see linux kernel commit:

    fab5db9 [PATCH] powerpc: Implement support for setting little-endian mode via prctl

and prctl(2): PR_SET_ENDIAN, PR_GET_ENDIAN

Reviewed-by: Thomas Huth <huth@tuxfamily.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:40 +03:00
Chen Gang
5b1d59d0bb linux-user/signal.c: Use s390 target space address instead of host space
The return address is in target space, so the restorer address needs to
be target space, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-05-27 14:50:40 +03:00
Chen Gang
166c97edd6 linux-user/signal.c: Use target address instead of host address for microblaze restorer
The return address is in target space, so the restorer address needs to
be target space, too.

Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:40 +03:00
Chen Gang
f1d9d1071c linux-user/signal.c: Generate opcode data for restorer in setup_rt_frame
Original implementation uses do_rt_sigreturn directly in host space,
when a guest program is in unwind procedure in guest space, it will get
an incorrect restore address, then causes unwind failure.

Also cleanup the original incorrect indentation.

Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Chen Gang <gang.chen.5i5j@gmail.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
167e4cdc29 linux-user: arm: Remove ARM_cpsr and similar #defines
The #defines of ARM_cpsr and friends in linux-user/arm/target-syscall.h
can clash with versions in the system headers if building on an
ARM or AArch64 build (though this seems to be dependent on the version
of the system headers). The QEMU defines are not very useful (it's
not clear that they're intended for use with the target_pt_regs struct
rather than (say) the CPUARMState structure) and we only use them in one
function in elfload.c anyway. So just remove the #defines and directly
access regs->uregs[].

Reported-by: Christopher Covington <cov@codeaurora.org>
Tested-by: Christopher Covington <cov@codeaurora.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
fd6f7798ac linux-user: Use direct syscalls for setuid(), etc
On Linux the setuid(), setgid(), etc system calls have different semantics
from the libc functions. The libc functions follow POSIX and update the
credentials for all threads in the process; the system calls update only
the thread which makes the call. (This impedance mismatch is worked around
in libc by signalling all threads to tell them to do a syscall, in a
byzantine and fragile way; see http://ewontfix.com/17/.)

Since in linux-user we are trying to emulate the system call semantics,
we must implement all these syscalls to directly call the underlying
host syscall, rather than calling the host libc function.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
716f3fbef2 linux-user: x86_64: Don't use 16-bit UIDs
The 64-bit x86 syscall ABI uses 32-bit UIDs; only define
USE_UID16 for 32-bit x86.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
415d847110 linux-user: Use g_try_malloc() in do_msgrcv()
In do_msgrcv() we want to allocate a message buffer, whose size
is passed to us by the guest. That means we could legitimately
fail, so use g_try_malloc() and handle the error case, in the same
way that do_msgsnd() does.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
99874f6552 linux-user: Handle msgrcv error case correctly
The msgrcv ABI is a bit odd -- the msgsz argument is a size_t, which is
unsigned, but it must fail EINVAL if the value is negative when cast
to a long. We were incorrectly passing the value through an
"unsigned int", which meant that if the guest was 32-bit longs and
the host was 64-bit longs an input of 0xffffffff (which should trigger
EINVAL) would simply be passed to the host msgrcv() as 0xffffffff,
where it does not cause the host kernel to reject it.
Follow the same approach as do_msgsnd() in using a ssize_t and
doing the check for negative values by hand, so we correctly fail
in this corner case.

This fixes the msgrcv03 Linux Test Project test case, which otherwise
hangs.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
c7e35da348 linux-user: Handle negative values in timespec conversion
In a struct timespec, both fields are signed longs. Converting
them from guest to host with code like
    host_ts->tv_sec = tswapal(target_ts->tv_sec);
mishandles negative values if the guest has 32-bit longs and
the host has 64-bit longs because tswapal()'s return type is
abi_ulong: the assignment will zero-extend into the host long
type rather than sign-extending it.

Make the conversion routines use __get_user() and __set_user()
instead: this automatically picks up the signedness of the
field type and does the correct kind of sign or zero extension.
It also handles the possibility that the target struct is not
sufficiently aligned for the host's requirements.

In particular, this fixes a hang when running the Linux Test Project
mq_timedsend01 and mq_timedreceive01 tests: one of the test cases
sets the timeout to -1 and expects an EINVAL failure, but we were
setting a very long timeout instead.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
d509eeb13c linux-user: Use safe_syscall for futex syscall
Use the safe_syscall wrapper for the futex syscall.

In particular, this fixes hangs when using programs that link
against the Boehm garbage collector, including the Mono runtime.

(We don't change the sys_futex() call in the implementation of
the exit syscall, because as the FIXME comment there notes
that should be handled by disabling signals, since we can't
easily back out if the futex were to return ERESTARTSYS.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:39 +03:00
Peter Maydell
6df9d38d33 linux-user: Use safe_syscall for pselect, select syscalls
Use the safe_syscall wrapper for the pselect and select syscalls.
Since not every architecture has the select syscall, we now
have to implement select in terms of pselect, which means doing
timeval<->timespec conversion.

(Five years on from the initial patch that added pselect support
to QEMU and a decade after pselect6 went into the kernel, it seems
safe to not try to support hosts with header files which don't
define __NR_pselect6.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:38 +03:00
Timothy E Baldwin
ffdcbe223d linux-user: Use safe_syscall for execve syscall
Wrap execve() in the safe-syscall handling. Although execve() is not
an interruptible syscall, it is a special case: if we allow a signal
to happen before we make the host$ syscall then we will 'lose' it,
because at the point of execve the process leaves QEMU's control.  So
we use the safe syscall wrapper to ensure that we either take the
signal as a guest signal, or else it does not happen before the
execve completes and makes it the other program's problem.

The practical upshot is that without this SIGTERM could fail to
terminate the process.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-25-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: expanded commit message to explain in more detail why this is
 needed, and add comment about it too]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:38 +03:00
Timothy E Baldwin
4af80a3783 linux-user: Use safe_syscall for wait system calls
Use safe_syscall for waitpid, waitid and wait4 syscalls. Note that this
change allows us to implement support for waitid's fifth (rusage) argument
in future; for the moment we ignore it as we have done up til now.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-18-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjust to new safe_syscall convention. Add fifth waitid syscall argument
 (which isn't present in the libc interface but is in the syscall ABI)]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:38 +03:00
Timothy E Baldwin
c10a07387b linux-user: Use safe_syscall for open and openat system calls
Restart open() and openat() if signals occur before,
or during with SA_RESTART.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-17-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Adjusted to follow new -1-and-set-errno safe_syscall convention]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:38 +03:00
Timothy E Baldwin
50afd02b84 linux-user: Use safe_syscall for read and write system calls
Restart read() and write() if signals occur before, or during with SA_RESTART

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-15-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Update to new safe_syscall() convention of setting errno]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:50:38 +03:00
Timothy E Baldwin
4d330cee37 linux-user: Provide safe_syscall for fixing races between signals and syscalls
If a signal is delivered immediately before a blocking system call the
handler will only be called after the system call returns, which may be a
long time later or never.

This is fixed by using a function (safe_syscall) that checks if a guest
signal is pending prior to making a system call, and if so does not call the
system call and returns -TARGET_ERESTARTSYS. If a signal is received between
the check and the system call host_signal_handler() rewinds execution to
before the check. This rewinding has the effect of closing the race window
so that safe_syscall will reliably either (a) go into the host syscall
with no unprocessed guest signals pending or or (b) return
-TARGET_ERESTARTSYS so that the caller can deal with the signals.
Implementing this requires a per-host-architecture assembly language
fragment.

This will also resolve the mishandling of the SA_RESTART flag where
we would restart a host system call and not call the guest signal handler
until the syscall finally completed -- syscall restarting now always
happens at the guest syscall level so the guest signal handler will run.
(The host syscall will never be restarted because if the host kernel
rewinds the PC to point at the syscall insn for a restart then our
host_signal_handler() will see this and arrange the guest PC rewind.)

This commit contains the infrastructure for implementing safe_syscall
and the assembly language fragment for x86-64, but does not change any
syscalls to use it.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-14-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM:
 * Avoid having an architecture if-ladder in configure by putting
   linux-user/host/$(ARCH) on the include path and including
   safe-syscall.inc.S from it
 * Avoid ifdef ladder in signal.c by creating new hostdep.h to hold
   host-architecture-specific things
 * Added copyright/license header to safe-syscall.inc.S
 * Rewrote commit message
 * Added comments to safe-syscall.inc.S
 * Changed calling convention of safe_syscall() to match syscall()
   (returns -1 and host error in errno on failure)
 * Added a long comment in qemu.h about how to use safe_syscall()
   to implement guest syscalls.
]
RV: squashed Peters "fixup! linux-user: compile on non-x86-64 hosts"
patch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-27 14:49:51 +03:00
Timothy E Baldwin
71a8f7fece linux-user: Add debug code to exercise restarting system calls
If DEBUG_ERESTARTSYS is set restart all system calls once. This
is pure debug code for exercising the syscall restart code paths
in the per-architecture cpu main loops.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-10-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Add comment and a commented-out #define next to the commented-out
 generic DEBUG #define; remove the check on TARGET_USE_ERESTARTSYS;
 tweak comment message]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:51 +03:00
Timothy E Baldwin
4134ecfeb9 linux-user: Support for restarting system calls for Microblaze targets
Update the Microblaze main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Note that this in passing fixes a bug where we were corrupting
the guest r[3] on sigreturn with the guest's r[10] because
do_sigreturn() was returning env->regs[10] but the register for
syscall return values is env->regs[3].

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-11-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define;
 drop whitespace changes]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:51 +03:00
Peter Maydell
d7749ab770 linux-user: Set r14 on exit from microblaze syscall
All syscall exits on microblaze result in r14 being equal to the
PC we return to, because the kernel syscall exit instruction "rtbd"
does this. (This is true even for sigreturn(); note that r14 is
not a userspace-usable register as the kernel may clobber it at
any point.)

Emulate the setting of r14 on exit; this isn't really a guest
visible change for valid guest code because r14 isn't reliably
observable anyway. However having the code and the comment helps
to explain why it's ok for the ERESTARTSYS handling not to undo
the changes to r14 that happen on syscall entry.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Peter Maydell
a9175169cc linux-user: Support for restarting system calls for tilegx targets
Update the tilegx main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * return -TARGET_QEMU_ESIGRETURN from sigreturn rather than current R_RE
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Note that this fixes a bug where a sigreturn which happened to have
an errno value in TILEGX_R_RE would incorrectly cause TILEGX_R_ERR
to get set.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
6205086558 linux-user: Support for restarting system calls for CRIS targets
Update the CRIS main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-34-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
47405ab642 linux-user: Support for restarting system calls for S390 targets
Update the S390 main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-33-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; remove stray double semicolon; drop
 TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
7ccb84a916 linux-user: Support for restarting system calls for M68K targets
Update the M68K main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-32-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
7fe7231a49 linux-user: Support for restarting system calls for OpenRISC targets
Update the OpenRISC main loop code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

(We don't implement sigreturn on this target so there is no
code there to update.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-31-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
256cb6af7f linux-user: Support for restarting system calls for UniCore32 targets
Update the UniCore32 main loop code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

(We don't support signals on this target so there is no sigreturn code
to update.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-30-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
338c858c94 linux-user: Support for restarting system calls for Alpha targets
Update the Alpha main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-13-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define;
 PC is env->pc, not env->ir[IR_PV]]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:50 +03:00
Timothy E Baldwin
ba41249678 linux-user: Support for restarting system calls for SH4 targets
Update the SH4 main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-12-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
c0bea68f9e linux-user: Support for restarting system calls for SPARC targets
Update the SPARC main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-9-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
6db9d00e2f linux-user: Support for restarting system calls for PPC targets
Update the PPC main loop code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn

(We already handle TARGET_QEMU_ESIGRETURN.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-8-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
2eb3ae27ec linux-user: Support for restarting system calls for MIPS targets
Update the MIPS main loop code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn

(We already handle TARGET_QEMU_ESIGRETURN.)

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-7-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
f0267ef711 linux-user: Support for restarting system calls for ARM targets
Update the 32-bit and 64-bit ARM main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code on sigreturn
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch any guest CPU state

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-6-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: tweak commit message; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
0284b03ba3 linux-user: Support for restarting system calls for x86 targets
Update the x86 main loop and sigreturn code:
 * on TARGET_ERESTARTSYS, wind guest PC backwards to repeat syscall insn
 * set all guest CPU state within signal.c code rather than passing it
   back out as the "return code" from do_sigreturn()
 * handle TARGET_QEMU_ESIGRETURN in the main loop as the indication
   that the main loop should not touch EAX

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-5-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: Commit message tweaks; drop TARGET_USE_ERESTARTSYS define]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
499b5d176a linux-user: Renumber TARGET_QEMU_ESIGRETURN, make it not arch-specific
Currently we define a QEMU-internal errno TARGET_QEMU_ESIGRETURN
only on the MIPS and PPC targets; move this to errno_defs.h
so it is available for all architectures, and renumber it to 513.
We pick 513 because this is safe from future use as a system call return
value: Linux uses it as ERESTART_NOINTR internally and never allows that
errno to escape to userspace.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-4-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: TARGET_ERESTARTSYS split out into preceding patch, add comment]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
14896d3281 linux-user: Define TARGET_ERESTART* errno values
Define TARGET_ERESTARTSYS; like the kernel, we will use this to
indicate that a guest system call should be restarted. We use
the same value the kernel does for this, 512.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
[PMM: split out from the patch which moves and renumbers
 TARGET_QEMU_ESIGRETURN, add comment on usage]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:49 +03:00
Timothy E Baldwin
da7c8647e5 linux-user: Reindent signal handling
Some of the signal handling was a mess with a mixture of tabs and 8 space
indents.

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-3-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: just rebased]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
2016-05-27 14:49:48 +03:00
Peter Maydell
a3ca7bb259 linux-user: Consistently return host errnos from do_openat()
The function do_openat() is not consistent about whether it is
returning a host errno or a guest errno in case of failure.
Standardise on returning -1 with errno set (ie caller has
to call get_errno()).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-05-27 14:49:48 +03:00
Timothy E Baldwin
2466119c95 linux-user: Check array bounds in errno conversion
Check array bounds in host_to_target_errno() and target_to_host_errno().

Signed-off-by: Timothy Edward Baldwin <T.E.Baldwin99@members.leeds.ac.uk>
Message-id: 1441497448-32489-2-git-send-email-T.E.Baldwin99@members.leeds.ac.uk
[PMM: Add a lower-bound check, use braces on if(), tweak commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
2016-05-27 14:49:48 +03:00
Peter Maydell
34c99d7b93 Merge remote-tracking branch 'remotes/dgibson/tags/ppc-for-2.7-20160527' into staging
ppc patch queue for 2016-05-27 (first pull for qemu-2.7)

I'm back from holidays now, and have re-collated the ppc patch queue.
This is a first pull request against the qemu-2.7 branch, mostly
consisting of patches which were posted before the 2.6 freeze, but
weren't suitable for late inclusion in the 2.6 branch.

 * Assorted bugfixes and cleanups
 * Some preliminary patches towards dynamic DMA windows and CPU hotplug
 * Significant performance impovement for the spapr-llan device
 * Added myself to MAINTAINERS for ppc (overdue)

# gpg: Signature made Fri 27 May 2016 04:04:15 BST using RSA key ID 20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg:                 aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg:                 aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg:          It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E  87DC 6C38 CACA 20D9 B392

* remotes/dgibson/tags/ppc-for-2.7-20160527:
  MAINTAINERS: Add David Gibson as ppc maintainer
  spapr_iommu: Move table allocation to helpers
  spapr_iommu: Finish renaming vfio_accel to need_vfio
  spapr_pci: Use correct DMA LIOBN when composing the device tree
  spapr: ensure device trees are always associated with DRC
  PPC/KVM: early validation of vcpu id
  Added negative check for get_image_size()
  hw/net/spapr_llan: Provide counter with dropped rx frames to the guest
  hw/net/spapr_llan: Delay flushing of the RX queue while adding new RX buffers
  target-ppc: Cleanups to rldinm, rldnm, rldimi
  target-ppc: Use 32-bit rotate instead of deposit + 64-bit rotate
  target-ppc: Use movcond in isel
  target-ppc: Correct KVM synchronization for ppc_hash64_set_external_hpt()

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-27 10:11:11 +01:00
David Gibson
b4daafbd13 MAINTAINERS: Add David Gibson as ppc maintainer
I've been de facto co-maintainer of all ppc target related code for some
time.  Alex Graf isworking on other things and doesn't have a whole lot of
time for qemu ppc maintainership.  So, update the MAINTAINERS file to
reflect this.

Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexander Graf <agraf@suse.de>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
2016-05-27 12:59:41 +10:00
Alexey Kardashevskiy
fec5d3a1cd spapr_iommu: Move table allocation to helpers
At the moment presence of vfio-pci devices on a bus affect the way
the guest view table is allocated. If there is no vfio-pci on a PHB
and the host kernel supports KVM acceleration of H_PUT_TCE, a table
is allocated in KVM. However, if there is vfio-pci and we do yet not
KVM acceleration for these, the table has to be allocated by
the userspace. At the moment the table is allocated once at boot time
but next patches will reallocate it.

This moves kvmppc_create_spapr_tce/g_malloc0 and their counterparts
to helpers.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy
f94819d601 spapr_iommu: Finish renaming vfio_accel to need_vfio
6a81dd17 "spapr_iommu: Rename vfio_accel parameter" renamed vfio_accel
flag everywhere but one spot was missed.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Alexey Kardashevskiy
eded5bac3b spapr_pci: Use correct DMA LIOBN when composing the device tree
The user could have picked LIOBN via the CLI but the device tree
rendering code would still use the value derived from the PHB index
(which is the default fallback if LIOBN is not set in the CLI).

This replaces SPAPR_PCI_LIOBN() with the actual DMA LIOBN value.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Jianjun Duan
5dd5238c0b spapr: ensure device trees are always associated with DRC
There are possible racing situations involving hotplug events and
guest migration. For cases where a hotplug event is migrated, or
the guest is in the process of fetching device tree at the time of
migration, we need to ensure the device tree is created and
associated with the corresponding DRC for devices that were
hotplugged on the source, but 'coldplugged' on the target.

Signed-off-by: Jianjun Duan <duanj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Greg Kurz
41264b385c PPC/KVM: early validation of vcpu id
The KVM API restricts vcpu ids to be < KVM_CAP_MAX_VCPUS. On PowerPC
targets, depending on the number of threads per core in the host and
in the guest, some topologies do generate higher vcpu ids actually.
When this happens, QEMU bails out with the following error:

kvm_init_vcpu failed: Invalid argument

The KVM_CREATE_VCPU ioctl has several EINVAL return paths, so it is
not possible to fully disambiguate.

This patch adds a check in the code that computes vcpu ids, so that
we can detect the error earlier, and print a friendlier message instead
of calling KVM_CREATE_VCPU with an obviously bogus vcpu id.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Zhou Jie
8afc22a20f Added negative check for get_image_size()
This patch adds check for negative return value from get_image_size(),
where it is missing. It avoids unnecessary two function calls.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Thomas Huth
5c29dd8c28 hw/net/spapr_llan: Provide counter with dropped rx frames to the guest
The last 8 bytes of the receive buffer list page (that has been supplied
by the guest with the H_REGISTER_LOGICAL_LAN call) contain a counter
for frames that have been dropped because there was no suitable receive
buffer available. This patch introduces code to use this field to
provide the information about dropped rx packets to the guest.
There it can be queried with "ethtool -S eth0 | grep rx_no_buffer".

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:23 +10:00
Thomas Huth
8836630f5d hw/net/spapr_llan: Delay flushing of the RX queue while adding new RX buffers
Currently, the spapr-vlan device is trying to flush the RX queue
after each RX buffer that has been added by the guest via the
H_ADD_LOGICAL_LAN_BUFFER hypercall. In case the receive buffer pool
was empty before, we only pass single packets to the guest this
way. This can cause very bad performance if a sender is trying
to stream fragmented UDP packets to the guest. For example when
using the UDP_STREAM test from netperf with UDP packets that are
much bigger than the MTU size, almost all UDP packets are dropped
in the guest since the chances are quite high that at least one of
the fragments got lost on the way.

When flushing the receive queue, it's much better if we'd have
a bunch of receive buffers available already, so that fragmented
packets can be passed to the guest in one go. To do this, the
spapr_vlan_receive() function should return 0 instead of -1 if there
are no more receive buffers available, so that receive_disabled = 1
gets temporarily set for the receive queue, and we have to delay
the queue flushing at the end of h_add_logical_lan_buffer() a little
bit by using a timer, so that the guest gets a chance to add multiple
RX buffers before we flush the queue again.

This improves the UDP_STREAM test with the spapr-vlan device a lot:
Running
 netserver -p 44444 -L <guestip> -f -D -4
in the guest, and
 netperf -p 44444 -L <hostip> -H <guestip> -t UDP_STREAM -l 60 -- -m 16384
in the host, I get the following values _without_ this patch:

Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

229376   16384   60.00     1738970      0    3798.83
229376           60.00          23              0.05

That "0.05" means that almost all UDP packets got lost/discarded
at the receiving side.
With this patch applied, the value look much better:

Socket  Message  Elapsed      Messages
Size    Size     Time         Okay Errors   Throughput
bytes   bytes    secs            #      #   10^6bits/sec

229376   16384   60.00     1789104      0    3908.35
229376           60.00       22818             49.85

Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:22 +10:00
Richard Henderson
a7b2c8b90a target-ppc: Cleanups to rldinm, rldnm, rldimi
Mirror the cleanups just done to rlwinm, rlwnm and rlwimi.
This adds use of deposit to rldimi.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:22 +10:00
Richard Henderson
63ae0915f8 target-ppc: Use 32-bit rotate instead of deposit + 64-bit rotate
A 32-bit rotate insn is more common on hosts than a deposit insn,
and if the host has neither the result is truely horrific.

At the same time, tidy up the temporaries within these functions,
drop the over-use of "likely", drop some checks for identity that
will also be checked by tcg-op.c functions, and special case mask
without rotate within rlwinm.

Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:22 +10:00
Richard Henderson
24f9cd951d target-ppc: Use movcond in isel
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:22 +10:00
David Gibson
319de6fe6e target-ppc: Correct KVM synchronization for ppc_hash64_set_external_hpt()
ppc_hash64_set_external_hpt() was added in e5c0d3c "target-ppc: Add helpers
for updating a CPU's SDR1 and external HPT".  This helper contains a
cpu_synchronize_state() since it may need to push state back to KVM
afterwards.

This turns out to break things when it is used in the reset path, which is
the only current user.  It appears that kvm_vcpu_dirty is not being set
early in the reset path, so the cpu_synchronize_state() is clobbering state
set up by the early part of the cpu reset path with stale state from KVM.

This may require some changes to the generic cpu reset path to fix
properly, but as a short term fix we can just remove the
cpu_synchronize_state() from ppc_hash64_set_external_hpt(), and require any
non-reset path callers to do that manually.

Reported-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
2016-05-27 09:40:22 +10:00
Peter Maydell
84cfc756d1 Merge remote-tracking branch 'remotes/awilliam/tags/vfio-update-20160526.1' into staging
VFIO updates 2016-05-26

 - Infrastructure and quirks to support IGD assignment (Alex Williamson)
 - Fixes to 128bit handling, IOMMU replay, IOMMU translation sanity
   checking (Alexey Kardashevskiy)

# gpg: Signature made Thu 26 May 2016 18:50:29 BST using RSA key ID 3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg:                 aka "Alex Williamson <alex@shazbot.org>"
# gpg:                 aka "Alex Williamson <alwillia@redhat.com>"
# gpg:                 aka "Alex Williamson <alex.l.williamson@gmail.com>"

* remotes/awilliam/tags/vfio-update-20160526.1:
  vfio: Check that IOMMU MR translates to system address space
  memory: Fix IOMMU replay base address
  vfio: Fix 128 bit handling when deleting region
  vfio/pci: Add IGD documentation
  vfio/pci: Add a separate option for IGD OpRegion support
  vfio/pci: Intel graphics legacy mode assignment
  vfio/pci: Setup BAR quirks after capabilities probing
  vfio/pci: Consolidate VGA setup
  vfio/pci: Fix return of vfio_populate_vga()
  vfio: Create device specific region info helper
  vfio: Enable sparse mmap capability

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 19:18:08 +01:00
Alexey Kardashevskiy
f1f9365019 vfio: Check that IOMMU MR translates to system address space
At the moment IOMMU MR only translate to the system memory.
However if some new code changes this, we will need clear indication why
it is not working so here is the check.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:09 -06:00
Alexey Kardashevskiy
d78c19b5cf memory: Fix IOMMU replay base address
Since a788f227 "memory: Allow replay of IOMMU mapping notifications"
when new VFIO listener is added, all existing IOMMU mappings are
replayed. However there is a problem that the base address of
an IOMMU memory region (IOMMU MR) is ignored which is not a problem
for the existing user (which is pseries) with its default 32bit DMA
window starting at 0 but it is if there is another DMA window.

This stores the IOMMU's offset_within_address_space and adjusts
the IOVA before calling vfio_dma_map/vfio_dma_unmap.

As the IOMMU notifier expects IOVA offset rather than the absolute
address, this also adjusts IOVA in sPAPR H_PUT_TCE handler before
calling notifier(s).

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:08 -06:00
Alexey Kardashevskiy
7a057b4fb9 vfio: Fix 128 bit handling when deleting region
7532d3cbf "vfio: Fix 128 bit handling" added support for 64bit IOMMU
memory regions when those are added to VFIO address space; however
removing code cannot cope with these as int128_get64() will fail on
1<<64.

This copies 128bit handling from region_add() to region_del().

Since the only machine type which is actually going to use 64bit IOMMU
is pseries and it never really removes them (instead it will dynamically
add/remove subregions), this should cause no behavioral change.

Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
2016-05-26 11:12:07 -06:00
Alex Williamson
0eb7342417 vfio/pci: Add IGD documentation
Document the usage modes, host primary graphics considerations, usage,
and fw_cfg ABI required for IGD assignment with vfio.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:05 -06:00
Alex Williamson
6ced0bba70 vfio/pci: Add a separate option for IGD OpRegion support
The IGD OpRegion is enabled automatically when running in legacy mode,
but it can sometimes be useful in universal passthrough mode as well.
Without an OpRegion, output spigots don't work, and even though Intel
doesn't officially support physical outputs in UPT mode, it's a
useful feature.  Note that if an OpRegion is enabled but a monitor is
not connected, some graphics features will be disabled in the guest
versus a headless system without an OpRegion, where they would work.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:03 -06:00
Alex Williamson
c4c45e943e vfio/pci: Intel graphics legacy mode assignment
Enable quirks to support SandyBridge and newer IGD devices as primary
VM graphics.  This requires new vfio-pci device specific regions added
in kernel v4.6 to expose the IGD OpRegion, the shadow ROM, and config
space access to the PCI host bridge and LPC/ISA bridge.  VM firmware
support, SeaBIOS only so far, is also required for reserving memory
regions for IGD specific use.  In order to enable this mode, IGD must
be assigned to the VM at PCI bus address 00:02.0, it must have a ROM,
it must be able to enable VGA, it must have or be able to create on
its own an LPC/ISA bridge of the proper type at PCI bus address
00:1f.0 (sorry, not compatible with Q35 yet), and it must have the
above noted vfio-pci kernel features and BIOS.  The intention is that
to enable this mode, a user simply needs to assign 00:02.0 from the
host to 00:02.0 in the VM:

  -device vfio-pci,host=0000:00:02.0,bus=pci.0,addr=02.0

and everything either happens automatically or it doesn't.  In the
case that it doesn't, we leave error reports, but assume the device
will operate in universal passthrough mode (UPT), which doesn't
require any of this, but has a much more narrow window of supported
devices, supported use cases, and supported guest drivers.

When using IGD in this mode, the VM firmware is required to reserve
some VM RAM for the OpRegion (on the order or several 4k pages) and
stolen memory for the GTT (up to 8MB for the latest GPUs).  An
additional option, x-igd-gms allows the user to specify some amount
of additional memory (value is number of 32MB chunks up to 512MB) that
is pre-allocated for graphics use.  TBH, I don't know of anything that
requires this or makes use of this memory, which is why we don't
allocate any by default, but the specification suggests this is not
actually a valid combination, so the option exists as a workaround.
Please report if it's actually necessary in some environment.

See code comments for further discussion about the actual operation
of the quirks necessary to assign these devices.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:01 -06:00
Alex Williamson
581406e0e3 vfio/pci: Setup BAR quirks after capabilities probing
Capability probing modifies wmask, which quirks may be interested in
changing themselves.  Apply our BAR quirks after the capability scan
to make this possible.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:12:00 -06:00
Alex Williamson
182bca4592 vfio/pci: Consolidate VGA setup
Combine VGA discovery and registration.  Quirks can have dependencies
on BARs, so the quirks push out until after we've scanned the BARs.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:11:58 -06:00
Alex Williamson
4225f2b670 vfio/pci: Fix return of vfio_populate_vga()
This function returns success if either we setup the VGA region or
the host vfio doesn't return enough regions to support the VGA index.
This latter case doesn't make any sense.  If we're asked to populate
VGA, fail if it doesn't exist and let the caller decide if that's
important.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:11:56 -06:00
Alex Williamson
e61a424f05 vfio: Create device specific region info helper
Given a device specific region type and sub-type, find it.  Also
cleanup return point on error in vfio_get_region_info() so that we
always return 0 with a valid pointer or -errno and NULL.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 11:04:50 -06:00
Alex Williamson
b53b0f696b vfio: Enable sparse mmap capability
The sparse mmap capability in a vfio region info allows vfio to tell
us which sub-areas of a region may be mmap'd.  Thus rather than
assuming a single mmap covers the entire region and later frobbing it
ourselves for things like the PCI MSI-X vector table, we can read that
directly from vfio.

Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-26 09:43:20 -06:00
Peter Maydell
aef11b8d33 Merge remote-tracking branch 'remotes/amit-migration/tags/migration-2.7-2' into staging
migration: add TLS support to the migration data channel

This is a big refactoring of the migration backend code - moving away from
QEMUFile to the new QIOChannel framework introduced here.  This brings a
good level of abstraction and reduction of many lines of code.

This series also adds the ability for many backends (all except RDMA) to
use TLS for encrypting the migration data between the endpoints.

# gpg: Signature made Thu 26 May 2016 07:07:08 BST using RSA key ID 657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-2.7-2: (28 commits)
  migration: remove qemu_get_fd method from QEMUFile
  migration: remove support for non-iovec based write handlers
  migration: add support for encrypting data with TLS
  migration: define 'tls-creds' and 'tls-hostname' migration parameters
  migration: don't use an array for storing migrate parameters
  migration: move definition of struct QEMUFile back into qemu-file.c
  migration: delete QEMUFile stdio implementation
  migration: delete QEMUFile sockets implementation
  migration: delete QEMUSizedBuffer struct
  migration: delete QEMUFile buffer implementation
  migration: convert savevm to use QIOChannel for writing to files
  migration: convert RDMA to use QIOChannel interface
  migration: convert exec socket protocol to use QIOChannel
  migration: convert fd socket protocol to use QIOChannel
  migration: convert tcp socket protocol to use QIOChannel
  migration: rename unix.c to socket.c
  migration: convert unix socket protocol to use QIOChannel
  migration: convert post-copy to use QIOChannelBuffer
  migration: add reporting of errors for outgoing migration
  migration: add helpers for creating QEMUFile from a QIOChannel
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 16:09:27 +01:00
Peter Maydell
2c56d06baf Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Wed 25 May 2016 18:32:40 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (31 commits)
  blockjob: Remove BlockJob.bs
  commit: Use BlockBackend for I/O
  backup: Use BlockBackend for I/O
  backup: Remove bs parameter from backup_do_cow()
  backup: Pack Notifier within BackupBlockJob
  backup: Don't leak BackupBlockJob in error path
  mirror: Use BlockBackend for I/O
  mirror: Allow target that already has a BlockBackend
  stream: Use BlockBackend for I/O
  block: Make blk_co_preadv/pwritev() public
  block: Convert block job core to BlockBackend
  block: Default to enabled write cache in blk_new()
  block: Cancel jobs first in bdrv_close_all()
  block: keep a list of block jobs
  block: Rename blk_write_zeroes()
  dma-helpers: change BlockBackend to opaque value in DMAIOFunc
  dma-helpers: change interface to byte-based
  block: Propagate .drained_begin/end callbacks
  block: Fix reconfiguring graph with drained nodes
  block: Make bdrv_drain() use bdrv_drained_begin/end()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 14:29:30 +01:00
Andreas Färber
a62c89117f qdev: Start disentangling bus from device
Move bus type and related APIs to a separate file bus.c.
This is a first step in breaking up qdev.c into more manageable chunks.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[AF: Rebased onto osdep.h]
Signed-off-by: Andreas Färber <afaerber@suse.de>
[PMM: added bus.o to link line for test-qdev-global-props]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 14:06:41 +01:00
Sergey Fedorov
c88c67e58b cpu-exec: Fix direct jump to TB spanning page
It is not safe to make a direct jump to a TB spanning two pages in
system emulation because the mapping for the second page can get changed
but we don't take care of direct jumps in this case.

However in user mode emulation, this is not the case because there's
only static address translation and TBs are always invalidated properly.

Fixes: 5b053a4a28 ("tcg: Clean up direct block chaining safety checks")

Reported-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Tested-by: Max Filippov <jcmvbkbc@gmail.com>
Message-id: 1463404380-29302-1-git-send-email-sergey.fedorov@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 13:14:29 +01:00
Peter Maydell
0533d3de60 Merge remote-tracking branch 'remotes/afaerber/tags/maintainers-for-peter' into staging
Andreas stepping down from most maintainer positions

# gpg: Signature made Wed 25 May 2016 16:53:45 BST using RSA key ID 3E7E013F
# gpg: Good signature from "Andreas Färber <afaerber@suse.de>"
# gpg:                 aka "Andreas Färber <afaerber@suse.com>"

* remotes/afaerber/tags/maintainers-for-peter:
  MAINTAINERS: Drop Andreas as CPU maintainer
  MAINTAINERS: Drop Andreas as 0.15 maintainer
  MAINTAINERS: Drop Andreas as PReP maintainer
  MAINTAINERS: Drop Andreas as Cocoa maintainer

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-26 12:41:12 +01:00
Daniel P. Berrange
12992c16d9 migration: remove qemu_get_fd method from QEMUFile
Now that there is a set_blocking callback in QEMUFileOps,
and all users needing non-blocking support have been
converted to QIOChannel, there is no longer any codepath
requiring the qemu_get_fd() method for QEMUFile. Remove it
to avoid further code being introduced with an expectation
of direct file handle access.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-29-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:21 +05:30
Daniel P. Berrange
11808bb0c4 migration: remove support for non-iovec based write handlers
All the remaining QEMUFile implementations provide an iovec
based write handler, so the put_buffer callback can be removed
to simplify the code.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-28-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:18 +05:30
Daniel P. Berrange
e122636562 migration: add support for encrypting data with TLS
This extends the migration_set_incoming_channel and
migration_set_outgoing_channel methods so that they
will automatically wrap the QIOChannel in a
QIOChannelTLS instance if TLS credentials are configured
in the migration parameters.

This allows TLS to work for tcp, unix, fd and exec
migration protocols. It does not (currently) work for
RDMA since it does not use these APIs, but it is
unlikely that TLS would be desired with RDMA anyway
since it would degrade the performance to that seen
with TCP defeating the purpose of using RDMA.

On the target host, QEMU would be launched with a set
of TLS credentials for a server endpoint

 $ qemu-system-x86_64 -monitor stdio -incoming defer \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=server,id=tls0 \
    ...other args...

To enable incoming TLS migration 2 monitor commands are
then used

  (qemu) migrate_set_str_parameter tls-creds tls0
  (qemu) migrate_incoming tcp:myhostname:9000

On the source host, QEMU is launched in a similar
manner but using client endpoint credentials

 $ qemu-system-x86_64 -monitor stdio \
    -object tls-creds-x509,dir=/home/berrange/security/qemutls,endpoint=client,id=tls0 \
    ...other args...

To enable outgoing TLS migration 2 monitor commands are
then used

  (qemu) migrate_set_str_parameter tls-creds tls0
  (qemu) migrate tcp:otherhostname:9000

Thanks to earlier improvements to error reporting,
TLS errors can be seen 'info migrate' when doing a
detached migration. For example:

  (qemu) info migrate
  capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
  Migration status: failed
  total time: 0 milliseconds
  error description: TLS handshake failed: The TLS connection was non-properly terminated.

Or

  (qemu) info migrate
  capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
  Migration status: failed
  total time: 0 milliseconds
  error description: Certificate does not match the hostname localhost

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-27-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:13 +05:30
Daniel P. Berrange
69ef1f36b0 migration: define 'tls-creds' and 'tls-hostname' migration parameters
Define two new migration parameters to be used with TLS encryption.
The 'tls-creds' parameter provides the ID of an instance of the
'tls-creds' object type, or rather a subclass such as 'tls-creds-x509'.
Providing these credentials will enable use of TLS on the migration
data stream.

If using x509 certificates, together with a migration URI that does
not include a hostname, the 'tls-hostname' parameter provides the
hostname to use when verifying the server's x509 certificate. This
allows TLS to be used in combination with fd: and exec: protocols
where a TCP connection is established by a 3rd party outside of
QEMU.

NB, this requires changing the migrate_set_parameter method in the
HMP to accept a 's' (string) value instead of 'i' (integer). This
is backwards compatible, because the parsing of strings allows the
quotes to be optional, thus any integer is also a valid string.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-26-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:10 +05:30
Daniel P. Berrange
2594f56d4c migration: don't use an array for storing migrate parameters
The MigrateState struct uses an array for storing migration
parameters. This presumes that all future parameters will
be integers too, which is not going to be the case. There
is no functional reason why an array is used, if anything
it makes the code less clear. The QAPI schema already
defines a struct - MigrationParameters - capable of storing
all the individual parameters, so just use that instead of
an array.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-25-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:07 +05:30
Daniel P. Berrange
a24939f279 migration: move definition of struct QEMUFile back into qemu-file.c
Now that the memory buffer based QEMUFile impl is gone, there
is no need for any backend to be accessing internals of the
QEMUFile struct, so it can be moved back into qemu-file.c

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-24-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:05 +05:30
Daniel P. Berrange
7fdc61c75d migration: delete QEMUFile stdio implementation
Now that the exec migration backend and savevm have converted
to use the QIOChannel based QEMUFile, there is no user remaining
for the stdio based QEMUFile impl and it can be deleted.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-23-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:03 +05:30
Daniel P. Berrange
40946ae40b migration: delete QEMUFile sockets implementation
Now that the tcp, unix and fd migration backends have converted
to use the QIOChannel based QEMUFile, there is no user remaining
for the sockets based QEMUFile impl and it can be deleted.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-22-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:32:00 +05:30
Daniel P. Berrange
2a22b4f370 migration: delete QEMUSizedBuffer struct
Now that we don't have have a buffer based QemuFile
implementation, the QEMUSizedBuffer code is also
unused and can be deleted. A simpler buffer class
also exists in util/buffer.c which other code can
used as needed.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-21-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:58 +05:30
Daniel P. Berrange
8b7c5c0f52 migration: delete QEMUFile buffer implementation
The qemu_bufopen() method is no longer used, so the memory
buffer based QEMUFile backend can be deleted entirely.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-20-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:55 +05:30
Daniel P. Berrange
8925839f00 migration: convert savevm to use QIOChannel for writing to files
Convert the exec savevm code to use QIOChannel and QEMUFileChannel,
instead of the stdio APIs.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-19-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:53 +05:30
Daniel P. Berrange
6ddd2d76ca migration: convert RDMA to use QIOChannel interface
This converts the RDMA code to provide a subclass of QIOChannel
that uses RDMA for the data transport.

This implementation of RDMA does not correctly handle non-blocking
mode. Reads might block if there was not already some pending data
and writes will block until all data is sent. This flawed behaviour
was already present in the existing impl, so appears to not be a
critical problem at this time. It should be on the list of things
to fix in the future though.

The RDMA code would be much better off it it could be split up in
a generic RDMA layer, a QIOChannel impl based on RMDA, and then
the RMDA migration glue. This is left as a future exercise for
the brave.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-18-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:50 +05:30
Daniel P. Berrange
527792fae6 migration: convert exec socket protocol to use QIOChannel
Convert the exec socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of the stdio
popen APIs. It can be unconditionally built because the
QIOChannelCommand class can report suitable error messages
on platforms which can't fork processes.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-17-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:47 +05:30
Daniel P. Berrange
64802ee57f migration: convert fd socket protocol to use QIOChannel
Convert the fd socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of plain sockets
APIs. It can be unconditionally built because the
QIOChannel APIs it uses will take care to report suitable
error messages if needed.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-16-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:45 +05:30
Daniel P. Berrange
e65c67e4da migration: convert tcp socket protocol to use QIOChannel
Drop the current TCP socket migration driver and extend
the new generic socket driver to cope with the TCP address
format

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-15-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:42 +05:30
Daniel P. Berrange
6f860ae755 migration: rename unix.c to socket.c
The unix.c file will be nearly the same as the tcp.c file,
only differing in the initial SocketAddress creation code.
Rename unix.c to socket.c and refactor it a little to
prepare for merging the TCP code.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-14-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:40 +05:30
Daniel P. Berrange
d984464eb9 migration: convert unix socket protocol to use QIOChannel
Convert the unix socket migration protocol driver to use
QIOChannel and QEMUFileChannel, instead of plain sockets
APIs. It can be unconditionally built, since the socket
impl of QIOChannel will report a suitable error on platforms
where UNIX sockets are unavailable.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-13-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:37 +05:30
Daniel P. Berrange
61b67d473d migration: convert post-copy to use QIOChannelBuffer
The post-copy code does some I/O to/from an intermediate
in-memory buffer rather than direct to the underlying
I/O channel. Switch this code to use QIOChannelBuffer
instead of QEMUSizedBuffer.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-12-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:34 +05:30
Daniel P. Berrange
d59ce6f344 migration: add reporting of errors for outgoing migration
Currently if an application initiates an outgoing migration,
it may or may not, get an error reported back on failure. If
the error occurs synchronously to the 'migrate' command
execution, the client app will see the error message. This
is the case for DNS lookup failures. If the error occurs
asynchronously to the monitor command though, the error
will be thrown away and the client left guessing about
what went wrong. This is the case for failure to connect
to the TCP server (eg due to wrong port, or firewall
rules, or other similar errors).

In the future we'll be adding more scope for errors to
happen asynchronously with the TLS protocol handshake.
TLS errors are hard to diagnose even when they are well
reported, so discarding errors entirely will make it
impossible to debug TLS connection problems.

Management apps which do migration are already using
'query-migrate' / 'info migrate' to check up on progress
of background migration operations and to see their end
status. This is a fine place to also include the error
message when things go wrong.

This patch thus adds an 'error-desc' field to the
MigrationInfo struct, which will be populated when
the 'status' is set to 'failed':

(qemu) migrate -d tcp:localhost:9001
(qemu) info migrate
capabilities: xbzrle: off rdma-pin-all: off auto-converge: off zero-blocks: off compress: off events: off x-postcopy-ram: off
Migration status: failed (Error connecting to socket: Connection refused)
total time: 0 milliseconds

In the HMP, when doing non-detached migration, it is
also possible to display this error message directly
to the app.

(qemu) migrate tcp:localhost:9001
Error connecting to socket: Connection refused

Or with QMP

  {
    "execute": "query-migrate",
    "arguments": {}
  }
  {
    "return": {
      "status": "failed",
      "error-desc": "address resolution failed for myhost:9000: No address associated with hostname"
    }
  }

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-11-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:30 +05:30
Daniel P. Berrange
48f07489ed migration: add helpers for creating QEMUFile from a QIOChannel
Currently creating a QEMUFile instance from a QIOChannel is
quite simple only requiring a single call to
qemu_fopen_channel_input or  qemu_fopen_channel_output
depending on the end of migration connection.

When QEMU gains TLS support, however, there will need to be
a TLS negotiation done inbetween creation of the QIOChannel
and creation of the final QEMUFile. Introduce some helper
methods that will encapsulate this logic, isolating the
migration protocol drivers from knowledge about TLS.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-10-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:27 +05:30
Daniel P. Berrange
a9cfeb33bb migration: introduce a new QEMUFile impl based on QIOChannel
Introduce a new QEMUFile implementation that is based on
the QIOChannel objects. This impl is different from existing
impls in that there is no file descriptor that can be made
available, as some channels may be based on higher level
protocols such as TLS.

Although the QIOChannel based implementation can trivially
provide a bi-directional stream, initially we have separate
functions for opening input & output directions to fit with
the expectation of the current QEMUFile interface.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1461751518-12128-9-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:24 +05:30
Daniel P. Berrange
9e4d2b98ee migration: force QEMUFile to blocking mode for outgoing migration
Instead of relying on the default QEMUFile I/O blocking flag
state, explicitly turn on blocking I/O for outgoing migration
since it takes place in a background thread.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-8-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:21 +05:30
Daniel P. Berrange
06ad513532 migration: introduce set_blocking function in QEMUFileOps
Remove the assumption that every QEMUFile implementation has
a file descriptor available by introducing a new function
in QEMUFileOps to change the blocking state of a QEMUFile.

If not set, it will fallback to the original code using
the get_fd method.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-7-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:19 +05:30
Daniel P. Berrange
0436e09f96 migration: split migration hooks out of QEMUFileOps
The QEMUFileOps struct contains the I/O subsystem callbacks
and the migration stage hooks. Split the hooks out into a
separate QEMUFileHooks struct to make it easier to refactor
the I/O side of QEMUFile without affecting the hooks.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-6-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:16 +05:30
Daniel P. Berrange
baf51e7739 migration: ensure qemu_fflush() always writes full data amount
The QEMUFile writev_buffer / put_buffer functions are expected
to write out the full set of requested data, blocking until
complete. The qemu_fflush() caller does not expect to deal with
partial writes. Clarify the function comments and add a sanity
check to the code to catch mistaken implementations.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-5-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:14 +05:30
Daniel P. Berrange
a8ec4437cd migration: remove use of qemu_bufopen from vmstate tests
Some of the test-vmstate.c test cases use a temporary file
while others use a memory buffer. To facilitate the future
removal of the qemu_bufopen() function, convert all the tests
to use a temporary file.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-4-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:11 +05:30
Daniel P. Berrange
d656ec5ea8 io: avoid double-free when closing QIOChannelBuffer
The QIOChannelBuffer's close implementation will free
the internal data buffer. It failed to reset the pointer
to NULL though, so when the object is later finalized
it will free it a second time with predictable crash.

Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-3-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:09 +05:30
Daniel P. Berrange
1fd791f007 s390: use FILE instead of QEMUFile for creating text file
The s390 skeys monitor command needs to write out a plain text
file. Currently it is using the QEMUFile class for this, but
work is ongoing to refactor QEMUFile and eliminate much code
related to it. The only feature qemu_fopen() gives over fopen()
is support for QEMU FD passing, but this can be achieved with
qemu_open() + fdopen() too. Switching to regular stdio FILE
APIs avoids the need to sprintf via an intermedia buffer which
slightly simplifies the code.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Message-Id: <1461751518-12128-2-git-send-email-berrange@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-26 11:31:05 +05:30
Kevin Wolf
b75536c9fa blockjob: Remove BlockJob.bs
There is a single remaining user in qemu-img, and another one in a test
case, both of which can be trivially converted to using BlockJob.blk
instead.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
4653456a5f commit: Use BlockBackend for I/O
This changes the commit block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the commit code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
5c438bc68c backup: Use BlockBackend for I/O
This changes the backup block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the backup code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
8543c27414 backup: Remove bs parameter from backup_do_cow()
Now that we pass the job to the function, bs is implied by that.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
John Snow
12b3e52e48 backup: Pack Notifier within BackupBlockJob
Instead of relying on peeking at bs->job, we want to explicitly get
a reference to the job that was involved in this notifier callback.

Pack the Notifier inside of the BackupBlockJob so we can use
container_of to get a reference back to the BackupBlockJob object.

This cuts out one more case where we rely unnecessarily on bs->job.

Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
91ab688379 backup: Don't leak BackupBlockJob in error path
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
e253f4b897 mirror: Use BlockBackend for I/O
This changes the mirror block job to use the job's BlockBackend for
performing its I/O. job->bs isn't used by the mirroring code any more
afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
b880481579 mirror: Allow target that already has a BlockBackend
We had to forbid mirroring to a target BDS that already had a BB
attached because the node swapping at job completion would add a second
BB and we didn't support multiple BBs on a single BDS at the time. Now
we do, so we can lift the restriction.

As we allow additional BlockBackends for the target, we must expect
other users to be sending requests. There may no requests be in flight
during the graph modification, so we have to drain those users now.

The core part of this patch is a revert of commit 40365552.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
03e35d820d stream: Use BlockBackend for I/O
This changes the streaming block job to use the job's BlockBackend for
performing the COR reads. job->bs isn't used by the streaming code any
more afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
1e98fefd95 block: Make blk_co_preadv/pwritev() public
Also add trace points now that the function can be directly called.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
b6d2e59995 block: Convert block job core to BlockBackend
This adds a new BlockBackend field to the BlockJob struct, which
coexists with the BlockDriverState while converting the individual jobs.

When creating a block job, a new BlockBackend is created on top of the
given BlockDriverState, and it is destroyed when the BlockJob ends. The
reference to the BDS is now held by the BlockBackend instead of calling
bdrv_ref/unref manually.

We have to be careful when we use bdrv_replace_in_backing_chain() in
block jobs because this changes the BDS that job->blk points to. At the
moment block jobs are too tightly coupled with their BDS, so that moving
a job to another BDS isn't easily possible; therefore, we need to just
manually undo this change afterwards.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
0c3169dffa block: Default to enabled write cache in blk_new()
The existing users of the function are:

1. blk_new_open(), which already enabled the write cache
2. Some test cases that don't care about the setting
3. blockdev_init() for empty drives, where the cache mode is overridden
   with the value from the options when a medium is inserted

Therefore, this patch doesn't change the current behaviour. It will be
convenient, however, for additional users of blk_new() (like block
jobs) if the most sensible WCE setting is the default.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
2016-05-25 19:04:21 +02:00
Kevin Wolf
a1a2af0756 block: Cancel jobs first in bdrv_close_all()
So far, bdrv_close_all() first removed all root BlockDriverStates of
BlockBackends and monitor owned BDSes, and then assumed that the
remaining BDSes must be related to jobs and cancelled these jobs.

This order doesn't work that well any more when block jobs use
BlockBackends internally because then they will lose their BDS before
being cancelled.

This patch changes bdrv_close_all() to first cancel all jobs and then
remove all root BDSes from the remaining BBs.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Alberto Garcia
a7112795c1 block: keep a list of block jobs
The current way to obtain the list of existing block jobs is to
iterate over all root nodes and check which ones own a job.

Since we want to be able to support block jobs in other nodes as well,
this patch keeps a list of jobs that is updated every time one is
created or destroyed.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Eric Blake
d004bd52aa block: Rename blk_write_zeroes()
Commit 983a1600 changed the semantics of blk_write_zeroes() to
be byte-based rather than sector-based, but did not change the
name, which is an open invitation for other code to misuse the
function.  Renaming to pwrite_zeroes() makes it more in line
with other byte-based interfaces, and will help make it easier
to track which remaining write_zeroes interfaces still need
conversion.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-25 19:04:21 +02:00
Paolo Bonzini
8a8e63ebdd dma-helpers: change BlockBackend to opaque value in DMAIOFunc
Callers of dma_blk_io have no way to pass extra data to the DMAIOFunc,
because the original callback and opaque are gone by the time DMAIOFunc
is called.  On the other hand, the BlockBackend is usually derived
from those extra data that you could pass to the DMAIOFunc (in the
next patch, that would be the SCSIRequest).

So change DMAIOFunc's prototype, decoupling it from blk_aio_readv
and blk_aio_writev's.  The new prototype loses the BlockBackend
and gains an extra opaque value which, in the case of dma_blk_readv
and dma_blk_writev, is of course used for the BlockBackend.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:11 +02:00
Paolo Bonzini
cbe0ed6247 dma-helpers: change interface to byte-based
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:11 +02:00
Kevin Wolf
20018e12cf block: Propagate .drained_begin/end callbacks
When draining intermediate nodes (i.e. nodes that aren't the root node
for at least one of their parents; with node references, the user can
always configure the graph to create this situation), we need to
propagate the .drained_begin/end callbacks all the way up to the root
for the drain to be effective.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:11 +02:00
Kevin Wolf
36fe13317b block: Fix reconfiguring graph with drained nodes
When changing the BlockDriverState that a BdrvChild points to while the
node is currently drained, we must call the .drained_end() parent
callback. Conversely, when this means attaching a new node that is
already drained, we need to call .drained_begin().

bdrv_root_attach_child() takes now an opaque parameter, which is needed
because the callbacks must also be called if we're attaching a new child
to the BlockBackend when the root node is already drained, and they need
a way to identify the BlockBackend. Previously, child->opaque was set
too late and the callbacks would still see it as NULL.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
6820643fdb block: Make bdrv_drain() use bdrv_drained_begin/end()
Until now, bdrv_drained_begin() used bdrv_drain() internally to drain
the queue. This is kind of backwards and caused quiescing code to be
duplicated because bdrv_drained_begin() had to ensure that no new
requests come in even after bdrv_drain() returns, whereas bdrv_drain()
had to have them because it could be called from other places.

Instead move the bdrv_drain() code to bdrv_drained_begin() and make
bdrv_drain() a simple wrapper around bdrv_drained_begin/end().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
e9740bc6d4 block: Introduce bdrv_replace_child()
This adds a common function that is called when attaching a new child to
a parent, removing a child from a parent and when reconfiguring the
graph so that an existing child points to a different node now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
109525ad6a block: Drop errp parameter from blk_new()
blk_new() cannot fail so its Error ** parameter has become superfluous.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
6b574e09b3 block: Drop bdrv_parent_cb_...() from bdrv_close()
bdrv_close() now asserts that the BDS's refcount is 0, therefore it
cannot have any parents and the bdrv_parent_cb_change_media() call is a
no-op.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
30f55fb81f block: Assert !bs->refcnt in bdrv_close()
The only caller of bdrv_close() left is bdrv_delete(). We may as well
assert that, in a way (there are some things in bdrv_close() that make
more sense under that assumption, such as the call to
bdrv_release_all_dirty_bitmaps() which in turn assumes that no frozen
bitmaps are attached to the BDS).

In addition, being called only in bdrv_delete() means that we can drop
bdrv_close()'s forward declaration at the top of block.c.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
5b3639371c block: Make bdrv_open() return a BDS
There are no callers to bdrv_open() or bdrv_open_inherit() left that
pass a pointer to a non-NULL BDS pointer as the first argument of these
functions, so we can finally drop that parameter and just make them
return the new BDS.

Generally, the following pattern is applied:

    bs = NULL;
    ret = bdrv_open(&bs, ..., &local_err);
    if (ret < 0) {
        error_propagate(errp, local_err);
        ...
    }

by

    bs = bdrv_open(..., errp);
    if (!bs) {
        ret = -EINVAL;
        ...
    }

Of course, there are only a few instances where the pattern is really
pure.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
9bddf75979 block: Drop bdrv_new_root()
It is unused now, so we may just as well drop it.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
28eb9b12f7 block: Drop blk_new_with_bs()
Its only caller is blk_new_open(), so we can just inline it there.

The bdrv_new_root() call is dropped in the process because we can just
let bdrv_open() create the BDS.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
21a699afc8 tests: Drop BDS from test-throttle.c
Now that throttling has been moved to the BlockBackend level, we do not
need to create a BDS along with the BB in the I/O throttling test.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
668361898e block: Let bdrv_open_inherit() return the snapshot
If bdrv_open_inherit() creates a snapshot BDS and *pbs is NULL, that
snapshot BDS should be returned instead of the BDS under it.

This has worked so far because (nearly) all users of BDRV_O_SNAPSHOT use
blk_new_open() to create the BDS tree. bdrv_append() (which is called by
bdrv_append_temp_snapshot()) redirects pointers from parents (i.e. the
BB in this case) to the newly appended child (i.e. the overlay),
therefore, while bdrv_open_inherit() did not return the root BDS, the BB
still pointed to it.

The only instance where BDRV_O_SNAPSHOT is used but blk_new_open() is
not is in blockdev_init() if no BDS tree is created, and instead
blk_new() is used and the flags are stored in the BB root state.
However, qmp_blockdev_change_medium() filters the BDRV_O_SNAPSHOT flag
before invoking bdrv_open(), so it will not have any effect.

In any case, it would be nicer if bdrv_open_inherit() could just always
return the root of the BDS tree that has been created.

To this end, bdrv_append_temp_snapshot() now returns the snapshot BDS
instead of just appending it on top of the snapshotted BDS. Also, it
calls bdrv_ref() before bdrv_append() (which bdrv_open_inherit() has to
undo if not returning the overlay).

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Max Reitz
506f8709ce block: Drop useless bdrv_new() call
bdrv_append_temp_snapshot() uses bdrv_new() to create an empty BDS
before invoking bdrv_open() on that BDS. This is probably a relict from
when it used to do some modifications on that empty BDS, but now that is
unnecessary, so we can just set bs_snapshot to NULL and let bdrv_open()
do the rest.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-25 19:04:10 +02:00
Kevin Wolf
88be7b4be4 block: Fix bdrv_next() memory leak
The bdrv_next() users all leaked the BdrvNextIterator after completing
the iteration. Simply changing bdrv_next() to free the iterator before
returning NULL at the end of list doesn't work because some callers exit
the loop before looking at all BDSes.

This patch moves the BdrvNextIterator from the heap to the stack of
the caller and switches to a bdrv_first()/bdrv_next() interface for
initialising the iterator.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-25 19:04:10 +02:00
Andreas Färber
12b0e69cd7 MAINTAINERS: Drop Andreas as CPU maintainer
Signed-off-by: Andreas Färber <afaerber@suse.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
211b76d1db MAINTAINERS: Drop Andreas as 0.15 maintainer
Downgrade to orphan status, like all other remaining stable entries.

Signed-off-by: Andreas Färber <afaerber@suse.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
9f38774da2 MAINTAINERS: Drop Andreas as PReP maintainer
Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2016-05-25 17:44:15 +02:00
Andreas Färber
aa373a1ec8 MAINTAINERS: Drop Andreas as Cocoa maintainer
Peter has taken over Cocoa maintainership.

Signed-off-by: Andreas Färber <andreas.faerber@web.de>
2016-05-25 17:14:56 +02:00
Prasad J Pandit
3af9187fc6 net: mipsnet: check packet length against buffer
When receiving packets over MIPSnet network device, it uses
receive buffer of size 1514 bytes. In case the controller
accepts large(MTU) packets, it could lead to memory corruption.
Add check to avoid it.

Reported by: Oleksandr Bazhaniuk <oleksandr.bazhaniuk@intel.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-05-25 15:46:07 +08:00
Zhou Jie
11196e95f0 net/tap: Allocating Large sized arrays to heap
net_init_tap has a huge stack usage of 8192 bytes approx.
Moving large arrays to heap to reduce stack usage.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
2016-05-25 15:46:07 +08:00
Peter Maydell
287db79df8 Merge remote-tracking branch 'remotes/ehabkost/tags/x86-pull-request' into staging
X86 queue, 2016-05-23

# gpg: Signature made Mon 23 May 2016 23:48:27 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/x86-pull-request:
  target-i386: kvm: Eliminate kvm_msr_entry_set()
  target-i386: kvm: Simplify MSR setting functions
  target-i386: kvm: Simplify MSR array construction
  target-i386: kvm: Increase MSR_BUF_SIZE
  target-i386: kvm: Allocate kvm_msrs struct once per VCPU
  target-i386: Call cpu_exec_init() on realize
  target-i386: Move TCG initialization to realize time
  target-i386: Move TCG initialization check to tcg_x86_init()
  cpu: Eliminate cpudef_init(), cpudef_setup()
  target-i386: Set constant model_id for qemu64/qemu32/athlon
  pc: Set CPU model-id on compat_props for pc <= 2.4
  osdep: Move default qemu_hw_version() value to a macro
  target-i386: kvm: Use X86XSaveArea struct for xsave save/load
  target-i386: Use xsave structs for ext_save_area
  target-i386: Define structs for layout of xsave area

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-24 13:06:33 +01:00
Peter Maydell
99694362ee Merge remote-tracking branch 'remotes/amit-migration/tags/migration-2.7-1' into staging
migration fixes:

- ensure src block devices continue fine after a failed migration
- fail on migration blockers; helps 9p savevm/loadvm
- move autoconverge commands out of experimental state
- move the migration-specific qjson in migration/

# gpg: Signature made Mon 23 May 2016 18:15:09 BST using RSA key ID 657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-migration/tags/migration-2.7-1:
  migration: regain control of images when migration fails to complete
  savevm: fail if migration blockers are present
  migration: Promote improved autoconverge commands out of experimental state
  migration/qjson: Drop gratuitous use of QOM
  migration: Move qjson.[ch] to migration/

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-24 12:21:07 +01:00
Peter Maydell
b0f6ef8915 Merge remote-tracking branch 'remotes/amit-virtio-rng/tags/rng-2.7-1' into staging
rng: rename RndRandom to RndRandom

# gpg: Signature made Mon 23 May 2016 16:44:58 BST using RSA key ID 657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg:                 aka "Amit Shah <amit@kernel.org>"
# gpg:                 aka "Amit Shah <amitshah@gmx.net>"

* remotes/amit-virtio-rng/tags/rng-2.7-1:
  rng-random: rename RndRandom to RngRandom

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-24 11:38:22 +01:00
Peter Maydell
4c63a818de Merge remote-tracking branch 'remotes/xtensa/tags/20160523-opencores_eth' into staging
opencores_eth cleanups:
- use mii.h
- reduce stack usage in open_eth_start_xmit.

# gpg: Signature made Mon 23 May 2016 20:14:20 BST using RSA key ID F83FA044
# gpg: Good signature from "Max Filippov <max.filippov@cogentembedded.com>"
# gpg:                 aka "Max Filippov <jcmvbkbc@gmail.com>"

* remotes/xtensa/tags/20160523-opencores_eth:
  hw/net/opencores_eth: Allocating Large sized arrays to heap
  hw/net/opencores_eth: use mii.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-24 10:19:45 +01:00
Eduardo Habkost
1abc2cae46 target-i386: kvm: Eliminate kvm_msr_entry_set()
Inline the function inside kvm_msr_entry_add().

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
e25ffda7bd target-i386: kvm: Simplify MSR setting functions
Simplify kvm_put_tscdeadline_msr() and
kvm_put_msr_feature_control() using kvm_msr_buf and the
kvm_msr_entry_add() helper.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
9c600a8454 target-i386: kvm: Simplify MSR array construction
Add a helper function that appends new entries to the MSR buffer
and checks for the buffer size limit.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
d1138251bf target-i386: kvm: Increase MSR_BUF_SIZE
We are dangerously close to the array limits in kvm_put_msrs()
and kvm_get_msrs(): with the default mcg_cap configuration, we
can set up to 148 MSRs in kvm_put_msrs(), and if we allow mcg_cap
to be changed, we can write up to 236 MSRs.

Use 4096 bytes for the buffer, that can hold 255 kvm_msr_entry
structs.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
d71b62a165 target-i386: kvm: Allocate kvm_msrs struct once per VCPU
Instead of using 2400 bytes in the stack for 150 MSR entries in
kvm_get_msrs() and kvm_put_msrs(), allocate a buffer once for
each VCPU.

Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
42ecabaae1 target-i386: Call cpu_exec_init() on realize
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).

Calling cpu_exec_init() also affects QEMU's ability to handle errors
during CPU creation, as some actions done by cpu_exec_init() can't be
reverted.

Move cpu_exec_init() call to realize so a simple object_new() won't
trigger it, and so that it is called after some basic validation of CPU
parameters.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
57f2453ab4 target-i386: Move TCG initialization to realize time
QOM instance_init functions are not supposed to have any side-effects,
as new objects may be created at any moment for querying property
information (see qmp_device_list_properties()).

Move TCG initialization to realize time so it won't be called when just
doing object_new() on a X86CPU subclass.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
4fe15cdedf target-i386: Move TCG initialization check to tcg_x86_init()
Instead of requiring cpu.c to check if TCG was already initialized,
simply let the function be called multiple times.

Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
3e2c0e062f cpu: Eliminate cpudef_init(), cpudef_setup()
x86_cpudef_init() doesn't do anything anymore, cpudef_init(),
cpudef_setup(), and x86_cpudef_init() can be finally removed.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:37 -03:00
Eduardo Habkost
9cf2cc3d82 target-i386: Set constant model_id for qemu64/qemu32/athlon
Newer PC machines don't set hw_version, and older machines set
model-id on compat_props explicitly, so we don't need the
x86_cpudef_setup() code that sets model_id using
qemu_hw_version() anymore.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 19:47:32 -03:00
Zhou Jie
ea4d824168 hw/net/opencores_eth: Allocating Large sized arrays to heap
open_eth_start_xmit has a huge stack usage of 65536 bytes approx.
Moving large arrays to heap to reduce stack usage.

Reduce size of a buffer allocated on stack to 0x600 bytes, which is the
maximal frame length when HUGEN bit is not set in MODER, only allocate
buffer on heap when that is too small. Thus heap is not used in typical
use case.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-05-23 22:10:16 +03:00
Max Filippov
aa8e0ab975 hw/net/opencores_eth: use mii.h
Drop local definitions of MII registers and use constants from mii.h for
registers and register bits. No functional changes.

Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
2016-05-23 22:10:16 +03:00
Greg Kurz
fe904ea824 migration: regain control of images when migration fails to complete
We currently have an error path during migration that can cause
the source QEMU to abort:

migration_thread()
  migration_completion()
    runstate_is_running() ----------------> true if guest is running
    bdrv_inactivate_all() ----------------> inactivate images
    qemu_savevm_state_complete_precopy()
     ... qemu_fflush()
           socket_writev_buffer() --------> error because destination fails
         qemu_fflush() -------------------> set error on migration stream
  migration_completion() -----------------> set migrate state to FAILED
migration_thread() -----------------------> break migration loop
  vm_start() -----------------------------> restart guest with inactive
                                            images

and you get:

qemu-system-ppc64: socket_writev_buffer: Got err=104 for (32768/18446744073709551615)
qemu-system-ppc64: /home/greg/Work/qemu/qemu-master/block/io.c:1342:bdrv_co_do_pwritev: Assertion `!(bs->open_flags & 0x0800)' failed.
Aborted (core dumped)

If we try postcopy with a similar scenario, we also get the writev error
message but QEMU leaves the guest paused because entered_postcopy is true.

We could possibly do the same with precopy and leave the guest paused.
But since the historical default for migration errors is to restart the
source, this patch adds a call to bdrv_invalidate_cache_all() instead.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Message-Id: <146357896785.6003.11983081732454362715.stgit@bahia.huguette.org>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 22:19:36 +05:30
Eduardo Habkost
cd6c1b7057 pc: Set CPU model-id on compat_props for pc <= 2.4
Instead of relying on x86_cpudef_setup() calling
qemu_hw_version(), just make old machines set model-id explicitly
on compat_props for qemu64, qemu32, and athlon. This will allow
us to eliminate x86_cpudef_setup() later.

Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 13:19:36 -03:00
Eduardo Habkost
d494352c2f osdep: Move default qemu_hw_version() value to a macro
The macro will be used by code that will stop calling
qemu_hw_version() at runtime and just need a constant value.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 13:19:36 -03:00
Eduardo Habkost
86cd2ea071 target-i386: kvm: Use X86XSaveArea struct for xsave save/load
Instead of using offset macros and bit operations in a uint32_t
array, use the X86XSaveArea struct to perform the loading/saving
operations in kvm_put_xsave() and kvm_get_xsave().

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 13:19:36 -03:00
Eduardo Habkost
ee1b09f695 target-i386: Use xsave structs for ext_save_area
This doesn't introduce any change in the code, as the offsets and
struct sizes match what was present in the table. This can be
validated by the QEMU_BUILD_BUG_ON lines on target-i386/cpu.h,
which ensures the struct sizes and offsets match the existing
values in ext_save_area.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 13:19:36 -03:00
Eduardo Habkost
b503717d28 target-i386: Define structs for layout of xsave area
Add structs that define the layout of the xsave areas used by
Intel processors. Add some QEMU_BUILD_BUG_ON lines to ensure the
structs match the XSAVE_* macros in target-i386/kvm.c and the
offsets and sizes at target-i386/cpu.c:ext_save_areas.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-23 13:19:36 -03:00
Greg Kurz
24f3902b08 savevm: fail if migration blockers are present
QEMU has currently two ways to prevent migration to occur:
- migration blocker when it depends on runtime state
- VMStateDescription.unmigratable when migration is not supported at all

This patch gathers all the logic into a single function to be called from
both the savevm and the migrate paths.

This fixes a bug with 9p, at least, where savevm would succeed and the
following would happen in the guest after loadvm:

$ ls /host
ls: cannot access /host: Protocol error

With this patch:

(qemu) savevm foo
Migration is disabled when VirtFS export path '/' is mounted in the guest
using mount_tag 'host'

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <146239057139.11271.9011797645454781543.stgit@bahia.huguette.org>

[Update subject according to Paolo's suggestion - Amit]

Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 21:44:08 +05:30
Peter Maydell
c915854761 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
* NMI cleanups (Bandan)
* RAMBlock/Memory cleanups and fixes (Dominik, Gonglei, Fam, me)
* first part of linuxboot support for fw_cfg DMA (Richard)
* IOAPIC fix (Peter Xu)
* iSCSI SG_IO fix (Vadim)
* Various infrastructure bug fixes (Zhijian, Peter M., Stefan)
* CVE fixes (Prasad)

# gpg: Signature made Mon 23 May 2016 16:06:18 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (24 commits)
  cpus: call the core nmi injection function
  nmi: remove x86 specific nmi handling
  target-i386: add a generic x86 nmi handler
  coccinelle: add g_assert_cmp* to macro file
  iscsi: pass SCSI status back for SG_IO
  esp: check dma length before reading scsi command(CVE-2016-4441)
  esp: check command buffer length before write(CVE-2016-4439)
  scripts/signrom.py: Check for magic in option ROMs.
  scripts/signrom.py: Allow option ROM checksum script to write the size header.
  Remove config-devices.mak on 'make clean'
  cpus.c: Use pthread_sigmask() rather than sigprocmask()
  memory: remove unnecessary masking of MemoryRegion ram_addr
  memory: Drop FlatRange.romd_mode
  memory: Remove code for mr->may_overlap
  exec: adjust rcu_read_lock requirement
  memory: drop find_ram_block()
  vl: change runstate only if new state is different from current state
  ioapic: clear remote irr bit for edge-triggered interrupts
  ioapic: keep RO bits for IOAPIC entry
  target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-23 16:15:52 +01:00
Bandan Das
1453e6627d cpus: call the core nmi injection function
We can call the common function here directly since
x86 specific actions will be taken care of by the arch
specific nmi handler

Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <1463761717-26558-4-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:47 +02:00
Bandan Das
f7e981f295 nmi: remove x86 specific nmi handling
nmi_monitor_handle is wired to call the x86 nmi
handler. So, we can directly use it at call sites.

Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <1463761717-26558-3-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:46 +02:00
Bandan Das
1255166b99 target-i386: add a generic x86 nmi handler
Instead of having x86 ifdefs in core nmi code, this
change adds a arch specific handler that the nmi common
code can call.

Signed-off-by: Bandan Das <bsd@redhat.com>
Message-Id: <1463761717-26558-2-git-send-email-bsd@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:46 +02:00
Paolo Bonzini
6ad978e9f4 coccinelle: add g_assert_cmp* to macro file
This helps applying semantic patches to unit tests.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:46 +02:00
Vadim Rozenfeld
644c6869d3 iscsi: pass SCSI status back for SG_IO
Signed-off-by: Vadim Rozenfeld <vrozenfe@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:46 +02:00
Prasad J Pandit
6c1fef6b59 esp: check dma length before reading scsi command(CVE-2016-4441)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer.
Routine get_cmd() uses DMA to read scsi commands into this buffer.
Add check to validate DMA length against buffer size to avoid any
overrun.

Fixes CVE-2016-4441.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-3-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:46 +02:00
Prasad J Pandit
c98c6c105f esp: check command buffer length before write(CVE-2016-4439)
The 53C9X Fast SCSI Controller(FSC) comes with an internal 16-byte
FIFO buffer. It is used to handle command and data transfer. While
writing to this command buffer 's->cmdbuf[TI_BUFSZ=16]', a check
was missing to validate input length. Add check to avoid OOB write
access.

Fixes CVE-2016-4439.

Reported-by: Li Qiang <liqiang6-s@360.cn>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1463654371-11169-2-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Richard W.M. Jones
fd28938b7a scripts/signrom.py: Check for magic in option ROMs.
Because of the risk that compilers might not emit the asm() block at
the beginning of the option ROM, check that the ROM contains the
required magic signature.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <1463000807-18015-3-git-send-email-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Richard W.M. Jones
6f71b779c8 scripts/signrom.py: Allow option ROM checksum script to write the size header.
Modify the signrom.py script so that if the size byte in the header is
0 (ie. not set) then the script will set the size.  If the size byte
is non-zero then we do the same as before, so this doesn't require
changes to any existing ROM sourcecode.

Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <1463000807-18015-2-git-send-email-rjones@redhat.com>
2016-05-23 16:53:45 +02:00
Peter Maydell
168340b6ba Remove config-devices.mak on 'make clean'
Our dependency mechanism works like this:
 * on first build there is neither a .o nor a .d
 * we create the .d as a side effect of creating the .o
 * for rebuilds we know when we need to update the .o,
   which also updates the .d

This system requires that you're never in a situation where there is
a .o file but no .d (because then we will never realise we need to
build the .d, and we will not have the dependency information about
when to rebuild the .o).

This is working fine for our object files, but we also try to use it
for $TARGET/config-devices.mak (where the dependency file is
in $TARGET-config-devices.mak.d). Unfortunately "make clean" doesn't
remove config-devices.mak, which means that it puts us in the
forbidden situation of "object file exists but not its .d file".
This in turn means that we will fail to notice when we need to rebuild:
  mkdir build/depbug
  (cd build/depbug && '../../configure')
  make -C build/depbug -j8
  make -C build/depbug clean
  echo "CONFIG_CANARY = y" >> default-configs/arm-softmmu.mak
  make -C build/depbug
  grep CANARY build/depbug/aarch64-softmmu/config-devices.mak

The CANARY token should show up in config-devices.mak but does not.

Fix this bug by making "make clean" delete the config-devices.mak files.
config-all-devices.mak doesn't have the same problem since it has
no .d file, but delete it too, since it is created by "make" and
logically should be removed by "make clean".

(Note that it is important not to remove config-devices.mak until
after we have recursively run 'make clean' in the subdirectories.)

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1463484451-22979-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Peter Maydell
a2d1761da1 cpus.c: Use pthread_sigmask() rather than sigprocmask()
On Linux, sigprocmask() and pthread_sigmask() are in practice the
same thing (they only set the signal mask for the calling thread),
but the documentation states that the behaviour of sigprocmask() in a
multithreaded process is undefined. Use pthread_sigmask() instead
(which is what we do in almost all places in QEMU that alter the
signal mask already).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1463420039-29761-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Paolo Bonzini
e4e697940d memory: remove unnecessary masking of MemoryRegion ram_addr
mr->ram_block->offset is already aligned to both host and target size
(see qemu_ram_alloc_internal).  Remove further masking as it is
unnecessary.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:45 +02:00
Fam Zheng
5b5660adf1 memory: Drop FlatRange.romd_mode
Its value is alway set to mr->romd_mode, so the removed comparisons are
fully superseded by "a->mr == b->mr".

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1458900629-2334-3-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Fam Zheng
b613597819 memory: Remove code for mr->may_overlap
The collision check does nothing and hasn't been used. Remove the
variable together with related code.

Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <1458900629-2334-2-git-send-email-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Gonglei
ab0a995608 exec: adjust rcu_read_lock requirement
qemu_ram_unset_idstr() doesn't need rcu lock anymore,
meanwhile make the range of rcu lock in
qemu_ram_set_idstr() as small as possible.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1462845901-89716-3-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Gonglei
fa53a0e53e memory: drop find_ram_block()
On the one hand, we have already qemu_get_ram_block() whose function
is similar. On the other hand, we can directly use mr->ram_block but
searching RAMblock by ram_addr which is a kind of waste.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1462845901-89716-2-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Li Zhijian
e92a2d9cb3 vl: change runstate only if new state is different from current state
Previously, qemu will abort at following scenario:
(qemu) stop
(qemu) system_reset
(qemu) system_reset
(qemu) 2016-04-13T20:54:38.979158Z qemu-system-x86_64: invalid runstate transition: 'prelaunch' -> 'prelaunch'

Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1460604352-18630-1-git-send-email-lizhijian@cn.fujitsu.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:44 +02:00
Peter Xu
ed1263c363 ioapic: clear remote irr bit for edge-triggered interrupts
This is to better emulate IOAPIC version 0x1X hardware. Linux kernel
leveraged this "feature" to do explicit EOI since EOI register is still
not introduced at that time. This will also fix the issue that level
triggered interrupts failed to work when IR enabled (tested with Linux
kernel version 4.5).

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1462875682-1349-3-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Peter Xu
479c2a1cb7 ioapic: keep RO bits for IOAPIC entry
Currently IOAPIC RO bits can be written. To be better aligned with
hardware, we should let them read-only.

Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1462875682-1349-2-git-send-email-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Paolo Bonzini
14cb949a3e target-i386: key sfence availability on CPUID_SSE, not CPUID_SSE2
sfence was introduced before lfence and mfence.  This fixes Linux
2.4's measurement of checksumming speeds for the pIII_sse
algorithm:

md: linear personality registered as nr 1
md: raid0 personality registered as nr 2
md: raid1 personality registered as nr 3
md: raid5 personality registered as nr 4
raid5: measuring checksumming speed
   8regs     :   384.400 MB/sec
   32regs    :   259.200 MB/sec
invalid operand: 0000
CPU:    0
EIP:    0010:[<c0240b2a>]    Not tainted
EFLAGS: 00000246
eax: c15d8000   ebx: 00000000   ecx: 00000000   edx: c15d5000
esi: 8005003b   edi: 00000004   ebp: 00000000   esp: c15bdf50
ds: 0018   es: 0018   ss: 0018
Process swapper (pid: 1, stackpage=c15bd000)
Stack: 00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
       00000000 00000000 00000000 00000000 00000000 00000000 00000000
00000000
       00000000 00000206 c0241c6c 00001000 c15d4000 c15d7000 c15d4000
c15d4000
Call Trace:    [<c0241c6c>] [<c0105000>] [<c0241db4>] [<c010503b>]
[<c0105000>]
  [<c0107416>] [<c0105030>]

Code: 0f ae f8 0f 10 04 24 0f 10 4c 24 10 0f 10 54 24 20 0f 10 5c
 <0>Kernel panic: Attempted to kill init!

Reported-by: Stefan Weil <sw@weilnetz.de>
Fixes: 121f315788
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Stefan Weil
5919e0328b configure: Allow builds with extra warnings
The clang compiler supports a useful compiler option -Weverything,
and GCC also has other warnings not enabled by -Wall.

If glib header files trigger a warning, however, testing glib with
-Werror will always fail. A size mismatch is also detected without
-Werror, so simply remove it.

Cc: qemu-stable@nongnu.org
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <1461879221-13338-1-git-send-email-sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Prasad J Pandit
691a02e2ce i386: kvmvapic: initialise imm32 variable
When processing Task Priorty Register(TPR) access, it could leak
automatic stack variable 'imm32' in patch_instruction().
Initialise the variable to avoid it.

Reported by: Donghai Zdh <donghai.zdh@alibaba-inc.com>
Cc: qemu-stable@nongnu.org
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1460013608-16670-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Pranith Kumar
dfc007f7f7 docs/atomics.txt: Update pointer to linux macro
Add a missing end brace and update doc to point to the latest access
macro. ACCESS_ONCE() is deprecated.

Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <1462198852-28694-1-git-send-email-bobby.prani@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:43 +02:00
Dominik Dingel
d2f39add72 exec.c: Ensure right alignment also for file backed ram
While in the anonymous ram case we already take care of the right alignment
such an alignment gurantee does not exist for file backed ram allocation.

Instead, pagesize is used for alignment. On s390 this is not enough for gmap,
as we need to satisfy an alignment up to segments.

Reported-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Dominik Dingel <dingel@linux.vnet.ibm.com>

Message-Id: <1461585338-45863-1-git-send-email-dingel@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-23 16:53:42 +02:00
Peter Maydell
2b5f477789 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160523-1' into staging
usb: add xen pvUSB backend, add num-ports check to ohci.

# gpg: Signature made Mon 23 May 2016 14:02:25 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160523-1:
  usb/ohci: Fix crash with when specifying too many num-ports
  xen: add pvUSB backend
  xen: write information about supported backends
  xen: introduce dummy system device

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-23 15:53:02 +01:00
Peter Maydell
38629bf5e4 Merge remote-tracking branch 'remotes/kraxel/tags/pull-vga-20160523-1' into staging
vga: fix CVE-2016-3712 regression, misc virtio-gpu fixes.

# gpg: Signature made Mon 23 May 2016 13:30:26 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-vga-20160523-1:
  vga: add sr_vbe register set
  virtio-gpu: fix ui idx check
  virtio-gpu: use VIRTIO_GPU_MAX_SCANOUTS
  virtio-gpu: check max_outputs only
  virtio-gpu: check max_outputs value
  virtio-vga: propagate on gpu realized error
  virtio-gpu: check early scanout id

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-23 14:50:40 +01:00
Thomas Huth
d400fc018b usb/ohci: Fix crash with when specifying too many num-ports
QEMU currently crashes when an OHCI controller is instantiated with
too many ports, e.g. "-device pci-ohci,num-ports=100,masterbus=1".
Thus add a proper check in usb_ohci_init() to make sure that we
do not use more than OHCI_MAX_PORTS = 15 ports here.

Ticket: https://bugs.launchpad.net/qemu/+bug/1581308
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1463995387-11710-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 14:59:40 +02:00
Gerd Hoffmann
94ef4f337f vga: add sr_vbe register set
Commit "fd3c136 vga: make sure vga register setup for vbe stays intact
(CVE-2016-3712)." causes a regression.  The win7 installer is unhappy
because it can't freely modify vga registers any more while in vbe mode.

This patch introduces a new sr_vbe register set.  The vbe_update_vgaregs
will fill sr_vbe[] instead of sr[].  Normal vga register reads and
writes go to sr[].  Any sr register read access happens through a new
sr() helper function which will read from sr_vbe[] with vbe active and
from sr[] otherwise.

This way we can allow guests update sr[] registers as they want, without
allowing them disrupt vbe video modes that way.

Cc: qemu-stable@nongnu.org
Reported-by: Thomas Lamprecht <thomas@lamprecht.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1463475294-14119-1-git-send-email-kraxel@redhat.com
2016-05-23 14:28:25 +02:00
Juergen Gross
816ac92ef7 xen: add pvUSB backend
Add a backend for para-virtualized USB devices for xen domains.

The backend is using host-libusb to forward USB requests from a
domain via libusb to the real device(s) passed through.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Message-id: 1463062421-613-4-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
6b860806c0 virtio-gpu: fix ui idx check
Fix off-by-one value check (0 is the first scanout).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-7-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Juergen Gross
637c53ffcb xen: write information about supported backends
Add a Xenstore directory for each supported pv backend. This will allow
Xen tools to decide which backend type to use in case there are
multiple possibilities.

The information is added under
/local/domain/<backend-domid>/device-model/<domid>/backends
before the "running" state is written to Xenstore. Using a directory
for each backend enables us to add parameters for specific backends
in the future.

This interface is documented in the Xen source repository in the file
docs/misc/qemu-backends.txt

In order to reuse the Xenstore directory creation already present in
hw/xen/xen_devconfig.c move the related functions to
hw/xen/xen_backend.c where they fit better.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Message-id: 1463062421-613-3-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
acfc484650 virtio-gpu: use VIRTIO_GPU_MAX_SCANOUTS
The value is defined in virtio_gpu.h already (changing from 4 to 16).

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-6-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Juergen Gross
9432e53a5b xen: introduce dummy system device
Introduce a new dummy system device serving as parent for virtual
buses. This will enable new pv backends to introduce virtual buses
which are removable again opposed to system buses which are meant
to stay once added.

Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Wei Liu <wei.liu2@citrix.com>
Message-id: 1463062421-613-2-git-send-email-jgross@suse.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
2fe760554e virtio-gpu: check max_outputs only
The scanout id should not be above the configured num_scanouts.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-5-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
5e3d741c6a virtio-gpu: check max_outputs value
The value must be less than VIRTIO_GPU_MAX_SCANOUT.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-4-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
d0f0c8654a virtio-vga: propagate on gpu realized error
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-3-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Marc-André Lureau
fe89fdebca virtio-gpu: check early scanout id
Before accessing the g->scanout array, in order to avoid potential
out-of-bounds access.

Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1463653560-26958-2-git-send-email-marcandre.lureau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-23 13:30:03 +02:00
Jason J. Herne
d85a31d1f4 migration: Promote improved autoconverge commands out of experimental state
The new autoconverge throttling commands have been tested for a release now. It
is time to move them out of the experimental state.

Signed-off-by: Jason J. Herne <jjherne@linux.vnet.ibm.com>
Message-Id: <1461262038-8197-1-git-send-email-jjherne@linux.vnet.ibm.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 16:05:09 +05:30
Peter Maydell
e081c24d30 Merge remote-tracking branch 'remotes/ehabkost/tags/machine-pull-request' into staging
Machine Core queue, 2016-05-20

# gpg: Signature made Fri 20 May 2016 21:26:49 BST using RSA key ID 984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"

* remotes/ehabkost/tags/machine-pull-request: (21 commits)
  Use &error_fatal when initializing crypto on qemu-{img,io,nbd}
  vl: Use &error_fatal when parsing monitor options
  vl: Use &error_fatal when parsing VNC options
  machine: add properties to compat_props incrementaly
  vl: Simplify global property registration
  vl: Make display_remote a local variable
  vl: Move DisplayType typedef to vl.c
  vl: Make display_type a local variable
  vl: Replace DT_NOGRAPHIC with machine option
  milkymist: Move DT_NOGRAPHIC check outside milkymist_tmu2_create()
  spice: Initialization stubs on qemu-spice.h
  gtk: Initialization stubs
  cocoa: cocoa_display_init() stub
  sdl: Initialization stubs
  curses: curses_display_init() stub
  vnc: Initialization stubs
  vl: Add DT_COCOA DisplayType value
  vl: Replace *_vga_available() functions with class_names field
  vl: Table-based select_vgahw()
  vl: Use exit(1) when requested VGA interface is unavailable
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-23 10:30:41 +01:00
Markus Armbruster
b72fe9e690 migration/qjson: Drop gratuitous use of QOM
All the use of QOM buys us here is the ability to destroy the thing
with object_unref(OBJECT(vmdesc)).  Not worth the notational overhead.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1462380558-2030-3-git-send-email-armbru@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 14:16:12 +05:30
Markus Armbruster
17b74b9867 migration: Move qjson.[ch] to migration/
Type QJSON lets you build JSON text.  Its interface mirrors (a subset
of) abstract JSON syntax.

QAPI output visitors also produce JSON text.  They assert their
preconditions and invariants, and therefore abort on incorrect use.

Contrastingly, QJSON does *not* detect incorrect use.  It happily
produces invalid JSON then.  This is what migration wants.

QJSON was designed for migration, and migration is its only user.
Move it to migration/ for proper coverage by MAINTAINERS, and to deter
accidental use outside migration.

[Pointed out by Eric: QJSON was added in commits 0457d07..b174257
 -- Amit]

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <1462380558-2030-2-git-send-email-armbru@redhat.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 14:16:09 +05:30
Wei Jiangang
cde6361534 rng-random: rename RndRandom to RngRandom
Usually, Random Number Generator is abbreviated to RNG/rng.
so replacing RndRandom with RngRandom seems more reasonable
and keep consistent with RngBackend.

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Reviewed-by: Pankaj Gupta <pagupta@redhat.com>
Message-Id: <1460684168-5403-1-git-send-email-weijg.fnst@cn.fujitsu.com>
Signed-off-by: Amit Shah <amit.shah@redhat.com>
2016-05-23 12:18:43 +05:30
Eduardo Habkost
e8f2d2722e Use &error_fatal when initializing crypto on qemu-{img,io,nbd}
In addition to making the code simpler, this will replace the
long error messages:
  cannot initialize crypto: Unable to initialize GNUTLS library: [...]
  cannot initialize crypto: Unable to initialize gcrypt
with shorter messages:
  Unable to initialize GNUTLS library: [...]
  Unable to initialize gcrypt

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:55 -03:00
Eduardo Habkost
822ac12df0 vl: Use &error_fatal when parsing monitor options
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:55 -03:00
Eduardo Habkost
7b1ee0f2b7 vl: Use &error_fatal when parsing VNC options
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:55 -03:00
Igor Mammedov
bacc344c54 machine: add properties to compat_props incrementaly
Switch to adding compat properties incrementaly instead of
completly overwriting compat_props per machine type.
That removes data duplication which we have due to nested
[PC|SPAPR]_COMPAT_* macros.

It also allows to set default device properties from
default foo_machine_options() hook, which will be used
in following patch for putting VMGENID device as
a function if ISA bridge on pc/q35 machines.

Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
[ehabkost: Fixed CCW_COMPAT_* and PC_COMPAT_0_* defines]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
16714b1680 vl: Simplify global property registration
There's no need to use qdev_prop_register_global_list() and an
array, if we are registering a single GlobalProperty struct. Use
qdev_prop_register_global() instead.

Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
1f0dfe02d4 vl: Make display_remote a local variable
The variable is used only inside main(), so it can be local.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
0cb48c4678 vl: Move DisplayType typedef to vl.c
Now the type is only used inside vl.c and doesn't need to be in a
header file.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
d29345d011 vl: Make display_type a local variable
Now display_type is only used inside main(), and don't need to be a
global variable.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
cfc58cf373 vl: Replace DT_NOGRAPHIC with machine option
All DisplayType values are just UI options that don't affect any
hardware emulation code, except for DT_NOGRAPHIC. Replace
DT_NOGRAPHIC with DT_NONE plus a new "-machine graphics=on|off"
option, so hardware emulation code don't need to use the
display_type variable.

Cc: Michael Walle <michael@walle.cc>
Cc: Blue Swirl <blauwirbel@gmail.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
cf3dc71eb5 milkymist: Move DT_NOGRAPHIC check outside milkymist_tmu2_create()
DT_NOGRAPHIC handling will be moved to a MachineState field, and
it will be easier to change milkymist_init() to check that field.

Cc: Michael Walle <michael@walle.cc>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:54 -03:00
Eduardo Habkost
6f0c894c25 spice: Initialization stubs on qemu-spice.h
This reduces the number of CONFIG_SPICE #ifdefs in vl.c.

Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:53 -03:00
Eduardo Habkost
19a2c6269f gtk: Initialization stubs
This reduces the number of CONFIG_GTK #ifdefs in vl.c.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:53 -03:00
Eduardo Habkost
e35ee7c1aa cocoa: cocoa_display_init() stub
One less #ifdef in vl.c.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:53 -03:00
Eduardo Habkost
476db0814d sdl: Initialization stubs
This reduces the number of CONFIG_SDL #ifdefs in vl.c.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:53 -03:00
Eduardo Habkost
674ec68693 curses: curses_display_init() stub
One less #ifdef in vl.c.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:53 -03:00
Eduardo Habkost
f8c75b2486 vnc: Initialization stubs
This reduces the number of CONFIG_VNC #ifdefs in the vl.c code.

The only user-visible difference is that this will make QEMU
complain about syntax when using "-display vnc" ("VNC requires a
display argument vnc=<display>") even if CONFIG_VNC is disabled.

Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Eduardo Habkost
7b7d2be50c vl: Add DT_COCOA DisplayType value
Instead of reusing DT_SDL for Cocoa, use DT_COCOA to indicate
that a Cocoa display was requested.

configure already ensures CONFIG_COCOA and CONFIG_SDL are never
set at the same time. The only case where DT_SDL is used outside
a #ifdef CONFIG_SDL block is in the no_frame/alt_grab/ctrl_grab
check. That means the only user-visible change is that we will
start printing a warning if the SDL-specific options are used in
Cocoa mode. This is a bugfix, because no_frame/alt_grab/ctrl_grab
are not used by Cocoa code.

Cc: Andreas Färber <andreas.faerber@web.de>
Cc: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Andreas Färber <andreas.faerber@web.de>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Eduardo Habkost
c2c7b22db1 vl: Replace *_vga_available() functions with class_names field
Instead of requiring a separate function for each VGA interface,
just enumerate the corresponding class names on struct
VGAInterfaceInfo.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Eduardo Habkost
8c9a2b71de vl: Table-based select_vgahw()
Instead of implementing separate check functions for each vga
interface type, add a table enumerating the possible VGA
interfaces.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Eduardo Habkost
4aeae8768a vl: Use exit(1) when requested VGA interface is unavailable
Instead of using exit(0), use exit(1) when an unavailable VGA
interface is used in the command-line to indicate it's an error.

Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Cao jin
07fcd59de6 pc-dimm: correct comment of MemoryHotplugState
correct comment and remove an unused macro. commit adcb4ee6
already correct its type

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
2016-05-20 14:28:52 -03:00
Paolo Bonzini
65603e2fc1 tci: do not include exec/exec-all.h
TCI does not need the runtime definition in exec-all.h.  It only needs the
host-side definitions in tcg/tcg.h.  Now that cpu.h is not included
everywhere, this caused a failure because exec-all.h does need cpu.h
but does not include it itself.

Fix by including the intended header.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1463745452-25831-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-20 15:07:46 +01:00
Paolo Bonzini
22b31af26f aspeed: include qemu/log.h
This is not visible with the default "log" trace backend.  With other
backends however trace.h does not include qemu/log.h, resulting in
build failures.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1463745452-25831-2-git-send-email-pbonzini@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-20 13:09:22 +01:00
Peter Maydell
6bd8ab6889 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Thu 19 May 2016 16:09:27 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (31 commits)
  qemu-iotests: Fix regression in 136 on aio_read invalid
  qemu-iotests: Simplify 109 with unaligned qemu-img compare
  qemu-io: Fix recent UI updates
  block: clarify error message for qmp-eject
  qemu-iotests: Some more write_zeroes tests
  qcow2: Fix write_zeroes with partially allocated backing file cluster
  qcow2: fix condition in is_zero_cluster
  block: Propagate AioContext change to all children
  block: Remove BlockDriverState.blk
  block: Don't return throttling info in query-named-block-nodes
  block: Avoid bs->blk in bdrv_next()
  block: Add bdrv_has_blk()
  block: Remove bdrv_aio_multiwrite()
  blockjob: Don't touch BDS iostatus
  blockjob: Don't set iostatus of target
  block: User BdrvChild callback for device name
  block: Use BdrvChild callbacks for change_media/resize
  block: Don't check throttled reqs in bdrv_requests_pending()
  Revert "block: Forbid I/O throttling on nodes with multiple parents for 2.6"
  block: Remove bdrv_move_feature_fields()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-19 16:54:12 +01:00
Kevin Wolf
7753da2351 Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-05-19' into queue-block
Block patches

# gpg: Signature made Thu May 19 16:58:53 2016 CEST using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-05-19:
  qemu-iotests: Fix regression in 136 on aio_read invalid
  qemu-iotests: Simplify 109 with unaligned qemu-img compare
  qemu-io: Fix recent UI updates

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:59:46 +02:00
Eric Blake
37546ff28f qemu-iotests: Fix regression in 136 on aio_read invalid
Commit 093ea232 removed the ability for aio_read and aio_write
to artificially inflate the invalid statistics counters for
block devices, since it no longer flags unaligned offset or
length.  Add 'aio_read -i' and 'aio_write -i' to restore
the ability, and update test 136 to use it.

Reported-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1463416983-28318-4-git-send-email-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:56:58 +02:00
Eric Blake
9e28bb26c2 qemu-iotests: Simplify 109 with unaligned qemu-img compare
For some time now, qemu-img compare has been able to compare
unaligned images.  So we no longer need test 109's hack of
resizing to sector boundaries before invoking compare.

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1463416983-28318-3-git-send-email-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:56:58 +02:00
Eric Blake
4ca1d3401b qemu-io: Fix recent UI updates
Commit 770e0e0e [*] tried to add 'writev -f', but didn't tweak
the getopt() call to actually let it work.  Likewise, commit
c2e001c missed implementing 'aio_write -u -z'.  The latter commit
also introduced a leak of ctx.

[*] does it sound "ech0e" in here? :)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1463416983-28318-2-git-send-email-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:56:58 +02:00
Peter Maydell
776efef324 Merge remote-tracking branch 'remotes/bonzini/tags/for-upstream' into staging
NEED_CPU_H cleanups, big enough to deserve their own pull request.

# gpg: Signature made Thu 19 May 2016 15:42:37 BST using RSA key ID 78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg:                 aka "Paolo Bonzini <pbonzini@redhat.com>"

* remotes/bonzini/tags/for-upstream: (52 commits)
  hw: clean up hw/hw.h includes
  hw: remove pio_addr_t
  cpu: move exec-all.h inclusion out of cpu.h
  exec: extract exec/tb-context.h
  hw: explicitly include qemu/log.h
  mips: move CP0 functions out of cpu.h
  arm: move arm_log_exception into .c file
  qemu-common: push cpu.h inclusion out of qemu-common.h
  acpi: do not use TARGET_PAGE_SIZE
  s390x: reorganize CSS bits between cpu.h and other headers
  dma: do not depend on kvm_enabled()
  gdbstub: remove unnecessary includes from gdbstub-xml.c
  qemu-common: stop including qemu/host-utils.h from qemu-common.h
  qemu-common: stop including qemu/bswap.h from qemu-common.h
  cpu: move endian-dependent load/store functions to cpu-all.h
  hw: cannot include hw/hw.h from user emulation
  hw: move CPU state serialization to migration/cpu.h
  hw: do not use VMSTATE_*TL
  include: poison symbols in osdep.h
  apic: move target-dependent definitions to cpu.h
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-19 15:55:08 +01:00
John Snow
3a3086b72a block: clarify error message for qmp-eject
If you use HMP's eject but the CDROM tray is locked, you may get a
confusing error message informing you that the "tray isn't open."

As this is the point of eject, we can do a little better and help
clarify that the tray was locked and that it (might) open up later,
so try again.

It's not ideal, but it makes the semantics of the (legacy) eject
command more understandable to end users when they try to use it.

Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
1ef7d01021 qemu-iotests: Some more write_zeroes tests
This covers some more write_zeroes cases which are relevant for the
recent qcow2 optimisations that check the allocation status of the
backing file for partial cluster write_zeroes requests.

This needs to be separate from 034 because we can only support qcow2 in
this test case for multiple reasons: We check the allocation status
after write_zeroes with 'qemu-img map' and the optimised behaviour that
produces zero clusters is only implemented in qcow2; second, the map
command returns offsets that are qcow2 specific; and finally, we also
use 512 byte clusters which aren't supported for formats like qed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
5efdf53227 qcow2: Fix write_zeroes with partially allocated backing file cluster
In order to correctly check whether a given cluster is read as zero, we
don't only need to check whether bdrv_get_block_status_above() sets
BDRV_BLOCK_ZERO, but also if all sectors for the whole cluster have the
same status.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
2016-05-19 16:45:31 +02:00
Denis V. Lunev
f575f145f4 qcow2: fix condition in is_zero_cluster
We should check for (res & BDRV_BLOCK_ZERO) only. The situation when we
will have !(res & BDRV_BLOCK_DATA) and will not have BDRV_BLOCK_ZERO is
not possible for images with bdi.unallocated_blocks_are_zero == true.

For those images where it's false, however, it can happen and we must
not consider the data zeroed then or we would corrupt the image.

Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:45:31 +02:00
Max Reitz
b97511c7bc block: Propagate AioContext change to all children
Instead of propagating any change of a BDS's AioContext only to its file
and backing children and letting driver-specific code do the rest, just
propagate it to all and drop the thus superfluous implementations of
bdrv_{at,de}tach_aio_context() in Quorum, blkverify and VMDK.

Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
1f0c461b82 block: Remove BlockDriverState.blk
This patch removes the remaining users of bs->blk, which will allow us
to have multiple BBs on top of a single BDS. In the meantime, all checks
that are currently in place to prevent the user from creating such
setups can be switched to bdrv_has_blk() instead of accessing BDS.blk.

Future patches can allow them and e.g. enable users to mirror to a block
device that already has a BlockBackend on it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
79c719b755 block: Don't return throttling info in query-named-block-nodes
query-named-block-nodes should not return information that is related
to the attached BlockBackend rather than the node itself, so throttling
information needs to be removed from it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
7c8eece45b block: Avoid bs->blk in bdrv_next()
We need to introduce a separate BdrvNextIterator struct that can keep
more state than just the current BDS in order to avoid using the bs->blk
pointer.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
dde33812a8 block: Add bdrv_has_blk()
In many cases we just want to know whether a BDS has at least one BB
attached, without needing to know the exact BB that is attached. In
contrast to bs->blk, this is still a valid question when more than one
BB can be attached, so just answer it by checking the parents list.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
91c6e4b7bb block: Remove bdrv_aio_multiwrite()
Since virtio-blk implements request merging itself these days, the only
remaining users are test cases for the function. That doesn't make the
function exactly useful any more.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
66a0fae438 blockjob: Don't touch BDS iostatus
Block jobs don't actually make use of the iostatus for their BDSes, but
they manage a separate block job iostatus. Still, they require that it
is enabled for the source BDS and they enable it automatically for the
target and set the error handling mode - which ends up never being used
by the job.

This patch removes all of the BDS iostatus handling from the block job,
which removes another few bs->blk accesses.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
81e254dc83 blockjob: Don't set iostatus of target
When block job errors were introduced, we assigned the iostatus of the
target BDS "just in case". The field has never been accessible for the
user because the target isn't listed in query-block.

Before we can allow the user to have a second BlockBackend on the
target, we need to clean this up. If anything, we would want to set the
iostatus for the internal BB of the job (which we can always do later),
but certainly not for a separate BB which the job doesn't even use.

As a nice side effect, this gets us rid of another bs->blk use.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
4c265bf9f4 block: User BdrvChild callback for device name
In order to get rid of bs->blk for bdrv_get_device_name() and
bdrv_get_device_or_node_name(), ask all parents for their name and
simply pick the first one.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
5c8cab4808 block: Use BdrvChild callbacks for change_media/resize
We want to get rid of BlockDriverState.blk in order to allow multiple
BlockBackends per BDS. Converting the device callbacks in block.c (which
assume a single BlockBackend) to per-child callbacks gets us rid of the
first few instances.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
cbe1beb7a1 block: Don't check throttled reqs in bdrv_requests_pending()
Checking whether there are throttled requests requires going to the
associated BlockBackend, which we want to avoid.

All users of bdrv_requests_pending() in block/io.c already call
bdrv_parent_drained_begin() first, which restarts all throttled
requests, so no throttled requests can be left here and this is removal
of dead code.

The remaining users (assertions during graph manipulation in block.c)
don't care about requests that are still queued in the BlockBackend and
haven't been issued for a BlockDriverState yet.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
b26ded9a7d Revert "block: Forbid I/O throttling on nodes with multiple parents for 2.6"
This reverts commit 76b223200e.

Now that I/O throttling is fully done on the BlockBackend level, there
is no reason any more to block I/O throttling for nodes with multiple
parents as the parents don't influence each other any more.

Conflicts:
	block.c

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
08e83aabe4 block: Remove bdrv_move_feature_fields()
bdrv_move_feature_fields() and swap_feature_fields() are empty now, they
can be removed.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:31 +02:00
Kevin Wolf
7ca7f0f6db block: Decouple throttling from BlockDriverState
This moves the throttling related part of the BDS life cycle management
to BlockBackend. The throttling group reference is now kept even when no
medium is inserted.

With this commit, throttling isn't disabled and then re-enabled any more
during graph reconfiguration. This fixes the temporary breakage of I/O
throttling when used with live snapshots or block jobs that manipulate
the graph.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
bb9aaecaf1 block/io: Quiesce parents between drained_begin/end
So far, bdrv_parent_drained_begin/end() was called for the duration of
the actual bdrv_drain() at the beginning of a drained section, but we
really should keep parents quiesced until the end of the drained
section.

This does not actually change behaviour at this point because the only
user of the .drained_begin/end BdrvChildRole callback is I/O throttling,
which already doesn't send any new requests after flushing its queue in
.drained_begin. The patch merely removes a trap for future users.

Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
c2066af051 block: Drain throttling queue with BdrvChild callback
This removes the last part of I/O throttling from block/io.c and moves
it to the BlockBackend.

Instead of having knowledge about throttling inside io.c, we can call a
BdrvChild callback .drained_begin/end, which happens to drain the
throttled requests for BlockBackend parents.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
22aa8b246a block: Introduce BdrvChild.opaque
BlockBackends use it to get a back pointer from BdrvChild to
BlockBackend in any BdrvChildRole callbacks.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
97148076e8 block: Move I/O throttling configuration functions to BlockBackend
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
441565b279 block: Move actual I/O throttling to BlockBackend
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
27ccdd5259 block: Move throttling fields from BDS to BB
This patch changes where the throttling state is stored (used to be the
BlockDriverState, now it is the BlockBackend), but it doesn't actually
make it a BB level feature yet. For example, throttling is still
disabled when the BDS is detached from the BB.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:30 +02:00
Kevin Wolf
49d2165d7d block: Convert throttle_group_get_name() to BlockBackend
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:29 +02:00
Kevin Wolf
31dce3ccca block: throttle-groups: Use BlockBackend pointers internally
As a first step towards moving I/O throttling to the BlockBackend level,
this patch changes all pointers in struct ThrottleGroup from referencing
a BlockDriverState to referencing a BlockBackend.

This change is valid because we made sure that throttling can only be
enabled on BDSes which have a BB attached.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:29 +02:00
Kevin Wolf
f2cd875d54 block: Introduce BlockBackendPublic
Some features, like I/O throttling, are implemented outside
block-backend.c, but still want to keep information in BlockBackend,
e.g. list entries that allow keeping a list of BlockBackends.

In order to avoid exposing the whole struct layout in the public header
file, this patch introduces an embedded public struct where such
information can be added and a pair of functions to convert between
BlockBackend and BlockBackendPublic.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:29 +02:00
Kevin Wolf
a5614993d7 block: Make sure throttled BDSes always have a BB
It was already true in principle that a throttled BDS always has a BB
attached, except that the order of operations while attaching or
detaching a BDS to/from a BB wasn't careful enough.

This commit breaks graph manipulations while I/O throttling is enabled.
It would have been possible to keep things working with some temporary
hacks, but quite cumbersome, so it's not worth the hassle. We'll fix
things again in a minute.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-19 16:45:29 +02:00
Paolo Bonzini
df43d49cb8 hw: clean up hw/hw.h includes
Include qom/object.h and exec/memory.h instead of exec/ioport.h;
exec/ioport.h was almost everywhere required only for those two
includes, not for the content of the header itself.

Remove block/aio.h, everybody is already including it through
another path.

With this change, include/hw/hw.h is freed from qemu-common.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:30 +02:00
Paolo Bonzini
89a80e7400 hw: remove pio_addr_t
pio_addr_t is almost unused, because these days I/O ports are simply
accessed through the address space.  cpu_{in,out}[bwl] themselves are
almost unused; monitor.c and xen-hvm.c could use address_space_read/write
directly, since they have an integer size at hand.  This leaves qtest as
the only user of those functions.

On the other hand even portio_* functions use this type; the only
interesting use of pio_addr_t thus is include/hw/sysbus.h.  I guess I
could move it there, but I don't see much benefit in that either.  Using
uint32_t is enough and avoids the need to include ioport.h everywhere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:30 +02:00
Paolo Bonzini
63c915526d cpu: move exec-all.h inclusion out of cpu.h
exec-all.h contains TCG-specific definitions.  It is not needed outside
TCG-specific files such as translate.c, exec.c or *helper.c.

One generic function had snuck into include/exec/exec-all.h; move it to
include/qom/cpu.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
00f6da6a1a exec: extract exec/tb-context.h
TCG backends do not need most of exec-all.h; extract what they actually
need to a separate file or move it directly to tcg.h.  The next patch
will stop including exec-all.h from everywhere.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
03dd024ff5 hw: explicitly include qemu/log.h
Move the inclusion out of hw/hw.h, most files do not need it.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
e6623d88f4 mips: move CP0 functions out of cpu.h
These are here for historical reasons: they are needed from both gdbstub.c
and op_helper.c, and the latter was compiled with fixed AREG0.  It is
not needed anymore, so uninline them.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
27a7ea8a1f arm: move arm_log_exception into .c file
Avoid need for qemu/log.h inclusion, and make the function static too.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
33c11879fd qemu-common: push cpu.h inclusion out of qemu-common.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:29 +02:00
Paolo Bonzini
35c5a52d1d acpi: do not use TARGET_PAGE_SIZE
This is a #define used by the CPU.  NVDIMM can just use 4K
unconditionally.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
bd3f16ac30 s390x: reorganize CSS bits between cpu.h and other headers
Move cpu_inject_* to the only C file where they are used.

Move ioinst.h declarations that need S390CPU to cpu.h, to make
ioinst.h independent of cpu.h.

Move channel declarations that only need SubchDev from cpu.h
to css.h, to make more channel users independent of cpu.h.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
77ac58ddc6 dma: do not depend on kvm_enabled()
Memory barriers are needed also by Xen and, when the ioeventfd
bugs are fixed, by TCG as well.

sysemu/kvm.h is not anymore needed in sysemu/dma.h, move it to
the actual users.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
da16384560 gdbstub: remove unnecessary includes from gdbstub-xml.c
gdbstub-xml.c defines a bunch of arrays of strings; there is no
need to include anything.  Keep osdep.h for consistency, but remove
the rest.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
87776ab72b qemu-common: stop including qemu/host-utils.h from qemu-common.h
Move it to the actual users.  There are some inclusions of
qemu/host-utils.h in headers, but they are all necessary.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
58369e22cf qemu-common: stop including qemu/bswap.h from qemu-common.h
Move it to the actual users.  There are still a few includes of
qemu/bswap.h in headers; removing them is left for future work.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
a7d6039cb3 cpu: move endian-dependent load/store functions to cpu-all.h
Disentangle cpu-common.h and memory.h from NEED_CPU_H.  Prototypes are
not defined for !NEED_CPU_H, so remove them from poison.h too.  Only
macros need poisoning.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
741da0d38b hw: cannot include hw/hw.h from user emulation
All qdev definitions are available from other headers, user-mode
emulation does not need hw/hw.h.

By considering system emulation only, it is simpler to disentangle
hw/hw.h from NEED_CPU_H.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
1e00b8d57a hw: move CPU state serialization to migration/cpu.h
Remove usage of NEED_CPU_H from hw/hw.h.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
cbd62f8616 hw: do not use VMSTATE_*TL
Reserve this to CPU state serialization.

Luckily, they were only used by sPAPR devices and these are ppc64
only.  So there is no change to migration format.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
bdd902277c include: poison symbols in osdep.h
Ensure that all target-independent files ignore poisoned symbols,
and fix the fallout.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
d613f8cc33 apic: move target-dependent definitions to cpu.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:28 +02:00
Paolo Bonzini
e81096b1c8 explicitly include linux/kvm.h
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
3b3d264888 explicitly include hw/qdev-core.h
exec/cpu-all.h includes qom/cpu.h, which includes hw/qdev-core.h.
Explicit inclusion will keep things working when cpu.h will not be
included indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
7d0c99a9d8 explicitly include qom/cpu.h
exec/cpu-all.h includes qom/cpu.h.  Explicit inclusion
will keep things working when cpu.h will not be included
indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
8ea952d679 arm: remove useless cpu.h inclusion
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
aa5a9e2484 ppc: use PowerPCCPU instead of CPUPPCState
This changes a cpu.h dependency for hw/ppc/ppc.h into a cpu-qom.h
dependency.  For it to compile we also need to clean up a few unused
definitions.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
5a975d435a mips: use MIPSCPU instead of CPUMIPSState
This changes a cpu.h dependency into a cpu-qom.h dependency.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
0774831d08 alpha: include cpu-qom.h in files that require AlphaCPU
This will keep things working when cpu.h will not be included
indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
b4c1c6fc61 sh4: include cpu-qom.h in files that require SuperHCPU
This will keep things working when cpu.h will not be included
indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
4669fcc7fa m68k: include cpu-qom.h in files that require M68KCPU
This will keep things working when cpu.h will not be included
indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
16fd646182 arm: include cpu-qom.h in files that require ARMCPU
This will keep things working when cpu.h will not be included
indirectly almost everywhere (either directly or through
qemu-common.h).

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:27 +02:00
Paolo Bonzini
da37426169 target-xtensa: make cpu-qom.h not target specific
Make XtensaCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  Conversely, move all definitions needed to
define a class to cpu-qom.h.  This helps making files independent of
NEED_CPU_H if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:42:26 +02:00
Paolo Bonzini
55b1142259 target-unicore32: make cpu-qom.h not target specific
Make UniCore32CPU an opaque type within cpu-qom.h, and move all
definitions of private methods, as well as all type definitions that
require knowledge of the layout to cpu.h.  This helps making files
independent of NEED_CPU_H if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:34 +02:00
Paolo Bonzini
fc111b107a target-tricore: make cpu-qom.h not target specific
Make TriCoreCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:34 +02:00
Paolo Bonzini
d61d1b2061 target-sparc: make cpu-qom.h not target specific
Make SPARCCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:34 +02:00
Paolo Bonzini
e6005f66f9 target-sh4: make cpu-qom.h not target specific
Make SuperHCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:34 +02:00
Paolo Bonzini
a4a02f99ff target-s390x: make cpu-qom.h not target specific
Make S390XCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:34 +02:00
Paolo Bonzini
2d34fe392c target-ppc: make cpu-qom.h not target specific
Make PowerPCCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  Conversely, move all definitions needed to define
a class to cpu-qom.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 16:41:33 +02:00
Paolo Bonzini
c771dabf55 target-ppc: do not make PowerPCCPUClass depend on target-specific symbols
Just leave some members in even if they are unused on e.g.
32-bit PPC or user-mode emulation.  This avoids complications
when using PowerPCCPUClass in code that is compiled just
once (because it applies to both 32-bit and 64-bit PPC
for example) but still needs to peek at PPC-specific members.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:05 +02:00
Paolo Bonzini
b2305601d3 target-ppc: do not use target_ulong in cpu-qom.h
Bring the PowerPCCPUClass handle_mmu_fault method type into line with
the one in CPUClass.

Using vaddr also makes the cpu-qom.h file target independent.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:05 +02:00
Paolo Bonzini
416bf93686 target-mips: make cpu-qom.h not target specific
Make MIPSCPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:05 +02:00
Paolo Bonzini
ffa3a3c6c1 target-microblaze: make cpu-qom.h not target specific
Make MicroBlazeCPU an opaque type within cpu-qom.h, and move all
definitions of private methods, as well as all type definitions that
require knowledge of the layout to cpu.h.  This helps making files
independent of NEED_CPU_H if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:05 +02:00
Paolo Bonzini
a836b8fa00 target-m68k: make cpu-qom.h not target specific
Make M68KCPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:05 +02:00
Paolo Bonzini
6adb9c5474 target-lm32: make cpu-qom.h not target specific
Make LM32CPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
4da6f8d954 target-i386: make cpu-qom.h not target specific
Make X86CPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
28618ac652 target-cris: make cpu-qom.h not target specific
Make CRISCPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
74e755647c target-arm: make cpu-qom.h not target specific
Make ARMCPU an opaque type within cpu-qom.h, and move all definitions of
private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
1dc8e6b758 target-alpha: make cpu-qom.h not target specific
Make AlphaCPU an opaque type within cpu-qom.h, and move all definitions
of private methods, as well as all type definitions that require knowledge
of the layout to cpu.h.  This helps making files independent of NEED_CPU_H
if they only need to pass around CPU pointers.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
347b1a5cc6 cpu: make cpu-qom.h only include-able from cpu.h
Make cpu-qom.h so that it is only included from cpu.h.  Then there
is no need for it to include cpu.h again.

Later we will make cpu-qom.h target independent and we will _want_
to include it from elsewhere, but for now reduce the number of cases
to handle.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
f2937a33a5 log: do not use CONFIG_USER_ONLY
This decouples logging further from config-target.h

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
4b4629d9d2 include: move CPU-related definitions out of qemu-common.h
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:08:04 +02:00
Paolo Bonzini
b01501db18 s390x: move .needed functions for subsections to machine.c
These functions are only used when defining subsections, so move
them there.

Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 13:07:34 +02:00
Paolo Bonzini
f115a19c40 scripts: add script to build QEMU and analyze inclusions
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2016-05-19 12:09:28 +02:00
Peter Maydell
8ec4fe0a4b Merge remote-tracking branch 'remotes/mjt/tags/pull-trivial-patches-2016-05-18' into staging
trivial patches for 2016-05-18

# gpg: Signature made Wed 18 May 2016 13:04:43 BST using RSA key ID A4C3D7DB
# gpg: Good signature from "Michael Tokarev <mjt@tls.msk.ru>"
# gpg:                 aka "Michael Tokarev <mjt@corpit.ru>"
# gpg:                 aka "Michael Tokarev <mjt@debian.org>"

* remotes/mjt/tags/pull-trivial-patches-2016-05-18:
  Fix some typos found by codespell
  9p: drop unused declaration from coth.h
  smbios: fix typo
  accel: make configure_accelerator return void
  configure: Use uniform description for devel packages
  ipack: Update e-mail address
  util: fix comment typos
  qdict: fix unbounded stack warning for qdict_array_entries
  Fix typo in variable name (found and fixed by codespell)
  vl: fix comment about when parsing cpu definitions
  loader: fix potential memory leak
  remove comment for nonexistent structure member
  s390: remove misleading comment

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-19 09:27:28 +01:00
Stefan Weil
cb8d4c8f54 Fix some typos found by codespell
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Greg Kurz
d506dc87b9 9p: drop unused declaration from coth.h
Commit "ebac1202c95a virtio-9p: use QEMU thread pool" dropped function
v9fs_init_worker_threads.

Signed-off-by: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Cao jin
cc2324d03d smbios: fix typo
The spec says: "on paragraph (16-byte) boundaries"

Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Wei Jiangang
bdc3f61dec accel: make configure_accelerator return void
Return the negated value of accel_initialised is meaningless,
and the caller vl doesn't check it.

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Stefan Weil
3f3b5388d4 configure: Use uniform description for devel packages
As all other devel packages are written in the form "name devel",
use this form for libcap devel and libattr devel, too.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Alberto Garcia
b996aed510 ipack: Update e-mail address
I'm not really using the old one anymore.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Wei Jiangang
d43eda3d19 util: fix comment typos
Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:27 +03:00
Peter Xu
de4905f4bc qdict: fix unbounded stack warning for qdict_array_entries
Here we use one g_strdup_printf() to replace the two stack allocated
array, considering it's more convenient, safe, and as long as it's
called rarely only when quorum device opens. This will remove the
unbound stack warning when compiling with "-Wstack-usage=1000000".

Reviewed-by:   Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Stefan Weil
1d817db3a0 Fix typo in variable name (found and fixed by codespell)
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Wei Jiangang
37a3e630d9 vl: fix comment about when parsing cpu definitions
machine->init() was replaced with machine_class->init()
in 958db90cd5.

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Cao jin
ed2f3bc1fa loader: fix potential memory leak
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Cao jin
ec609656fc remove comment for nonexistent structure member
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Michael Tokarev
f35c1f66ad s390: remove misleading comment
The comment talks about a non-ELF object while the
example gives ELF object.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2016-05-18 15:04:26 +03:00
Peter Maydell
a257c74149 Merge remote-tracking branch 'remotes/cohuck/tags/s390x-20160517' into staging
First batch of s390x patches for 2.7:
- The new machine for 2.7
- Make use of the runtime instrumentation support introduced in
  the kernel
- Enhance our ipl (boot) process: We can now start from devices
  in subchannel sets > 0 as well. As a bonus, the conversion to
  diag308 in the bios allows us to get rid of the gr7 hack.
- Xiaoqiang Zhao's SCLP qomification patches
- Several fixes in the s390x pci implementation

# gpg: Signature made Tue 17 May 2016 15:35:32 BST using RSA key ID C6F02FAF
# gpg: Good signature from "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg:                 aka "Cornelia Huck <cornelia.huck@de.ibm.com>"

* remotes/cohuck/tags/s390x-20160517:
  s390x/pci: remove whitespace
  s390x/pci: add length checking for pci sclp handlers
  s390x/pci: enhance mpcifc_service_call
  s390x/pci: fix s390_pci_sclp_deconfigure
  s390x/pci: introduce S390PCIBusDevice.iommu_enabled
  s390x/pci: export pci_dereg_ioat and pci_dereg_irqs
  s390x/pci: separate s390_pcihost_iommu_configure function
  s390x/pci: separate s390_sclp_configure function
  s390x/pci: fix reg_irqs()
  hw/char: QOM'ify sclpconsole.c
  hw/char: QOM'ify sclpconsole-lm.c
  s390x/ipl: Remove redundant usage of gr7
  s390-ccw.img: rebuild image
  pc-bios/s390-ccw: Get device address via diag 308/6
  s390x/ipl: Add ssid field to IplParameterBlock
  s390x/ipl: Provide ipl parameter block
  s390x/ipl: Add type and length checks for IplParameterBlock values
  s390x/ipl: Extend the IplParameterBlock struct
  s390x: enable runtime instrumentation
  s390x: add compat machine for 2.7

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-17 16:49:11 +01:00
Yi Min Zhao
c26916942a s390x/pci: remove whitespace
Fix indentation of PciCfgSccb struct.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
3b40ea2957 s390x/pci: add length checking for pci sclp handlers
The configure/deconfigure sclp commands need a SCCB with a length of
at least 16. Indicate in the response code if this is not fulfilled.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
a6d9d4f26a s390x/pci: enhance mpcifc_service_call
Enhance error handling for mpcifc_service_call() to propagate errors
to guest by setting status codes or triggering program interrupts.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
259a4f0a76 s390x/pci: fix s390_pci_sclp_deconfigure
When deconfiguring a s390 pci device, we should deconfigure the
corresponding IOMMU memory region and the IRQs for the device.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
df6a050c82 s390x/pci: introduce S390PCIBusDevice.iommu_enabled
We introduce iommu_enabled field for S390PCIBusDevice struct to
track whether the iommu has been enabled for the device. This allows
us to stop temporarily changing ->configured while en/disabling the
iommu and to do conditional cleanup later.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
e141dbadfa s390x/pci: export pci_dereg_ioat and pci_dereg_irqs
dereg_irqs and dereg_ioat are needed by external functions. Let's
rename and export both of them in s390-pci-inst.h.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
715838881f s390x/pci: separate s390_pcihost_iommu_configure function
Split s390_pcihost_iommu_configure() into separate functions for
configuring and deconfiguring in order to make the code more readable.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
8f5cb69313 s390x/pci: separate s390_sclp_configure function
Split s390_sclp_configure() into separate functions for sclp
configuring and deconfiguring in order to make the code more readable.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Yi Min Zhao
bac45d5147 s390x/pci: fix reg_irqs()
In reg_irqs(), present code assumes that map_indicator() always issues
successfully. Let's check it and return the error to caller in order to
inform guest.

Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
xiaoqiang zhao
3f6ec642ae hw/char: QOM'ify sclpconsole.c
Drop the DO_UPCAST macro

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1459237645-17227-7-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
xiaoqiang zhao
e563c59b6a hw/char: QOM'ify sclpconsole-lm.c
Drop the DO_UPCAST macro

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-Id: <1459237645-17227-6-git-send-email-zxq_yx_007@163.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
010d45d279 s390x/ipl: Remove redundant usage of gr7
We don't need to pass device address for pc-bios using gr7 anymore as
the pcbios completely relies on diag308 now, so we can remove it from
qemu. devno, ssid and cssid are migrated but the value was never reused,
so we can safely ignore these fields and migrate 0.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Cornelia Huck
a388ac74de s390-ccw.img: rebuild image
Contains the following change:

pc-bios/s390-ccw: Get device address via diag 308/6

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
d046c51dad pc-bios/s390-ccw: Get device address via diag 308/6
To IPL from a device, pc-bios receives from qemu a device address via
general register 7. The better way to do it is to use diag308/6
instruction which returns so called
"IplParameterBlock". IplParameterBlock contains the device address for
IPL and additional parameters that can be used by pc-bios.

This patch allows pc-bios to get device address via diag308/6 and
doesn't use gr7 passed boot information anymore.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
3041e3bead s390x/ipl: Add ssid field to IplParameterBlock
Add the ssid field to the ipl parameter block struct and fill it when
necessary so the guest can use it.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
6aed958978 s390x/ipl: Provide ipl parameter block
Right now we return the ipl parameter block only if the guest
specified one. Let's fill in the parameter block when bootindex
parameter is available and not booting from an external kernel.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
9946a9113c s390x/ipl: Add type and length checks for IplParameterBlock values
We can check for valid type and lengths of the IplParameterBlock fields
when receiving the struct from the guest.

Length of the IplParameterBlock can be less than 4K. To play safe we can
read and write only required amount of data.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenband <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Alexander Yarygin
04ca4b92ec s390x/ipl: Extend the IplParameterBlock struct
The IplParameterBlock struct currently has only 200 bytes filled, but it
can be up to 4K.

This patch converts the struct to union with a fully populated struct
inside it and second struct with old values.

For compatibility reasons we disable migration of the extended iplb
field for pre-2.7 machines. Also a guest still can read/write only the
first 200 bytes of IPLB for now.

Signed-off-by: Alexander Yarygin <yarygin@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <dahi@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Fan Zhang
9700230b0d s390x: enable runtime instrumentation
Introduce run-time-instrumentation support when running under kvm for
virtio-ccw 2.7 machine and make sure older machines can not enable it.

The new ri_allowed field in the s390MachineClass serves as an indicator
whether the feature can be used by the machine and should therefore be
activated if available.

riccb_needed() is used to check whether riccb is needed or not in live
migration.

Signed-off-by: Fan Zhang <zhangfan@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Cornelia Huck
946e55f3c7 s390x: add compat machine for 2.7
Also add some of the option cascading we were missing.

Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
2016-05-17 15:50:29 +02:00
Peter Maydell
5a3fd960f3 Merge remote-tracking branch 'remotes/stefanha/tags/tracing-pull-request' into staging
# gpg: Signature made Tue 17 May 2016 14:06:54 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/tracing-pull-request:
  hw/intc/arm_gic: add tracepoints

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-17 14:07:25 +01:00
Peter Maydell
3f5e34a45c Merge remote-tracking branch 'remotes/stefanha/tags/block-pull-request' into staging
# gpg: Signature made Tue 17 May 2016 01:19:39 BST using RSA key ID 81AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg:                 aka "Stefan Hajnoczi <stefanha@gmail.com>"

* remotes/stefanha/tags/block-pull-request:
  rfifolock: no need to get thread identifier when nesting

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-17 10:35:50 +01:00
Peter Maydell
c98e793711 Merge remote-tracking branch 'remotes/thibault/tags/samuel-thibault' into staging
slirp updates

# gpg: Signature made Mon 16 May 2016 20:22:36 BST using RSA key ID FB6B2F1D
# gpg: Good signature from "Samuel Thibault <samuel.thibault@gnu.org>"
# gpg:                 aka "Samuel Thibault <sthibault@debian.org>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@inria.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@labri.fr>"
# gpg:                 aka "Samuel Thibault <samuel.thibault@ens-lyon.org>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg:          There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 900C B024 B679 31D4 0F82  304B D017 8C76 7D06 9EE6
#      Subkey fingerprint: F632 74CD C630 0873 CB3D  29D9 E3E5 1CE8 FB6B 2F1D

* remotes/thibault/tags/samuel-thibault:
  slirp: Clean up osdep.h related header inclusions
  slirp: Remove some unused code from slirp.h
  slirp: Remove obsolete backward-compatibility cruft
  slirp: Clean up slirp_config.h

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-17 09:16:00 +01:00
Hollis Blanchard
2531088f6c hw/intc/arm_gic: add tracepoints
These are obviously critical to understanding interrupt delivery:
gic_enable_irq
gic_disable_irq
gic_set_irq (inbound irq from device models)
gic_update_set_irq (outbound irq to CPU)
gic_acknowledge_irq

The only one that I think might raise eyebrows is gic_update_bestirq, but I've
(sadly) debugged problems that ended up being caused by unexpected priorities.
Knowing that the GIC has an irq ready, but doesn't deliver to the CPU due to
priority, has also proven important.

Signed-off-by: Hollis Blanchard <hollis_blanchard@mentor.com>
Message-id: 1461252281-22399-1-git-send-email-hollis_blanchard@mentor.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-16 17:20:41 -07:00
Changlong Xie
de3e15a705 rfifolock: no need to get thread identifier when nesting
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Message-id: 1462874348-32396-1-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2016-05-16 15:29:44 -07:00
Thomas Huth
9892663dc4 slirp: Clean up osdep.h related header inclusions
qemu/osdep.h is included in some headers twice - one time
should be sufficient.
Also remove the inclusion of time.h since that is already
done by osdep.h, too (this makes scripts/clean-includes
happy again).

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-05-16 21:01:16 +02:00
Thomas Huth
2cdc848eb5 slirp: Remove some unused code from slirp.h
These hunks are apparently not used anymore, so let's delete them.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-05-16 21:00:31 +02:00
Thomas Huth
5469feadb1 slirp: Remove obsolete backward-compatibility cruft
The slirp code does not use index() and gethostid() anymore,
so these parts can be removed without problems.
memmove() and strerror() should be available on each of the
supported platforms nowadays, too, so these wrappers are also
not needed anymore.
And we certainly also do not support Ultrix anymore, so no
need to keep the code for this platform anymore.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-05-16 20:58:47 +02:00
Thomas Huth
cebee21aca slirp: Clean up slirp_config.h
There are a lot of unused #defines / #undefs in slirp_config.h,
which are apparently left-overs from the very early slirp code.
Since there is no more code that uses them, let's simply remove
them from our version of slirp.

Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
2016-05-16 20:57:00 +02:00
Peter Maydell
70f87e0f0a Merge remote-tracking branch 'remotes/kraxel/tags/pull-ui-20160513-1' into staging
gtk/sdl build tweaks
fix gtk 3.20 warnings
gtk clipboard support
spice-gl monitor config support
fix coverity warnings

# gpg: Signature made Fri 13 May 2016 13:30:39 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-ui-20160513-1:
  gtk: don't leak the GtkBorder with VTE 0.36
  gtk: update grab code for gtk 3.20
  spice: fix coverity complains
  egl-helpers: fix possible resource leak
  Changed malloc to g_malloc, free to g_free in ui/shader.c
  spice/gl: add & use qemu_spice_gl_monitor_config
  ui/gtk: copy to clipboard support
  ui: gtk: Fix some deprecation warnings
  ui: gtk: Fix a runtime warning on vte >= 0.37
  configure: support vte-2.91
  configure: report SDL version
  configure: report GTK version
  configure: add echo_version helper
  configure: error on unknown --with-sdlabi value
  configure: build SDL if only SDL2 available
  ui: sdl2: Release grab before opening console window
  ui: gtk: fix crash when terminal inner-border is NULL

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-13 13:39:38 +01:00
Peter Maydell
14fccfa91e Merge remote-tracking branch 'remotes/lalrae/tags/mips-20160513' into staging
MIPS patches 2016-05-13

Changes:
* fix zeroing CP0.WatchLo registers in soft reset
* QOMify Jazz led

# gpg: Signature made Fri 13 May 2016 11:04:04 BST using RSA key ID 0B29DA6B
# gpg: Good signature from "Leon Alrae <leon.alrae@imgtec.com>"

* remotes/lalrae/tags/mips-20160513:
  hw/display: QOM'ify jazz_led.c
  target-mips: fix call to memset in soft reset code

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-13 11:50:42 +01:00
Alberto Garcia
6978dc4adc gtk: don't leak the GtkBorder with VTE 0.36
When gtk_widget_style_get() is used to get the "inner-border" style
property, it returns a copy of the GtkBorder which must be freed by
the caller.

This patch also fixes a warning about the unused 'padding' structure
with VTE 0.36.

Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1463127654-5171-1-git-send-email-berto@igalia.com
Cc: Cole Robinson <crobinso@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>

[ kraxel: adapted to changes in ui patch queue ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-13 12:40:12 +02:00
Peter Maydell
20c20318f9 Merge remote-tracking branch 'remotes/rth/tags/pull-tcg-20160512' into staging
queued 2.7 patches

# gpg: Signature made Fri 13 May 2016 01:08:20 BST using RSA key ID 4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg:                 aka "Richard Henderson <rth@redhat.com>"
# gpg:                 aka "Richard Henderson <rth@twiddle.net>"

* remotes/rth/tags/pull-tcg-20160512: (39 commits)
  cpu-exec: Clean up 'interrupt_request' reloading in cpu_handle_interrupt()
  cpu-exec: Remove unused 'x86_cpu' and 'env' from cpu_exec()
  cpu-exec: Move TB execution stuff out of cpu_exec()
  cpu-exec: Move interrupt handling out of cpu_exec()
  cpu-exec: Move exception handling out of cpu_exec()
  cpu-exec: Move halt handling out of cpu_exec()
  cpu-exec: Remove relic orphaned comment
  tcg: Remove needless CPUState::current_tb
  cpu-exec: Move TB chaining into tb_find_fast()
  tcg: Rework tb_invalidated_flag
  tcg: Clean up from 'next_tb'
  cpu-exec: elide more icount code if CONFIG_USER_ONLY
  tcg: reorganize tb_find_physical loop
  tcg: code_bitmap and code_write_count are not used by user-mode emulation
  tcg: Allow goto_tb to any target PC in user mode
  tcg: Clean up direct block chaining safety checks
  tcg: Clean up tb_jmp_unlink()
  tcg: Extract removing of jumps to TB from tb_phys_invalidate()
  tcg: Rename tb_jmp_remove() to tb_remove_from_jmp_list()
  tcg: Clarify thread safety check in tb_add_jump()
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-13 10:42:40 +01:00
xiaoqiang.zhao
7fe91a5b33 hw/display: QOM'ify jazz_led.c
* Drop the old SysBus init function and use instance_init
* Move graphic_console_init into realize stage

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-05-13 09:33:38 +01:00
Sergey Fedorov
8b1fe3f439 cpu-exec: Clean up 'interrupt_request' reloading in cpu_handle_interrupt()
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1463071937-26607-1-git-send-email-sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:07:16 -10:00
Sergey Fedorov
ba048a4ae1 cpu-exec: Remove unused 'x86_cpu' and 'env' from cpu_exec()
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1462962111-32237-6-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
928de9ee14 cpu-exec: Move TB execution stuff out of cpu_exec()
Simplify cpu_exec() by extracting TB execution code outside of
cpu_exec() into a new static inline function cpu_loop_exec_tb().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1462962111-32237-5-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
c385e6e497 cpu-exec: Move interrupt handling out of cpu_exec()
Simplify cpu_exec() by extracting interrupt handling code outside of
cpu_exec() into a new static inline function cpu_handle_interrupt().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson  <rth@twiddle.net>
Message-Id: <1462962111-32237-4-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
ea284766ec cpu-exec: Move exception handling out of cpu_exec()
Simplify cpu_exec() by extracting exception handling code out of
cpu_exec() into a new static inline function cpu_handle_exception().
Also make cpu_handle_debug_exception() inline as it is used only once.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1462962111-32237-3-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
8b2d34e997 cpu-exec: Move halt handling out of cpu_exec()
Simplify cpu_exec() by extracting CPU halt state handling code out of
cpu_exec() into a new static inline function cpu_handle_halt().

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1462962111-32237-2-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
c6f0d9f84c cpu-exec: Remove relic orphaned comment
This comment should have been deleted by commit 0ac087f1f3 ("removed
unused code") but somehow it is still here. There's no point to keep it.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1462286050-21778-1-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
3213525f8a tcg: Remove needless CPUState::current_tb
This field was used for telling cpu_interrupt() to unlink a chain of TBs
being executed when it worked that way. Now, cpu_interrupt() don't do
this anymore. So we don't need this field anymore.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1462273462-14036-1-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
a0522c7a55 cpu-exec: Move TB chaining into tb_find_fast()
Move tb_add_jump() call and surrounding code from cpu_exec() into
tb_find_fast(). That simplifies cpu_exec() a little by hiding the direct
chaining optimization details into tb_find_fast(). It also allows to
move tb_lock()/tb_unlock() pair into tb_find_fast(), putting it closer
to tb_find_slow() which also manipulates the lock.

Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Fixed rebase typo in nochain test.]
2016-05-12 14:06:42 -10:00
Sergey Fedorov
6f789be56d tcg: Rework tb_invalidated_flag
'tb_invalidated_flag' was meant to catch two events:
 * some TB has been invalidated by tb_phys_invalidate();
 * the whole translation buffer has been flushed by tb_flush().

Then it was checked:
 * in cpu_exec() to ensure that the last executed TB can be safely
   linked to directly call the next one;
 * in cpu_exec_nocache() to decide if the original TB should be provided
   for further possible invalidation along with the temporarily
   generated TB.

It is always safe to patch an invalidated TB since it is not going to be
used anyway. It is also safe to call tb_phys_invalidate() for an already
invalidated TB. Thus, setting this flag in tb_phys_invalidate() is
simply unnecessary. Moreover, it can prevent from pretty proper linking
of TBs, if any arbitrary TB has been invalidated. So just don't touch it
in tb_phys_invalidate().

If this flag is only used to catch whether tb_flush() has been called
then rename it to 'tb_flushed'. Declare it as 'bool' and stick to using
only 'true' and 'false' to set its value. Also, instead of setting it in
tb_gen_code(), just after tb_flush() has been called, do it right inside
of tb_flush().

In cpu_exec(), this flag is used to track if tb_flush() has been called
and have made 'next_tb' (a reference to the last executed TB) invalid
for linking it to directly call the next TB. tb_flush() can be called
during the CPU execution loop from tb_gen_code(), during TB execution or
by another thread while 'tb_lock' is released. Catch for translation
buffer flush reliably by resetting this flag once before first TB lookup
and each time we find it set before trying to add a direct jump. Don't
touch in in tb_find_physical().

Each vCPU has its own execution loop in multithreaded mode and thus
should have its own copy of the flag to be able to reset it with its own
'next_tb' and don't affect any other vCPU execution thread. So make this
flag per-vCPU and move it to CPUState.

In cpu_exec_nocache(), we only need to check if tb_flush() has been
called from tb_gen_code() called by cpu_exec_nocache() itself. To do
this reliably, preserve the old value of the flag, reset it before
calling tb_gen_code(), check afterwards, and combine the saved value
back to the flag.

This patch is based on the patch "tcg: move tb_invalidated_flag to
CPUState" from Paolo Bonzini <pbonzini@redhat.com>.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
819af24b9c tcg: Clean up from 'next_tb'
The value returned from tcg_qemu_tb_exec() is the value passed to the
corresponding tcg_gen_exit_tb() at translation time of the last TB
attempted to execute. It is a little confusing to store it in a variable
named 'next_tb'. In fact, it is a combination of 4-byte aligned pointer
and additional information in its two least significant bits. Break it
down right away into two variables named 'last_tb' and 'tb_exit' which
are a pointer to the last TB attempted to execute and the TB exit
reason, correspondingly. This simplifies the code and improves its
readability.

Correct a misleading documentation comment for tcg_qemu_tb_exec() and
fix logging in cpu_tb_exec(). Also rename a misleading 'next_tb' in
another couple of places.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Paolo Bonzini
7687bf52e5 cpu-exec: elide more icount code if CONFIG_USER_ONLY
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Alex Bennée: #ifndef replay code to match elided functions]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Alex Bennée
1279f323d6 tcg: reorganize tb_find_physical loop
Put some comments and improve code structure. This should help reading
the code.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[Sergey Fedorov: provide commit message; bring back resetting of
tb_invalidated_flag]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson  <rth@twiddle.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Paolo Bonzini
6fad459c91 tcg: code_bitmap and code_write_count are not used by user-mode emulation
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
[Sergey Fedorov: eliminate the field entirely in user-mode]
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Richard Henderson  <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
[rth: merged followup fixup]
Message-Id: <1462982777-4513-1-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
90aa39a1cc tcg: Allow goto_tb to any target PC in user mode
In user mode, there's only a static address translation, TBs are always
invalidated properly and direct jumps are reset when mapping change.
Thus the destination address is always valid for direct jumps and
there's no need to restrict it to the pages the TB resides in.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Cc: Riku Voipio <riku.voipio@iki.fi>
Cc: Blue Swirl <blauwirbel@gmail.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:42 -10:00
Sergey Fedorov
5b053a4a28 tcg: Clean up direct block chaining safety checks
We don't take care of direct jumps when address mapping changes. Thus we
must be sure to generate direct jumps so that they always keep valid
even if address mapping changes. Luckily, we can only allow to execute a
TB if it was generated from the pages which match with current mapping.

Document tcg_gen_goto_tb() declaration and note the reason for
destination PC limitations.

Some targets with variable length instructions allow TB to straddle a
page boundary. However, we make sure that both of TB pages match the
current address mapping when looking up TBs. So it is safe to do direct
jumps into the both pages. Correct the checks for some of those targets.

Given that, we can safely patch a TB which spans two pages. Remove the
unnecessary check in cpu_exec() and allow such TBs to be patched.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
f9c5b66f48 tcg: Clean up tb_jmp_unlink()
Unify the code of this function with tb_jmp_remove_from_list(). Making
these functions similar improves their readability. Also this could be a
step towards making this function thread-safe.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
89bba49632 tcg: Extract removing of jumps to TB from tb_phys_invalidate()
Move the code for removing jumps to a TB out of tb_phys_invalidate() to
a separate static inline function tb_jmp_unlink(). This simplifies
tb_phys_invalidate() and improves code structure.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
133626783a tcg: Rename tb_jmp_remove() to tb_remove_from_jmp_list()
tb_jmp_remove() was only used to remove the TB from a list of all TBs
jumping to the same TB which is n-th jump destination of the given TB.
Put a comment briefly describing the function behavior and rename it to
better reflect its purpose.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
9962c478b1 tcg: Clarify thread safety check in tb_add_jump()
The check is to make sure that another thread hasn't already done the
same while we were outside of tb_lock. Mention this in a comment.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
901bc3deb4 tcg: Init TB's direct jumps before making it visible
Initialize TB's direct jump list data fields and reset the jumps before
tb_link_page() puts it into the physical hash table and the physical
page list. So TB is completely initialized before it becomes visible.

This is pure rearrangement of code to a more suitable place, though it
could be a preparation for relaxing the locking scheme in future.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
e90d96b158 tcg: Rearrange tb_link_page() to avoid forward declaration
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
c37e6d7e35 tcg: Use uintptr_t type for jmp_list_{next|first} fields of TB
These fields do not contain pure pointers to a TranslationBlock
structure. So uintptr_t is the most appropriate type for them.
Also put some asserts to assure that the two least significant bits of
the pointer are always zero before assigning it to jmp_list_first.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
f309101c26 tcg: Clean up direct block chaining data fields
Briefly describe in a comment how direct block chaining is done. It
should help in understanding of the following data fields.

Rename some fields in TranslationBlock and TCGContext structures to
better reflect their purpose (dropping excessive 'tb_' prefix in
TranslationBlock but keeping it in TCGContext):
   tb_next_offset  =>  jmp_reset_offset
   tb_jmp_offset   =>  jmp_insn_offset
   tb_next         =>  jmp_target_addr
   jmp_next        =>  jmp_list_next
   jmp_first       =>  jmp_list_first

Avoid using a magic constant as an invalid offset which is used to
indicate that there's no n-th jump generated.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Richard Henderson
7ba6a512ae translate-all: Adjust 256mb testing for mips64
Make sure we preserve the high 32-bits when masking for mips64.

Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Emilio G. Cota
8bdf499782 translate-all: add missing munmap of the code_gen guard page for MIPS
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1461283314-2353-2-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Emilio G. Cota
835154b6e2 translate-all: remove redundant setting of tcg_ctx.code_gen_buffer_size
The setting of tcg_ctx.code_gen_buffer_size is done by the only caller of
size_code_gen_buffer(), which is code_gen_alloc():

  $ git grep size_code_gen_buffer
  translate-all.c:static inline size_t size_code_gen_buffer(size_t tb_size)
  translate-all.c:    tcg_ctx.code_gen_buffer_size = size_code_gen_buffer(tb_size);

Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1461283314-2353-1-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
10b4f48555 tcg: Note requirement on atomic direct jump patching
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1461341333-19646-12-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
c82460a560 tcg/mips: Make direct jump patching thread-safe
Ensure direct jump patching in MIPS is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-11-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
[rth: Merged the deposit32 followup.]
[rth: Merged the following followup.]
Message-Id: <1462210518-26522-1-git-send-email-sergey.fedorov@linaro.org>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
84f79fb7c6 tcg/sparc: Make direct jump patching thread-safe
Ensure direct jump patching in SPARC is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1461341333-19646-10-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
9e26911295 tcg/aarch64: Make direct jump patching thread-safe
Ensure direct jump patching in AArch64 is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-9-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
7d14e0e2d6 tcg/arm: Make direct jump patching thread-safe
Ensure direct jump patching in ARM is atomic by using
atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-8-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
ed3d51ecd7 tcg/s390: Make direct jump patching thread-safe
Ensure direct jump patching in s390 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-7-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
0d07abf05e tcg/i386: Make direct jump patching thread-safe
Ensure direct jump patching in i386 is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() for code patching.

tcg_out_nopn() implementation:
Suggested-by: Richard Henderson <rth@twiddle.net>.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-6-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:41 -10:00
Sergey Fedorov
399f164857 tcg/ppc: Make direct jump patching thread-safe
Ensure direct jump patching in PPC is atomic by:
 * limiting translation buffer size in 32-bit mode to be addressable by
   Branch I-form instruction;
 * using atomic_read()/atomic_set() for code patching.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1461341333-19646-5-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:40 -10:00
Sergey Fedorov
76442a939e tci: Make direct jump patching thread-safe
Ensure direct jump patching in TCI is atomic by:
 * naturally aligning a location of direct jump address;
 * using atomic_read()/atomic_set() to load/store the address.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-4-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:40 -10:00
Sergey Fedorov
6b587d3cda include/qemu/osdep.h: Add macros for pointer alignment
These macros provide a convenient way to n-byte align pointers up and
down and check if a pointer is n-byte aligned.

Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-3-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:40 -10:00
Sergey Fedorov
18a60a7614 include/qemu/osdep.h: Add a macro to check for alignment
Signed-off-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Sergey Fedorov <sergey.fedorov@linaro.org>
Message-Id: <1461341333-19646-2-git-send-email-sergey.fedorov@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
2016-05-12 14:06:40 -10:00
Emilio G. Cota
89fee74a0f tb: consistently use uint32_t for tb->flags
We are inconsistent with the type of tb->flags: usage varies loosely
between int and uint64_t. Settle to uint32_t everywhere, which is
superior to both: at least one target (aarch64) uses the most significant
bit in the u32, and uint64_t is wasteful.

Compile-tested for all targets.

Suggested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Suggested-by: Richard Henderson <rth@twiddle.net>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1460049562-23517-1-git-send-email-cota@braap.org>
2016-05-12 14:06:40 -10:00
Peter Maydell
f68419eee9 Merge remote-tracking branch 'remotes/kevin/tags/for-upstream' into staging
Block layer patches

# gpg: Signature made Thu 12 May 2016 14:37:05 BST using RSA key ID C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"

* remotes/kevin/tags/for-upstream: (69 commits)
  qemu-iotests: iotests: fail hard if not run via "check"
  block: enable testing of LUKS driver with block I/O tests
  block: add support for encryption secrets in block I/O tests
  block: add support for --image-opts in block I/O tests
  qemu-io: Add 'write -z -u' to test MAY_UNMAP flag
  qemu-io: Add 'write -f' to test FUA flag
  qemu-io: Allow unaligned access by default
  qemu-io: Use bool for command line flags
  qemu-io: Make 'open' subcommand more like command line
  qemu-io: Add missing option documentation
  qmp: add monitor command to add/remove a child
  quorum: implement bdrv_add_child() and bdrv_del_child()
  Add new block driver interface to add/delete a BDS's child
  qemu-img: check block status of backing file when converting.
  iotests: fix the redirection order in 083
  block: Inactivate all children
  block: Drop superfluous invalidating bs->file from drivers
  block: Invalidate all children
  nbd: Simplify client FUA handling
  block: Honor BDRV_REQ_FUA during write_zeroes
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 16:33:40 +01:00
Peter Maydell
e4f70d6358 Merge remote-tracking branch 'remotes/pmaydell/tags/pull-target-arm-20160512' into staging
target-arm queue:
 * blizzard, omap_lcdc: code cleanup to remove DEPTH != 32 dead code
 * QOMify various ARM devices
 * bcm2835_property: use cached values when querying framebuffer
 * hw/arm/nseries: don't allocate large sized array on the stack
 * fix LPAE descriptor address masking (only visible for EL2)
 * fix stage 2 exec permission handling for AArch32
 * first part of supporting syndrome info for data aborts to EL2
 * virt: NUMA support
 * work towards i.MX6 support
 * avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes

# gpg: Signature made Thu 12 May 2016 14:29:14 BST using RSA key ID 14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg:                 aka "Peter Maydell <pmaydell@gmail.com>"
# gpg:                 aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"

* remotes/pmaydell/tags/pull-target-arm-20160512: (43 commits)
  hw/arm: QOM'ify versatilepb.c
  hw/arm: QOM'ify strongarm.c
  hw/arm: QOM'ify stellaris.c
  hw/arm: QOM'ify spitz.c
  hw/arm: QOM'ify pxa2xx_pic.c
  hw/arm: QOM'ify pxa2xx.c
  hw/arm: QOM'ify integratorcp.c
  hw/arm: QOM'ify highbank.c
  hw/arm: QOM'ify armv7m.c
  target-arm: Avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes
  hw/display/blizzard: Remove blizzard_template.h
  hw/display/blizzard: Expand out macros
  i.MX: Add sabrelite i.MX6 emulation.
  i.MX: Add i.MX6 SOC implementation.
  i.MX: Add the Freescale SPI Controller
  FIFO: Add a FIFO32 implementation
  i.MX: Add i.MX6 System Reset Controller device.
  ARM: Factor out ARM on/off PSCI control functions
  ACPI: Virt: Generate SRAT table
  ACPI: move acpi_build_srat_memory to common place
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 15:55:45 +01:00
Gerd Hoffmann
a69fc693e9 gtk: update grab code for gtk 3.20
Fixes the remaining gtk 3.20 warnings.

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Cole Robinson <crobinso@redhat.com>
Tested-by: Cole Robinson <crobinso@redhat.com>
Message-id: 1463038146-13939-1-git-send-email-kraxel@redhat.com
2016-05-12 16:41:46 +02:00
Gonglei
28f4a7083d spice: fix coverity complains
Remove the unnecessary NULL check.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1463047028-123868-3-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-12 16:41:46 +02:00
Gonglei
f454f49c42 egl-helpers: fix possible resource leak
CID 1352419, using g_strdup_printf instead of asprintf.

Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1463047028-123868-2-git-send-email-arei.gonglei@huawei.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-12 16:41:46 +02:00
Md Haris Iqbal
42ddb8aa7c Changed malloc to g_malloc, free to g_free in ui/shader.c
Signed-off-by: Md Haris Iqbal <haris.phnx@gmail.com>
Message-id: 1459862499-4768-1-git-send-email-haris.phnx@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-12 16:41:46 +02:00
Gerd Hoffmann
39414ef4e9 spice/gl: add & use qemu_spice_gl_monitor_config
Cc: qemu-stable@nongnu.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
2016-05-12 16:41:46 +02:00
Michael S. Tsirkin
44b31e0bc4 ui/gtk: copy to clipboard support
This adds a menu item to copy current selection to clipboard.
Seems handy for copying out guest error messages.

Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Message-id: 1460924740-24513-1-git-send-email-mst@redhat.com

[ kraxel: fix build with CONFIG_VTE=n ]
[ kraxel: fix build with CONFIG_VTE=n, now for real ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-12 16:41:18 +02:00
Peter Maydell
6ddeeffffe Merge remote-tracking branch 'remotes/armbru/tags/pull-qapi-2016-05-12' into staging
QAPI patches for 2016-05-12

# gpg: Signature made Thu 12 May 2016 08:49:04 BST using RSA key ID EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg:                 aka "Markus Armbruster <armbru@pond.sub.org>"

* remotes/armbru/tags/pull-qapi-2016-05-12: (23 commits)
  qapi: Change visit_type_FOO() to no longer return partial objects
  qapi: Simplify semantics of visit_next_list()
  qapi: Fix string input visitor handling of invalid list
  tests/string-input-visitor: Add negative integer tests
  qapi: Split visit_end_struct() into pieces
  qmp: Tighten output visitor rules
  qmp: Don't reuse qmp visitor after grabbing output
  spapr_drc: Expose 'null' in qom-get when there is no fdt
  qmp: Support explicit null during visits
  qapi: Add visit_type_null() visitor
  tests: Add check-qnull
  qapi: Document visitor interfaces, add assertions
  qmp-input: Refactor when list is advanced
  qmp-input: Require struct push to visit members of top dict
  qom: Wrap prop visit in visit_start_struct
  qapi-commands: Wrap argument visit in visit_start_struct
  qmp-input: Don't consume input when checking has_member
  qapi: Use strict QMP input visitor in more places
  qapi: Consolidate QMP input visitor creation
  qmp-input: Clean up stack handling
  ...

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 15:06:38 +01:00
Kevin Wolf
efc2645f71 Merge remote-tracking branch 'mreitz/tags/pull-block-for-kevin-2016-05-12' into queue-block
Block patches for 2.7

# gpg: Signature made Thu May 12 15:34:13 2016 CEST using RSA key ID E838ACAD
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"

* mreitz/tags/pull-block-for-kevin-2016-05-12:
  qemu-iotests: iotests: fail hard if not run via "check"
  block: enable testing of LUKS driver with block I/O tests
  block: add support for encryption secrets in block I/O tests
  block: add support for --image-opts in block I/O tests
  qemu-io: Add 'write -z -u' to test MAY_UNMAP flag
  qemu-io: Add 'write -f' to test FUA flag
  qemu-io: Allow unaligned access by default
  qemu-io: Use bool for command line flags
  qemu-io: Make 'open' subcommand more like command line
  qemu-io: Add missing option documentation
  qmp: add monitor command to add/remove a child
  quorum: implement bdrv_add_child() and bdrv_del_child()
  Add new block driver interface to add/delete a BDS's child
  qemu-img: check block status of backing file when converting.
  iotests: fix the redirection order in 083

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:35:20 +02:00
Peter Maydell
f83b70f701 Merge remote-tracking branch 'remotes/kraxel/tags/pull-usb-20160511-1' into staging
usb: misc fixes

# gpg: Signature made Wed 11 May 2016 12:18:25 BST using RSA key ID D3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg:                 aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg:                 aka "Gerd Hoffmann (private) <kraxel@gmail.com>"

* remotes/kraxel/tags/pull-usb-20160511-1:
  usb: Support compilation without poll.h
  usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain
  usb:xhci: no DMA on HC reset

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 14:34:35 +01:00
Sascha Silbe
5a8fabf333 qemu-iotests: iotests: fail hard if not run via "check"
Running an iotests-based Python test directly might appear to work,
but may fail in subtle ways and is insecure:

- It creates files with predictable file names in a world-writable
  location (/var/tmp).

- Tests expect the environment to be set up by check. E.g. 041 and 055
  may take the wrong code paths if QEMU_DEFAULT_MACHINE is not
  set. This can lead to false negatives.

Instead fail hard and tell the user we want to be run via "check".

The actual environment expected by the tests is currently only defined
by the implementation of "check". We use two of the environment
variables set by "check" as indication of whether we're being run via
"check". Anyone writing their own test runner (replacing "check") will
need to replicate the full environment (in a broader sense, not just
environment variables) provided by "check" anyway, including setting
the two environment variables we check. Whereas a regular developer
just trying to invoke the tests usually won't have both of these
defined in their environment so we can catch their mistake and give
out useful advice.

Signed-off-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Reviewed-by: Bo Tu <tubo@linux.vnet.ibm.com>
Message-id: 1461094442-16014-1-git-send-email-silbe@linux.vnet.ibm.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Daniel P. Berrange
4e9b25fb05 block: enable testing of LUKS driver with block I/O tests
This adds support for testing the LUKS driver with the block
I/O test framework.

   cd tests/qemu-io-tests
   ./check -luks

A handful of test cases are modified to work with luks

 - 004 - whitelist luks format
 - 012 - use TEST_IMG_FILE instead of TEST_IMG for file ops
 - 048 - use TEST_IMG_FILE instead of TEST_IMG for file ops.
         don't assume extended image contents is all zeros,
         explicitly initialize with zeros
         Make file size smaller to avoid having to decrypt
         1 GB of data.
 - 052 - don't assume initial image contents is all zeros,
         explicitly initialize with zeros
 - 100 - don't assume initial image contents is all zeros,
         explicitly initialize with zeros

With this patch applied, the results are as follows:

  Passed: 001 002 003 004 005 008 009 010 011 012 021 032 043
          047 048 049 052 087 100 134 143
  Failed: 033 120 140 145
 Skipped: 007 013 014 015 017 018 019 020 022 023 024 025 026
          027 028 029 030 031 034 035 036 037 038 039 040 041
          042 043 044 045 046 047 049 050 051 053 054 055 056
          057 058 059 060 061 062 063 064 065 066 067 068 069
          070 071 072 073 074 075 076 077 078 079 080 081 082
          083 084 085 086 087 088 089 090 091 092 093 094 095
          096 097 098 099 101 102 103 104 105 107 108 109 110
          111 112 113 114 115 116 117 118 119 121 122 123 124
          128 129 130 131 132 133 134 135 136 137 138 139 141
          142 144 146 148 150 152

The reasons for the failed tests are:

 - 033 - needs adapting to use image opts syntax with blkdebug
         and test image in order to correctly set align property
 - 120 - needs adapting to use correct -drive syntax for luks
 - 140 - needs adapting to use correct -drive syntax for luks
 - 145 - needs adapting to use correct -drive syntax for luks

The vast majority of skipped tests are exercising code that is
qcow2 specific, though a couple could probably be usefully
enabled for luks too.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1462896689-18450-4-git-send-email-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Daniel P. Berrange
b7e875b2f9 block: add support for encryption secrets in block I/O tests
The LUKS block driver tests will require the ability to specify
encryption secrets with block devices. This requires using the
--object argument to qemu-img/qemu-io to create a 'secret'
object.

When the IMGKEYSECRET env variable is set, it provides the
password to be associated with a secret called 'keysec0'

The _qemu_img_wrapper function isn't modified as that needs
to cope with differing syntax for subcommands, so can't be
made to use the image opts syntax unconditionally.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1462896689-18450-3-git-send-email-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Daniel P. Berrange
076003f526 block: add support for --image-opts in block I/O tests
Currently all block tests use the traditional syntax for images
just specifying a filename. To support the LUKS driver without
resorting to JSON, the tests need to be able to use the new
--image-opts argument to qemu-img and qemu-io.

This introduces a new env variable IMGOPTSSYNTAX. If this is
set to 'true', then qemu-img/qemu-io should use --image-opts.

Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1462896689-18450-2-git-send-email-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
c2e001cc82 qemu-io: Add 'write -z -u' to test MAY_UNMAP flag
Make it easier to control whether the BDRV_REQ_MAY_UNMAP flag
can be passed through a write_zeroes command, by adding the '-u'
flag to qemu-io 'write -z' and 'aio_write -z'.  To be useful,
the device has to be opened with BDRV_O_UNMAP (done by default
in qemu-io, but can be made explicit with '-d unmap').

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1462677405-4752-7-git-send-email-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
770e0e0e80 qemu-io: Add 'write -f' to test FUA flag
Make it easier to test block drivers with BDRV_REQ_FUA in
.supported_write_flags, by adding the '-f' flag to qemu-io to
conditionally pass the flag through to specific writes ('write',
'write -z', 'writev', 'aio_write', 'aio_write -z'). You'll want
to use 'qemu-io -t none' to actually make -f useful (as
otherwise, the default writethrough mode automatically sets the
FUA bit on every write).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1462677405-4752-6-git-send-email-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
093ea232b0 qemu-io: Allow unaligned access by default
There's no reason to require the user to specify a flag just so
they can pass in unaligned numbers.  Keep 'read -p' and 'write -p'
as no-ops so that I don't have to hunt down and update all users
of qemu-io, but otherwise make their behavior default as 'read' and
'write'.  Also fix 'write -z', 'readv', 'writev', 'writev',
'aio_read', 'aio_write', and 'aio_write -z'.  For now, 'read -b',
'write -b', and 'write -c' still require alignment (and 'multiwrite',
but that's slated to die soon).

qemu-iotest 23 is updated to match, as the only test that was
previously explicitly expecting an error on an unaligned request.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1462677405-4752-5-git-send-email-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
dc38852aaa qemu-io: Use bool for command line flags
We require a C99 compiler; let's use it to express what we
really mean.

(Yes, we now have an instance of 'if (bool + bool + bool > 1)',
which, although semantically valid C, looks ugly; it gets
cleaned up later.)

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 1462677405-4752-4-git-send-email-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
b8d970f1a9 qemu-io: Make 'open' subcommand more like command line
The command line defaults to BDRV_O_UNMAP, but can use
-d to reset it.  Meanwhile, the 'open' subcommand was
defaulting to no discards, with no way to set it.

The command line has both -n and -tMODE to set a variety
of cache modes, but the 'open' subcommand had only -n.

The 'open' subcommand had no way to set BDRV_O_NATIVE_AIO.

Note that the 'reopen' subcommand uses '-c' where the
command line and 'open' use -t.  Making that consistent
would be a separate patch.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1462677405-4752-3-git-send-email-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:24 +02:00
Eric Blake
e4e12bb26d qemu-io: Add missing option documentation
The Usage: summary is missing several options, but rather than
having to maintain it, it's simpler to just state [OPTIONS],
since the options are spelled out below.

Commit 499afa2 added --image-opts, but forgot to document it in
--help.  Likewise for commit 9e8f183 and -d/--discard.

Commit e3aff4f6 put "-o/--offset" in the long opts, but it has
never been honored.

Add a note that '-n' is short for '-t none'.

Commit 9a2d77ad killed the -C option, but forgot to undocument
it for the 'open' subcommand.

Finally, commit 10d9d75 removed -g/--growable, but forgot to
cull it from the valid short options.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 1462677405-4752-2-git-send-email-eblake@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Wen Congyang
7f82159769 qmp: add monitor command to add/remove a child
The new QMP command name is x-blockdev-change. It's just for adding/removing
quorum's child now, and doesn't support all kinds of children, all kinds of
operations, nor all block drivers. So it is experimental now.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1462865799-19402-4-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Wen Congyang
98292c61bc quorum: implement bdrv_add_child() and bdrv_del_child()
Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-id: 1462865799-19402-3-git-send-email-xiecl.fnst@cn.fujitsu.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Wen Congyang
e06018ad28 Add new block driver interface to add/delete a BDS's child
In some cases, we want to take a quorum child offline, and take
another child online.

Signed-off-by: Wen Congyang <wency@cn.fujitsu.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1462865799-19402-2-git-send-email-xiecl.fnst@cn.fujitsu.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Ren Kimura
263a6f4c3a qemu-img: check block status of backing file when converting.
When converting images, check the block status of its backing file chain
to avoid needlessly reading zeros.

Signed-off-by: Ren Kimura <rkx1209dev@gmail.com>
Message-id: 1461773098-20356-1-git-send-email-rkx1209dev@gmail.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Wei Jiangang
9036e87c74 iotests: fix the redirection order in 083
It should redirect stdout to /dev/null first,
then redirect stderr to whatever stdout currently points at.

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Message-id: 1461665601-14908-1-git-send-email-weijg.fnst@cn.fujitsu.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
2016-05-12 15:33:23 +02:00
Fam Zheng
aad0b7a0bf block: Inactivate all children
Currently we only inactivate the top BDS. Actually bdrv_inactivate
should be the opposite of bdrv_invalidate_cache.

Recurse into the whole subtree instead.

Because a node may have multiple parents, and because once
BDRV_O_INACTIVE is set for a node, further writes are not allowed, we
cannot interleave flag settings and .bdrv_inactivate calls (that may
submit write to other nodes in a graph) within a single pass. Therefore
two passes are used here.

Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Fam Zheng
c9e9e9c66c block: Drop superfluous invalidating bs->file from drivers
Now they are invalidated by the block layer, so it's not necessary to
do this in block drivers' implementations of .bdrv_invalidate_cache.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Fam Zheng
0d1c5c9160 block: Invalidate all children
Currently we only recurse to bs->file, which will miss the children in quorum
and VMDK.

Recurse into the whole subtree to avoid that.

Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
52a4650574 nbd: Simplify client FUA handling
Now that the block layer honors per-bds FUA support, we don't
have to duplicate the fallback flush at the NBD layer.  The
static function nbd_co_writev_flags() is no longer needed, and
the driver can just directly use nbd_client_co_writev().

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
465fe887cc block: Honor BDRV_REQ_FUA during write_zeroes
The block layer has a couple of cases where it can lose
Force Unit Access semantics when writing a large block of
zeroes, such that the request returns before the zeroes
have been guaranteed to land on underlying media.

SCSI does not support FUA during WRITESAME(10/16); FUA is only
supported if it falls back to WRITE(10/16).  But where the
underlying device is new enough to not need a fallback, it
means that any upper layer request with FUA semantics was
silently ignoring BDRV_REQ_FUA.

Conversely, NBD has situations where it can support FUA but not
ZERO_WRITE; when that happens, the generic block layer fallback
to bdrv_driver_pwritev() (or the older bdrv_co_writev() in qemu
2.6) was losing the FUA flag.

The problem of losing flags unrelated to ZERO_WRITE has been
latent in bdrv_co_do_write_zeroes() since commit aa7bfbff, but
back then, it did not matter because there was no FUA flag.  It
became observable when commit 93f5e6d8 paved the way for flags
that can impact correctness, when we should have been using
bdrv_co_writev_flags() with modified flags.  Compare to commit
9eeb6dd, which got flag manipulation right in
bdrv_co_do_zero_pwritev().

Symptoms: I tested with qemu-io with default writethrough cache
(which is supposed to use FUA semantics on every write), and
targetted an NBD client connected to a server that intentionally
did not advertise NBD_FLAG_SEND_FUA.  When doing 'write 0 512',
the NBD client sent two operations (NBD_CMD_WRITE then
NBD_CMD_FLUSH) to get the fallback FUA semantics; but when doing
'write -z 0 512', the NBD client sent only NBD_CMD_WRITE.

The fix is do to a cleanup bdrv_co_flush() at the end of the
operation if any step in the middle relied on a BDS that does
not natively support FUA for that step (note that we don't
need to flush after every operation, if the operation is broken
into chunks based on bounce-buffer sizing).  Each BDS gains a
new flag .supported_zero_flags, which parallels the use of
.supported_write_flags but only when accessing a zero write
operation (the flags MUST be different, because of SCSI having
different semantics based on WRITE vs. WRITESAME; and also
because BDRV_REQ_MAY_UNMAP only makes sense on zero writes).

Also fix some documentation to describe -ENOTSUP semantics,
particularly since iscsi depends on those semantics.

Down the road, we may want to add a driver where its
.bdrv_co_pwritev() honors all three of BDRV_REQ_FUA,
BDRV_REQ_ZERO_WRITE, and BDRV_REQ_MAY_UNMAP, and advertise
this via bs->supported_write_flags for blocks opened by that
driver; such a driver should NOT supply .bdrv_co_write_zeroes
nor .supported_zero_flags.  But none of the drivers touched
in this patch want to do that (the act of writing zeroes is
different enough from normal writes to deserve a second
callback).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
4df863f336 block: Make supported_write_flags a per-bds property
Pre-patch, .supported_write_flags lives at the driver level, which
means we are blindly declaring that all block devices using a
given driver will either equally support FUA, or that we need a
fallback at the block layer.  But there are drivers where FUA
support is a per-block decision: the NBD block driver is dependent
on the remote server advertising NBD_FLAG_SEND_FUA (and has
fallback code to duplicate the flush that the block layer would do
if NBD had not set .supported_write_flags); and the iscsi block
driver is dependent on the mode sense bits advertised by the
underlying device (and is currently silently ignoring FUA requests
if the underlying device does not support FUA).

The fix is to make supported flags as a per-BDS option, set during
.bdrv_open().  This patch moves the variable and fixes NBD and iscsi
to set it only conditionally; later patches will then further
simplify the NBD driver to quit duplicating work done at the block
layer, as well as tackle the fact that SCSI does not support FUA
semantics on WRITESAME(10/16) but only on WRITE(10/16).

Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Denis V. Lunev
2928abce6d qcow2: improve qcow2_co_write_zeroes()
There is a possibility that qcow2_co_write_zeroes() will be called
with the partial block. This could be synthetically triggered with
    qemu-io -c "write -z 32k 4k"
and can happen in the real life in qemu-nbd. The latter happens under
the following conditions:
    (1) qemu-nbd is started with --detect-zeroes=on and is connected to the
        kernel NBD client
    (2) third party program opens kernel NBD device with O_DIRECT
    (3) third party program performs write operation with memory buffer
        not aligned to the page
In this case qcow2_co_write_zeroes() is unable to perform the operation
and mark entire cluster as zeroed and returns ENOTSUP. Thus the caller
switches to non-optimized version and writes real zeroes to the disk.

The patch creates a shortcut. If the block is read as zeroes, f.e. if
it is unallocated, the request is extended to cover full block.
User-visible situation with this block is not changed. Before the patch
the block is filled in the image with real zeroes. After that patch the
block is marked as zeroed in metadata. Thus any subsequent changes in
backing store chain are not affected.

Kevin, thank you for a cool suggestion.

Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
7b1deac84e block: Kill unused sector-based blk_* functions
Now that there are no remaining clients, we can drop the
sector-based blk_read(), blk_write(), blk_aio_readv(), and
blk_aio_writev().  Sadly, there are still remaining
sector-based interfaces, such as blk_*discard(), or
blk_write_compressed(); those will have to wait for another
day.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
7b3f9712e1 qemu-io: Switch to byte-based block access
qemu-io is the last user of several sector-based interfaces.
This patch upgrades to the new interfaces under the hood,
then deletes the resulting dead code.  Note that for maximum
back-compat, while the -p option is no longer required to get
blk_pread(), it is still needed to allow for unaligned access;
this is because qemu-iotest 23 relies on qemu-io rejecting
unaligned accesses without -p.  A later patch may clean up the
interface to be more user-friendly, but it's better to separate
what's done under the hood from what the user sees.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
9166920a0b qemu-img: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
bd31c214c3 nbd: Switch to byte-based block access
Sector-based blk_read() should die; switch to byte-based
blk_pread() instead.

Add a constant for our magic number 512, to make it obvious
that this size will NOT change even if BDRV_SECTOR_SIZE does,
even though the two happen to be the same for now.  Split
assignments from conditionals to keep checkpatch.pl happy.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
26a122d3d4 atapi: Switch to byte-based block access
Sector-based blk_read() should die; switch to byte-based
blk_pread() instead.

Add new defines ATAPI_SECTOR_BITS and ATAPI_SECTOR_SIZE to
use anywhere we were previously scaling BDRV_SECTOR_* by 4,
for better legibility.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
243e6f69c1 m25p80: Switch to byte-based block access
Sector-based blk_read() should die; switch to byte-based
blk_pread() instead.

Likewise for blk_aio_readv() and blk_aio_writev().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
12c125cba9 sd: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

Greatly simplifies the code, now that we let the block layer
take care of alignment and read-modify-write on our behalf :)
In fact, we no longer need to include 'buf' in the migration
stream (although we do have to ensure that the stream remains
compatible).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
098e732dbe pflash: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
441692ddd8 onenand: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

This particular device picks its size during onenand_initfn(),
and can be at most 0x80000000 bytes; therefore, shifting an
'int sec' request to get back to a byte offset should never
overflow 32 bits.  But adding assertions to document that point
should not hurt.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
9fc0d361cc nand: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

This file is doing some complex computations to map various
flash page sizes (256, 512, and 2048) atop generic uses of
512-byte sector operations.  Perhaps someone will want to tidy
up the file for fewer gymnastics in managing addresses and
offsets, and less wasteful visits of 256-byte pages, but it
was out of scope for this series, where I just went with the
mechanical conversion.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
a7a5b7c0fc fdc: Switch to byte-based block access
Sector-based blk_write() should die; switch to byte-based
blk_pwrite() instead.  Likewise for blk_read().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
d00000f901 xen_disk: Switch to byte-based aio block access
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
b5772fdde4 virtio: Switch to byte-based aio block access
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.

The trace is modified at the same time, and nb_sectors is now
unused.  Fix a comment typo while in the vicinity.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
03c90063cc scsi-disk: Switch to byte-based aio block access
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.

As part of the cleanup, scsi_init_iovec() no longer needs to return
a value, and reword a comment.

[ kwolf: Fix read accounting change ]

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:09 +02:00
Eric Blake
d4f510eb3f ide: Switch to byte-based aio block access
Sector-based blk_aio_readv() and blk_aio_writev() should die; switch
to byte-based blk_aio_preadv() and blk_aio_pwritev() instead.

The patch had to touch multiple files at once, because dma_blk_io()
takes pointers to the functions, and ide_issue_trim() piggybacks on
the same interface (while ignoring offset under the hood).

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Eric Blake
60cb2fa7eb block: Introduce byte-based aio read/write
blk_aio_readv() and blk_aio_writev() are annoying in that they
can't access sub-sector granularity, and cannot pass flags.
Also, they require the caller to pass redundant information
about the size of the I/O (qiov->size in bytes must match
nb_sectors in sectors).

Add new blk_aio_preadv() and blk_aio_pwritev() functions to fix
the flaws. The next few patches will upgrade callers, then
finally delete the old interfaces.

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Eric Blake
983a160050 block: Switch blk_*write_zeroes() to byte interface
Sector-based blk_write() should die; convert the one-off
variant blk_write_zeroes() to use an offset/count interface
instead.  Likewise for blk_co_write_zeroes() and
blk_aio_write_zeroes().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Eric Blake
b7d17f9fa4 block: Switch blk_read_unthrottled() to byte interface
Sector-based blk_read() should die; convert the one-off
variant blk_read_unthrottled().

Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Eric Blake
8341f00dc2 block: Allow BDRV_REQ_FUA through blk_pwrite()
We have several block drivers that understand BDRV_REQ_FUA,
and emulate it in the block layer for the rest by a full flush.
But without a way to actually request BDRV_REQ_FUA during a
pass-through blk_pwrite(), FUA-aware block drivers like NBD are
forced to repeat the emulation logic of a full flush regardless
of whether the backend they are writing to could do it more
efficiently.

This patch just wires up a flags argument; followup patches
will actually make use of it in the NBD driver and in qemu-io.

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
0e01b76e7c qemu-io: Fix memory leak in 'aio_write -z'
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
2016-05-12 15:22:08 +02:00
Janne Karhunen
f249924e96 Allow users to specify the vmdk virtual hardware version.
Vmdk images have metadata to indicate the vmware virtual
hardware version image was created/tested to run with.
Allow users to specify that version via new 'hwversion'
option.

[ kwolf: Adjust qemu-iotests common.filter ]

Signed-off-by: Janne Karhunen <Janne.Karhunen@gmail.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Zhou Jie
ed79f37d9b block: always compile-check debug prints
Files with conditional debug statements should ensure that the printf is
always compiled. This prevents bitrot of the format string of the debug
statement. And switch debug output to stderr.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Wei Jiangang
547cb1574e block: Fix typo in comment
s/imlement/implement/

Signed-off-by: Wei Jiangang <weijg.fnst@cn.fujitsu.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
e3ddef25e9 block: Remove BlockDriver.bdrv_read/write
There are no block drivers left that implement the old .bdrv_read/write
interface, so it can be removed now. This gets us rid of the
corresponding emulation functions, too.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
4575eb496d vvfat: Implement .bdrv_co_preadv/pwritev interfaces
This doesn't really convert any of the actual vvfat logic to use
vectored I/O (and it's doubtful whether that would make sense), but
instead just adapts the wrappers to the modern interface.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
513b0f026b vpc: Implement .bdrv_co_pwritev() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
d46b7cc680 vpc: Implement .bdrv_co_preadv() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
37b1d7d8c9 vmdk: Implement .bdrv_co_pwritev() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
f10cc24359 vmdk: Implement .bdrv_co_preadv() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
a844a2b0d4 vmdk: Add vmdk_find_offset_in_cluster()
This is a byte granularity version of vmdk_find_index_in_cluster().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
fde9d56f5b vdi: Implement .bdrv_co_pwritev() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
0865bb6f04 vdi: Implement .bdrv_co_preadv() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
3edf1e73d5 dmg: Implement .bdrv_co_preadv() interface
This implements .bdrv_co_preadv() for the cloop block driver. While
updating the error paths, change -1 to a valid -errno code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
5cd230819e cloop: Implement .bdrv_co_preadv() interface
This implements .bdrv_co_preadv() for the cloop block driver. While
updating the error paths, change -1 to a valid -errno code.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
3b8fd33011 bochs: Implement .bdrv_co_preadv() interface
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
3fb06697ae block: Introduce .bdrv_co_preadv/pwritev BlockDriver function
Many parts of the block layer are already byte granularity. The block
driver interface, however, was still missing an interface that allows
making use of this. This patch introduces a new BlockDriver interface,
which is based on coroutines, vectored, has flags and uses a byte
granularity. This is now the preferred interface for new drivers.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
cab3a3563c block: Rename bdrv_co_do_preadv/writev to bdrv_co_preadv/writev
It used to be an internal helper function just for implementing
bdrv_co_do_readv/writev(), but now that it's a public interface, it
deserves a name without "do" in it.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:08 +02:00
Kevin Wolf
0884447382 block: Support AIO drivers in bdrv_driver_preadv/pwritev()
Instead of registering emulation functions as .bdrv_co_writev, just
directly check whether the function is there or not, and use the AIO
interface if it isn't. This makes the read/write functions more
consistent with how things are done in other places (flush, discard,
etc.)

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:07 +02:00
Kevin Wolf
78a07294d5 block: Introduce bdrv_driver_pwritev()
This is a function that simply calls into the block driver for doing a
write, providing the byte granularity interface we want to eventually
have everywhere, and using whatever interface that driver supports.

This one is a bit more interesting than the version for reads: It adds
support for .bdrv_co_writev_flags() everywhere, so that drivers
implementing this function can drop .bdrv_co_writev() now.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:07 +02:00
Kevin Wolf
166fe96051 block: Introduce bdrv_driver_preadv()
This is a function that simply calls into the block driver for doing a
read, providing the byte granularity interface we want to eventually
have everywhere, and using whatever interface that driver supports.

For now, this is just a wrapper for calling bs->drv->bdrv_co_readv().

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
dd7f7ed104 linux-aio: make it more type safe
Replace void* with an opaque LinuxAioState type.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
6b98bd6495 block: plug whole tree at once, introduce bdrv_io_unplugged_begin/end
Extract the handling of io_plug "depth" from linux-aio.c and let the
main bdrv_drain loop do nothing but wait on I/O.

Like the two newly introduced functions, bdrv_io_plug and bdrv_io_unplug
now operate on all children.  The visit order is now symmetrical between
plug and unplug, making it possible for formats to implement plug/unplug.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
ce0f141259 block: introduce bdrv_no_throttling_begin/end
Extract the handling of throttling from bdrv_flush_io_queue.  These
new functions will soon become BdrvChildRole callbacks, as they can
be generalized to "beginning of drain" and "end of drain".

Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
b6e84c97ed block: extract bdrv_drain_poll/bdrv_co_yield_to_drain from bdrv_drain/bdrv_co_drain
Do not call bdrv_drain_recurse twice in bdrv_co_drain.  A small
tweak to the logic in Fam's patch, which is harmless since no
one implements bdrv_drain anyway.  But better get it right.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
a72f641407 block: move restarting of throttled reqs to block/throttle-groups.c
We want to remove throttled_reqs from block/io.c.  This is the easy
part---hide the handling of throttled_reqs during disable/enable of
throttling within throttle-groups.c.

Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Paolo Bonzini
733bbc8cea block: make bdrv_start_throttled_reqs return void
The return value is unused and I am not sure why it would be useful.

Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
Kevin Wolf
90c78624f1 block: Don't disable I/O throttling on sync requests
We had to disable I/O throttling with synchronous requests because we
didn't use to run timers in nested event loops when the code was
introduced. This isn't true any more, and throttling works just fine
even when using the synchronous API.

The removed code is in fact dead code since commit a8823a3b ('block: Use
blk_co_pwritev() for blk_write()') because I/O throttling can only be
set on the top layer, but BlockBackend always uses the coroutine
interface now instead of using the sync API emulation in block.c.

Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-Id: <1458660792-3035-2-git-send-email-kwolf@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
2016-05-12 15:22:07 +02:00
xiaoqiang.zhao
0bc91ab3bb hw/arm: QOM'ify versatilepb.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:12 +01:00
xiaoqiang.zhao
5a67508c7a hw/arm: QOM'ify strongarm.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:11 +01:00
xiaoqiang.zhao
15c4fff5d8 hw/arm: QOM'ify stellaris.c
* Drop the use of old SysBus init function and use instance_init
* Use DeviceClass::vmsd instead of 'vmstate_register' function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:10 +01:00
xiaoqiang zhao
f68575c956 hw/arm: QOM'ify spitz.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:09 +01:00
xiaoqiang.zhao
08ba3fde1d hw/arm: QOM'ify pxa2xx_pic.c
Remove the empty 'pxa2xx_pic_initfn' and it's
setup code in the 'pxa2xx_pic_class_init'

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:08 +01:00
xiaoqiang.zhao
16fb31a382 hw/arm: QOM'ify pxa2xx.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:07 +01:00
xiaoqiang.zhao
a1f42e0c9a hw/arm: QOM'ify integratorcp.c
* Drop the use of old SysBus init function and use instance_init
* Remove the empty 'icp_pic_class_init' from Typeinfo

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:06 +01:00
xiaoqiang.zhao
ff7a27c15a hw/arm: QOM'ify highbank.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:42:06 +01:00
xiaoqiang.zhao
3f5ab25490 hw/arm: QOM'ify armv7m.c
Drop the use of old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:40:48 +01:00
Peter Maydell
6459b94c26 target-arm: Avoid unnecessary TLB flush on TCR_EL2, TCR_EL3 writes
The TCR_EL2 and TCR_EL3 regdefs were incorrectly using the
vmsa_tcr_el1_write function for writes. Since these registers don't
have the A1 bit that TCR_EL1 does, we don't need to do a tlb_flush()
when they are written. Remove the unnecessary .writefn and also the
harmless but unneeded .raw_writefn and .resetfn definitions.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Sergey Fedorov <sergey.fedorov@linaro.org>
2016-05-12 13:22:30 +01:00
Peter Maydell
4274d821ff hw/display/blizzard: Remove blizzard_template.h
We no longer need to do the "multiply include this header" trick with
blizzard_template.h, and it is only used in a single .c file, so just
put its contents inline in blizzard.c.

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1462371352-21498-3-git-send-email-peter.maydell@linaro.org
2016-05-12 13:22:30 +01:00
Peter Maydell
5c8759087d hw/display/blizzard: Expand out macros
Now that we can assume that only depth 32 is possible, there's no need
for the COPY_PIXEL1 and PIXEL_TYPE macros, and the SKIP_PIXEL, COPY_PIXEL
and SWAP_WORDS macros aren't used at all. Expand out COPY_PIXEL1 and
PIXEL_TYPE where they are used, delete the unused macro definitions, and
expand out the uses of glue(name_prefix, DEPTH).

Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1462371352-21498-2-git-send-email-peter.maydell@linaro.org
2016-05-12 13:22:29 +01:00
Jean-Christophe DUBOIS
3a0f31bcb8 i.MX: Add sabrelite i.MX6 emulation.
The sabrelite supports one SPI FLASH memory on SPI1

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:29 +01:00
Jean-Christophe DUBOIS
ec46eaa83a i.MX: Add i.MX6 SOC implementation.
For now we only support the following devices:
* up to 4 Cortex A9 cores
* A9 MPCORE (SCU, GIC, TWD)
* 5 i.MX UARTs
* 2 EPIT timers
* 1 GPT timer
* 3 I2C controllers
* 7 GPIO controllers
* 6 SDHC controllers
* 5 SPI controllers
* 1 CCM device
* 1 SRC device
* various ROM/RAM areas.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:29 +01:00
Jean-Christophe DUBOIS
c906a3a015 i.MX: Add the Freescale SPI Controller
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:29 +01:00
Jean-Christophe DUBOIS
53374b16a2 FIFO: Add a FIFO32 implementation
This one is build on top of the existing FIFO8

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:29 +01:00
Jean-Christophe DUBOIS
1983057470 i.MX: Add i.MX6 System Reset Controller device.
This controller is also present in i.MX5X devices but they are not
yet emulated by QEMU.

Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:28 +01:00
Jean-Christophe DUBOIS
825482adde ARM: Factor out ARM on/off PSCI control functions
Split ARM on/off function from PSCI support code.

This will allow to reuse these functions in other code.

Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:28 +01:00
Shannon Zhao
2b302e1e3c ACPI: Virt: Generate SRAT table
To support NUMA, it needs to generate SRAT ACPI table.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1461667229-9216-6-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:28 +01:00
Shannon Zhao
64b831367b ACPI: move acpi_build_srat_memory to common place
Move acpi_build_srat_memory to common place so that it could be reused
by ARM. Rename it to build_srat_memory.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1461667229-9216-5-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:28 +01:00
Shannon Zhao
ea9fcbd7d0 ACPI: Fix the definition of proximity in AcpiSratMemoryAffinity
ACPI spec says that Proximity Domain is an "Integer that represents
the proximity domain to which the processor belongs". So define it as a
uint32_t.

Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1461667229-9216-4-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:28 +01:00
Shannon Zhao
e6e400d54f ACPI: Add GICC Affinity Structure
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1461667229-9216-3-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:27 +01:00
Shannon Zhao
9695200ad8 ARM: Virt: Set numa-node-id for cpu and memory nodes
Generate memory nodes according to NUMA topology. Set numa-node-id
property for cpu and memory nodes.

Signed-off-by: Shannon Zhao <shannon.zhao@linaro.org>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1461667229-9216-2-git-send-email-zhaoshenglong@huawei.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:27 +01:00
xiaoqiang zhao
3c09d6caad hw/display: QOM'ify exynos4210_fimd.c
* Drop the old SysBus init function and use instance_init
* Move graphic_console_init into realize stage

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Message-id: 1462417489-28603-2-git-send-email-zxq_yx_007@163.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:27 +01:00
Edgar E. Iglesias
cd694521ca target-arm/translate-a64.c: Unify some of the ldst_reg decoding
The various load/store variants under disas_ldst_reg can all reuse the
same decoding for opc, size, rt and is_vector.

This patch unifies the decoding in preparation for generating
instruction syndromes for data aborts.
This will allow us to reduce the number of places to hook in updates
to the load/store state needed to generate the insn syndromes.

No functional change.

Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-7-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:27 +01:00
Edgar E. Iglesias
026a19c312 target-arm/translate-a64.c: Use extract32 in disas_ldst_reg_imm9
Use extract32 instead of open coding the bit masking when decoding
is_signed and is_extended. This streamlines the decoding with some
of the other ldst variants.

No functional change.

Reviewed-by: Sergey Fedorov <serge.fdrv@gmail.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-6-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:27 +01:00
Peter Maydell
094d028a79 target-arm: Split data abort syndrome generator
Split the data abort syndrome generator into two versions:
One with a valid Instruction Specific Syndrome (ISS) and another without.

The following new flags are supported by the syndrome generator
with ISS:
* isv - Instruction syndrome valid
* sas - Syndrome access size
* sse - Syndrome sign extend
* srt - Syndrome register transfer
* sf  - Sixty-Four bit register width
* ar  - Acquire/Release

These flags are not yet used, so this patch has no functional change
except that we will now correctly set the IL bit in data abort
syndromes without ISS information.

Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-5-git-send-email-edgar.iglesias@gmail.com>
[PMM: squashed in with patch which was just adding the IL bit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Edgar E. Iglesias
25caa94c4a gen-icount: Use tcg_set_insn_param
Use tcg_set_insn_param() instead of directly accessing internal
tcg data structures to update an insn param.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-3-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Edgar E. Iglesias
1d41478fd4 tcg: Add tcg_set_insn_param
Add tcg_set_insn_param as a mechanism to modify an insn
parameter after emiting the insn. This is useful for icount
and also for embedding fault information for a specific insn.

Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1461931684-1867-2-git-send-email-edgar.iglesias@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Sergey Sorokin
dddb522341 target-arm: Fix descriptor address masking in ARM address translation
There is a bug in ARM address translation regime with a long-descriptor
format. On the descriptor reading its address is formed from an index
which is a part of the input address. And on the first iteration this index
is incorrectly masked with 'grainsize' mask. But it can be wider according
to pseudo-code.
On the other hand on the iterations other than first the descriptor address
is formed from the previous level descriptor by masking with 'descaddrmask'
value. It always clears just 12 lower bits, but it must clear 'grainsize'
lower bits instead according to pseudo-code.
The patch fixes both cases.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-id: 1460996853-22117-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Sergey Sorokin
dfda68377e target-arm: Stage 2 permission fault was fixed in AArch32 state
As described in AArch32.CheckS2Permission an instruction fetch fails if
XN bit is set or there is no read permission for the address.

Signed-off-by: Sergey Sorokin <afarallax@yandex.ru>
Message-id: 1461002400-3187-1-git-send-email-afarallax@yandex.ru
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:26 +01:00
Zhou Jie
0b062eb090 hw/arm/nseries: Allocating Large sized arrays to heap
n8x0_init has a huge stack usage of 65536 bytes approx.
Moving large arrays to heap to reduce stack usage.

Signed-off-by: Zhou Jie <zhoujie2011@cn.fujitsu.com>
Message-id: 1461651308-894-1-git-send-email-zhoujie2011@cn.fujitsu.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
Sylvain Garrigues
27a5dc7be6 bcm2835_property: use cached values when querying framebuffer
As the framebuffer settings are copied into the result message before it is
reconfigured, inconsistent behavior can happen when, for instance, you set with
a single message the width, height, and depth, and ask at the same time to
allocate the buffer and get the pitch and the size.

In this case, the reported pitch and size would be incorrect as they were
computed with the initial values of width, height and depth, not the ones the
client requested.

Signed-off-by: Sylvain Garrigues <sylvain@sylvaingarrigues.com>
Reviewed-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 1461325343-24995-1-git-send-email-sylvain@sylvaingarrigues.com
[PMM: folded a couple of long lines]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
xiaoqiang zhao
0a750e2a78 hw/intc: QOM'ify omap_intc.c
* Split the old SysBus init into an instance_init and a
  DeviceClass::realize function
* Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
xiaoqiang.zhao
22c70d8a6a hw/intc: QOM'ify grlib_irqmp.c
* Split the old SysBus init into an instance_init and a
  DeviceClass::realize function
* Drop the old SysBus init function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: corrected "can not" to "cannot" in error message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
xiaoqiang.zhao
c09008d2d3 hw/intc: QOM'ify slavio_intctl.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
xiaoqiang.zhao
e3be8b4f4f hw/intc: QOM'ify pl190.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:25 +01:00
xiaoqiang.zhao
f777bda60f hw/intc: QOM'ify imx_avic.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
xiaoqiang.zhao
68d71616c0 hw/intc: QOM'ify exynos4210_gic.c
* Drop the old SysBus init function and use instance_init
* Split the exynos4210_irq_gate_init into an instance_init
  and a DeviceClass::realize function

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
xiaoqiang.zhao
d3d5a6febd hw/intc: QOM'ify exynos4210_combiner.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
xiaoqiang.zhao
b46818e9e7 hw/intc: QOM'ify etraxfs_pic.c
Drop the old SysBus init function and use instance_init

Signed-off-by: xiaoqiang zhao <zxq_yx_007@163.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
Pooja Dhannawat
ea644cf343 omap_lcdc: Remove support for DEPTH != 32
surface_bits_per_pixel() always returns 32
so, removing other dead code which is
based on DEPTH !== 32

Signed-off-by: Pooja Dhannawat <dhannawatpooja1@gmail.com>
Message-id: 1459260142-9144-1-git-send-email-dhannawatpooja1@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
Pooja Dhannawat
5c87c4089a blizzard: Remove support for DEPTH != 32
Removing support for DEPTH != 32 from blizzard template header
and file that includes it, as macro DEPTH == 32 only used.

Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Pooja Dhannawat <dhannawatpooja1@gmail.com>
Message-id: 1458971873-2768-1-git-send-email-dhannawatpooja1@gmail.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 13:22:24 +01:00
Peter Maydell
26617924e9 Open 2.7 development tree
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
2016-05-12 12:35:25 +01:00
Aurelien Jarno
9d989c732b target-mips: fix call to memset in soft reset code
Recent versions of GCC report the following error when compiling
target-mips/helper.c:

  qemu/target-mips/helper.c:542:9: warning: ‘memset’ used with length
  equal to number of elements without multiplication by element size
  [-Wmemset-elt-size]

This is indeed correct and due to a wrong usage of sizeof(). Fix that.

Cc: Stefan Weil <sw@weilnetz.de>
Cc: Leon Alrae <leon.alrae@imgtec.com>
Cc: qemu-stable@nongnu.org
LP: https://bugs.launchpad.net/qemu/+bug/1577841
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Leon Alrae <leon.alrae@imgtec.com>
Signed-off-by: Leon Alrae <leon.alrae@imgtec.com>
2016-05-12 11:01:05 +01:00
Eric Blake
68ab47e4b4 qapi: Change visit_type_FOO() to no longer return partial objects
Returning a partial object on error is an invitation for a careless
caller to leak memory.  We already fixed things in an earlier
patch to guarantee NULL if visit_start fails ("qapi: Guarantee
NULL obj on input visitor callback error"), but that does not
help the case where visit_start succeeds but some other failure
happens before visit_end, such that we leak a partially constructed
object outside visit_type_FOO(). As no one outside the testsuite
was actually relying on these semantics, it is cleaner to just
document and guarantee that ALL pointer-based visit_type_FOO()
functions always leave a safe value in *obj during an input visitor
(either the new object on success, or NULL if an error is
encountered), so callers can now unconditionally use
qapi_free_FOO() to clean up regardless of whether an error occurred.

The decision is done by adding visit_is_input(), then updating the
generated code to check if additional cleanup is needed based on
the type of visitor in use.

Note that we still leave *obj unchanged after a scalar-based
visit_type_FOO(); I did not feel like auditing all uses of
visit_type_Enum() to see if the callers would tolerate a specific
sentinel value (not to mention having to decide whether it would
be better to use 0 or ENUM__MAX as that sentinel).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-25-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
d9f62dde13 qapi: Simplify semantics of visit_next_list()
The semantics of the list visit are somewhat baroque, with the
following pseudocode when FooList is used:

start()
for (prev = head; cur = next(prev); prev = &cur) {
    visit(&cur->value)
}

Note that these semantics (advance before visit) requires that
the first call to next() return the list head, while all other
calls return the next element of the list; that is, every visitor
implementation is required to track extra state to decide whether
to return the input as-is, or to advance.  It also requires an
argument of 'GenericList **' to next(), solely because the first
iteration might need to modify the caller's GenericList head, so
that all other calls have to do a layer of dereferencing.

Thankfully, we only have two uses of list visits in the entire
code base: one in spapr_drc (which completely avoids
visit_next_list(), feeding in integers from a different source
than uint8List), and one in qapi-visit.py.  That is, all other
list visitors are generated in qapi-visit.c, and share the same
paradigm based on a qapi FooList type, so we can refactor how
lists are laid out with minimal churn among clients.

We can greatly simplify things by hoisting the special case
into the start() routine, and flipping the order in the loop
to visit before advance:

start(head)
for (tail = *head; tail; tail = next(tail)) {
    visit(&tail->value)
}

With the simpler semantics, visitors have less state to track,
the argument to next() is reduced to 'GenericList *', and it
also becomes obvious whether an input visitor is allocating a
FooList during visit_start_list() (rather than the old way of
not knowing if an allocation happened until the first
visit_next_list()).  As a minor drawback, we now allocate in
two functions instead of one, and have to pass the size to
both functions (unless we were to tweak the input visitors to
cache the size to start_list for reuse during next_list, but
that defeats the goal of less visitor state).

The signature of visit_start_list() is chosen to match
visit_start_struct(), with the new parameters after 'name'.

The spapr_drc case is a virtual visit, done by passing NULL for
list, similarly to how NULL is passed to visit_start_struct()
when a qapi type is not used in those visits.  It was easy to
provide these semantics for qmp-output and dealloc visitors,
and a bit harder for qmp-input (several prerequisite patches
refactored things to make this patch straightforward).  But it
turned out that the string and opts visitors munge enough other
state during visit_next_list() to make it easier to just
document and require a GenericList visit for now; an assertion
will remind us to adjust things if we need the semantics in the
future.

Several pre-requisite cleanup patches made the reshuffling of
the various visitors easier; particularly the qmp input visitor.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-24-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
74f24cb630 qapi: Fix string input visitor handling of invalid list
As shown in the previous commit, the string input visitor was
treating bogus input as an empty list rather than an error.
Fix parse_str() to set errp, then the callers to exit early if
an error was reported.

Meanwhile, fix the testsuite to use the generated
qapi_free_int16List() instead of rolling our own, and to
validate the fixed behavior, while at the same time documenting
one more change that we'd like to make in a later patch (a
failed visit_start_list should guarantee a NULL pointer,
regardless of what things were on input).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-23-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Markus Armbruster
7337468385 tests/string-input-visitor: Add negative integer tests
Add two negative tests, one for int and one for int16List.  The latter
exposes a bug: nonsensical input results in an empty list instead of
an error.

Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1461325048-14122-1-git-send-email-armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-22-git-send-email-eblake@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
15c2f669e3 qapi: Split visit_end_struct() into pieces
As mentioned in previous patches, we want to call visit_end_struct()
functions unconditionally, so that visitors can release resources
tied up since the matching visit_start_struct() without also having
to worry about error priority if more than one error occurs.

Even though error_propagate() can be safely used to ignore a second
error during cleanup caused by a first error, it is simpler if the
cleanup cannot set an error.  So, split out the error checking
portion (basically, input visitors checking for unvisited keys) into
a new function visit_check_struct(), which can be safely skipped if
any earlier errors are encountered, and leave the cleanup portion
(which never fails, but must be called unconditionally if
visit_start_struct() succeeded) in visit_end_struct().

Generated code in qapi-visit.c has diffs resembling:

|@@ -59,10 +59,12 @@ void visit_type_ACPIOSTInfo(Visitor *v,
|         goto out_obj;
|     }
|     visit_type_ACPIOSTInfo_members(v, obj, &err);
|-    error_propagate(errp, err);
|-    err = NULL;
|+    if (err) {
|+        goto out_obj;
|+    }
|+    visit_check_struct(v, &err);
| out_obj:
|-    visit_end_struct(v, &err);
|+    visit_end_struct(v);
| out:

and in qapi-event.c:

@@ -47,7 +47,10 @@ void qapi_event_send_acpi_device_ost(ACP
|         goto out;
|     }
|     visit_type_q_obj_ACPI_DEVICE_OST_arg_members(v, &param, &err);
|-    visit_end_struct(v, err ? NULL : &err);
|+    if (!err) {
|+        visit_check_struct(v, &err);
|+    }
|+    visit_end_struct(v);
|     if (err) {
|         goto out;

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-20-git-send-email-eblake@redhat.com>
[Conflict with a doc fixup resolved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
56a6f02b8c qmp: Tighten output visitor rules
Tighten assertions in the QMP output visitor, so that:

- qmp_output_get_qobject() can only be called after pairing a
visit_end_* for every visit_start_* (rather than allowing it on
a partially built object)

- qmp_output_get_qobject() cannot be called unless at least one
visit_type_* or visit_start/visit_end pair has occurred since
creation/reset (the accidental return of NULL fixed by commit
ab8bf1d7 would have been much easier to diagnose)

- ensure that we are encountering the expected object or list
type, to provide protection against mismatched push(struct)/
pop(list) or push(list)/pop(struct), similar to the qmp-input
protection added in commit bdd8e6b5.

- ensure that except for the root, 'name' is non-null inside a
dict, and NULL inside a list (this may need changing later if
we add "name.0" support for better error messages for a list,
but for now it makes sure all users are at least consistent)

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-19-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
f2ff429bfa qmp: Don't reuse qmp visitor after grabbing output
The testsuite was the only client that attempted to reuse a
QmpOutputVisitor for a second visit after encountering an
error and/or calling qmp_output_get_qobject() on a first
visit.  The next patch is about to tighten the semantics to
be one-shot usage of the visitor, like all other visitors
(which will enable further simplifications down the road).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1462854006-24658-1-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:55 +02:00
Eric Blake
a543a554cf spapr_drc: Expose 'null' in qom-get when there is no fdt
Now that the QMP output visitor supports an explicit null
output, we should utilize it to make it easier to diagnose
the difference between a missing fdt ('null') vs. a
present-but-empty one ('{}').

(Note that this reverts the behavior of commit ab8bf1d, taking
us back to the behavior of commit 6c2f9a1 [which in turn
stemmed from a crash fix in 1d10b44]; but that this time,
the change is intentional and not an accidental side-effect.)

Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1461879932-9020-17-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
3df016f185 qmp: Support explicit null during visits
Implement the new type_null() callback for the qmp input and
output visitors. While we don't yet have a use for this in QAPI
input (the generator will need some tweaks first), some
potential usages have already been discussed on the list.
Meanwhile, the output visitor could already output explicit null
via type_any, but this gives us finer control.

At any rate, it's easy to test that we can round-trip an explicit
null through manual use of visit_type_null() wrapped by a virtual
visit_start_struct() walk, even if we can't do the visit in a
QAPI type.  Repurpose the test_visitor_out_empty test,
particularly since a future patch will tighten semantics to
forbid use of qmp_output_get_qobject() without at least one
intervening visit_type_*.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-16-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
3bc97fd592 qapi: Add visit_type_null() visitor
Right now, qmp-output-visitor happens to produce a QNull result
if nothing is actually visited between the creation of the visitor
and the request for the resulting QObject.  A stronger protocol
would require that a QMP output visit MUST visit something.  But
to still be able to produce a JSON 'null' output, we need a new
visitor function that states our intentions.  Yes, we could say
that such a visit must go through visit_type_any(), but that
feels clunky.

So this patch introduces the new visit_type_null() interface and
its no-op interface in the dealloc visitor, and stubs in the
qmp visitors (the next patch will finish the implementation).
For the visitors that will not implement the callback, document
the situation. The code in qapi-visit-core unconditionally
dereferences the callback pointer, so that a segfault will inform
a developer if they need to implement the callback for their
choice of visitor.

Note that JSON has a primitive null type, with the single value
null; likewise with the QNull type for QObject; but for QAPI,
we just have the 'null' value without a null type.  We may
eventually want to add more support in QAPI for null (most likely,
we'd use it via an alternate type that permits 'null' or an
object); but we'll create that usage when we need it.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-15-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
7d7a337ec3 tests: Add check-qnull
Add a new test, for checking reference counting of qnull(). As
part of the new file, move a previous reference counting change
added in commit a861564 to a more logical place.

Note that while most of the check-q*.c leave visitor stuff to
the test-qmp-*-visitor.c, in this case we actually want the
visitor tests in our new file because we are validating the
reference count of qnull_, which is an internal detail that
test-qmp-*-visitor should not be peeking into (or put another
way, qnull() is the only special case where we don't have
independent allocation of a QObject, so none of the other
visitor tests require the layering violation present in this
test).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-14-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
adfb264c9e qapi: Document visitor interfaces, add assertions
The visitor interface for mapping between QObject/QemuOpts/string
and QAPI is scandalously under-documented, making changes to visitor
core, individual visitors, and users of visitors difficult to
coordinate.  Among other questions: when is it safe to pass NULL,
vs. when a string must be provided; which visitors implement which
callbacks; the difference between concrete and virtual visits.

Correct this by retrofitting proper contracts, and document where some
of the interface warts remain (for example, we may want to modify
visit_end_* to require the same 'obj' as the visit_start counterpart,
so the dealloc visitor can be simplified).  Later patches in this
series will tackle some, but not all, of these warts.

Add assertions to (partially) enforce the contract.  Some of these
were only made possible by recent cleanup commits.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-13-git-send-email-eblake@redhat.com>
[Doc fix from Eric squashed in]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
fcf3cb2178 qmp-input: Refactor when list is advanced
In the QMP input visitor, visiting a list traverses two objects:
the QAPI GenericList of the caller (which gets advanced in
visit_next_list() regardless of this patch), and the QList input
that we are converting to QAPI.  For consistency with QDict
visits, we want to consume elements from the input QList during
the visit_type_FOO() for the list element; that is, we want ALL
the code for consuming an input to live in qmp_input_get_object(),
rather than having it split according to whether we are visiting
a dict or a list.  Making qmp_input_get_object() the common point
of consumption will make it easier for a later patch to refactor
visit_start_list() to cover the GenericList * head of a QAPI list,
and in turn will get rid of the 'first' flag (which lived in
qmp_input_next_list() pre-patch, and is hoisted to StackObject
by this patch).

This patch is therefore altering the post-condition use of 'entry',
while keeping what gets visited unchanged, from:

        start_list next_list type_ELT ... next_list type_ELT next_list end_list
 visits                      1st elt                last elt
 entry  NULL       1st elt   1st elt      last elt  last elt NULL      gone

where type_ELT() returns (entry ? entry : 1st elt) and next_list() steps
entry

to this usage:

        start_list next_list type_ELT ... next_list type_ELT next_list end_list
 visits                      1st elt                last elt
 entry  1st elt    1nd elt   2nd elt      last elt  NULL     NULL      gone

where type_ELT() steps entry and returns the old entry, and next_list()
leaves entry alone.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-12-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
ce140b1769 qmp-input: Require struct push to visit members of top dict
Don't embed the root of the visit into the stack of current
containers being visited.  That way, we no longer get confused
on whether the first visit of a dictionary is to the dictionary
itself or to one of the members of the dictionary, based on
whether the caller passed name=NULL; and makes the QMP Input
visitor like other visitors where the value of 'name' is now
ignored on the root visit.  (We may someday want to revisit
the rules on what 'name' should be on a top-level visit,
rather than just ignoring it; but that would be the topic of
another patch).

An audit of all qmp_input_visitor_new() call sites shows that
there were only two places where callers had previously been
visiting to a QDict with a non-NULL name to bypass a call to
visit_start_struct(), and those were fixed in prior patches.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-11-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
ad739706bb qom: Wrap prop visit in visit_start_struct
The qmp-input visitor was allowing callers to play rather fast
and loose: when visiting a QDict, you could grab members of the
root dictionary without first pushing into the dict; the final
such culprit was the QOM code for converting to and from object
properties.  But we are about to tighten the input visitor, at
which point user_creatable_add_type() as called with a QMP input
visitor via qmp_object_add() MUST follow the same paradigms as
everyone else, of pushing into the struct before grabbing its
keys.

The use of 'err ? NULL : &err' is temporary; a later patch will
clean that up when it splits visit_end_struct().

Furthermore, note that both callers always pass qdict, so we can
convert the conditional into an assert and reduce indentation.

The change has no impact to the testsuite now, but is required to
avoid a failure in tests/test-netfilter once qmp-input is made
stricter to detect inconsistent 'name' arguments on the root visit.

Since user_creatable_add_type() is also called with OptsVisitor
through user_creatable_add_opts(), we must also check that there
is no negative impact there; both pre- and post-patch, we see:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio -object secret,id=sec0,data=letmein,format=raw,foo=bar
qemu-system-x86_64: -object secret,id=sec0,data=letmein,format=raw,foo=bar: Property '.foo' not found

That is, the only new checking that the new visit_end_struct() can
perform is for excess input, but we already catch excess input
earlier in object_property_set().

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-10-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
ed84153519 qapi-commands: Wrap argument visit in visit_start_struct
The qmp-input visitor was allowing callers to play rather fast
and loose: when visiting a QDict, you could grab members of the
root dictionary without first pushing into the dict; among the
culprit callers was the generated marshal code on the 'arguments'
dictionary of a QMP command.  But we are about to tighten the
input visitor, at which point the generated marshal code MUST
follow the same paradigms as everyone else, of pushing into the
struct before grabbing its keys.

Generated code grows as follows:

|@@ -515,7 +641,12 @@ void qmp_marshal_blockdev_backup(QDict *
|     BlockdevBackup arg = {0};
|
|     v = qmp_input_get_visitor(qiv);
|+    visit_start_struct(v, NULL, NULL, 0, &err);
|+    if (err) {
|+        goto out;
|+    }
|     visit_type_BlockdevBackup_members(v, &arg, &err);
|+    visit_end_struct(v, err ? NULL : &err);
|     if (err) {
|         goto out;
|     }
|@@ -527,7 +715,9 @@ out:
|     qmp_input_visitor_cleanup(qiv);
|     qdv = qapi_dealloc_visitor_new();
|     v = qapi_dealloc_get_visitor(qdv);
|+    visit_start_struct(v, NULL, NULL, 0, NULL);
|     visit_type_BlockdevBackup_members(v, &arg, NULL);
|+    visit_end_struct(v, NULL);
|     qapi_dealloc_visitor_cleanup(qdv);
| }

The use of 'err ? NULL : &err' is temporary; a later patch will
clean that up when it splits visit_end_struct().

Prior to this patch, the fact that there was no final
visit_end_struct() meant that even though we are using a strict
input visit, the marshalling code was not detecting excess input
at the top level (only in nested levels).  Fortunately, we have
code in monitor.c:qmp_check_client_args() that also checks for
no excess arguments at the top level.  But as the generated code
is more compact than the manual check, a later patch will clean
up monitor.c to drop the redundancy added here.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-9-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
e5826a2fd7 qmp-input: Don't consume input when checking has_member
Commit e8316d7 mistakenly passed consume=true within
qmp_input_optional() when checking if an optional member was
present, but the mistake was silently ignored since the code
happily let us extract a member more than once.  Fix
qmp_input_optional() to not consume anything, then tighten up
the input visitor to ensure that a member is consumed exactly
once (all generated code follows this pattern; and the new
assert will catch any hand-written code that tries to visit
the same key more than once).

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-8-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
240f64b6dc qapi: Use strict QMP input visitor in more places
The following uses of a QMP input visitor should be strict
(that is, excess keys in QDict input should be flagged if not
converted to QAPI):

- Testsuite code unrelated to explicitly testing non-strict
mode (test-qmp-commands, test-visitor-serialization); since
we want more code to be strict by default, having more tests
of strict mode doesn't hurt

- Code used for cloning QAPI objects (replay-input.c,
qemu-sockets.c); we are reparsing a QObject just barely
produced by the qmp output visitor and which therefore should
not have any garbage, so while it is extra work to be strict,
it validates that our clone is correct [note that a later patch
series will simplify these two uses by creating an actual
clone visitor that is much more efficient than a
generate/reparse cycle]

- qmp_object_add(), which calls into user_creatable_add_type().
Since command line parsing for '-object' uses the same
user_creatable_add_type() through the OptsVisitor, and that is
always strict, we want to ensure that any nested dictionaries
would be treated the same in QMP and from the command line (I
don't actually know if such nested dictionaries exist).  Note
that on this code change, strictness only matters for nested
dictionaries (if even possible), since we already flag excess
input at the top level during an earlier object_property_set()
on an unknown key, whether from QemuOpts:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio -object secret,id=sec0,data=letmein,format=raw,foo=bar
qemu-system-x86_64: -object secret,id=sec0,data=letmein,format=raw,foo=bar: Property '.foo' not found

or from QMP:

$ ./x86_64-softmmu/qemu-system-x86_64 -nographic -nodefaults -qmp stdio
{"QMP": {"version": {"qemu": {"micro": 93, "minor": 5, "major": 2}, "package": ""}, "capabilities": []}}
{"execute":"qmp_capabilities"}
{"return": {}}
{"execute":"object-add","arguments":{"qom-type":"secret","id":"sec0","props":{"format":"raw","data":"letmein","foo":"bar"}}}
{"error": {"class": "GenericError", "desc": "Property '.foo' not found"}}

The only remaining uses of non-strict input visits are:

- QMP 'qom-set' (which eventually executes
object_property_set_qobject()) - mark it as something to revisit
in the future (I didn't want to spend any more time on this patch
auditing if we have any QOM dictionary properties that might be
impacted, and couldn't easily prove whether this code path is
shared with anything else).

- test-qmp-input-visitor: explicit tests of non-strict mode. If
we later get rid of users that don't need strictness, then this
test should be merged with test-qmp-input-strict

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-7-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
fc471c18d5 qapi: Consolidate QMP input visitor creation
Rather than having two separate ways to create a QMP input
visitor, where the safer approach has the more verbose name,
it is better to consolidate things into a single function
where the caller must explicitly choose whether to be strict
or to ignore excess input.  This patch is the strictly
mechanical conversion; the next patch will then audit which
uses can be made stricter.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-6-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
b471d012e5 qmp-input: Clean up stack handling
Management of the top of stack was a bit verbose; creating a
temporary variable and adding some comments makes the existing
code more legible before the next few patches improve things.
No semantic changes other than asserting that we are always
visiting a QObject, and not a NULL value.  In particular, the
check for 'name && qobject_type(qobj) == QTYPE_QDICT)' is a
bit overkill (a dict visit should always have a name); a later
patch revisits that, while this patch is only changing one
layer of indentation due to dropping 'if (qobj)'.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-5-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
42a502a7a6 qmp: Drop dead command->type
Ever since QMP was first added back in commit 43c20a43, we have
never had any QmpCommandType other than QCT_NORMAL.  It's
pointless to carry around the cruft.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-4-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
e58d695e6c qapi: Guarantee NULL obj on input visitor callback error
Our existing input visitors were not very consistent on errors in a
function taking 'TYPE **obj'.  These are start_struct(),
start_alternate(), type_str(), and type_any().  next_list() is
similar, but can't fail (see commit 08f9541).  While all of them set
'*obj' to allocated storage on success, it was not obvious whether
'*obj' was guaranteed safe on failure, or whether it was left
uninitialized.  But a future patch wants to guarantee that
visit_type_FOO() does not leak a partially-constructed obj back to
the caller; it is easier to implement this if we can reliably state
that input visitors assign '*obj' regardless of success or failure,
and that on failure *obj is NULL.  Add assertions to enforce
consistency in the final setting of err vs. *obj.

The opts-visitor start_struct() doesn't set an error, but it
also was doing a weird check for 0 size; all callers pass in
non-zero size if obj is non-NULL.

The testsuite has at least one spot where we no longer need
to pre-initialize a variable prior to a visit; valgrind confirms
that the test is still fine with the cleanup.

A later patch will document the design constraint implemented
here.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-3-git-send-email-eblake@redhat.com>
[visit_start_alternate()'s assertion tightened, commit message tweaked]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Eric Blake
983f52d4b3 qapi-visit: Add visitor.type classification
We have three classes of QAPI visitors: input, output, and dealloc.
Currently, all implementations of these visitors have one thing in
common based on their visitor type: the implementation used for the
visit_type_enum() callback.  But since we plan to add more such
common behavior, in relation to documenting and further refining
the semantics, it makes more sense to have the visitor
implementations advertise which class they belong to, so the common
qapi-visit-core code can use that information in multiple places.

A later patch will better document the types of visitors directly
in visitor.h.

For this patch, knowing the class of a visitor implementation lets
us make input_type_enum() and output_type_enum() become static
functions, by replacing the callback function Visitor.type_enum()
with the simpler enum member Visitor.type.  Share a common
assertion in qapi-visit-core as part of the refactoring.

Move comments in opts-visitor.c to match the refactored layout.

Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1461879932-9020-2-git-send-email-eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
2016-05-12 09:47:54 +02:00
Stefan Weil
a277c3e094 usb: Support compilation without poll.h
This is a hack to support compilation with Mingw-w64 which provides
a libusb-1.0 package, but no poll.h.

Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 1458630800-10088-1-git-send-email-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 10:37:39 +02:00
Isaac Lozano
1f66fe5778 usb-mtp: fix usb_mtp_get_device_info so that libmtp on the guest doesn't complain
If an application uses libmtp on the guest system,
it will complain with the warning message:
LIBMTP WARNING: VendorExtensionID: ffffffff
LIBMTP WARNING: VendorExtensionDesc: (null)
LIBMTP WARNING: this typically means the device is PTP (i.e. a camera) but
not a MTP device at all. Trying to continue anyway.

This is because libmtp expects a MTP Vendor Extension ID of 0x00000006 and a
MTP Version of 0x0064. These numbers are taken from Microsoft's MTP Vendor
Extension Identification Message page and are what most physical devices
show.

Signed-off-by: Isaac Lozano <109lozanoi@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1460892593-5908-1-git-send-email-109lozanoi@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 10:33:49 +02:00
Roman Kagan
491d68d938 usb:xhci: no DMA on HC reset
This patch is a rough fix to a memory corruption we are observing when
running VMs with xhci USB controller and OVMF firmware.

Specifically, on the following call chain

xhci_reset
  xhci_disable_slot
    xhci_disable_ep
      xhci_set_ep_state

QEMU overwrites guest memory using stale guest addresses.

This doesn't happen when the guest (firmware) driver sets up xhci for
the first time as there are no slots configured yet.  However when the
firmware hands over the control to the OS some slots and endpoints are
already set up with their context in the guest RAM.  Now the OS' driver
resets the controller again and xhci_set_ep_state then reads and writes
that memory which is now owned by the OS.

As a quick fix, skip calling xhci_set_ep_state in xhci_disable_ep if the
device context base address array pointer is zero (indicating we're in
the HC reset and no DMA is possible).

Cc: qemu-stable@nongnu.org
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-id: 1462384435-1034-1-git-send-email-rkagan@virtuozzo.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 10:29:28 +02:00
Cole Robinson
bb732ee78c ui: gtk: Fix some deprecation warnings
All device manager APIs are deprecated now. Much of our usage is
just to get the current pointer, so centralize that logic and use
the new seat APIs

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: d6dec24220a4e1449a0172119c10c48e145c0f6f.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:41 +02:00
Cole Robinson
84e2dc4bf3 ui: gtk: Fix a runtime warning on vte >= 0.37
inner-border was dropped in vte API 2.91, in favor of the standard
padding style

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 60a6cdc337d611d902f53907e66a8f37ea374d65.1462557436.git.crobinso@redhat.com

[ kraxel: Fix warning with old vte version. ]

Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:41 +02:00
Cole Robinson
c6feff9e09 configure: support vte-2.91
vte >= 0.37 expores API version 2.91, which is where all the active
development is. qemu builds and runs fine with that version, so use it
if it's available.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: b4f0375647f7b368d3dbd3834aee58cb0253566a.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
d6a6dba359 configure: report SDL version
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 98e4a3b98dc824bfaff96db43b172272c780c15f.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
f2a4e54828 configure: report GTK version
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 4c464e20d69fdcf21927ceed31a8d749b4af0c49.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
02d34f62fd configure: add echo_version helper
Simplifies printing library versions, dependent on if the library
was even found

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 3c9ab16123e06bb4109771ef6ee8acd82d449ba0.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
e07047cfd7 configure: error on unknown --with-sdlabi value
I accidentally tried --with-sdlabi="1.0", and it failed much later in
a weird way. Instead, throw an error if the value isn't in our
whitelist.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 60e4822e17697d257a914df03bdb9fff4b4c0490.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
ee8466d0ea configure: build SDL if only SDL2 available
Right now if SDL2 is installed but not SDL1, default configure will
entirely disable SDL. Check upfront for SDL2 using pkg-config, but
still prefer SDL1 if both versions are installed.

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: c9e570b5964d128a3595efe3170129a3da459776.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
56f289f383 ui: sdl2: Release grab before opening console window
sdl 2.0.4 currently has a bug which causes our UI shortcuts to fire
rapidly in succession:

  https://bugzilla.libsdl.org/show_bug.cgi?id=3287

It's a toss up whether ctrl+alt+f or ctrl+alt+2 will fire an
odd or even number of times, thus determining whether the action
succeeds or fails.

Opening monitor/serial windows is doubly broken, since it will often
lock the UI trying to grab the pointer:

  0x00007fffef3720a5 in SDL_Delay_REAL () at /lib64/libSDL2-2.0.so.0
  0x00007fffef3688ba in X11_SetWindowGrab () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2f2da7 in SDL_SendWindowEvent () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2f080b in SDL_SetKeyboardFocus () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35d784 in X11_DispatchFocusIn.isra.8 () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35dbce in X11_DispatchEvent () at /lib64/libSDL2-2.0.so.0
  0x00007fffef35ee4a in X11_PumpEvents () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2eea6a in SDL_PumpEvents_REAL () at /lib64/libSDL2-2.0.so.0
  0x00007fffef2eeab5 in SDL_WaitEventTimeout_REAL () at /lib64/libSDL2-2.0.so.0
  0x000055555597eed0 in sdl2_poll_events (scon=0x55555876f928) at ui/sdl2.c:593

We can work around that hang by ungrabbing the pointer before launching
a new window. This roughly matches what our sdl1 code does

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 31c9ab6540b031f7a614c59edcecea9877685612.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
Cole Robinson
4fd811a6bd ui: gtk: fix crash when terminal inner-border is NULL
VTE terminal inner-border can be NULL. The vte-0.36 (API 2.90)
code checks for the condition too so I assume it's not just a bug

Fixes a crash on Fedora 24 with gtk 3.20

Signed-off-by: Cole Robinson <crobinso@redhat.com>
Message-id: 2b2e85d403e8760ea53afd735a170500d5c17716.1462557436.git.crobinso@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
2016-05-11 08:02:40 +02:00
1440 changed files with 66843 additions and 25560 deletions

2
.gitignore vendored
View File

@@ -5,6 +5,7 @@
/config-target.*
/config.status
/config-temp
/trace-events-all
/trace/generated-tracers.h
/trace/generated-tracers.c
/trace/generated-tracers-dtrace.h
@@ -108,4 +109,5 @@
cscope.*
tags
TAGS
docker-src.*
*~

View File

@@ -17,6 +17,7 @@ addons:
- libgtk-3-dev
- libiscsi-dev
- liblttng-ust-dev
- libnfs-dev
- libncurses5-dev
- libnss3-dev
- libpixman-1-dev
@@ -63,9 +64,6 @@ script:
- make -j3 && ${TEST_CMD}
matrix:
include:
# Sparse is GCC only
- env: CONFIG="--enable-sparse"
compiler: gcc
# gprof/gcov are GCC features
- env: CONFIG="--enable-gprof --enable-gcov --disable-pie"
compiler: gcc
@@ -88,3 +86,13 @@ matrix:
- env: CONFIG=""
os: osx
compiler: clang
- env: CONFIG=""
sudo: required
addons:
dist: trusty
compiler: gcc
before_install:
- sudo apt-get update -qq
- sudo apt-get build-dep -qq qemu
- wget -O - http://people.linaro.org/~alex.bennee/qemu-submodule-git-seed.tar.xz | tar -xvJ
- git submodule update --init --recursive

View File

@@ -165,6 +165,7 @@ F: hw/openrisc/
F: tests/tcg/openrisc/
PowerPC
M: David Gibson <david@gibson.dropbear.id.au>
M: Alexander Graf <agraf@suse.de>
L: qemu-ppc@nongnu.org
S: Maintained
@@ -188,8 +189,8 @@ F: hw/sh4/
F: disas/sh4.c
SPARC
M: Blue Swirl <blauwirbel@gmail.com>
M: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
M: Artyom Tarasenko <atar4qemu@gmail.com>
S: Maintained
F: target-sparc/
F: hw/sparc/
@@ -597,7 +598,7 @@ F: hw/pci-host/grackle.c
F: hw/misc/macio/
PReP
M: Andreas Färber <andreas.faerber@web.de>
L: qemu-devel@nongnu.org
L: qemu-ppc@nongnu.org
S: Odd Fixes
F: hw/ppc/prep.c
@@ -779,6 +780,7 @@ F: hw/ipack/
PCI
M: Michael S. Tsirkin <mst@redhat.com>
M: Marcel Apfelbaum <marcel@redhat.com>
S: Supported
F: include/hw/pci/*
F: hw/misc/pci-testdev.c
@@ -886,12 +888,13 @@ F: include/hw/virtio/
virtio-9p
M: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
M: Greg Kurz <gkurz@linux.vnet.ibm.com>
M: Greg Kurz <groug@kaod.org>
S: Supported
F: hw/9pfs/
F: fsdev/
F: tests/virtio-9p-test.c
T: git git://github.com/kvaneesh/QEMU.git
T: git git://github.com/gkurz/qemu.git 9p-next
virtio-blk
M: Stefan Hajnoczi <stefanha@redhat.com>
@@ -953,6 +956,14 @@ S: Maintained
F: hw/*/xilinx_*
F: include/hw/xilinx.h
Network packet abstractions
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: include/net/eth.h
F: net/eth.c
F: hw/net/net_rx_pkt*
F: hw/net/net_tx_pkt*
Vmware
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
@@ -972,6 +983,16 @@ F: hw/acpi/nvdimm.c
F: hw/mem/nvdimm.c
F: include/hw/mem/nvdimm.h
e1000x
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/e1000x*
e1000e
M: Dmitry Fleytman <dmitry@daynix.com>
S: Maintained
F: hw/net/e1000e*
Subsystems
----------
Audio
@@ -1046,7 +1067,7 @@ S: Supported
F: scripts/coverity-model.c
CPU
M: Andreas Färber <afaerber@suse.de>
L: qemu-devel@nongnu.org
S: Supported
F: qom/cpu.c
F: include/qom/cpu.h
@@ -1105,7 +1126,6 @@ F: ui/
F: include/ui/
Cocoa graphics
M: Andreas Färber <andreas.faerber@web.de>
M: Peter Maydell <peter.maydell@linaro.org>
S: Odd Fixes
F: ui/cocoa.m
@@ -1242,7 +1262,6 @@ F: docs/tracing.txt
T: git git://github.com/stefanha/qemu.git tracing
Checkpatch
M: Blue Swirl <blauwirbel@gmail.com>
S: Odd Fixes
F: scripts/checkpatch.pl
@@ -1315,8 +1334,7 @@ F: thunk.c
F: user-exec.c
BSD user
M: Blue Swirl <blauwirbel@gmail.com>
S: Maintained
S: Orphan
F: bsd-user/
Linux user
@@ -1379,8 +1397,7 @@ F: tcg/s390/
F: disas/s390.c
SPARC target
M: Blue Swirl <blauwirbel@gmail.com>
S: Maintained
S: Odd Fixes
F: tcg/sparc/
F: disas/sparc.c
@@ -1400,9 +1417,8 @@ S: Orphan
Stable 0.15
L: qemu-stable@nongnu.org
M: Andreas Färber <afaerber@suse.de>
T: git git://git.qemu-project.org/qemu-stable-0.15.git
S: Supported
S: Orphan
Stable 0.14
L: qemu-stable@nongnu.org
@@ -1616,3 +1632,10 @@ Build system architecture
M: Daniel P. Berrange <berrange@redhat.com>
S: Odd Fixes
F: docs/build-system.txt
Docker testing
--------------
Docker based testing framework and cases
M: Fam Zheng <famz@redhat.com>
S: Maintained
F: tests/docker/

View File

@@ -6,7 +6,7 @@ BUILD_DIR=$(CURDIR)
# Before including a proper config-host.mak, assume we are in the source tree
SRC_PATH=.
UNCHECKED_GOALS := %clean TAGS cscope ctags
UNCHECKED_GOALS := %clean TAGS cscope ctags docker docker-%
# All following code might depend on configuration variables
ifneq ($(wildcard config-host.mak),)
@@ -30,7 +30,6 @@ CONFIG_ALL=y
-include config-all-devices.mak
-include config-all-disas.mak
include $(SRC_PATH)/rules.mak
config-host.mak: $(SRC_PATH)/configure
@echo $@ is out-of-date, running configure
@# TODO: The next lines include code which supports a smooth
@@ -49,7 +48,9 @@ ifneq ($(filter-out $(UNCHECKED_GOALS),$(MAKECMDGOALS)),$(if $(MAKECMDGOALS),,fa
endif
endif
GENERATED_HEADERS = config-host.h qemu-options.def
include $(SRC_PATH)/rules.mak
GENERATED_HEADERS = qemu-version.h config-host.h qemu-options.def
GENERATED_HEADERS += qmp-commands.h qapi-types.h qapi-visit.h qapi-event.h
GENERATED_SOURCES += qmp-marshal.c qapi-types.c qapi-visit.c qapi-event.c
GENERATED_HEADERS += qmp-introspect.h
@@ -81,7 +82,7 @@ Makefile: ;
configure: ;
.PHONY: all clean cscope distclean dvi html info install install-doc \
pdf recurse-all speed test dist msi
pdf recurse-all speed test dist msi FORCE
$(call set-vpath, $(SRC_PATH))
@@ -92,9 +93,6 @@ HELPERS-$(CONFIG_LINUX) = qemu-bridge-helper$(EXESUF)
ifdef BUILD_DOCS
DOCS=qemu-doc.html qemu-tech.html qemu.1 qemu-img.1 qemu-nbd.8 qemu-ga.8
DOCS+=qmp-commands.txt
ifdef CONFIG_LINUX
DOCS+=kvm_stat.1
endif
ifdef CONFIG_VIRTFS
DOCS+=fsdev/virtfs-proxy-helper.1
endif
@@ -119,7 +117,7 @@ endif
-include $(SUBDIR_DEVICES_MAK_DEP)
%/config-devices.mak: default-configs/%.mak
%/config-devices.mak: default-configs/%.mak $(SRC_PATH)/scripts/make_device_config.sh
$(call quiet-command, \
$(SHELL) $(SRC_PATH)/scripts/make_device_config.sh $< $*-config-devices.mak.d $@ > $@.tmp, " GEN $@.tmp")
$(call quiet-command, if test -f $@; then \
@@ -164,14 +162,34 @@ dummy := $(call unnest-vars,, \
common-obj-m)
ifneq ($(wildcard config-host.mak),)
include $(SRC_PATH)/tests/Makefile
include $(SRC_PATH)/tests/Makefile.include
endif
all: $(DOCS) $(TOOLS) $(HELPERS-y) recurse-all modules
qemu-version.h: FORCE
$(call quiet-command, \
(cd $(SRC_PATH); \
printf '#define QEMU_PKGVERSION '; \
if test -n "$(PKGVERSION)"; then \
printf '"$(PKGVERSION)"\n'; \
else \
if test -d .git; then \
printf '" ('; \
git describe --match 'v*' 2>/dev/null | tr -d '\n'; \
if ! git diff-index --quiet HEAD &>/dev/null; then \
printf -- '-dirty'; \
fi; \
printf ')"\n'; \
else \
printf '""\n'; \
fi; \
fi) > $@.tmp)
$(call quiet-command, cmp --quiet $@ $@.tmp || mv $@.tmp $@)
config-host.h: config-host.h-timestamp
config-host.h-timestamp: config-host.mak
qemu-options.def: $(SRC_PATH)/qemu-options.hx
qemu-options.def: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
SUBDIR_RULES=$(patsubst %,subdir-%, $(TARGET_DIRS))
@@ -243,7 +261,7 @@ qemu-bridge-helper$(EXESUF): qemu-bridge-helper.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): fsdev/virtfs-proxy-helper.o fsdev/9p-marshal.o fsdev/9p-iov-marshal.o libqemuutil.a libqemustub.a
fsdev/virtfs-proxy-helper$(EXESUF): LIBS += -lcap
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx
qemu-img-cmds.h: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $@")
qemu-ga$(EXESUF): LIBS = $(LIBS_QGA)
@@ -356,6 +374,7 @@ clean:
if test -d $$d; then $(MAKE) -C $$d $@ || exit 1; fi; \
rm -f $$d/qemu-options.def; \
done
rm -f $(SUBDIR_DEVICES_MAK) config-all-devices.mak
VERSION ?= $(shell cat VERSION)
@@ -468,7 +487,7 @@ endif
set -e; for x in $(KEYMAPS); do \
$(INSTALL_DATA) $(SRC_PATH)/pc-bios/keymaps/$$x "$(DESTDIR)$(qemu_datadir)/keymaps"; \
done
$(INSTALL_DATA) $(SRC_PATH)/trace-events "$(DESTDIR)$(qemu_datadir)/trace-events"
$(INSTALL_DATA) $(BUILD_DIR)/trace-events-all "$(DESTDIR)$(qemu_datadir)/trace-events-all"
for d in $(TARGET_DIRS); do \
$(MAKE) $(SUBDIR_MAKEFLAGS) TARGET_DIR=$$d/ -C $$d $@ || exit 1 ; \
done
@@ -479,12 +498,12 @@ test speed: all
.PHONY: ctags
ctags:
rm -f $@
rm -f tags
find "$(SRC_PATH)" -name '*.[hc]' -exec ctags --append {} +
.PHONY: TAGS
TAGS:
rm -f $@
rm -f TAGS
find "$(SRC_PATH)" -name '*.[hc]' -exec etags --append {} +
cscope:
@@ -525,19 +544,19 @@ TEXIFLAG=$(if $(V),,--quiet)
%.pdf: %.texi
$(call quiet-command,texi2pdf $(TEXIFLAG) -I . $<," GEN $@")
qemu-options.texi: $(SRC_PATH)/qemu-options.hx
qemu-options.texi: $(SRC_PATH)/qemu-options.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx
qemu-monitor.texi: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx
qemu-monitor-info.texi: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx
qmp-commands.txt: $(SRC_PATH)/qmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -q < $< > $@," GEN $@")
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx
qemu-img-cmds.texi: $(SRC_PATH)/qemu-img-cmds.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -t < $< > $@," GEN $@")
qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
@@ -545,8 +564,9 @@ qemu.1: qemu-doc.texi qemu-options.texi qemu-monitor.texi qemu-monitor-info.texi
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu.pod > $@, \
" GEN $@")
qemu.1: qemu-option-trace.texi
qemu-img.1: qemu-img.texi qemu-img-cmds.texi
qemu-img.1: qemu-img.texi qemu-option-trace.texi qemu-img-cmds.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-img.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " qemu-img.pod > $@, \
@@ -558,7 +578,7 @@ fsdev/virtfs-proxy-helper.1: fsdev/virtfs-proxy-helper.texi
$(POD2MAN) --section=1 --center=" " --release=" " fsdev/virtfs-proxy-helper.pod > $@, \
" GEN $@")
qemu-nbd.8: qemu-nbd.texi
qemu-nbd.8: qemu-nbd.texi qemu-option-trace.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< qemu-nbd.pod && \
$(POD2MAN) --section=8 --center=" " --release=" " qemu-nbd.pod > $@, \
@@ -570,19 +590,13 @@ qemu-ga.8: qemu-ga.texi
$(POD2MAN) --section=8 --center=" " --release=" " qemu-ga.pod > $@, \
" GEN $@")
kvm_stat.1: scripts/kvm/kvm_stat.texi
$(call quiet-command, \
perl -Ww -- $(SRC_PATH)/scripts/texi2pod.pl $< kvm_stat.pod && \
$(POD2MAN) --section=1 --center=" " --release=" " kvm_stat.pod > $@, \
" GEN $@")
dvi: qemu-doc.dvi qemu-tech.dvi
html: qemu-doc.html qemu-tech.html
info: qemu-doc.info qemu-tech.info
pdf: qemu-doc.pdf qemu-tech.pdf
qemu-doc.dvi qemu-doc.html qemu-doc.info qemu-doc.pdf: \
qemu-img.texi qemu-nbd.texi qemu-options.texi \
qemu-img.texi qemu-nbd.texi qemu-options.texi qemu-option-trace.texi \
qemu-monitor.texi qemu-img-cmds.texi qemu-ga.texi \
qemu-monitor-info.texi
@@ -651,3 +665,5 @@ endif
# Include automatically generated dependency files
# Dependencies in Makefile.objs files come from our recursive subdir rules
-include $(wildcard *.d tests/*.d)
include $(SRC_PATH)/tests/docker/Makefile.include

View File

@@ -52,7 +52,6 @@ common-obj-$(CONFIG_LINUX) += fsdev/
common-obj-y += migration/
common-obj-y += qemu-char.o #aio.o
common-obj-y += page_cache.o
common-obj-y += qjson.o
common-obj-$(CONFIG_SPICE) += spice-qemu-char.o
@@ -116,3 +115,46 @@ qga-vss-dll-obj-y = qga/
# contrib
ivshmem-client-obj-y = contrib/ivshmem-client/
ivshmem-server-obj-y = contrib/ivshmem-server/
######################################################################
trace-events-y = trace-events
trace-events-y += util/trace-events
trace-events-y += crypto/trace-events
trace-events-y += io/trace-events
trace-events-y += migration/trace-events
trace-events-y += block/trace-events
trace-events-y += hw/block/trace-events
trace-events-y += hw/char/trace-events
trace-events-y += hw/intc/trace-events
trace-events-y += hw/net/trace-events
trace-events-y += hw/virtio/trace-events
trace-events-y += hw/audio/trace-events
trace-events-y += hw/misc/trace-events
trace-events-y += hw/usb/trace-events
trace-events-y += hw/scsi/trace-events
trace-events-y += hw/nvram/trace-events
trace-events-y += hw/display/trace-events
trace-events-y += hw/input/trace-events
trace-events-y += hw/timer/trace-events
trace-events-y += hw/dma/trace-events
trace-events-y += hw/sparc/trace-events
trace-events-y += hw/sd/trace-events
trace-events-y += hw/isa/trace-events
trace-events-y += hw/i386/trace-events
trace-events-y += hw/9pfs/trace-events
trace-events-y += hw/ppc/trace-events
trace-events-y += hw/pci/trace-events
trace-events-y += hw/s390x/trace-events
trace-events-y += hw/vfio/trace-events
trace-events-y += hw/acpi/trace-events
trace-events-y += hw/arm/trace-events
trace-events-y += hw/alpha/trace-events
trace-events-y += ui/trace-events
trace-events-y += audio/trace-events
trace-events-y += net/trace-events
trace-events-y += target-sparc/trace-events
trace-events-y += target-s390x/trace-events
trace-events-y += target-ppc/trace-events
trace-events-y += qom/trace-events
trace-events-y += linux-user/trace-events

View File

@@ -48,7 +48,7 @@ else
TARGET_TYPE=system
endif
$(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
$(QEMU_PROG).stp-installed: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
@@ -57,7 +57,7 @@ $(QEMU_PROG).stp-installed: $(SRC_PATH)/trace-events
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp-installed")
$(QEMU_PROG).stp: $(SRC_PATH)/trace-events
$(QEMU_PROG).stp: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--format=stap \
--backends=$(TRACE_BACKENDS) \
@@ -66,7 +66,7 @@ $(QEMU_PROG).stp: $(SRC_PATH)/trace-events
--target-type=$(TARGET_TYPE) \
< $< > $@," GEN $(TARGET_DIR)$(QEMU_PROG).stp")
$(QEMU_PROG)-simpletrace.stp: $(SRC_PATH)/trace-events
$(QEMU_PROG)-simpletrace.stp: $(BUILD_DIR)/trace-events-all
$(call quiet-command,$(TRACETOOL) \
--format=simpletrace-stap \
--backends=$(TRACE_BACKENDS) \
@@ -108,7 +108,9 @@ obj-$(CONFIG_LIBDECNUMBER) += libdecnumber/dpd/decimal128.o
ifdef CONFIG_LINUX_USER
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) -I$(SRC_PATH)/linux-user
QEMU_CFLAGS+=-I$(SRC_PATH)/linux-user/$(TARGET_ABI_DIR) \
-I$(SRC_PATH)/linux-user/host/$(ARCH) \
-I$(SRC_PATH)/linux-user
obj-y += linux-user/
obj-y += gdbstub.o thunk.o user-exec.o
@@ -201,13 +203,13 @@ endif
gdbstub-xml.c: $(TARGET_XML_FILES) $(SRC_PATH)/scripts/feature_to_c.sh
$(call quiet-command,rm -f $@ && $(SHELL) $(SRC_PATH)/scripts/feature_to_c.sh $@ $(TARGET_XML_FILES)," GEN $(TARGET_DIR)$@")
hmp-commands.h: $(SRC_PATH)/hmp-commands.hx
hmp-commands.h: $(SRC_PATH)/hmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx
hmp-commands-info.h: $(SRC_PATH)/hmp-commands-info.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx
qmp-commands-old.h: $(SRC_PATH)/qmp-commands.hx $(SRC_PATH)/scripts/hxtool
$(call quiet-command,sh $(SRC_PATH)/scripts/hxtool -h < $< > $@," GEN $(TARGET_DIR)$@")
clean:

View File

@@ -1 +1 @@
2.6.0
2.6.50

View File

@@ -77,7 +77,7 @@ static int accel_init_machine(AccelClass *acc, MachineState *ms)
return ret;
}
int configure_accelerator(MachineState *ms)
void configure_accelerator(MachineState *ms)
{
const char *p;
char buf[10];
@@ -128,8 +128,6 @@ int configure_accelerator(MachineState *ms)
if (init_failed) {
fprintf(stderr, "Back to %s accelerator.\n", acc->name);
}
return !accel_initialised;
}

View File

@@ -22,6 +22,8 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "sysemu/sysemu.h"
#include "sysemu/arch_init.h"
#include "hw/pci/pci.h"
@@ -272,13 +274,6 @@ void do_smbios_option(QemuOpts *opts)
#endif
}
void cpudef_init(void)
{
#if defined(cpudef_setup)
cpudef_setup(); /* parse cpu definitions in target config file */
#endif
}
int kvm_available(void)
{
#ifdef CONFIG_KVM

View File

@@ -1131,8 +1131,6 @@ static void audio_timer (void *opaque)
*/
int AUD_write (SWVoiceOut *sw, void *buf, int size)
{
int bytes;
if (!sw) {
/* XXX: Consider options */
return size;
@@ -1143,14 +1141,11 @@ int AUD_write (SWVoiceOut *sw, void *buf, int size)
return 0;
}
bytes = sw->hw->pcm_ops->write (sw, buf, size);
return bytes;
return sw->hw->pcm_ops->write(sw, buf, size);
}
int AUD_read (SWVoiceIn *sw, void *buf, int size)
{
int bytes;
if (!sw) {
/* XXX: Consider options */
return size;
@@ -1161,8 +1156,7 @@ int AUD_read (SWVoiceIn *sw, void *buf, int size)
return 0;
}
bytes = sw->hw->pcm_ops->read (sw, buf, size);
return bytes;
return sw->hw->pcm_ops->read(sw, buf, size);
}
int AUD_get_buffer_size_out (SWVoiceOut *sw)

View File

@@ -24,6 +24,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/bswap.h"
#include "audio.h"
#define AUDIO_CAP "mixeng"
@@ -270,7 +271,7 @@ f_sample *mixeng_clip[2][2][2][3] = {
* August 21, 1998
* Copyright 1998 Fabrice Bellard.
*
* [Rewrote completly the code of Lance Norskog And Sundry
* [Rewrote completely the code of Lance Norskog And Sundry
* Contributors with a more efficient algorithm.]
*
* This source code is freely redistributable and may be used for

View File

@@ -23,6 +23,7 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/host-utils.h"
#include "audio.h"
#include "qemu/timer.h"

View File

@@ -22,7 +22,6 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include <sys/mman.h>
#include <sys/ioctl.h>
#include <sys/soundcard.h>
#include "qemu-common.h"
@@ -898,7 +897,7 @@ static struct audio_option oss_options[] = {
.name = "EXCLUSIVE",
.tag = AUD_OPT_BOOL,
.valp = &glob_conf.exclusive,
.descr = "Open device in exclusive mode (vmix wont work)"
.descr = "Open device in exclusive mode (vmix won't work)"
},
#ifdef USE_DSP_POLICY
{

View File

@@ -781,23 +781,22 @@ static int qpa_ctl_in (HWVoiceIn *hw, int cmd, ...)
pa_threaded_mainloop_lock (g->mainloop);
/* FIXME: use the upcoming "set_source_output_{volume,mute}" */
op = pa_context_set_source_volume_by_index (g->context,
pa_stream_get_device_index (pa->stream),
op = pa_context_set_source_output_volume (g->context,
pa_stream_get_index (pa->stream),
&v, NULL, NULL);
if (!op) {
qpa_logerr (pa_context_errno (g->context),
"set_source_volume() failed\n");
"set_source_output_volume() failed\n");
} else {
pa_operation_unref(op);
}
op = pa_context_set_source_mute_by_index (g->context,
op = pa_context_set_source_output_mute (g->context,
pa_stream_get_index (pa->stream),
sw->vol.mute, NULL, NULL);
if (!op) {
qpa_logerr (pa_context_errno (g->context),
"set_source_mute() failed\n");
"set_source_output_mute() failed\n");
} else {
pa_operation_unref (op);
}

View File

@@ -19,6 +19,7 @@
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "qemu/host-utils.h"
#include "qemu/error-report.h"
#include "qemu/timer.h"
#include "ui/qemu-spice.h"

17
audio/trace-events Normal file
View File

@@ -0,0 +1,17 @@
# See docs/trace-events.txt for syntax documentation.
# audio/alsaaudio.c
alsa_revents(int revents) "revents = %d"
alsa_pollout(int i, int fd) "i = %d fd = %d"
alsa_set_handler(int events, int index, int fd, int err) "events=%#x index=%d fd=%d err=%d"
alsa_wrote_zero(int len) "Failed to write %d frames (wrote zero)"
alsa_read_zero(long len) "Failed to read %ld frames (read zero)"
alsa_xrun_out(void) "Recovering from playback xrun"
alsa_xrun_in(void) "Recovering from capture xrun"
alsa_resume_out(void) "Resuming suspended output stream"
alsa_resume_in(void) "Resuming suspended input stream"
alsa_no_frames(int state) "No frames available and ALSA state is %d"
# audio/ossaudio.c
oss_version(int version) "OSS version = %#x"
oss_invalid_available_size(int size, int bufsize) "Invalid available size, size=%d bufsize=%d"

View File

@@ -22,7 +22,7 @@
* THE SOFTWARE.
*/
#include "qemu/osdep.h"
#include "hw/hw.h"
#include "qemu/host-utils.h"
#include "qemu/timer.h"
#include "audio.h"

View File

@@ -17,7 +17,7 @@
#include "qapi/qmp/qerror.h"
#include "qemu/main-loop.h"
struct RndRandom
struct RngRandom
{
RngBackend parent;
@@ -34,7 +34,7 @@ struct RndRandom
static void entropy_available(void *opaque)
{
RndRandom *s = RNG_RANDOM(opaque);
RngRandom *s = RNG_RANDOM(opaque);
while (!QSIMPLEQ_EMPTY(&s->parent.requests)) {
RngRequest *req = QSIMPLEQ_FIRST(&s->parent.requests);
@@ -57,7 +57,7 @@ static void entropy_available(void *opaque)
static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
{
RndRandom *s = RNG_RANDOM(b);
RngRandom *s = RNG_RANDOM(b);
if (QSIMPLEQ_EMPTY(&s->parent.requests)) {
/* If there are no pending requests yet, we need to
@@ -68,7 +68,7 @@ static void rng_random_request_entropy(RngBackend *b, RngRequest *req)
static void rng_random_opened(RngBackend *b, Error **errp)
{
RndRandom *s = RNG_RANDOM(b);
RngRandom *s = RNG_RANDOM(b);
if (s->filename == NULL) {
error_setg(errp, QERR_INVALID_PARAMETER_VALUE,
@@ -83,7 +83,7 @@ static void rng_random_opened(RngBackend *b, Error **errp)
static char *rng_random_get_filename(Object *obj, Error **errp)
{
RndRandom *s = RNG_RANDOM(obj);
RngRandom *s = RNG_RANDOM(obj);
return g_strdup(s->filename);
}
@@ -92,7 +92,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
Error **errp)
{
RngBackend *b = RNG_BACKEND(obj);
RndRandom *s = RNG_RANDOM(obj);
RngRandom *s = RNG_RANDOM(obj);
if (b->opened) {
error_setg(errp, QERR_PERMISSION_DENIED);
@@ -105,7 +105,7 @@ static void rng_random_set_filename(Object *obj, const char *filename,
static void rng_random_init(Object *obj)
{
RndRandom *s = RNG_RANDOM(obj);
RngRandom *s = RNG_RANDOM(obj);
object_property_add_str(obj, "filename",
rng_random_get_filename,
@@ -118,7 +118,7 @@ static void rng_random_init(Object *obj)
static void rng_random_finalize(Object *obj)
{
RndRandom *s = RNG_RANDOM(obj);
RngRandom *s = RNG_RANDOM(obj);
if (s->fd != -1) {
qemu_set_fd_handler(s->fd, NULL, NULL, NULL);
@@ -139,7 +139,7 @@ static void rng_random_class_init(ObjectClass *klass, void *data)
static const TypeInfo rng_random_info = {
.name = TYPE_RNG_RANDOM,
.parent = TYPE_RNG_BACKEND,
.instance_size = sizeof(RndRandom),
.instance_size = sizeof(RngRandom),
.class_init = rng_random_class_init,
.instance_init = rng_random_init,
.instance_finalize = rng_random_finalize,

625
block.c

File diff suppressed because it is too large Load Diff

View File

@@ -974,11 +974,9 @@ err_exit2:
static int64_t qemu_archipelago_getlength(BlockDriverState *bs)
{
int64_t ret;
BDRVArchipelagoState *s = bs->opaque;
ret = archipelago_volume_info(s);
return ret;
return archipelago_volume_info(s);
}
static int qemu_archipelago_truncate(BlockDriverState *bs, int64_t offset)

View File

@@ -36,7 +36,7 @@ typedef struct CowRequest {
typedef struct BackupBlockJob {
BlockJob common;
BlockDriverState *target;
BlockBackend *target;
/* bitmap for sync=incremental */
BdrvDirtyBitmap *sync_bitmap;
MirrorSyncMode sync_mode;
@@ -47,6 +47,7 @@ typedef struct BackupBlockJob {
uint64_t sectors_read;
unsigned long *done_bitmap;
int64_t cluster_size;
NotifierWithReturn before_write;
QLIST_HEAD(, CowRequest) inflight_reqs;
} BackupBlockJob;
@@ -93,12 +94,12 @@ static void cow_request_end(CowRequest *req)
qemu_co_queue_restart_all(&req->wait_queue);
}
static int coroutine_fn backup_do_cow(BlockDriverState *bs,
static int coroutine_fn backup_do_cow(BackupBlockJob *job,
int64_t sector_num, int nb_sectors,
bool *error_is_read,
bool is_write_notifier)
{
BackupBlockJob *job = (BackupBlockJob *)bs->job;
BlockBackend *blk = job->common.blk;
CowRequest cow_request;
struct iovec iov;
QEMUIOVector bounce_qiov;
@@ -131,20 +132,15 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
start * sectors_per_cluster);
if (!bounce_buffer) {
bounce_buffer = qemu_blockalign(bs, job->cluster_size);
bounce_buffer = blk_blockalign(blk, job->cluster_size);
}
iov.iov_base = bounce_buffer;
iov.iov_len = n * BDRV_SECTOR_SIZE;
qemu_iovec_init_external(&bounce_qiov, &iov, 1);
if (is_write_notifier) {
ret = bdrv_co_readv_no_serialising(bs,
start * sectors_per_cluster,
n, &bounce_qiov);
} else {
ret = bdrv_co_readv(bs, start * sectors_per_cluster, n,
&bounce_qiov);
}
ret = blk_co_preadv(blk, start * job->cluster_size,
bounce_qiov.size, &bounce_qiov,
is_write_notifier ? BDRV_REQ_NO_SERIALISING : 0);
if (ret < 0) {
trace_backup_do_cow_read_fail(job, start, ret);
if (error_is_read) {
@@ -154,13 +150,11 @@ static int coroutine_fn backup_do_cow(BlockDriverState *bs,
}
if (buffer_is_zero(iov.iov_base, iov.iov_len)) {
ret = bdrv_co_write_zeroes(job->target,
start * sectors_per_cluster,
n, BDRV_REQ_MAY_UNMAP);
ret = blk_co_pwrite_zeroes(job->target, start * job->cluster_size,
bounce_qiov.size, BDRV_REQ_MAY_UNMAP);
} else {
ret = bdrv_co_writev(job->target,
start * sectors_per_cluster, n,
&bounce_qiov);
ret = blk_co_pwritev(job->target, start * job->cluster_size,
bounce_qiov.size, &bounce_qiov, 0);
}
if (ret < 0) {
trace_backup_do_cow_write_fail(job, start, ret);
@@ -197,14 +191,16 @@ static int coroutine_fn backup_before_write_notify(
NotifierWithReturn *notifier,
void *opaque)
{
BackupBlockJob *job = container_of(notifier, BackupBlockJob, before_write);
BdrvTrackedRequest *req = opaque;
int64_t sector_num = req->offset >> BDRV_SECTOR_BITS;
int nb_sectors = req->bytes >> BDRV_SECTOR_BITS;
assert(req->bs == blk_bs(job->common.blk));
assert((req->offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((req->bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
return backup_do_cow(req->bs, sector_num, nb_sectors, NULL, true);
return backup_do_cow(job, sector_num, nb_sectors, NULL, true);
}
static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
@@ -218,19 +214,10 @@ static void backup_set_speed(BlockJob *job, int64_t speed, Error **errp)
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void backup_iostatus_reset(BlockJob *job)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
if (s->target->blk) {
blk_iostatus_reset(s->target->blk);
}
}
static void backup_cleanup_sync_bitmap(BackupBlockJob *job, int ret)
{
BdrvDirtyBitmap *bm;
BlockDriverState *bs = job->common.bs;
BlockDriverState *bs = blk_bs(job->common.blk);
if (ret < 0 || block_job_is_cancelled(&job->common)) {
/* Merge the successor back into the parent, delete nothing. */
@@ -259,24 +246,31 @@ static void backup_abort(BlockJob *job)
}
}
static void backup_attached_aio_context(BlockJob *job, AioContext *aio_context)
{
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
blk_set_aio_context(s->target, aio_context);
}
static const BlockJobDriver backup_job_driver = {
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.iostatus_reset = backup_iostatus_reset,
.commit = backup_commit,
.abort = backup_abort,
.instance_size = sizeof(BackupBlockJob),
.job_type = BLOCK_JOB_TYPE_BACKUP,
.set_speed = backup_set_speed,
.commit = backup_commit,
.abort = backup_abort,
.attached_aio_context = backup_attached_aio_context,
};
static BlockErrorAction backup_error_action(BackupBlockJob *job,
bool read, int error)
{
if (read) {
return block_job_error_action(&job->common, job->common.bs,
job->on_source_error, true, error);
return block_job_error_action(&job->common, job->on_source_error,
true, error);
} else {
return block_job_error_action(&job->common, job->target,
job->on_target_error, false, error);
return block_job_error_action(&job->common, job->on_target_error,
false, error);
}
}
@@ -289,7 +283,7 @@ static void backup_complete(BlockJob *job, void *opaque)
BackupBlockJob *s = container_of(job, BackupBlockJob, common);
BackupCompleteData *data = opaque;
bdrv_unref(s->target);
blk_unref(s->target);
block_job_completed(job, data->ret);
g_free(data);
@@ -331,7 +325,6 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
int64_t end;
int64_t last_cluster = -1;
int64_t sectors_per_cluster = cluster_size_sectors(job);
BlockDriverState *bs = job->common.bs;
HBitmapIter hbi;
granularity = bdrv_dirty_bitmap_granularity(job->sync_bitmap);
@@ -353,7 +346,7 @@ static int coroutine_fn backup_run_incremental(BackupBlockJob *job)
if (yield_and_check(job)) {
return ret;
}
ret = backup_do_cow(bs, cluster * sectors_per_cluster,
ret = backup_do_cow(job, cluster * sectors_per_cluster,
sectors_per_cluster, &error_is_read,
false);
if ((ret < 0) &&
@@ -386,12 +379,8 @@ static void coroutine_fn backup_run(void *opaque)
{
BackupBlockJob *job = opaque;
BackupCompleteData *data;
BlockDriverState *bs = job->common.bs;
BlockDriverState *target = job->target;
BlockdevOnError on_target_error = job->on_target_error;
NotifierWithReturn before_write = {
.notify = backup_before_write_notify,
};
BlockDriverState *bs = blk_bs(job->common.blk);
BlockBackend *target = job->target;
int64_t start, end;
int64_t sectors_per_cluster = cluster_size_sectors(job);
int ret = 0;
@@ -404,20 +393,14 @@ static void coroutine_fn backup_run(void *opaque)
job->done_bitmap = bitmap_new(end);
if (target->blk) {
blk_set_on_error(target->blk, on_target_error, on_target_error);
blk_iostatus_enable(target->blk);
}
bdrv_add_before_write_notifier(bs, &before_write);
job->before_write.notify = backup_before_write_notify;
bdrv_add_before_write_notifier(bs, &job->before_write);
if (job->sync_mode == MIRROR_SYNC_MODE_NONE) {
while (!block_job_is_cancelled(&job->common)) {
/* Yield until the job is cancelled. We just let our before_write
* notify callback service CoW requests. */
job->common.busy = false;
qemu_coroutine_yield();
job->common.busy = true;
block_job_yield(&job->common);
}
} else if (job->sync_mode == MIRROR_SYNC_MODE_INCREMENTAL) {
ret = backup_run_incremental(job);
@@ -461,7 +444,7 @@ static void coroutine_fn backup_run(void *opaque)
}
}
/* FULL sync mode we copy the whole drive. */
ret = backup_do_cow(bs, start * sectors_per_cluster,
ret = backup_do_cow(job, start * sectors_per_cluster,
sectors_per_cluster, &error_is_read, false);
if (ret < 0) {
/* Depending on error action, fail now or retry cluster */
@@ -477,17 +460,14 @@ static void coroutine_fn backup_run(void *opaque)
}
}
notifier_with_return_remove(&before_write);
notifier_with_return_remove(&job->before_write);
/* wait until pending backup_do_cow() calls have completed */
qemu_co_rwlock_wrlock(&job->flush_rwlock);
qemu_co_rwlock_unlock(&job->flush_rwlock);
g_free(job->done_bitmap);
if (target->blk) {
blk_iostatus_disable(target->blk);
}
bdrv_op_unblock_all(target, job->common.blocker);
bdrv_op_unblock_all(blk_bs(target), job->common.blocker);
data = g_malloc(sizeof(*data));
data->ret = ret;
@@ -504,24 +484,17 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
{
int64_t len;
BlockDriverInfo bdi;
BackupBlockJob *job = NULL;
int ret;
assert(bs);
assert(target);
assert(cb);
if (bs == target) {
error_setg(errp, "Source and target cannot be the same");
return;
}
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (!bdrv_is_inserted(bs)) {
error_setg(errp, "Device is not inserted: %s",
bdrv_get_device_name(bs));
@@ -568,15 +541,16 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
goto error;
}
BackupBlockJob *job = block_job_create(&backup_job_driver, bs, speed,
cb, opaque, errp);
job = block_job_create(&backup_job_driver, bs, speed, cb, opaque, errp);
if (!job) {
goto error;
}
job->target = blk_new();
blk_insert_bs(job->target, target);
job->on_source_error = on_source_error;
job->on_target_error = on_target_error;
job->target = target;
job->sync_mode = sync_mode;
job->sync_bitmap = sync_mode == MIRROR_SYNC_MODE_INCREMENTAL ?
sync_bitmap : NULL;
@@ -584,7 +558,7 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
/* If there is no backing file on the target, we cannot rely on COW if our
* backup cluster size is smaller than the target cluster size. Even for
* targets with a backing file, try to avoid COW if possible. */
ret = bdrv_get_info(job->target, &bdi);
ret = bdrv_get_info(target, &bdi);
if (ret < 0 && !target->backing) {
error_setg_errno(errp, -ret,
"Couldn't determine the cluster size of the target image, "
@@ -610,4 +584,8 @@ void backup_start(BlockDriverState *bs, BlockDriverState *target,
if (sync_bitmap) {
bdrv_reclaim_dirty_bitmap(bs, sync_bitmap, NULL);
}
if (job) {
blk_unref(job->target);
block_job_unref(&job->common);
}
}

View File

@@ -103,11 +103,11 @@ static int coroutine_fn blkreplay_co_writev(BlockDriverState *bs,
return ret;
}
static int coroutine_fn blkreplay_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
static int coroutine_fn blkreplay_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
{
uint64_t reqid = request_id++;
int ret = bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags);
int ret = bdrv_co_pwrite_zeroes(bs->file->bs, offset, count, flags);
block_request_create(reqid, bs, qemu_coroutine_self());
qemu_coroutine_yield();
@@ -147,7 +147,7 @@ static BlockDriver bdrv_blkreplay = {
.bdrv_co_readv = blkreplay_co_readv,
.bdrv_co_writev = blkreplay_co_writev,
.bdrv_co_write_zeroes = blkreplay_co_write_zeroes,
.bdrv_co_pwrite_zeroes = blkreplay_co_pwrite_zeroes,
.bdrv_co_discard = blkreplay_co_discard,
.bdrv_co_flush = blkreplay_co_flush,
};

View File

@@ -293,22 +293,6 @@ static bool blkverify_recurse_is_first_non_filter(BlockDriverState *bs,
return bdrv_recurse_is_first_non_filter(s->test_file->bs, candidate);
}
/* Propagate AioContext changes to ->test_file */
static void blkverify_detach_aio_context(BlockDriverState *bs)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_detach_aio_context(s->test_file->bs);
}
static void blkverify_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVBlkverifyState *s = bs->opaque;
bdrv_attach_aio_context(s->test_file->bs, new_context);
}
static void blkverify_refresh_filename(BlockDriverState *bs, QDict *options)
{
BDRVBlkverifyState *s = bs->opaque;
@@ -356,9 +340,6 @@ static BlockDriver bdrv_blkverify = {
.bdrv_aio_writev = blkverify_aio_writev,
.bdrv_aio_flush = blkverify_aio_flush,
.bdrv_attach_aio_context = blkverify_attach_aio_context,
.bdrv_detach_aio_context = blkverify_detach_aio_context,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = blkverify_recurse_is_first_non_filter,
};

View File

@@ -1,7 +1,7 @@
/*
* QEMU Block backends
*
* Copyright (C) 2014 Red Hat, Inc.
* Copyright (C) 2014-2016 Red Hat, Inc.
*
* Authors:
* Markus Armbruster <armbru@redhat.com>,
@@ -19,6 +19,7 @@
#include "sysemu/sysemu.h"
#include "qapi-event.h"
#include "qemu/id.h"
#include "trace.h"
/* Number of coroutines to reserve per attached device model */
#define COROUTINE_POOL_RESERVATION 64
@@ -34,6 +35,7 @@ struct BlockBackend {
DriveInfo *legacy_dinfo; /* null unless created by drive_new() */
QTAILQ_ENTRY(BlockBackend) link; /* for block_backends */
QTAILQ_ENTRY(BlockBackend) monitor_link; /* for monitor_block_backends */
BlockBackendPublic public;
void *dev; /* attached device model, if any */
/* TODO change to DeviceState when all users are qdevified */
@@ -74,6 +76,7 @@ static const AIOCBInfo block_backend_aiocb_info = {
};
static void drive_info_del(DriveInfo *dinfo);
static BlockBackend *bdrv_first_blk(BlockDriverState *bs);
/* All BlockBackends */
static QTAILQ_HEAD(, BlockBackend) block_backends =
@@ -90,9 +93,26 @@ static void blk_root_inherit_options(int *child_flags, QDict *child_options,
/* We're not supposed to call this function for root nodes */
abort();
}
static void blk_root_drained_begin(BdrvChild *child);
static void blk_root_drained_end(BdrvChild *child);
static void blk_root_change_media(BdrvChild *child, bool load);
static void blk_root_resize(BdrvChild *child);
static const char *blk_root_get_name(BdrvChild *child)
{
return blk_name(child->opaque);
}
static const BdrvChildRole child_root = {
.inherit_options = blk_root_inherit_options,
.inherit_options = blk_root_inherit_options,
.change_media = blk_root_change_media,
.resize = blk_root_resize,
.get_name = blk_root_get_name,
.drained_begin = blk_root_drained_begin,
.drained_end = blk_root_drained_end,
};
/*
@@ -100,40 +120,26 @@ static const BdrvChildRole child_root = {
* Store an error through @errp on failure, unless it's null.
* Return the new BlockBackend on success, null on failure.
*/
BlockBackend *blk_new(Error **errp)
BlockBackend *blk_new(void)
{
BlockBackend *blk;
blk = g_new0(BlockBackend, 1);
blk->refcnt = 1;
blk_set_enable_write_cache(blk, true);
qemu_co_queue_init(&blk->public.throttled_reqs[0]);
qemu_co_queue_init(&blk->public.throttled_reqs[1]);
notifier_list_init(&blk->remove_bs_notifiers);
notifier_list_init(&blk->insert_bs_notifiers);
QTAILQ_INSERT_TAIL(&block_backends, blk, link);
return blk;
}
/*
* Create a new BlockBackend with a new BlockDriverState attached.
* Otherwise just like blk_new(), which see.
*/
BlockBackend *blk_new_with_bs(Error **errp)
{
BlockBackend *blk;
BlockDriverState *bs;
blk = blk_new(errp);
if (!blk) {
return NULL;
}
bs = bdrv_new_root();
blk->root = bdrv_root_attach_child(bs, "root", &child_root);
bs->blk = blk;
return blk;
}
/*
* Calls blk_new_with_bs() and then calls bdrv_open() on the BlockDriverState.
* Creates a new BlockBackend, opens a new BlockDriverState, and connects both.
*
* Just as with bdrv_open(), after having called this function the reference to
* @options belongs to the block layer (even on failure).
@@ -148,21 +154,16 @@ BlockBackend *blk_new_open(const char *filename, const char *reference,
QDict *options, int flags, Error **errp)
{
BlockBackend *blk;
int ret;
BlockDriverState *bs;
blk = blk_new_with_bs(errp);
if (!blk) {
QDECREF(options);
return NULL;
}
ret = bdrv_open(&blk->root->bs, filename, reference, options, flags, errp);
if (ret < 0) {
blk = blk_new();
bs = bdrv_open(filename, reference, options, flags, errp);
if (!bs) {
blk_unref(blk);
return NULL;
}
blk_set_enable_write_cache(blk, true);
blk->root = bdrv_root_attach_child(bs, "root", &child_root, blk);
return blk;
}
@@ -177,10 +178,6 @@ static void blk_delete(BlockBackend *blk)
}
assert(QLIST_EMPTY(&blk->remove_bs_notifiers.notifiers));
assert(QLIST_EMPTY(&blk->insert_bs_notifiers.notifiers));
if (blk->root_state.throttle_state) {
g_free(blk->root_state.throttle_group);
throttle_group_unref(blk->root_state.throttle_state);
}
QTAILQ_REMOVE(&block_backends, blk, link);
drive_info_del(blk->legacy_dinfo);
block_acct_cleanup(&blk->stats);
@@ -267,28 +264,45 @@ BlockBackend *blk_next(BlockBackend *blk)
: QTAILQ_FIRST(&monitor_block_backends);
}
/*
* Iterates over all BlockDriverStates which are attached to a BlockBackend.
* This function is for use by bdrv_next().
*
* @bs must be NULL or a BDS that is attached to a BB.
*/
BlockDriverState *blk_next_root_bs(BlockDriverState *bs)
/* Iterates over all top-level BlockDriverStates, i.e. BDSs that are owned by
* the monitor or attached to a BlockBackend */
BlockDriverState *bdrv_next(BdrvNextIterator *it)
{
BlockBackend *blk;
BlockDriverState *bs;
if (bs) {
assert(bs->blk);
blk = bs->blk;
} else {
blk = NULL;
/* First, return all root nodes of BlockBackends. In order to avoid
* returning a BDS twice when multiple BBs refer to it, we only return it
* if the BB is the first one in the parent list of the BDS. */
if (it->phase == BDRV_NEXT_BACKEND_ROOTS) {
do {
it->blk = blk_all_next(it->blk);
bs = it->blk ? blk_bs(it->blk) : NULL;
} while (it->blk && (bs == NULL || bdrv_first_blk(bs) != it->blk));
if (bs) {
return bs;
}
it->phase = BDRV_NEXT_MONITOR_OWNED;
}
/* Then return the monitor-owned BDSes without a BB attached. Ignore all
* BDSes that are attached to a BlockBackend here; they have been handled
* by the above block already */
do {
blk = blk_all_next(blk);
} while (blk && !blk->root);
it->bs = bdrv_next_monitor_owned(it->bs);
bs = it->bs;
} while (bs && bdrv_has_blk(bs));
return blk ? blk->root->bs : NULL;
return bs;
}
BlockDriverState *bdrv_first(BdrvNextIterator *it)
{
*it = (BdrvNextIterator) {
.phase = BDRV_NEXT_BACKEND_ROOTS,
};
return bdrv_next(it);
}
/*
@@ -375,6 +389,26 @@ BlockDriverState *blk_bs(BlockBackend *blk)
return blk->root ? blk->root->bs : NULL;
}
static BlockBackend *bdrv_first_blk(BlockDriverState *bs)
{
BdrvChild *child;
QLIST_FOREACH(child, &bs->parents, next_parent) {
if (child->role == &child_root) {
return child->opaque;
}
}
return NULL;
}
/*
* Returns true if @bs has an associated BlockBackend.
*/
bool bdrv_has_blk(BlockDriverState *bs)
{
return bdrv_first_blk(bs) != NULL;
}
/*
* Return @blk's DriveInfo if any, else null.
*/
@@ -410,18 +444,34 @@ BlockBackend *blk_by_legacy_dinfo(DriveInfo *dinfo)
abort();
}
/*
* Returns a pointer to the publicly accessible fields of @blk.
*/
BlockBackendPublic *blk_get_public(BlockBackend *blk)
{
return &blk->public;
}
/*
* Returns a BlockBackend given the associated @public fields.
*/
BlockBackend *blk_by_public(BlockBackendPublic *public)
{
return container_of(public, BlockBackend, public);
}
/*
* Disassociates the currently associated BlockDriverState from @blk.
*/
void blk_remove_bs(BlockBackend *blk)
{
assert(blk->root->bs->blk == blk);
notifier_list_notify(&blk->remove_bs_notifiers, blk);
if (blk->public.throttle_state) {
throttle_timers_detach_aio_context(&blk->public.throttle_timers);
}
blk_update_root_state(blk);
blk->root->bs->blk = NULL;
bdrv_root_unref_child(blk->root);
blk->root = NULL;
}
@@ -431,12 +481,14 @@ void blk_remove_bs(BlockBackend *blk)
*/
void blk_insert_bs(BlockBackend *blk, BlockDriverState *bs)
{
assert(!blk->root && !bs->blk);
bdrv_ref(bs);
blk->root = bdrv_root_attach_child(bs, "root", &child_root);
bs->blk = blk;
blk->root = bdrv_root_attach_child(bs, "root", &child_root, blk);
notifier_list_notify(&blk->insert_bs_notifiers, blk);
if (blk->public.throttle_state) {
throttle_timers_attach_aio_context(
&blk->public.throttle_timers, bdrv_get_aio_context(bs));
}
}
/*
@@ -525,6 +577,11 @@ void blk_dev_change_media_cb(BlockBackend *blk, bool load)
}
}
static void blk_root_change_media(BdrvChild *child, bool load)
{
blk_dev_change_media_cb(child->opaque, load);
}
/*
* Does @blk's attached device model have removable media?
* %true if no device model is attached.
@@ -579,8 +636,10 @@ bool blk_dev_is_medium_locked(BlockBackend *blk)
/*
* Notify @blk's attached device model of a backend size change.
*/
void blk_dev_resize_cb(BlockBackend *blk)
static void blk_root_resize(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
if (blk->dev_ops && blk->dev_ops->resize_cb) {
blk->dev_ops->resize_cb(blk->dev_opaque);
}
@@ -683,34 +742,50 @@ static int blk_check_request(BlockBackend *blk, int64_t sector_num,
nb_sectors * BDRV_SECTOR_SIZE);
}
static int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
{
int ret = blk_check_byte_request(blk, offset, bytes);
if (ret < 0) {
return ret;
}
return bdrv_co_do_preadv(blk_bs(blk), offset, bytes, qiov, flags);
}
static int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
int coroutine_fn blk_co_preadv(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
{
int ret;
trace_blk_co_preadv(blk, blk_bs(blk), offset, bytes, flags);
ret = blk_check_byte_request(blk, offset, bytes);
if (ret < 0) {
return ret;
}
/* throttling disk I/O */
if (blk->public.throttle_state) {
throttle_group_co_io_limits_intercept(blk, bytes, false);
}
return bdrv_co_preadv(blk_bs(blk), offset, bytes, qiov, flags);
}
int coroutine_fn blk_co_pwritev(BlockBackend *blk, int64_t offset,
unsigned int bytes, QEMUIOVector *qiov,
BdrvRequestFlags flags)
{
int ret;
trace_blk_co_pwritev(blk, blk_bs(blk), offset, bytes, flags);
ret = blk_check_byte_request(blk, offset, bytes);
if (ret < 0) {
return ret;
}
/* throttling disk I/O */
if (blk->public.throttle_state) {
throttle_group_co_io_limits_intercept(blk, bytes, true);
}
if (!blk->enable_write_cache) {
flags |= BDRV_REQ_FUA;
}
return bdrv_co_do_pwritev(blk_bs(blk), offset, bytes, qiov, flags);
return bdrv_co_pwritev(blk_bs(blk), offset, bytes, qiov, flags);
}
typedef struct BlkRwCo {
@@ -772,55 +847,27 @@ static int blk_prw(BlockBackend *blk, int64_t offset, uint8_t *buf,
return rwco.ret;
}
static int blk_rw(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors, CoroutineEntry co_entry,
BdrvRequestFlags flags)
int blk_pread_unthrottled(BlockBackend *blk, int64_t offset, uint8_t *buf,
int count)
{
if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
return -EINVAL;
}
return blk_prw(blk, sector_num << BDRV_SECTOR_BITS, buf,
nb_sectors << BDRV_SECTOR_BITS, co_entry, flags);
}
int blk_read(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
return blk_rw(blk, sector_num, buf, nb_sectors, blk_read_entry, 0);
}
int blk_read_unthrottled(BlockBackend *blk, int64_t sector_num, uint8_t *buf,
int nb_sectors)
{
BlockDriverState *bs = blk_bs(blk);
bool enabled;
int ret;
ret = blk_check_request(blk, sector_num, nb_sectors);
ret = blk_check_byte_request(blk, offset, count);
if (ret < 0) {
return ret;
}
enabled = bs->io_limits_enabled;
bs->io_limits_enabled = false;
ret = blk_read(blk, sector_num, buf, nb_sectors);
bs->io_limits_enabled = enabled;
blk_root_drained_begin(blk->root);
ret = blk_pread(blk, offset, buf, count);
blk_root_drained_end(blk->root);
return ret;
}
int blk_write(BlockBackend *blk, int64_t sector_num, const uint8_t *buf,
int nb_sectors)
int blk_pwrite_zeroes(BlockBackend *blk, int64_t offset,
int count, BdrvRequestFlags flags)
{
return blk_rw(blk, sector_num, (uint8_t*) buf, nb_sectors,
blk_write_entry, 0);
}
int blk_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
{
return blk_rw(blk, sector_num, NULL, nb_sectors, blk_write_entry,
flags | BDRV_REQ_ZERO_WRITE);
return blk_prw(blk, offset, NULL, count, blk_write_entry,
flags | BDRV_REQ_ZERO_WRITE);
}
static void error_callback_bh(void *opaque)
@@ -932,18 +979,12 @@ static void blk_aio_write_entry(void *opaque)
blk_aio_complete(acb);
}
BlockAIOCB *blk_aio_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
BlockAIOCB *blk_aio_pwrite_zeroes(BlockBackend *blk, int64_t offset,
int count, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
return blk_abort_aio_request(blk, cb, opaque, -EINVAL);
}
return blk_aio_prwv(blk, sector_num << BDRV_SECTOR_BITS,
nb_sectors << BDRV_SECTOR_BITS, NULL,
blk_aio_write_entry, flags | BDRV_REQ_ZERO_WRITE,
cb, opaque);
return blk_aio_prwv(blk, offset, count, NULL, blk_aio_write_entry,
flags | BDRV_REQ_ZERO_WRITE, cb, opaque);
}
int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
@@ -955,9 +996,11 @@ int blk_pread(BlockBackend *blk, int64_t offset, void *buf, int count)
return count;
}
int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count)
int blk_pwrite(BlockBackend *blk, int64_t offset, const void *buf, int count,
BdrvRequestFlags flags)
{
int ret = blk_prw(blk, offset, (void*) buf, count, blk_write_entry, 0);
int ret = blk_prw(blk, offset, (void *) buf, count, blk_write_entry,
flags);
if (ret < 0) {
return ret;
}
@@ -991,30 +1034,20 @@ int64_t blk_nb_sectors(BlockBackend *blk)
return bdrv_nb_sectors(blk_bs(blk));
}
BlockAIOCB *blk_aio_readv(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
return blk_abort_aio_request(blk, cb, opaque, -EINVAL);
}
assert(nb_sectors << BDRV_SECTOR_BITS == iov->size);
return blk_aio_prwv(blk, sector_num << BDRV_SECTOR_BITS, iov->size, iov,
blk_aio_read_entry, 0, cb, opaque);
}
BlockAIOCB *blk_aio_writev(BlockBackend *blk, int64_t sector_num,
QEMUIOVector *iov, int nb_sectors,
BlockAIOCB *blk_aio_preadv(BlockBackend *blk, int64_t offset,
QEMUIOVector *qiov, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
return blk_abort_aio_request(blk, cb, opaque, -EINVAL);
}
return blk_aio_prwv(blk, offset, qiov->size, qiov,
blk_aio_read_entry, flags, cb, opaque);
}
assert(nb_sectors << BDRV_SECTOR_BITS == iov->size);
return blk_aio_prwv(blk, sector_num << BDRV_SECTOR_BITS, iov->size, iov,
blk_aio_write_entry, 0, cb, opaque);
BlockAIOCB *blk_aio_pwritev(BlockBackend *blk, int64_t offset,
QEMUIOVector *qiov, BdrvRequestFlags flags,
BlockCompletionFunc *cb, void *opaque)
{
return blk_aio_prwv(blk, offset, qiov->size, qiov,
blk_aio_write_entry, flags, cb, opaque);
}
BlockAIOCB *blk_aio_flush(BlockBackend *blk,
@@ -1049,20 +1082,6 @@ void blk_aio_cancel_async(BlockAIOCB *acb)
bdrv_aio_cancel_async(acb);
}
int blk_aio_multiwrite(BlockBackend *blk, BlockRequest *reqs, int num_reqs)
{
int i, ret;
for (i = 0; i < num_reqs; i++) {
ret = blk_check_request(blk, reqs[i].sector, reqs[i].nb_sectors);
if (ret < 0) {
return ret;
}
}
return bdrv_aio_multiwrite(blk_bs(blk), reqs, num_reqs);
}
int blk_ioctl(BlockBackend *blk, unsigned long int req, void *buf)
{
if (!blk_is_available(blk)) {
@@ -1375,7 +1394,14 @@ void blk_set_aio_context(BlockBackend *blk, AioContext *new_context)
BlockDriverState *bs = blk_bs(blk);
if (bs) {
if (blk->public.throttle_state) {
throttle_timers_detach_aio_context(&blk->public.throttle_timers);
}
bdrv_set_aio_context(bs, new_context);
if (blk->public.throttle_state) {
throttle_timers_attach_aio_context(&blk->public.throttle_timers,
new_context);
}
}
}
@@ -1444,15 +1470,10 @@ void *blk_aio_get(const AIOCBInfo *aiocb_info, BlockBackend *blk,
return qemu_aio_get(aiocb_info, blk_bs(blk), cb, opaque);
}
int coroutine_fn blk_co_write_zeroes(BlockBackend *blk, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
int coroutine_fn blk_co_pwrite_zeroes(BlockBackend *blk, int64_t offset,
int count, BdrvRequestFlags flags)
{
if (nb_sectors < 0 || nb_sectors > BDRV_REQUEST_MAX_SECTORS) {
return -EINVAL;
}
return blk_co_pwritev(blk, sector_num << BDRV_SECTOR_BITS,
nb_sectors << BDRV_SECTOR_BITS, NULL,
return blk_co_pwritev(blk, offset, count, NULL,
flags | BDRV_REQ_ZERO_WRITE);
}
@@ -1545,19 +1566,6 @@ void blk_update_root_state(BlockBackend *blk)
blk->root_state.open_flags = blk->root->bs->open_flags;
blk->root_state.read_only = blk->root->bs->read_only;
blk->root_state.detect_zeroes = blk->root->bs->detect_zeroes;
if (blk->root_state.throttle_group) {
g_free(blk->root_state.throttle_group);
throttle_group_unref(blk->root_state.throttle_state);
}
if (blk->root->bs->throttle_state) {
const char *name = throttle_group_get_name(blk->root->bs);
blk->root_state.throttle_group = g_strdup(name);
blk->root_state.throttle_state = throttle_group_incref(name);
} else {
blk->root_state.throttle_group = NULL;
blk->root_state.throttle_state = NULL;
}
}
/*
@@ -1568,9 +1576,6 @@ void blk_update_root_state(BlockBackend *blk)
void blk_apply_root_state(BlockBackend *blk, BlockDriverState *bs)
{
bs->detect_zeroes = blk->root_state.detect_zeroes;
if (blk->root_state.throttle_group) {
bdrv_io_limits_enable(bs, blk->root_state.throttle_group);
}
}
/*
@@ -1633,3 +1638,62 @@ int blk_flush_all(void)
return result;
}
/* throttling disk I/O limits */
void blk_set_io_limits(BlockBackend *blk, ThrottleConfig *cfg)
{
throttle_group_config(blk, cfg);
}
void blk_io_limits_disable(BlockBackend *blk)
{
assert(blk->public.throttle_state);
bdrv_drained_begin(blk_bs(blk));
throttle_group_unregister_blk(blk);
bdrv_drained_end(blk_bs(blk));
}
/* should be called before blk_set_io_limits if a limit is set */
void blk_io_limits_enable(BlockBackend *blk, const char *group)
{
assert(!blk->public.throttle_state);
throttle_group_register_blk(blk, group);
}
void blk_io_limits_update_group(BlockBackend *blk, const char *group)
{
/* this BB is not part of any group */
if (!blk->public.throttle_state) {
return;
}
/* this BB is a part of the same group than the one we want */
if (!g_strcmp0(throttle_group_get_name(blk), group)) {
return;
}
/* need to change the group this bs belong to */
blk_io_limits_disable(blk);
blk_io_limits_enable(blk, group);
}
static void blk_root_drained_begin(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
/* Note that blk->root may not be accessible here yet if we are just
* attaching to a BlockDriverState that is drained. Use child instead. */
if (blk->public.io_limits_disabled++ == 0) {
throttle_group_restart_blk(blk);
}
}
static void blk_root_drained_end(BdrvChild *child)
{
BlockBackend *blk = child->opaque;
assert(blk->public.io_limits_disabled);
--blk->public.io_limits_disabled;
}

View File

@@ -27,6 +27,7 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
/**************************************************************/
@@ -104,6 +105,7 @@ static int bochs_open(BlockDriverState *bs, QDict *options, int flags,
int ret;
bs->read_only = 1; // no write support yet
bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
ret = bdrv_pread(bs->file->bs, 0, &bochs, sizeof(bochs));
if (ret < 0) {
@@ -221,38 +223,52 @@ static int64_t seek_to_sector(BlockDriverState *bs, int64_t sector_num)
return bitmap_offset + (512 * (s->bitmap_blocks + extent_offset));
}
static int bochs_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
bochs_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVBochsState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
uint64_t bytes_done = 0;
QEMUIOVector local_qiov;
int ret;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_co_mutex_lock(&s->lock);
while (nb_sectors > 0) {
int64_t block_offset = seek_to_sector(bs, sector_num);
if (block_offset < 0) {
return block_offset;
} else if (block_offset > 0) {
ret = bdrv_pread(bs->file->bs, block_offset, buf, 512);
ret = block_offset;
goto fail;
}
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, 512);
if (block_offset > 0) {
ret = bdrv_co_preadv(bs->file->bs, block_offset, 512,
&local_qiov, 0);
if (ret < 0) {
return ret;
goto fail;
}
} else {
memset(buf, 0, 512);
qemu_iovec_memset(&local_qiov, 0, 0, 512);
}
nb_sectors--;
sector_num++;
buf += 512;
bytes_done += 512;
}
return 0;
}
static coroutine_fn int bochs_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVBochsState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = bochs_read(bs, sector_num, buf, nb_sectors);
ret = 0;
fail:
qemu_co_mutex_unlock(&s->lock);
qemu_iovec_destroy(&local_qiov);
return ret;
}
@@ -267,7 +283,7 @@ static BlockDriver bdrv_bochs = {
.instance_size = sizeof(BDRVBochsState),
.bdrv_probe = bochs_probe,
.bdrv_open = bochs_open,
.bdrv_read = bochs_co_read,
.bdrv_co_preadv = bochs_co_preadv,
.bdrv_close = bochs_close,
};

View File

@@ -26,6 +26,7 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include <zlib.h>
/* Maximum compressed block size */
@@ -66,6 +67,7 @@ static int cloop_open(BlockDriverState *bs, QDict *options, int flags,
int ret;
bs->read_only = 1;
bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
/* read header */
ret = bdrv_pread(bs->file->bs, 128, &s->block_size, 4);
@@ -229,33 +231,38 @@ static inline int cloop_read_block(BlockDriverState *bs, int block_num)
return 0;
}
static int cloop_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
cloop_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVCloopState *s = bs->opaque;
int i;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
for (i = 0; i < nb_sectors; i++) {
void *data;
uint32_t sector_offset_in_block =
((sector_num + i) % s->sectors_per_block),
block_num = (sector_num + i) / s->sectors_per_block;
if (cloop_read_block(bs, block_num) != 0) {
return -1;
ret = -EIO;
goto fail;
}
memcpy(buf + i * 512,
s->uncompressed_block + sector_offset_in_block * 512, 512);
}
return 0;
}
static coroutine_fn int cloop_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVCloopState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = cloop_read(bs, sector_num, buf, nb_sectors);
data = s->uncompressed_block + sector_offset_in_block * 512;
qemu_iovec_from_buf(qiov, i * 512, data, 512);
}
ret = 0;
fail:
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -273,7 +280,7 @@ static BlockDriver bdrv_cloop = {
.instance_size = sizeof(BDRVCloopState),
.bdrv_probe = cloop_probe,
.bdrv_open = cloop_open,
.bdrv_read = cloop_co_read,
.bdrv_co_preadv = cloop_co_preadv,
.bdrv_close = cloop_close,
};

View File

@@ -36,28 +36,36 @@ typedef struct CommitBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *active;
BlockDriverState *top;
BlockDriverState *base;
BlockBackend *top;
BlockBackend *base;
BlockdevOnError on_error;
int base_flags;
int orig_overlay_flags;
char *backing_file_str;
} CommitBlockJob;
static int coroutine_fn commit_populate(BlockDriverState *bs,
BlockDriverState *base,
static int coroutine_fn commit_populate(BlockBackend *bs, BlockBackend *base,
int64_t sector_num, int nb_sectors,
void *buf)
{
int ret = 0;
QEMUIOVector qiov;
struct iovec iov = {
.iov_base = buf,
.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
};
ret = bdrv_read(bs, sector_num, buf, nb_sectors);
if (ret) {
qemu_iovec_init_external(&qiov, &iov, 1);
ret = blk_co_preadv(bs, sector_num * BDRV_SECTOR_SIZE,
qiov.size, &qiov, 0);
if (ret < 0) {
return ret;
}
ret = bdrv_write(base, sector_num, buf, nb_sectors);
if (ret) {
ret = blk_co_pwritev(base, sector_num * BDRV_SECTOR_SIZE,
qiov.size, &qiov, 0);
if (ret < 0) {
return ret;
}
@@ -73,8 +81,8 @@ static void commit_complete(BlockJob *job, void *opaque)
CommitBlockJob *s = container_of(job, CommitBlockJob, common);
CommitCompleteData *data = opaque;
BlockDriverState *active = s->active;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
BlockDriverState *top = blk_bs(s->top);
BlockDriverState *base = blk_bs(s->base);
BlockDriverState *overlay_bs;
int ret = data->ret;
@@ -94,6 +102,8 @@ static void commit_complete(BlockJob *job, void *opaque)
bdrv_reopen(overlay_bs, s->orig_overlay_flags, NULL);
}
g_free(s->backing_file_str);
blk_unref(s->top);
blk_unref(s->base);
block_job_completed(&s->common, ret);
g_free(data);
}
@@ -102,8 +112,6 @@ static void coroutine_fn commit_run(void *opaque)
{
CommitBlockJob *s = opaque;
CommitCompleteData *data;
BlockDriverState *top = s->top;
BlockDriverState *base = s->base;
int64_t sector_num, end;
int ret = 0;
int n = 0;
@@ -111,27 +119,27 @@ static void coroutine_fn commit_run(void *opaque)
int bytes_written = 0;
int64_t base_len;
ret = s->common.len = bdrv_getlength(top);
ret = s->common.len = blk_getlength(s->top);
if (s->common.len < 0) {
goto out;
}
ret = base_len = bdrv_getlength(base);
ret = base_len = blk_getlength(s->base);
if (base_len < 0) {
goto out;
}
if (base_len < s->common.len) {
ret = bdrv_truncate(base, s->common.len);
ret = blk_truncate(s->base, s->common.len);
if (ret) {
goto out;
}
}
end = s->common.len >> BDRV_SECTOR_BITS;
buf = qemu_blockalign(top, COMMIT_BUFFER_SIZE);
buf = blk_blockalign(s->top, COMMIT_BUFFER_SIZE);
for (sector_num = 0; sector_num < end; sector_num += n) {
uint64_t delay_ns = 0;
@@ -146,7 +154,8 @@ wait:
break;
}
/* Copy if allocated above the base */
ret = bdrv_is_allocated_above(top, base, sector_num,
ret = bdrv_is_allocated_above(blk_bs(s->top), blk_bs(s->base),
sector_num,
COMMIT_BUFFER_SIZE / BDRV_SECTOR_SIZE,
&n);
copy = (ret == 1);
@@ -158,7 +167,7 @@ wait:
goto wait;
}
}
ret = commit_populate(top, base, sector_num, n, buf);
ret = commit_populate(s->top, s->base, sector_num, n, buf);
bytes_written += n * BDRV_SECTOR_SIZE;
}
if (ret < 0) {
@@ -214,13 +223,6 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
BlockDriverState *overlay_bs;
Error *local_err = NULL;
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, "Invalid parameter combination");
return;
}
assert(top != bs);
if (top == base) {
error_setg(errp, "Invalid files for merge: top and base are the same");
@@ -234,6 +236,11 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
return;
}
s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
orig_base_flags = bdrv_get_flags(base);
orig_overlay_flags = bdrv_get_flags(overlay_bs);
@@ -250,18 +257,18 @@ void commit_start(BlockDriverState *bs, BlockDriverState *base,
bdrv_reopen_multiple(reopen_queue, &local_err);
if (local_err != NULL) {
error_propagate(errp, local_err);
block_job_unref(&s->common);
return;
}
}
s = block_job_create(&commit_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
s->base = blk_new();
blk_insert_bs(s->base, base);
s->top = blk_new();
blk_insert_bs(s->top, top);
s->base = base;
s->top = top;
s->active = bs;
s->base_flags = orig_base_flags;

View File

@@ -91,7 +91,7 @@ static ssize_t block_crypto_write_func(QCryptoBlock *block,
struct BlockCryptoCreateData *data = opaque;
ssize_t ret;
ret = blk_pwrite(data->blk, offset, buf, buflen);
ret = blk_pwrite(data->blk, offset, buf, buflen, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write encryption header");
return ret;
@@ -196,7 +196,6 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
OptsVisitor *ov;
QCryptoBlockOpenOptions *ret = NULL;
Error *local_err = NULL;
Error *end_err = NULL;
ret = g_new0(QCryptoBlockOpenOptions, 1);
ret->format = format;
@@ -219,9 +218,11 @@ block_crypto_open_opts_init(QCryptoBlockFormat format,
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(opts_get_visitor(ov), &local_err);
}
visit_end_struct(opts_get_visitor(ov), &end_err);
error_propagate(&local_err, end_err);
visit_end_struct(opts_get_visitor(ov));
out:
if (local_err) {
@@ -242,7 +243,6 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
OptsVisitor *ov;
QCryptoBlockCreateOptions *ret = NULL;
Error *local_err = NULL;
Error *end_err = NULL;
ret = g_new0(QCryptoBlockCreateOptions, 1);
ret->format = format;
@@ -265,9 +265,11 @@ block_crypto_create_opts_init(QCryptoBlockFormat format,
error_setg(&local_err, "Unsupported block format %d", format);
break;
}
if (!local_err) {
visit_check_struct(opts_get_visitor(ov), &local_err);
}
visit_end_struct(opts_get_visitor(ov), &end_err);
error_propagate(&local_err, end_err);
visit_end_struct(opts_get_visitor(ov));
out:
if (local_err) {

View File

@@ -36,10 +36,16 @@
// #define DEBUG_VERBOSE
#ifdef DEBUG_CURL
#define DPRINTF(fmt, ...) do { printf(fmt, ## __VA_ARGS__); } while (0)
#define DEBUG_CURL_PRINT 1
#else
#define DPRINTF(fmt, ...) do { } while (0)
#define DEBUG_CURL_PRINT 0
#endif
#define DPRINTF(fmt, ...) \
do { \
if (DEBUG_CURL_PRINT) { \
fprintf(stderr, fmt, ## __VA_ARGS__); \
} \
} while (0)
#if LIBCURL_VERSION_NUM >= 0x071000
/* The multi interface timer callback was introduced in 7.16.0 */

View File

@@ -32,7 +32,6 @@
#ifdef CONFIG_BZIP2
#include <bzlib.h>
#endif
#include <glib.h>
enum {
/* Limit chunk sizes to prevent unreasonable amounts of memory being used
@@ -440,6 +439,8 @@ static int dmg_open(BlockDriverState *bs, QDict *options, int flags,
int ret;
bs->read_only = 1;
bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
s->n_chunks = 0;
s->offsets = s->lengths = s->sectors = s->sectorcounts = NULL;
/* used by dmg_read_mish_block to keep track of the current I/O position */
@@ -659,38 +660,42 @@ static inline int dmg_read_chunk(BlockDriverState *bs, uint64_t sector_num)
return 0;
}
static int dmg_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
dmg_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVDMGState *s = bs->opaque;
int i;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
int ret, i;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
qemu_co_mutex_lock(&s->lock);
for (i = 0; i < nb_sectors; i++) {
uint32_t sector_offset_in_chunk;
void *data;
if (dmg_read_chunk(bs, sector_num + i) != 0) {
return -1;
ret = -EIO;
goto fail;
}
/* Special case: current chunk is all zeroes. Do not perform a memcpy as
* s->uncompressed_chunk may be too small to cover the large all-zeroes
* section. dmg_read_chunk is called to find s->current_chunk */
if (s->types[s->current_chunk] == 2) { /* all zeroes block entry */
memset(buf + i * 512, 0, 512);
qemu_iovec_memset(qiov, i * 512, 0, 512);
continue;
}
sector_offset_in_chunk = sector_num + i - s->sectors[s->current_chunk];
memcpy(buf + i * 512,
s->uncompressed_chunk + sector_offset_in_chunk * 512, 512);
data = s->uncompressed_chunk + sector_offset_in_chunk * 512;
qemu_iovec_from_buf(qiov, i * 512, data, 512);
}
return 0;
}
static coroutine_fn int dmg_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVDMGState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = dmg_read(bs, sector_num, buf, nb_sectors);
ret = 0;
fail:
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -715,7 +720,7 @@ static BlockDriver bdrv_dmg = {
.instance_size = sizeof(BDRVDMGState),
.bdrv_probe = dmg_probe,
.bdrv_open = dmg_open,
.bdrv_read = dmg_co_read,
.bdrv_co_preadv = dmg_co_preadv,
.bdrv_close = dmg_close,
};

View File

@@ -24,6 +24,8 @@ typedef struct GlusterAIOCB {
typedef struct BDRVGlusterState {
struct glfs *glfs;
struct glfs_fd *fd;
bool supports_seek_data;
int debug_level;
} BDRVGlusterState;
typedef struct GlusterConf {
@@ -32,6 +34,7 @@ typedef struct GlusterConf {
char *volname;
char *image;
char *transport;
int debug_level;
} GlusterConf;
static void qemu_gluster_gconf_free(GlusterConf *gconf)
@@ -194,11 +197,7 @@ static struct glfs *qemu_gluster_init(GlusterConf *gconf, const char *filename,
goto out;
}
/*
* TODO: Use GF_LOG_ERROR instead of hard code value of 4 here when
* GlusterFS makes GF_LOG_* macros available to libgfapi users.
*/
ret = glfs_set_logging(glfs, "-", 4);
ret = glfs_set_logging(glfs, "-", gconf->debug_level);
if (ret < 0) {
goto out;
}
@@ -256,16 +255,26 @@ static void gluster_finish_aiocb(struct glfs_fd *fd, ssize_t ret, void *arg)
qemu_bh_schedule(acb->bh);
}
#define GLUSTER_OPT_FILENAME "filename"
#define GLUSTER_OPT_DEBUG "debug"
#define GLUSTER_DEBUG_DEFAULT 4
#define GLUSTER_DEBUG_MAX 9
/* TODO Convert to fine grained options */
static QemuOptsList runtime_opts = {
.name = "gluster",
.head = QTAILQ_HEAD_INITIALIZER(runtime_opts.head),
.desc = {
{
.name = "filename",
.name = GLUSTER_OPT_FILENAME,
.type = QEMU_OPT_STRING,
.help = "URL to the gluster image",
},
{
.name = GLUSTER_OPT_DEBUG,
.type = QEMU_OPT_NUMBER,
.help = "Gluster log level, valid range is 0-9",
},
{ /* end of list */ }
},
};
@@ -287,6 +296,28 @@ static void qemu_gluster_parse_flags(int bdrv_flags, int *open_flags)
}
}
/*
* Do SEEK_DATA/HOLE to detect if it is functional. Older broken versions of
* gfapi incorrectly return the current offset when SEEK_DATA/HOLE is used.
* - Corrected versions return -1 and set errno to EINVAL.
* - Versions that support SEEK_DATA/HOLE correctly, will return -1 and set
* errno to ENXIO when SEEK_DATA is called with a position of EOF.
*/
static bool qemu_gluster_test_seek(struct glfs_fd *fd)
{
off_t ret, eof;
eof = glfs_lseek(fd, 0, SEEK_END);
if (eof < 0) {
/* this should never occur */
return false;
}
/* this should always fail with ENXIO if SEEK_DATA is supported */
ret = glfs_lseek(fd, eof, SEEK_DATA);
return (ret < 0) && (errno == ENXIO);
}
static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
int bdrv_flags, Error **errp)
{
@@ -306,8 +337,17 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
goto out;
}
filename = qemu_opt_get(opts, "filename");
filename = qemu_opt_get(opts, GLUSTER_OPT_FILENAME);
s->debug_level = qemu_opt_get_number(opts, GLUSTER_OPT_DEBUG,
GLUSTER_DEBUG_DEFAULT);
if (s->debug_level < 0) {
s->debug_level = 0;
} else if (s->debug_level > GLUSTER_DEBUG_MAX) {
s->debug_level = GLUSTER_DEBUG_MAX;
}
gconf->debug_level = s->debug_level;
s->glfs = qemu_gluster_init(gconf, filename, errp);
if (!s->glfs) {
ret = -errno;
@@ -338,6 +378,8 @@ static int qemu_gluster_open(BlockDriverState *bs, QDict *options,
ret = -errno;
}
s->supports_seek_data = qemu_gluster_test_seek(s->fd);
out:
qemu_opts_del(opts);
qemu_gluster_gconf_free(gconf);
@@ -363,6 +405,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
BlockReopenQueue *queue, Error **errp)
{
int ret = 0;
BDRVGlusterState *s;
BDRVGlusterReopenState *reop_s;
GlusterConf *gconf = NULL;
int open_flags = 0;
@@ -370,6 +413,8 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
assert(state != NULL);
assert(state->bs != NULL);
s = state->bs->opaque;
state->opaque = g_new0(BDRVGlusterReopenState, 1);
reop_s = state->opaque;
@@ -377,6 +422,7 @@ static int qemu_gluster_reopen_prepare(BDRVReopenState *state,
gconf = g_new0(GlusterConf, 1);
gconf->debug_level = s->debug_level;
reop_s->glfs = qemu_gluster_init(gconf, state->bs->filename, errp);
if (reop_s->glfs == NULL) {
ret = -errno;
@@ -454,14 +500,12 @@ static void qemu_gluster_reopen_abort(BDRVReopenState *state)
}
#ifdef CONFIG_GLUSTERFS_ZEROFILL
static coroutine_fn int qemu_gluster_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
static coroutine_fn int qemu_gluster_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int size, BdrvRequestFlags flags)
{
int ret;
GlusterAIOCB acb;
BDRVGlusterState *s = bs->opaque;
off_t size = nb_sectors * BDRV_SECTOR_SIZE;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
acb.size = size;
acb.ret = 0;
@@ -512,6 +556,14 @@ static int qemu_gluster_create(const char *filename,
char *tmp = NULL;
GlusterConf *gconf = g_new0(GlusterConf, 1);
gconf->debug_level = qemu_opt_get_number_del(opts, GLUSTER_OPT_DEBUG,
GLUSTER_DEBUG_DEFAULT);
if (gconf->debug_level < 0) {
gconf->debug_level = 0;
} else if (gconf->debug_level > GLUSTER_DEBUG_MAX) {
gconf->debug_level = GLUSTER_DEBUG_MAX;
}
glfs = qemu_gluster_init(gconf, filename, errp);
if (!glfs) {
ret = -errno;
@@ -729,6 +781,159 @@ static int qemu_gluster_has_zero_init(BlockDriverState *bs)
return 0;
}
/*
* Find allocation range in @bs around offset @start.
* May change underlying file descriptor's file offset.
* If @start is not in a hole, store @start in @data, and the
* beginning of the next hole in @hole, and return 0.
* If @start is in a non-trailing hole, store @start in @hole and the
* beginning of the next non-hole in @data, and return 0.
* If @start is in a trailing hole or beyond EOF, return -ENXIO.
* If we can't find out, return a negative errno other than -ENXIO.
*
* (Shamefully copied from raw-posix.c, only miniscule adaptions.)
*/
static int find_allocation(BlockDriverState *bs, off_t start,
off_t *data, off_t *hole)
{
BDRVGlusterState *s = bs->opaque;
off_t offs;
if (!s->supports_seek_data) {
return -ENOTSUP;
}
/*
* SEEK_DATA cases:
* D1. offs == start: start is in data
* D2. offs > start: start is in a hole, next data at offs
* D3. offs < 0, errno = ENXIO: either start is in a trailing hole
* or start is beyond EOF
* If the latter happens, the file has been truncated behind
* our back since we opened it. All bets are off then.
* Treating like a trailing hole is simplest.
* D4. offs < 0, errno != ENXIO: we learned nothing
*/
offs = glfs_lseek(s->fd, start, SEEK_DATA);
if (offs < 0) {
return -errno; /* D3 or D4 */
}
assert(offs >= start);
if (offs > start) {
/* D2: in hole, next data at offs */
*hole = start;
*data = offs;
return 0;
}
/* D1: in data, end not yet known */
/*
* SEEK_HOLE cases:
* H1. offs == start: start is in a hole
* If this happens here, a hole has been dug behind our back
* since the previous lseek().
* H2. offs > start: either start is in data, next hole at offs,
* or start is in trailing hole, EOF at offs
* Linux treats trailing holes like any other hole: offs ==
* start. Solaris seeks to EOF instead: offs > start (blech).
* If that happens here, a hole has been dug behind our back
* since the previous lseek().
* H3. offs < 0, errno = ENXIO: start is beyond EOF
* If this happens, the file has been truncated behind our
* back since we opened it. Treat it like a trailing hole.
* H4. offs < 0, errno != ENXIO: we learned nothing
* Pretend we know nothing at all, i.e. "forget" about D1.
*/
offs = glfs_lseek(s->fd, start, SEEK_HOLE);
if (offs < 0) {
return -errno; /* D1 and (H3 or H4) */
}
assert(offs >= start);
if (offs > start) {
/*
* D1 and H2: either in data, next hole at offs, or it was in
* data but is now in a trailing hole. In the latter case,
* all bets are off. Treating it as if it there was data all
* the way to EOF is safe, so simply do that.
*/
*data = start;
*hole = offs;
return 0;
}
/* D1 and H1 */
return -EBUSY;
}
/*
* Returns the allocation status of the specified sectors.
*
* If 'sector_num' is beyond the end of the disk image the return value is 0
* and 'pnum' is set to 0.
*
* 'pnum' is set to the number of sectors (including and immediately following
* the specified sector) that are known to be in the same
* allocated/unallocated state.
*
* 'nb_sectors' is the max value 'pnum' should be set to. If nb_sectors goes
* beyond the end of the disk image it will be clamped.
*
* (Based on raw_co_get_block_status() from raw-posix.c.)
*/
static int64_t coroutine_fn qemu_gluster_co_get_block_status(
BlockDriverState *bs, int64_t sector_num, int nb_sectors, int *pnum,
BlockDriverState **file)
{
BDRVGlusterState *s = bs->opaque;
off_t start, data = 0, hole = 0;
int64_t total_size;
int ret = -EINVAL;
if (!s->fd) {
return ret;
}
start = sector_num * BDRV_SECTOR_SIZE;
total_size = bdrv_getlength(bs);
if (total_size < 0) {
return total_size;
} else if (start >= total_size) {
*pnum = 0;
return 0;
} else if (start + nb_sectors * BDRV_SECTOR_SIZE > total_size) {
nb_sectors = DIV_ROUND_UP(total_size - start, BDRV_SECTOR_SIZE);
}
ret = find_allocation(bs, start, &data, &hole);
if (ret == -ENXIO) {
/* Trailing hole */
*pnum = nb_sectors;
ret = BDRV_BLOCK_ZERO;
} else if (ret < 0) {
/* No info available, so pretend there are no holes */
*pnum = nb_sectors;
ret = BDRV_BLOCK_DATA;
} else if (data == start) {
/* On a data extent, compute sectors to the end of the extent,
* possibly including a partial sector at EOF. */
*pnum = MIN(nb_sectors, DIV_ROUND_UP(hole - start, BDRV_SECTOR_SIZE));
ret = BDRV_BLOCK_DATA;
} else {
/* On a hole, compute sectors to the beginning of the next extent. */
assert(hole == start);
*pnum = MIN(nb_sectors, (data - start) / BDRV_SECTOR_SIZE);
ret = BDRV_BLOCK_ZERO;
}
*file = bs;
return ret | BDRV_BLOCK_OFFSET_VALID | start;
}
static QemuOptsList qemu_gluster_create_opts = {
.name = "qemu-gluster-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(qemu_gluster_create_opts.head),
@@ -743,6 +948,11 @@ static QemuOptsList qemu_gluster_create_opts = {
.type = QEMU_OPT_STRING,
.help = "Preallocation mode (allowed values: off, full)"
},
{
.name = GLUSTER_OPT_DEBUG,
.type = QEMU_OPT_NUMBER,
.help = "Gluster log level, valid range is 0-9",
},
{ /* end of list */ }
}
};
@@ -769,8 +979,9 @@ static BlockDriver bdrv_gluster = {
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -796,8 +1007,9 @@ static BlockDriver bdrv_gluster_tcp = {
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -823,8 +1035,9 @@ static BlockDriver bdrv_gluster_unix = {
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};
@@ -850,8 +1063,9 @@ static BlockDriver bdrv_gluster_rdma = {
.bdrv_co_discard = qemu_gluster_co_discard,
#endif
#ifdef CONFIG_GLUSTERFS_ZEROFILL
.bdrv_co_write_zeroes = qemu_gluster_co_write_zeroes,
.bdrv_co_pwrite_zeroes = qemu_gluster_co_pwrite_zeroes,
#endif
.bdrv_co_get_block_status = qemu_gluster_co_get_block_status,
.create_opts = &qemu_gluster_create_opts,
};

1265
block/io.c

File diff suppressed because it is too large Load Diff

View File

@@ -401,18 +401,26 @@ static int64_t sector_qemu2lun(int64_t sector, IscsiLun *iscsilun)
return sector * BDRV_SECTOR_SIZE / iscsilun->block_size;
}
static bool is_request_lun_aligned(int64_t sector_num, int nb_sectors,
IscsiLun *iscsilun)
static bool is_byte_request_lun_aligned(int64_t offset, int count,
IscsiLun *iscsilun)
{
if ((sector_num * BDRV_SECTOR_SIZE) % iscsilun->block_size ||
(nb_sectors * BDRV_SECTOR_SIZE) % iscsilun->block_size) {
error_report("iSCSI misaligned request: "
"iscsilun->block_size %u, sector_num %" PRIi64
", nb_sectors %d",
iscsilun->block_size, sector_num, nb_sectors);
return 0;
if (offset % iscsilun->block_size || count % iscsilun->block_size) {
error_report("iSCSI misaligned request: "
"iscsilun->block_size %u, offset %" PRIi64
", count %d",
iscsilun->block_size, offset, count);
return false;
}
return 1;
return true;
}
static bool is_sector_request_lun_aligned(int64_t sector_num, int nb_sectors,
IscsiLun *iscsilun)
{
assert(nb_sectors <= BDRV_REQUEST_MAX_SECTORS);
return is_byte_request_lun_aligned(sector_num << BDRV_SECTOR_BITS,
nb_sectors << BDRV_SECTOR_BITS,
iscsilun);
}
static unsigned long *iscsi_allocationmap_init(IscsiLun *iscsilun)
@@ -456,9 +464,12 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
struct IscsiTask iTask;
uint64_t lba;
uint32_t num_sectors;
bool fua;
bool fua = flags & BDRV_REQ_FUA;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
if (fua) {
assert(iscsilun->dpofua);
}
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
@@ -472,7 +483,6 @@ iscsi_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
num_sectors = sector_qemu2lun(nb_sectors, iscsilun);
iscsi_co_init_iscsitask(iscsilun, &iTask);
retry:
fua = iscsilun->dpofua && (flags & BDRV_REQ_FUA);
if (iscsilun->use_16_for_rw) {
iTask.task = iscsi_write16_task(iscsilun->iscsi, iscsilun->lun, lba,
NULL, num_sectors * iscsilun->block_size,
@@ -513,13 +523,6 @@ retry:
return 0;
}
static int coroutine_fn
iscsi_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
QEMUIOVector *iov)
{
return iscsi_co_writev_flags(bs, sector_num, nb_sectors, iov, 0);
}
static bool iscsi_allocationmap_is_allocated(IscsiLun *iscsilun,
int64_t sector_num, int nb_sectors)
@@ -546,7 +549,7 @@ static int64_t coroutine_fn iscsi_co_get_block_status(BlockDriverState *bs,
iscsi_co_init_iscsitask(iscsilun, &iTask);
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
ret = -EINVAL;
goto out;
}
@@ -643,7 +646,7 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
uint64_t lba;
uint32_t num_sectors;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
@@ -658,7 +661,8 @@ static int coroutine_fn iscsi_co_readv(BlockDriverState *bs,
int64_t ret;
int pnum;
BlockDriverState *file;
ret = iscsi_co_get_block_status(bs, sector_num, INT_MAX, &pnum, &file);
ret = iscsi_co_get_block_status(bs, sector_num,
BDRV_REQUEST_MAX_SECTORS, &pnum, &file);
if (ret < 0) {
return ret;
}
@@ -766,6 +770,7 @@ iscsi_aio_ioctl_cb(struct iscsi_context *iscsi, int status,
acb->ioh->driver_status = 0;
acb->ioh->host_status = 0;
acb->ioh->resid = 0;
acb->ioh->status = status;
#define SG_ERR_DRIVER_SENSE 0x08
@@ -837,6 +842,13 @@ static BlockAIOCB *iscsi_aio_ioctl(BlockDriverState *bs,
return &acb->common;
}
if (acb->ioh->cmd_len > SCSI_CDB_MAX_SIZE) {
error_report("iSCSI: ioctl error CDB exceeds max size (%d > %d)",
acb->ioh->cmd_len, SCSI_CDB_MAX_SIZE);
qemu_aio_unref(acb);
return NULL;
}
acb->task = malloc(sizeof(struct scsi_task));
if (acb->task == NULL) {
error_report("iSCSI: Failed to allocate task for scsi command. %s",
@@ -923,7 +935,7 @@ coroutine_fn iscsi_co_discard(BlockDriverState *bs, int64_t sector_num,
struct IscsiTask iTask;
struct unmap_list list;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
if (!is_sector_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
}
@@ -974,8 +986,8 @@ retry:
}
static int
coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
coroutine_fn iscsi_co_pwrite_zeroes(BlockDriverState *bs, int64_t offset,
int count, BdrvRequestFlags flags)
{
IscsiLun *iscsilun = bs->opaque;
struct IscsiTask iTask;
@@ -983,8 +995,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
uint32_t nb_blocks;
bool use_16_for_ws = iscsilun->use_16_for_rw;
if (!is_request_lun_aligned(sector_num, nb_sectors, iscsilun)) {
return -EINVAL;
if (!is_byte_request_lun_aligned(offset, count, iscsilun)) {
return -ENOTSUP;
}
if (flags & BDRV_REQ_MAY_UNMAP) {
@@ -1005,8 +1017,8 @@ coroutine_fn iscsi_co_write_zeroes(BlockDriverState *bs, int64_t sector_num,
return -ENOTSUP;
}
lba = sector_qemu2lun(sector_num, iscsilun);
nb_blocks = sector_qemu2lun(nb_sectors, iscsilun);
lba = offset / iscsilun->block_size;
nb_blocks = count / iscsilun->block_size;
if (iscsilun->zeroblock == NULL) {
iscsilun->zeroblock = g_try_malloc0(iscsilun->block_size);
@@ -1062,9 +1074,11 @@ retry:
}
if (flags & BDRV_REQ_MAY_UNMAP) {
iscsi_allocationmap_clear(iscsilun, sector_num, nb_sectors);
iscsi_allocationmap_clear(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
} else {
iscsi_allocationmap_set(iscsilun, sector_num, nb_sectors);
iscsi_allocationmap_set(iscsilun, offset >> BDRV_SECTOR_BITS,
count >> BDRV_SECTOR_BITS);
}
return 0;
@@ -1555,6 +1569,10 @@ static int iscsi_open(BlockDriverState *bs, QDict *options, int flags,
task = NULL;
iscsi_modesense_sync(iscsilun);
if (iscsilun->dpofua) {
bs->supported_write_flags = BDRV_REQ_FUA;
}
bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
/* Check the write protect flag of the LUN if we want to write */
if (iscsilun->type == TYPE_DISK && (flags & BDRV_O_RDWR) &&
@@ -1704,15 +1722,19 @@ static void iscsi_refresh_limits(BlockDriverState *bs, Error **errp)
}
bs->bl.discard_alignment =
sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
} else {
bs->bl.discard_alignment = iscsilun->block_size >> BDRV_SECTOR_BITS;
}
if (iscsilun->bl.max_ws_len < 0xffffffff) {
bs->bl.max_write_zeroes =
sector_limits_lun2qemu(iscsilun->bl.max_ws_len, iscsilun);
if (iscsilun->bl.max_ws_len < 0xffffffff / iscsilun->block_size) {
bs->bl.max_pwrite_zeroes =
iscsilun->bl.max_ws_len * iscsilun->block_size;
}
if (iscsilun->lbp.lbpws) {
bs->bl.write_zeroes_alignment =
sector_limits_lun2qemu(iscsilun->bl.opt_unmap_gran, iscsilun);
bs->bl.pwrite_zeroes_alignment =
iscsilun->bl.opt_unmap_gran * iscsilun->block_size;
} else {
bs->bl.pwrite_zeroes_alignment = iscsilun->block_size;
}
bs->bl.opt_transfer_length =
sector_limits_lun2qemu(iscsilun->bl.opt_xfer_len, iscsilun);
@@ -1845,11 +1867,9 @@ static BlockDriver bdrv_iscsi = {
.bdrv_co_get_block_status = iscsi_co_get_block_status,
.bdrv_co_discard = iscsi_co_discard,
.bdrv_co_write_zeroes = iscsi_co_write_zeroes,
.bdrv_co_pwrite_zeroes = iscsi_co_pwrite_zeroes,
.bdrv_co_readv = iscsi_co_readv,
.bdrv_co_writev = iscsi_co_writev,
.bdrv_co_writev_flags = iscsi_co_writev_flags,
.supported_write_flags = BDRV_REQ_FUA,
.bdrv_co_flush_to_disk = iscsi_co_flush,
#ifdef __linux__

View File

@@ -11,8 +11,10 @@
#include "qemu-common.h"
#include "block/aio.h"
#include "qemu/queue.h"
#include "block/block.h"
#include "block/raw-aio.h"
#include "qemu/event_notifier.h"
#include "qemu/coroutine.h"
#include <libaio.h>
@@ -30,7 +32,8 @@
struct qemu_laiocb {
BlockAIOCB common;
struct qemu_laio_state *ctx;
Coroutine *co;
LinuxAioState *ctx;
struct iocb iocb;
ssize_t ret;
size_t nbytes;
@@ -46,7 +49,7 @@ typedef struct {
QSIMPLEQ_HEAD(, qemu_laiocb) pending;
} LaioQueue;
struct qemu_laio_state {
struct LinuxAioState {
io_context_t ctx;
EventNotifier e;
@@ -60,7 +63,7 @@ struct qemu_laio_state {
int event_max;
};
static void ioq_submit(struct qemu_laio_state *s);
static void ioq_submit(LinuxAioState *s);
static inline ssize_t io_event_ret(struct io_event *ev)
{
@@ -70,8 +73,7 @@ static inline ssize_t io_event_ret(struct io_event *ev)
/*
* Completes an AIO request (calls the callback and frees the ACB).
*/
static void qemu_laio_process_completion(struct qemu_laio_state *s,
struct qemu_laiocb *laiocb)
static void qemu_laio_process_completion(struct qemu_laiocb *laiocb)
{
int ret;
@@ -89,9 +91,14 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
}
}
}
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
laiocb->ret = ret;
if (laiocb->co) {
qemu_coroutine_enter(laiocb->co, NULL);
} else {
laiocb->common.cb(laiocb->common.opaque, ret);
qemu_aio_unref(laiocb);
}
}
/* The completion BH fetches completed I/O requests and invokes their
@@ -99,7 +106,7 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
*
* The function is somewhat tricky because it supports nested event loops, for
* example when a request callback invokes aio_poll(). In order to do this,
* the completion events array and index are kept in qemu_laio_state. The BH
* the completion events array and index are kept in LinuxAioState. The BH
* reschedules itself as long as there are completions pending so it will
* either be called again in a nested event loop or will be called after all
* events have been completed. When there are no events left to complete, the
@@ -107,7 +114,7 @@ static void qemu_laio_process_completion(struct qemu_laio_state *s,
*/
static void qemu_laio_completion_bh(void *opaque)
{
struct qemu_laio_state *s = opaque;
LinuxAioState *s = opaque;
/* Fetch more completion events when empty */
if (s->event_idx == s->event_max) {
@@ -136,20 +143,22 @@ static void qemu_laio_completion_bh(void *opaque)
laiocb->ret = io_event_ret(&s->events[s->event_idx]);
s->event_idx++;
qemu_laio_process_completion(s, laiocb);
qemu_laio_process_completion(laiocb);
}
if (!s->io_q.plugged && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
qemu_bh_cancel(s->completion_bh);
}
static void qemu_laio_completion_cb(EventNotifier *e)
{
struct qemu_laio_state *s = container_of(e, struct qemu_laio_state, e);
LinuxAioState *s = container_of(e, LinuxAioState, e);
if (event_notifier_test_and_clear(&s->e)) {
qemu_bh_schedule(s->completion_bh);
qemu_laio_completion_bh(s);
}
}
@@ -185,7 +194,7 @@ static void ioq_init(LaioQueue *io_q)
io_q->blocked = false;
}
static void ioq_submit(struct qemu_laio_state *s)
static void ioq_submit(LinuxAioState *s)
{
int ret, len;
struct qemu_laiocb *aiocb;
@@ -216,45 +225,27 @@ static void ioq_submit(struct qemu_laio_state *s)
s->io_q.blocked = (s->io_q.n > 0);
}
void laio_io_plug(BlockDriverState *bs, void *aio_ctx)
void laio_io_plug(BlockDriverState *bs, LinuxAioState *s)
{
struct qemu_laio_state *s = aio_ctx;
s->io_q.plugged++;
assert(!s->io_q.plugged);
s->io_q.plugged = 1;
}
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug)
void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s)
{
struct qemu_laio_state *s = aio_ctx;
assert(s->io_q.plugged > 0 || !unplug);
if (unplug && --s->io_q.plugged > 0) {
return;
}
assert(s->io_q.plugged);
s->io_q.plugged = 0;
if (!s->io_q.blocked && !QSIMPLEQ_EMPTY(&s->io_q.pending)) {
ioq_submit(s);
}
}
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
static int laio_do_submit(int fd, struct qemu_laiocb *laiocb, off_t offset,
int type)
{
struct qemu_laio_state *s = aio_ctx;
struct qemu_laiocb *laiocb;
struct iocb *iocbs;
off_t offset = sector_num * 512;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * 512;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
iocbs = &laiocb->iocb;
LinuxAioState *s = laiocb->ctx;
struct iocb *iocbs = &laiocb->iocb;
QEMUIOVector *qiov = laiocb->qiov;
switch (type) {
case QEMU_AIO_WRITE:
@@ -267,7 +258,7 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
default:
fprintf(stderr, "%s: invalid AIO request type 0x%x.\n",
__func__, type);
goto out_free_aiocb;
return -EIO;
}
io_set_eventfd(&laiocb->iocb, event_notifier_get_fd(&s->e));
@@ -277,33 +268,71 @@ BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
(!s->io_q.plugged || s->io_q.n >= MAX_QUEUED_IO)) {
ioq_submit(s);
}
return &laiocb->common;
out_free_aiocb:
qemu_aio_unref(laiocb);
return NULL;
return 0;
}
void laio_detach_aio_context(void *s_, AioContext *old_context)
int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
uint64_t offset, QEMUIOVector *qiov, int type)
{
struct qemu_laio_state *s = s_;
int ret;
struct qemu_laiocb laiocb = {
.co = qemu_coroutine_self(),
.nbytes = qiov->size,
.ctx = s,
.is_read = (type == QEMU_AIO_READ),
.qiov = qiov,
};
ret = laio_do_submit(fd, &laiocb, offset, type);
if (ret < 0) {
return ret;
}
qemu_coroutine_yield();
return laiocb.ret;
}
BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
{
struct qemu_laiocb *laiocb;
off_t offset = sector_num * BDRV_SECTOR_SIZE;
int ret;
laiocb = qemu_aio_get(&laio_aiocb_info, bs, cb, opaque);
laiocb->nbytes = nb_sectors * BDRV_SECTOR_SIZE;
laiocb->ctx = s;
laiocb->ret = -EINPROGRESS;
laiocb->is_read = (type == QEMU_AIO_READ);
laiocb->qiov = qiov;
ret = laio_do_submit(fd, laiocb, offset, type);
if (ret < 0) {
qemu_aio_unref(laiocb);
return NULL;
}
return &laiocb->common;
}
void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context)
{
aio_set_event_notifier(old_context, &s->e, false, NULL);
qemu_bh_delete(s->completion_bh);
}
void laio_attach_aio_context(void *s_, AioContext *new_context)
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context)
{
struct qemu_laio_state *s = s_;
s->completion_bh = aio_bh_new(new_context, qemu_laio_completion_bh, s);
aio_set_event_notifier(new_context, &s->e, false,
qemu_laio_completion_cb);
}
void *laio_init(void)
LinuxAioState *laio_init(void)
{
struct qemu_laio_state *s;
LinuxAioState *s;
s = g_malloc0(sizeof(*s));
if (event_notifier_init(&s->e, false) < 0) {
@@ -325,10 +354,8 @@ out_free_state:
return NULL;
}
void laio_cleanup(void *s_)
void laio_cleanup(LinuxAioState *s)
{
struct qemu_laio_state *s = s_;
event_notifier_cleanup(&s->e);
if (io_destroy(s->ctx) != 0) {

View File

@@ -20,7 +20,6 @@
#include "qapi/qmp/qerror.h"
#include "qemu/ratelimit.h"
#include "qemu/bitmap.h"
#include "qemu/error-report.h"
#define SLICE_TIME 100000000ULL /* ns */
#define MAX_IN_FLIGHT 16
@@ -36,7 +35,7 @@ typedef struct MirrorBuffer {
typedef struct MirrorBlockJob {
BlockJob common;
RateLimit limit;
BlockDriverState *target;
BlockBackend *target;
BlockDriverState *base;
/* The name of the graph node to replace */
char *replaces;
@@ -45,6 +44,7 @@ typedef struct MirrorBlockJob {
/* Used to block operations on the drive-mirror-replace target */
Error *replace_blocker;
bool is_none_mode;
BlockMirrorBackingMode backing_mode;
BlockdevOnError on_source_error, on_target_error;
bool synced;
bool should_complete;
@@ -80,11 +80,11 @@ static BlockErrorAction mirror_error_action(MirrorBlockJob *s, bool read,
{
s->synced = false;
if (read) {
return block_job_error_action(&s->common, s->common.bs,
s->on_source_error, true, error);
return block_job_error_action(&s->common, s->on_source_error,
true, error);
} else {
return block_job_error_action(&s->common, s->target,
s->on_target_error, false, error);
return block_job_error_action(&s->common, s->on_target_error,
false, error);
}
}
@@ -157,8 +157,8 @@ static void mirror_read_complete(void *opaque, int ret)
mirror_iteration_done(op, ret);
return;
}
bdrv_aio_writev(s->target, op->sector_num, &op->qiov, op->nb_sectors,
mirror_write_complete, op);
blk_aio_pwritev(s->target, op->sector_num * BDRV_SECTOR_SIZE, &op->qiov,
0, mirror_write_complete, op);
}
static inline void mirror_clip_sectors(MirrorBlockJob *s,
@@ -186,8 +186,9 @@ static int mirror_cow_align(MirrorBlockJob *s,
need_cow |= !test_bit((*sector_num + *nb_sectors - 1) / chunk_sectors,
s->cow_bitmap);
if (need_cow) {
bdrv_round_to_clusters(s->target, *sector_num, *nb_sectors,
&align_sector_num, &align_nb_sectors);
bdrv_round_sectors_to_clusters(blk_bs(s->target), *sector_num,
*nb_sectors, &align_sector_num,
&align_nb_sectors);
}
if (align_nb_sectors > max_sectors) {
@@ -217,23 +218,29 @@ static inline void mirror_wait_for_io(MirrorBlockJob *s)
}
/* Submit async read while handling COW.
* Returns: nb_sectors if no alignment is necessary, or
* Returns: The number of sectors copied after and including sector_num,
* excluding any sectors copied prior to sector_num due to alignment.
* This will be nb_sectors if no alignment is necessary, or
* (new_end - sector_num) if tail is rounded up or down due to
* alignment or buffer limit.
*/
static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
int nb_sectors)
{
BlockDriverState *source = s->common.bs;
BlockBackend *source = s->common.blk;
int sectors_per_chunk, nb_chunks;
int ret = nb_sectors;
int ret;
MirrorOp *op;
int max_sectors;
sectors_per_chunk = s->granularity >> BDRV_SECTOR_BITS;
max_sectors = sectors_per_chunk * s->max_iov;
/* We can only handle as much as buf_size at a time. */
nb_sectors = MIN(s->buf_size >> BDRV_SECTOR_BITS, nb_sectors);
nb_sectors = MIN(max_sectors, nb_sectors);
assert(nb_sectors);
ret = nb_sectors;
if (s->cow_bitmap) {
ret += mirror_cow_align(s, &sector_num, &nb_sectors);
@@ -274,7 +281,7 @@ static int mirror_do_read(MirrorBlockJob *s, int64_t sector_num,
s->sectors_in_flight += nb_sectors;
trace_mirror_one_iteration(s, sector_num, nb_sectors);
bdrv_aio_readv(source, sector_num, &op->qiov, nb_sectors,
blk_aio_preadv(source, sector_num * BDRV_SECTOR_SIZE, &op->qiov, 0,
mirror_read_complete, op);
return ret;
}
@@ -296,10 +303,11 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
s->in_flight++;
s->sectors_in_flight += nb_sectors;
if (is_discard) {
bdrv_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
blk_aio_discard(s->target, sector_num, op->nb_sectors,
mirror_write_complete, op);
} else {
bdrv_aio_write_zeroes(s->target, sector_num, op->nb_sectors,
blk_aio_pwrite_zeroes(s->target, sector_num * BDRV_SECTOR_SIZE,
op->nb_sectors * BDRV_SECTOR_SIZE,
s->unmap ? BDRV_REQ_MAY_UNMAP : 0,
mirror_write_complete, op);
}
@@ -307,7 +315,7 @@ static void mirror_do_zero_or_discard(MirrorBlockJob *s,
static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
{
BlockDriverState *source = s->common.bs;
BlockDriverState *source = blk_bs(s->common.blk);
int64_t sector_num, first_chunk;
uint64_t delay_ns = 0;
/* At least the first dirty chunk is mirrored in one iteration. */
@@ -325,10 +333,12 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
first_chunk = sector_num / sectors_per_chunk;
while (test_bit(first_chunk, s->in_flight_bitmap)) {
trace_mirror_yield_in_flight(s, first_chunk, s->in_flight);
trace_mirror_yield_in_flight(s, sector_num, s->in_flight);
mirror_wait_for_io(s);
}
block_job_pause_point(&s->common);
/* Find the number of consective dirty chunks following the first dirty
* one, and wait for in flight requests in them. */
while (nb_chunks * sectors_per_chunk < (s->buf_size >> BDRV_SECTOR_BITS)) {
@@ -384,8 +394,9 @@ static uint64_t coroutine_fn mirror_iteration(MirrorBlockJob *s)
} else if (ret >= 0 && !(ret & BDRV_BLOCK_DATA)) {
int64_t target_sector_num;
int target_nb_sectors;
bdrv_round_to_clusters(s->target, sector_num, io_sectors,
&target_sector_num, &target_nb_sectors);
bdrv_round_sectors_to_clusters(blk_bs(s->target), sector_num,
io_sectors, &target_sector_num,
&target_nb_sectors);
if (target_sector_num == sector_num &&
target_nb_sectors == io_sectors) {
mirror_method = ret & BDRV_BLOCK_ZERO ?
@@ -449,7 +460,8 @@ static void mirror_exit(BlockJob *job, void *opaque)
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
MirrorExitData *data = opaque;
AioContext *replace_aio_context = NULL;
BlockDriverState *src = s->common.bs;
BlockDriverState *src = blk_bs(s->common.blk);
BlockDriverState *target_bs = blk_bs(s->target);
/* Make sure that the source BDS doesn't go away before we called
* block_job_completed(). */
@@ -461,26 +473,25 @@ static void mirror_exit(BlockJob *job, void *opaque)
}
if (s->should_complete && data->ret == 0) {
BlockDriverState *to_replace = s->common.bs;
BlockDriverState *to_replace = src;
if (s->to_replace) {
to_replace = s->to_replace;
}
/* This was checked in mirror_start_job(), but meanwhile one of the
* nodes could have been newly attached to a BlockBackend. */
if (to_replace->blk && s->target->blk) {
error_report("block job: Can't create node with two BlockBackends");
data->ret = -EINVAL;
goto out;
if (bdrv_get_flags(target_bs) != bdrv_get_flags(to_replace)) {
bdrv_reopen(target_bs, bdrv_get_flags(to_replace), NULL);
}
if (bdrv_get_flags(s->target) != bdrv_get_flags(to_replace)) {
bdrv_reopen(s->target, bdrv_get_flags(to_replace), NULL);
}
bdrv_replace_in_backing_chain(to_replace, s->target);
/* The mirror job has no requests in flight any more, but we need to
* drain potential other users of the BDS before changing the graph. */
bdrv_drained_begin(target_bs);
bdrv_replace_in_backing_chain(to_replace, target_bs);
bdrv_drained_end(target_bs);
/* We just changed the BDS the job BB refers to */
blk_remove_bs(job->blk);
blk_insert_bs(job->blk, src);
}
out:
if (s->to_replace) {
bdrv_op_unblock_all(s->to_replace, s->replace_blocker);
error_free(s->replace_blocker);
@@ -490,8 +501,8 @@ out:
aio_context_release(replace_aio_context);
}
g_free(s->replaces);
bdrv_op_unblock_all(s->target, s->common.blocker);
bdrv_unref(s->target);
bdrv_op_unblock_all(target_bs, s->common.blocker);
blk_unref(s->target);
block_job_completed(&s->common, data->ret);
g_free(data);
bdrv_drained_end(src);
@@ -505,7 +516,8 @@ static void coroutine_fn mirror_run(void *opaque)
{
MirrorBlockJob *s = opaque;
MirrorExitData *data;
BlockDriverState *bs = s->common.bs;
BlockDriverState *bs = blk_bs(s->common.blk);
BlockDriverState *target_bs = blk_bs(s->target);
int64_t sector_num, end, length;
uint64_t last_pause_ns;
BlockDriverInfo bdi;
@@ -541,18 +553,18 @@ static void coroutine_fn mirror_run(void *opaque)
* the destination do COW. Instead, we copy sectors around the
* dirty data if needed. We need a bitmap to do that.
*/
bdrv_get_backing_filename(s->target, backing_filename,
bdrv_get_backing_filename(target_bs, backing_filename,
sizeof(backing_filename));
if (!bdrv_get_info(s->target, &bdi) && bdi.cluster_size) {
if (!bdrv_get_info(target_bs, &bdi) && bdi.cluster_size) {
target_cluster_size = bdi.cluster_size;
}
if (backing_filename[0] && !s->target->backing
if (backing_filename[0] && !target_bs->backing
&& s->granularity < target_cluster_size) {
s->buf_size = MAX(s->buf_size, target_cluster_size);
s->cow_bitmap = bitmap_new(length);
}
s->target_cluster_sectors = target_cluster_size >> BDRV_SECTOR_BITS;
s->max_iov = MIN(s->common.bs->bl.max_iov, s->target->bl.max_iov);
s->max_iov = MIN(bs->bl.max_iov, target_bs->bl.max_iov);
end = s->bdev_length / BDRV_SECTOR_SIZE;
s->buf = qemu_try_blockalign(bs, s->buf_size);
@@ -567,7 +579,7 @@ static void coroutine_fn mirror_run(void *opaque)
if (!s->is_none_mode) {
/* First part, loop on the sectors and initialize the dirty bitmap. */
BlockDriverState *base = s->base;
bool mark_all_dirty = s->base == NULL && !bdrv_has_zero_init(s->target);
bool mark_all_dirty = s->base == NULL && !bdrv_has_zero_init(target_bs);
for (sector_num = 0; sector_num < end; ) {
/* Just to make sure we are not exceeding int limit. */
@@ -578,6 +590,8 @@ static void coroutine_fn mirror_run(void *opaque)
if (now - last_pause_ns > SLICE_TIME) {
last_pause_ns = now;
block_job_sleep_ns(&s->common, QEMU_CLOCK_REALTIME, 0);
} else {
block_job_pause_point(&s->common);
}
if (block_job_is_cancelled(&s->common)) {
@@ -609,6 +623,8 @@ static void coroutine_fn mirror_run(void *opaque)
goto immediate_exit;
}
block_job_pause_point(&s->common);
cnt = bdrv_get_dirty_count(s->dirty_bitmap);
/* s->common.offset contains the number of bytes already processed so
* far, cnt is the number of dirty sectors remaining and
@@ -637,7 +653,7 @@ static void coroutine_fn mirror_run(void *opaque)
should_complete = false;
if (s->in_flight == 0 && cnt == 0) {
trace_mirror_before_flush(s);
ret = bdrv_flush(s->target);
ret = blk_flush(s->target);
if (ret < 0) {
if (mirror_error_action(s, false, -ret) ==
BLOCK_ERROR_ACTION_REPORT) {
@@ -710,15 +726,12 @@ immediate_exit:
g_free(s->cow_bitmap);
g_free(s->in_flight_bitmap);
bdrv_release_dirty_bitmap(bs, s->dirty_bitmap);
if (s->target->blk) {
blk_iostatus_disable(s->target->blk);
}
data = g_malloc(sizeof(*data));
data->ret = ret;
/* Before we switch to target in mirror_exit, make sure data doesn't
* change. */
bdrv_drained_begin(s->common.bs);
bdrv_drained_begin(bs);
if (qemu_get_aio_context() == bdrv_get_aio_context(bs)) {
/* FIXME: virtio host notifiers run on iohandler_ctx, therefore the
* above bdrv_drained_end isn't enough to quiesce it. This is ugly, we
@@ -739,32 +752,30 @@ static void mirror_set_speed(BlockJob *job, int64_t speed, Error **errp)
ratelimit_set_speed(&s->limit, speed / BDRV_SECTOR_SIZE, SLICE_TIME);
}
static void mirror_iostatus_reset(BlockJob *job)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
if (s->target->blk) {
blk_iostatus_reset(s->target->blk);
}
}
static void mirror_complete(BlockJob *job, Error **errp)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
Error *local_err = NULL;
int ret;
BlockDriverState *src, *target;
src = blk_bs(job->blk);
target = blk_bs(s->target);
ret = bdrv_open_backing_file(s->target, NULL, "backing", &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
return;
}
if (!s->synced) {
error_setg(errp, QERR_BLOCK_JOB_NOT_READY, job->id);
return;
}
/* check the target bs is not blocked and block all operations on it */
if (s->backing_mode == MIRROR_OPEN_BACKING_CHAIN) {
int ret;
assert(!target->backing);
ret = bdrv_open_backing_file(target, NULL, "backing", errp);
if (ret < 0) {
return;
}
}
/* block all operations on to_replace bs */
if (s->replaces) {
AioContext *replace_aio_context;
@@ -785,31 +796,57 @@ static void mirror_complete(BlockJob *job, Error **errp)
aio_context_release(replace_aio_context);
}
if (s->backing_mode == MIRROR_SOURCE_BACKING_CHAIN) {
BlockDriverState *backing = s->is_none_mode ? src : s->base;
if (backing_bs(target) != backing) {
bdrv_set_backing_hd(target, backing);
}
}
s->should_complete = true;
block_job_enter(&s->common);
}
/* There is no matching mirror_resume() because mirror_run() will begin
* iterating again when the job is resumed.
*/
static void coroutine_fn mirror_pause(BlockJob *job)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
mirror_drain(s);
}
static void mirror_attached_aio_context(BlockJob *job, AioContext *new_context)
{
MirrorBlockJob *s = container_of(job, MirrorBlockJob, common);
blk_set_aio_context(s->target, new_context);
}
static const BlockJobDriver mirror_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_MIRROR,
.set_speed = mirror_set_speed,
.iostatus_reset= mirror_iostatus_reset,
.complete = mirror_complete,
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_MIRROR,
.set_speed = mirror_set_speed,
.complete = mirror_complete,
.pause = mirror_pause,
.attached_aio_context = mirror_attached_aio_context,
};
static const BlockJobDriver commit_active_job_driver = {
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = mirror_set_speed,
.iostatus_reset
= mirror_iostatus_reset,
.complete = mirror_complete,
.instance_size = sizeof(MirrorBlockJob),
.job_type = BLOCK_JOB_TYPE_COMMIT,
.set_speed = mirror_set_speed,
.complete = mirror_complete,
.pause = mirror_pause,
.attached_aio_context = mirror_attached_aio_context,
};
static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity,
int64_t buf_size,
BlockMirrorBackingMode backing_mode,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
@@ -819,7 +856,6 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
bool is_none_mode, BlockDriverState *base)
{
MirrorBlockJob *s;
BlockDriverState *replaced_bs;
if (granularity == 0) {
granularity = bdrv_get_default_bitmap_granularity(target);
@@ -827,13 +863,6 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
assert ((granularity & (granularity - 1)) == 0);
if ((on_source_error == BLOCKDEV_ON_ERROR_STOP ||
on_source_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-source-error");
return;
}
if (buf_size < 0) {
error_setg(errp, "Invalid parameter 'buf-size'");
return;
@@ -843,31 +872,19 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
buf_size = DEFAULT_MIRROR_BUF_SIZE;
}
/* We can't support this case as long as the block layer can't handle
* multiple BlockBackends per BlockDriverState. */
if (replaces) {
replaced_bs = bdrv_lookup_bs(replaces, replaces, errp);
if (replaced_bs == NULL) {
return;
}
} else {
replaced_bs = bs;
}
if (replaced_bs->blk && target->blk) {
error_setg(errp, "Can't create node with two BlockBackends");
return;
}
s = block_job_create(driver, bs, speed, cb, opaque, errp);
if (!s) {
return;
}
s->target = blk_new();
blk_insert_bs(s->target, target);
s->replaces = g_strdup(replaces);
s->on_source_error = on_source_error;
s->on_target_error = on_target_error;
s->target = target;
s->is_none_mode = is_none_mode;
s->backing_mode = backing_mode;
s->base = base;
s->granularity = granularity;
s->buf_size = ROUND_UP(buf_size, granularity);
@@ -876,16 +893,13 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
s->dirty_bitmap = bdrv_create_dirty_bitmap(bs, granularity, NULL, errp);
if (!s->dirty_bitmap) {
g_free(s->replaces);
blk_unref(s->target);
block_job_unref(&s->common);
return;
}
bdrv_op_block_all(s->target, s->common.blocker);
bdrv_op_block_all(target, s->common.blocker);
if (s->target->blk) {
blk_set_on_error(s->target->blk, on_target_error, on_target_error);
blk_iostatus_enable(s->target->blk);
}
s->common.co = qemu_coroutine_create(mirror_run);
trace_mirror_start(bs, s, s->common.co, opaque);
qemu_coroutine_enter(s->common.co, s);
@@ -894,7 +908,8 @@ static void mirror_start_job(BlockDriverState *bs, BlockDriverState *target,
void mirror_start(BlockDriverState *bs, BlockDriverState *target,
const char *replaces,
int64_t speed, uint32_t granularity, int64_t buf_size,
MirrorSyncMode mode, BlockdevOnError on_source_error,
MirrorSyncMode mode, BlockMirrorBackingMode backing_mode,
BlockdevOnError on_source_error,
BlockdevOnError on_target_error,
bool unmap,
BlockCompletionFunc *cb,
@@ -910,7 +925,7 @@ void mirror_start(BlockDriverState *bs, BlockDriverState *target,
is_none_mode = mode == MIRROR_SYNC_MODE_NONE;
base = mode == MIRROR_SYNC_MODE_TOP ? backing_bs(bs) : NULL;
mirror_start_job(bs, target, replaces,
speed, granularity, buf_size,
speed, granularity, buf_size, backing_mode,
on_source_error, on_target_error, unmap, cb, opaque, errp,
&mirror_job_driver, is_none_mode, base);
}
@@ -957,8 +972,7 @@ void commit_active_start(BlockDriverState *bs, BlockDriverState *base,
}
}
bdrv_ref(base);
mirror_start_job(bs, base, NULL, speed, 0, 0,
mirror_start_job(bs, base, NULL, speed, 0, 0, MIRROR_LEAVE_BACKING_CHAIN,
on_error, on_error, false, cb, opaque, &local_err,
&commit_active_job_driver, false, base);
if (local_err) {

View File

@@ -243,15 +243,15 @@ static int nbd_co_readv_1(BlockDriverState *bs, int64_t sector_num,
static int nbd_co_writev_1(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov,
int offset, int *flags)
int offset, int flags)
{
NbdClientSession *client = nbd_get_client_session(bs);
struct nbd_request request = { .type = NBD_CMD_WRITE };
struct nbd_reply reply;
ssize_t ret;
if ((*flags & BDRV_REQ_FUA) && (client->nbdflags & NBD_FLAG_SEND_FUA)) {
*flags &= ~BDRV_REQ_FUA;
if (flags & BDRV_REQ_FUA) {
assert(client->nbdflags & NBD_FLAG_SEND_FUA);
request.type |= NBD_CMD_FLAG_FUA;
}
@@ -291,7 +291,7 @@ int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
}
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov, int *flags)
int nb_sectors, QEMUIOVector *qiov, int flags)
{
int offset = 0;
int ret;
@@ -414,6 +414,9 @@ int nbd_client_init(BlockDriverState *bs,
logout("Failed to negotiate with the NBD server\n");
return ret;
}
if (client->nbdflags & NBD_FLAG_SEND_FUA) {
bs->supported_write_flags = BDRV_REQ_FUA;
}
qemu_co_mutex_init(&client->send_mutex);
qemu_co_mutex_init(&client->free_sema);

View File

@@ -48,7 +48,7 @@ int nbd_client_co_discard(BlockDriverState *bs, int64_t sector_num,
int nb_sectors);
int nbd_client_co_flush(BlockDriverState *bs);
int nbd_client_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov, int *flags);
int nb_sectors, QEMUIOVector *qiov, int flags);
int nbd_client_co_readv(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov);

View File

@@ -355,31 +355,6 @@ static int nbd_co_readv(BlockDriverState *bs, int64_t sector_num,
return nbd_client_co_readv(bs, sector_num, nb_sectors, qiov);
}
static int nbd_co_writev_flags(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov, int flags)
{
int ret;
ret = nbd_client_co_writev(bs, sector_num, nb_sectors, qiov, &flags);
if (ret < 0) {
return ret;
}
/* The flag wasn't sent to the server, so we need to emulate it with an
* explicit flush */
if (flags & BDRV_REQ_FUA) {
ret = nbd_client_co_flush(bs);
}
return ret;
}
static int nbd_co_writev(BlockDriverState *bs, int64_t sector_num,
int nb_sectors, QEMUIOVector *qiov)
{
return nbd_co_writev_flags(bs, sector_num, nb_sectors, qiov, 0);
}
static int nbd_co_flush(BlockDriverState *bs)
{
return nbd_client_co_flush(bs);
@@ -476,9 +451,7 @@ static BlockDriver bdrv_nbd = {
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_co_writev_flags = nbd_co_writev_flags,
.supported_write_flags = BDRV_REQ_FUA,
.bdrv_co_writev_flags = nbd_client_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
@@ -496,9 +469,7 @@ static BlockDriver bdrv_nbd_tcp = {
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_co_writev_flags = nbd_co_writev_flags,
.supported_write_flags = BDRV_REQ_FUA,
.bdrv_co_writev_flags = nbd_client_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,
@@ -516,9 +487,7 @@ static BlockDriver bdrv_nbd_unix = {
.bdrv_parse_filename = nbd_parse_filename,
.bdrv_file_open = nbd_open,
.bdrv_co_readv = nbd_co_readv,
.bdrv_co_writev = nbd_co_writev,
.bdrv_co_writev_flags = nbd_co_writev_flags,
.supported_write_flags = BDRV_REQ_FUA,
.bdrv_co_writev_flags = nbd_client_co_writev,
.bdrv_close = nbd_close,
.bdrv_co_flush_to_os = nbd_co_flush,
.bdrv_co_discard = nbd_co_discard,

View File

@@ -1,7 +1,7 @@
/*
* QEMU Block driver for native access to files on NFS shares
*
* Copyright (c) 2014 Peter Lieven <pl@kamp.de>
* Copyright (c) 2014-2016 Peter Lieven <pl@kamp.de>
*
* Permission is hereby granted, free of charge, to any person obtaining a copy
* of this software and associated documentation files (the "Software"), to deal
@@ -38,6 +38,7 @@
#include <nfsc/libnfs.h>
#define QEMU_NFS_MAX_READAHEAD_SIZE 1048576
#define QEMU_NFS_MAX_PAGECACHE_SIZE (8388608 / NFS_BLKSIZE)
#define QEMU_NFS_MAX_DEBUG_LEVEL 2
typedef struct NFSClient {
@@ -47,6 +48,7 @@ typedef struct NFSClient {
bool has_zero_init;
AioContext *aio_context;
blkcnt_t st_blocks;
bool cache_used;
} NFSClient;
typedef struct NFSRPC {
@@ -278,7 +280,7 @@ static void nfs_file_close(BlockDriverState *bs)
}
static int64_t nfs_client_open(NFSClient *client, const char *filename,
int flags, Error **errp)
int flags, Error **errp, int open_flags)
{
int ret = -EINVAL, i;
struct stat st;
@@ -330,12 +332,38 @@ static int64_t nfs_client_open(NFSClient *client, const char *filename,
nfs_set_tcp_syncnt(client->context, val);
#ifdef LIBNFS_FEATURE_READAHEAD
} else if (!strcmp(qp->p[i].name, "readahead")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS readahead "
"if cache.direct = on");
goto fail;
}
if (val > QEMU_NFS_MAX_READAHEAD_SIZE) {
error_report("NFS Warning: Truncating NFS readahead"
" size to %d", QEMU_NFS_MAX_READAHEAD_SIZE);
val = QEMU_NFS_MAX_READAHEAD_SIZE;
}
nfs_set_readahead(client->context, val);
#ifdef LIBNFS_FEATURE_PAGECACHE
nfs_set_pagecache_ttl(client->context, 0);
#endif
client->cache_used = true;
#endif
#ifdef LIBNFS_FEATURE_PAGECACHE
nfs_set_pagecache_ttl(client->context, 0);
} else if (!strcmp(qp->p[i].name, "pagecache")) {
if (open_flags & BDRV_O_NOCACHE) {
error_setg(errp, "Cannot enable NFS pagecache "
"if cache.direct = on");
goto fail;
}
if (val > QEMU_NFS_MAX_PAGECACHE_SIZE) {
error_report("NFS Warning: Truncating NFS pagecache"
" size to %d pages", QEMU_NFS_MAX_PAGECACHE_SIZE);
val = QEMU_NFS_MAX_PAGECACHE_SIZE;
}
nfs_set_pagecache(client->context, val);
nfs_set_pagecache_ttl(client->context, 0);
client->cache_used = true;
#endif
#ifdef LIBNFS_FEATURE_DEBUG
} else if (!strcmp(qp->p[i].name, "debug")) {
@@ -418,7 +446,7 @@ static int nfs_file_open(BlockDriverState *bs, QDict *options, int flags,
}
ret = nfs_client_open(client, qemu_opt_get(opts, "filename"),
(flags & BDRV_O_RDWR) ? O_RDWR : O_RDONLY,
errp);
errp, bs->open_flags);
if (ret < 0) {
goto out;
}
@@ -454,7 +482,7 @@ static int nfs_file_create(const char *url, QemuOpts *opts, Error **errp)
total_size = ROUND_UP(qemu_opt_get_size_del(opts, BLOCK_OPT_SIZE, 0),
BDRV_SECTOR_SIZE);
ret = nfs_client_open(client, url, O_CREAT, errp);
ret = nfs_client_open(client, url, O_CREAT, errp, 0);
if (ret < 0) {
goto out;
}
@@ -516,6 +544,12 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
return -EACCES;
}
if ((state->flags & BDRV_O_NOCACHE) && client->cache_used) {
error_setg(errp, "Cannot disable cache if libnfs readahead or"
" pagecache is enabled");
return -EINVAL;
}
/* Update cache for read-only reopens */
if (!(state->flags & BDRV_O_RDWR)) {
ret = nfs_fstat(client->context, client->fh, &st);
@@ -530,6 +564,15 @@ static int nfs_reopen_prepare(BDRVReopenState *state,
return 0;
}
#ifdef LIBNFS_FEATURE_PAGECACHE
static void nfs_invalidate_cache(BlockDriverState *bs,
Error **errp)
{
NFSClient *client = bs->opaque;
nfs_pagecache_invalidate(client->context, client->fh);
}
#endif
static BlockDriver bdrv_nfs = {
.format_name = "nfs",
.protocol_name = "nfs",
@@ -553,6 +596,10 @@ static BlockDriver bdrv_nfs = {
.bdrv_detach_aio_context = nfs_detach_aio_context,
.bdrv_attach_aio_context = nfs_attach_aio_context,
#ifdef LIBNFS_FEATURE_PAGECACHE
.bdrv_invalidate_cache = nfs_invalidate_cache,
#endif
};
static void nfs_block_init(void)

View File

@@ -12,6 +12,8 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qapi/qmp/qdict.h"
#include "qapi/qmp/qstring.h"
#include "block/block_int.h"
#define NULL_OPT_LATENCY "latency-ns"
@@ -223,6 +225,20 @@ static int64_t coroutine_fn null_co_get_block_status(BlockDriverState *bs,
}
}
static void null_refresh_filename(BlockDriverState *bs, QDict *opts)
{
QINCREF(opts);
qdict_del(opts, "filename");
if (!qdict_size(opts)) {
snprintf(bs->exact_filename, sizeof(bs->exact_filename), "%s://",
bs->drv->format_name);
}
qdict_put(opts, "driver", qstring_from_str(bs->drv->format_name));
bs->full_open_options = opts;
}
static BlockDriver bdrv_null_co = {
.format_name = "null-co",
.protocol_name = "null-co",
@@ -238,6 +254,8 @@ static BlockDriver bdrv_null_co = {
.bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
};
static BlockDriver bdrv_null_aio = {
@@ -255,6 +273,8 @@ static BlockDriver bdrv_null_aio = {
.bdrv_reopen_prepare = null_reopen_prepare,
.bdrv_co_get_block_status = null_co_get_block_status,
.bdrv_refresh_filename = null_refresh_filename,
};
static void bdrv_null_init(void)

View File

@@ -33,6 +33,7 @@
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "qemu/bitmap.h"
#include "qapi/util.h"
@@ -203,13 +204,15 @@ static int64_t allocate_clusters(BlockDriverState *bs, int64_t sector_num,
return -EINVAL;
}
to_allocate = (sector_num + *pnum + s->tracks - 1) / s->tracks - idx;
to_allocate = DIV_ROUND_UP(sector_num + *pnum, s->tracks) - idx;
space = to_allocate * s->tracks;
if (s->data_end + space > bdrv_getlength(bs->file->bs) >> BDRV_SECTOR_BITS) {
int ret;
space += s->prealloc_size;
if (s->prealloc_mode == PRL_PREALLOC_MODE_FALLOCATE) {
ret = bdrv_write_zeroes(bs->file->bs, s->data_end, space, 0);
ret = bdrv_pwrite_zeroes(bs->file->bs,
s->data_end << BDRV_SECTOR_BITS,
space << BDRV_SECTOR_BITS, 0);
} else {
ret = bdrv_truncate(bs->file->bs,
(s->data_end + space) << BDRV_SECTOR_BITS);
@@ -512,11 +515,12 @@ static int parallels_create(const char *filename, QemuOpts *opts, Error **errp)
memset(tmp, 0, sizeof(tmp));
memcpy(tmp, &header, sizeof(header));
ret = blk_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE);
ret = blk_pwrite(file, 0, tmp, BDRV_SECTOR_SIZE, 0);
if (ret < 0) {
goto exit;
}
ret = blk_write_zeroes(file, 1, bat_sectors - 1, 0);
ret = blk_pwrite_zeroes(file, BDRV_SECTOR_SIZE,
(bat_sectors - 1) << BDRV_SECTOR_BITS, 0);
if (ret < 0) {
goto exit;
}

View File

@@ -67,10 +67,10 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->backing_file_depth = bdrv_get_backing_file_depth(bs);
info->detect_zeroes = bs->detect_zeroes;
if (bs->throttle_state) {
if (blk && blk_get_public(blk)->throttle_state) {
ThrottleConfig cfg;
throttle_group_get_config(bs, &cfg);
throttle_group_get_config(blk, &cfg);
info->bps = cfg.buckets[THROTTLE_BPS_TOTAL].avg;
info->bps_rd = cfg.buckets[THROTTLE_BPS_READ].avg;
@@ -118,7 +118,7 @@ BlockDeviceInfo *bdrv_block_device_info(BlockBackend *blk,
info->iops_size = cfg.op_size;
info->has_group = true;
info->group = g_strdup(throttle_group_get_name(bs));
info->group = g_strdup(throttle_group_get_name(blk));
}
info->write_threshold = bdrv_write_threshold_get(bs);

View File

@@ -28,6 +28,7 @@
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include <zlib.h>
#include "qapi/qmp/qerror.h"
#include "crypto/cipher.h"
@@ -161,10 +162,16 @@ static int qcow_open(BlockDriverState *bs, QDict *options, int flags,
if (s->crypt_method_header) {
if (bdrv_uses_whitelist() &&
s->crypt_method_header == QCOW_CRYPT_AES) {
error_report("qcow built-in AES encryption is deprecated");
error_printf("Support for it will be removed in a future release.\n"
"You can use 'qemu-img convert' to switch to an\n"
"unencrypted qcow image, or a LUKS raw image.\n");
error_setg(errp,
"Use of AES-CBC encrypted qcow images is no longer "
"supported in system emulators");
error_append_hint(errp,
"You can use 'qemu-img convert' to convert your "
"image to an alternative supported format, such "
"as unencrypted qcow, or raw with the LUKS "
"format instead.\n");
ret = -ENOSYS;
goto fail;
}
bs->encrypted = 1;
@@ -853,24 +860,24 @@ static int qcow_create(const char *filename, QemuOpts *opts, Error **errp)
}
/* write all the data */
ret = blk_pwrite(qcow_blk, 0, &header, sizeof(header));
ret = blk_pwrite(qcow_blk, 0, &header, sizeof(header), 0);
if (ret != sizeof(header)) {
goto exit;
}
if (backing_file) {
ret = blk_pwrite(qcow_blk, sizeof(header),
backing_file, backing_filename_len);
backing_file, backing_filename_len, 0);
if (ret != backing_filename_len) {
goto exit;
}
}
tmp = g_malloc0(BDRV_SECTOR_SIZE);
for (i = 0; i < ((sizeof(uint64_t)*l1_size + BDRV_SECTOR_SIZE - 1)/
BDRV_SECTOR_SIZE); i++) {
ret = blk_pwrite(qcow_blk, header_size +
BDRV_SECTOR_SIZE*i, tmp, BDRV_SECTOR_SIZE);
for (i = 0; i < DIV_ROUND_UP(sizeof(uint64_t) * l1_size, BDRV_SECTOR_SIZE);
i++) {
ret = blk_pwrite(qcow_blk, header_size + BDRV_SECTOR_SIZE * i,
tmp, BDRV_SECTOR_SIZE, 0);
if (ret != BDRV_SECTOR_SIZE) {
g_free(tmp);
goto exit;

View File

@@ -24,11 +24,6 @@
/* Needed for CONFIG_MADVISE */
#include "qemu/osdep.h"
#if defined(CONFIG_MADVISE) || defined(CONFIG_POSIX_MADVISE)
#include <sys/mman.h>
#endif
#include "block/block_int.h"
#include "qemu-common.h"
#include "qcow2.h"
@@ -226,7 +221,7 @@ static int qcow2_cache_entry_flush(BlockDriverState *bs, Qcow2Cache *c, int i)
return 0;
}
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c)
{
BDRVQcow2State *s = bs->opaque;
int result = 0;
@@ -242,8 +237,15 @@ int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
}
}
return result;
}
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c)
{
int result = qcow2_cache_write(bs, c);
if (result == 0) {
ret = bdrv_flush(bs->file->bs);
int ret = bdrv_flush(bs->file->bs);
if (ret < 0) {
result = ret;
}

View File

@@ -29,6 +29,7 @@
#include "qemu-common.h"
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/bswap.h"
#include "trace.h"
int qcow2_grow_l1_table(BlockDriverState *bs, uint64_t min_size,
@@ -153,11 +154,9 @@ static int l2_load(BlockDriverState *bs, uint64_t l2_offset,
uint64_t **l2_table)
{
BDRVQcow2State *s = bs->opaque;
int ret;
ret = qcow2_cache_get(bs, s->l2_table_cache, l2_offset, (void**) l2_table);
return ret;
return qcow2_cache_get(bs, s->l2_table_cache, l2_offset,
(void **)l2_table);
}
/*
@@ -389,22 +388,18 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
return 0;
}
static int coroutine_fn copy_sectors(BlockDriverState *bs,
uint64_t start_sect,
uint64_t cluster_offset,
int n_start, int n_end)
static int coroutine_fn do_perform_cow(BlockDriverState *bs,
uint64_t src_cluster_offset,
uint64_t cluster_offset,
int offset_in_cluster,
int bytes)
{
BDRVQcow2State *s = bs->opaque;
QEMUIOVector qiov;
struct iovec iov;
int n, ret;
int ret;
n = n_end - n_start;
if (n <= 0) {
return 0;
}
iov.iov_len = n * BDRV_SECTOR_SIZE;
iov.iov_len = bytes;
iov.iov_base = qemu_try_blockalign(bs, iov.iov_len);
if (iov.iov_base == NULL) {
return -ENOMEM;
@@ -423,17 +418,21 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
* interface. This avoids double I/O throttling and request tracking,
* which can lead to deadlock when block layer copy-on-read is enabled.
*/
ret = bs->drv->bdrv_co_readv(bs, start_sect + n_start, n, &qiov);
ret = bs->drv->bdrv_co_preadv(bs, src_cluster_offset + offset_in_cluster,
bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
if (bs->encrypted) {
Error *err = NULL;
int64_t sector = (cluster_offset + offset_in_cluster)
>> BDRV_SECTOR_BITS;
assert(s->cipher);
if (qcow2_encrypt_sectors(s, start_sect + n_start,
iov.iov_base, iov.iov_base, n,
true, &err) < 0) {
assert((offset_in_cluster & ~BDRV_SECTOR_MASK) == 0);
assert((bytes & ~BDRV_SECTOR_MASK) == 0);
if (qcow2_encrypt_sectors(s, sector, iov.iov_base, iov.iov_base,
bytes >> BDRV_SECTOR_BITS, true, &err) < 0) {
ret = -EIO;
error_free(err);
goto out;
@@ -441,14 +440,14 @@ static int coroutine_fn copy_sectors(BlockDriverState *bs,
}
ret = qcow2_pre_write_overlap_check(bs, 0,
cluster_offset + n_start * BDRV_SECTOR_SIZE, n * BDRV_SECTOR_SIZE);
cluster_offset + offset_in_cluster, bytes);
if (ret < 0) {
goto out;
}
BLKDBG_EVENT(bs->file, BLKDBG_COW_WRITE);
ret = bdrv_co_writev(bs->file->bs, (cluster_offset >> 9) + n_start, n,
&qiov);
ret = bdrv_co_pwritev(bs->file->bs, cluster_offset + offset_in_cluster,
bytes, &qiov, 0);
if (ret < 0) {
goto out;
}
@@ -463,47 +462,44 @@ out:
/*
* get_cluster_offset
*
* For a given offset of the disk image, find the cluster offset in
* qcow2 file. The offset is stored in *cluster_offset.
* For a given offset of the virtual disk, find the cluster type and offset in
* the qcow2 file. The offset is stored in *cluster_offset.
*
* on entry, *num is the number of contiguous sectors we'd like to
* access following offset.
* On entry, *bytes is the maximum number of contiguous bytes starting at
* offset that we are interested in.
*
* on exit, *num is the number of contiguous sectors we can read.
* On exit, *bytes is the number of bytes starting at offset that have the same
* cluster type and (if applicable) are stored contiguously in the image file.
* Compressed clusters are always returned one by one.
*
* Returns the cluster type (QCOW2_CLUSTER_*) on success, -errno in error
* cases.
*/
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset)
unsigned int *bytes, uint64_t *cluster_offset)
{
BDRVQcow2State *s = bs->opaque;
unsigned int l2_index;
uint64_t l1_index, l2_offset, *l2_table;
int l1_bits, c;
unsigned int index_in_cluster, nb_clusters;
uint64_t nb_available, nb_needed;
unsigned int offset_in_cluster, nb_clusters;
uint64_t bytes_available, bytes_needed;
int ret;
index_in_cluster = (offset >> 9) & (s->cluster_sectors - 1);
nb_needed = *num + index_in_cluster;
offset_in_cluster = offset_into_cluster(s, offset);
bytes_needed = (uint64_t) *bytes + offset_in_cluster;
l1_bits = s->l2_bits + s->cluster_bits;
/* compute how many bytes there are between the offset and
* the end of the l1 entry
*/
/* compute how many bytes there are between the start of the cluster
* containing offset and the end of the l1 entry */
bytes_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1))
+ offset_in_cluster;
nb_available = (1ULL << l1_bits) - (offset & ((1ULL << l1_bits) - 1));
/* compute the number of available sectors */
nb_available = (nb_available >> 9) + index_in_cluster;
if (nb_needed > nb_available) {
nb_needed = nb_available;
if (bytes_needed > bytes_available) {
bytes_needed = bytes_available;
}
assert(nb_needed <= INT_MAX);
assert(bytes_needed <= INT_MAX);
*cluster_offset = 0;
@@ -541,7 +537,7 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
*cluster_offset = be64_to_cpu(l2_table[l2_index]);
/* nb_needed <= INT_MAX, thus nb_clusters <= INT_MAX, too */
nb_clusters = size_to_clusters(s, nb_needed << 9);
nb_clusters = size_to_clusters(s, bytes_needed);
ret = qcow2_get_cluster_type(*cluster_offset);
switch (ret) {
@@ -588,13 +584,14 @@ int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
qcow2_cache_put(bs, s->l2_table_cache, (void**) &l2_table);
nb_available = (c * s->cluster_sectors);
bytes_available = (c * s->cluster_size);
out:
if (nb_available > nb_needed)
nb_available = nb_needed;
if (bytes_available > bytes_needed) {
bytes_available = bytes_needed;
}
*num = nb_available - index_in_cluster;
*bytes = bytes_available - offset_in_cluster;
return ret;
@@ -740,14 +737,12 @@ static int perform_cow(BlockDriverState *bs, QCowL2Meta *m, Qcow2COWRegion *r)
BDRVQcow2State *s = bs->opaque;
int ret;
if (r->nb_sectors == 0) {
if (r->nb_bytes == 0) {
return 0;
}
qemu_co_mutex_unlock(&s->lock);
ret = copy_sectors(bs, m->offset / BDRV_SECTOR_SIZE, m->alloc_offset,
r->offset / BDRV_SECTOR_SIZE,
r->offset / BDRV_SECTOR_SIZE + r->nb_sectors);
ret = do_perform_cow(bs, m->offset, m->alloc_offset, r->offset, r->nb_bytes);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
@@ -809,13 +804,14 @@ int qcow2_alloc_cluster_link_l2(BlockDriverState *bs, QCowL2Meta *m)
assert(l2_index + m->nb_clusters <= s->l2_size);
for (i = 0; i < m->nb_clusters; i++) {
/* if two concurrent writes happen to the same unallocated cluster
* each write allocates separate cluster and writes data concurrently.
* The first one to complete updates l2 table with pointer to its
* cluster the second one has to do RMW (which is done above by
* copy_sectors()), update l2 table with its cluster pointer and free
* old cluster. This is what this loop does */
if(l2_table[l2_index + i] != 0)
* each write allocates separate cluster and writes data concurrently.
* The first one to complete updates l2 table with pointer to its
* cluster the second one has to do RMW (which is done above by
* perform_cow()), update l2 table with its cluster pointer and free
* old cluster. This is what this loop does */
if (l2_table[l2_index + i] != 0) {
old_cluster[j++] = l2_table[l2_index + i];
}
l2_table[l2_index + i] = cpu_to_be64((cluster_offset +
(i << s->cluster_bits)) | QCOW_OFLAG_COPIED);
@@ -1197,25 +1193,20 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
/*
* Save info needed for meta data update.
*
* requested_sectors: Number of sectors from the start of the first
* requested_bytes: Number of bytes from the start of the first
* newly allocated cluster to the end of the (possibly shortened
* before) write request.
*
* avail_sectors: Number of sectors from the start of the first
* avail_bytes: Number of bytes from the start of the first
* newly allocated to the end of the last newly allocated cluster.
*
* nb_sectors: The number of sectors from the start of the first
* nb_bytes: The number of bytes from the start of the first
* newly allocated cluster to the end of the area that the write
* request actually writes to (excluding COW at the end)
*/
int requested_sectors =
(*bytes + offset_into_cluster(s, guest_offset))
>> BDRV_SECTOR_BITS;
int avail_sectors = nb_clusters
<< (s->cluster_bits - BDRV_SECTOR_BITS);
int alloc_n_start = offset_into_cluster(s, guest_offset)
>> BDRV_SECTOR_BITS;
int nb_sectors = MIN(requested_sectors, avail_sectors);
uint64_t requested_bytes = *bytes + offset_into_cluster(s, guest_offset);
int avail_bytes = MIN(INT_MAX, nb_clusters << s->cluster_bits);
int nb_bytes = MIN(requested_bytes, avail_bytes);
QCowL2Meta *old_m = *m;
*m = g_malloc0(sizeof(**m));
@@ -1226,23 +1217,21 @@ static int handle_alloc(BlockDriverState *bs, uint64_t guest_offset,
.alloc_offset = alloc_cluster_offset,
.offset = start_of_cluster(s, guest_offset),
.nb_clusters = nb_clusters,
.nb_available = nb_sectors,
.cow_start = {
.offset = 0,
.nb_sectors = alloc_n_start,
.nb_bytes = offset_into_cluster(s, guest_offset),
},
.cow_end = {
.offset = nb_sectors * BDRV_SECTOR_SIZE,
.nb_sectors = avail_sectors - nb_sectors,
.offset = nb_bytes,
.nb_bytes = avail_bytes - nb_bytes,
},
};
qemu_co_queue_init(&(*m)->dependent_requests);
QLIST_INSERT_HEAD(&s->cluster_allocs, *m, next_in_flight);
*host_offset = alloc_cluster_offset + offset_into_cluster(s, guest_offset);
*bytes = MIN(*bytes, (nb_sectors * BDRV_SECTOR_SIZE)
- offset_into_cluster(s, guest_offset));
*bytes = MIN(*bytes, nb_bytes - offset_into_cluster(s, guest_offset));
assert(*bytes != 0);
return 1;
@@ -1274,7 +1263,8 @@ fail:
* Return 0 on success and -errno in error cases
*/
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *host_offset, QCowL2Meta **m)
unsigned int *bytes, uint64_t *host_offset,
QCowL2Meta **m)
{
BDRVQcow2State *s = bs->opaque;
uint64_t start, remaining;
@@ -1282,13 +1272,11 @@ int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
uint64_t cur_bytes;
int ret;
trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *num);
assert((offset & ~BDRV_SECTOR_MASK) == 0);
trace_qcow2_alloc_clusters_offset(qemu_coroutine_self(), offset, *bytes);
again:
start = offset;
remaining = (uint64_t)*num << BDRV_SECTOR_BITS;
remaining = *bytes;
cluster_offset = 0;
*host_offset = 0;
cur_bytes = 0;
@@ -1374,8 +1362,8 @@ again:
}
}
*num -= remaining >> BDRV_SECTOR_BITS;
assert(*num > 0);
*bytes -= remaining;
assert(*bytes > 0);
assert(*host_offset != 0);
return 0;
@@ -1764,8 +1752,7 @@ static int expand_zero_clusters_in_l1(BlockDriverState *bs, uint64_t *l1_table,
goto fail;
}
ret = bdrv_write_zeroes(bs->file->bs, offset / BDRV_SECTOR_SIZE,
s->cluster_sectors, 0);
ret = bdrv_pwrite_zeroes(bs->file->bs, offset, s->cluster_size, 0);
if (ret < 0) {
if (!preallocated) {
qcow2_free_clusters(bs, offset, s->cluster_size,
@@ -1867,8 +1854,8 @@ int qcow2_expand_zero_clusters(BlockDriverState *bs,
}
for (i = 0; i < s->nb_snapshots; i++) {
int l1_sectors = (s->snapshots[i].l1_size * sizeof(uint64_t) +
BDRV_SECTOR_SIZE - 1) / BDRV_SECTOR_SIZE;
int l1_sectors = DIV_ROUND_UP(s->snapshots[i].l1_size *
sizeof(uint64_t), BDRV_SECTOR_SIZE);
l1_table = g_realloc(l1_table, l1_sectors * BDRV_SECTOR_SIZE);

View File

@@ -28,6 +28,7 @@
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/range.h"
#include "qemu/bswap.h"
static int64_t alloc_clusters_noref(BlockDriverState *bs, uint64_t size);
static int QEMU_WARN_UNUSED_RESULT update_refcount(BlockDriverState *bs,
@@ -217,13 +218,10 @@ static int load_refcount_block(BlockDriverState *bs,
void **refcount_block)
{
BDRVQcow2State *s = bs->opaque;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_REFBLOCK_LOAD);
ret = qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
refcount_block);
return ret;
return qcow2_cache_get(bs, s->refcount_block_cache, refcount_block_offset,
refcount_block);
}
/*
@@ -489,14 +487,12 @@ static int alloc_refcount_block(BlockDriverState *bs,
uint64_t table_clusters =
size_to_clusters(s, table_size * sizeof(uint64_t));
blocks_clusters = 1 +
((table_clusters + s->refcount_block_size - 1)
/ s->refcount_block_size);
DIV_ROUND_UP(table_clusters, s->refcount_block_size);
uint64_t meta_clusters = table_clusters + blocks_clusters;
last_table_size = table_size;
table_size = next_refcount_table_size(s, blocks_used +
((meta_clusters + s->refcount_block_size - 1)
/ s->refcount_block_size));
DIV_ROUND_UP(meta_clusters, s->refcount_block_size));
} while (last_table_size != table_size);

View File

@@ -26,6 +26,7 @@
#include "qapi/error.h"
#include "block/block_int.h"
#include "block/qcow2.h"
#include "qemu/bswap.h"
#include "qemu/error-report.h"
#include "qemu/cutils.h"

View File

@@ -36,6 +36,7 @@
#include "trace.h"
#include "qemu/option_int.h"
#include "qemu/cutils.h"
#include "qemu/bswap.h"
/*
Differences with QCOW:
@@ -967,13 +968,22 @@ static int qcow2_open(BlockDriverState *bs, QDict *options, int flags,
if (s->crypt_method_header) {
if (bdrv_uses_whitelist() &&
s->crypt_method_header == QCOW_CRYPT_AES) {
error_report("qcow2 built-in AES encryption is deprecated");
error_printf("Support for it will be removed in a future release.\n"
"You can use 'qemu-img convert' to switch to an\n"
"unencrypted qcow2 image, or a LUKS raw image.\n");
error_setg(errp,
"Use of AES-CBC encrypted qcow2 images is no longer "
"supported in system emulators");
error_append_hint(errp,
"You can use 'qemu-img convert' to convert your "
"image to an alternative supported format, such "
"as unencrypted qcow2, or raw with the LUKS "
"format instead.\n");
ret = -ENOSYS;
goto fail;
}
bs->encrypted = 1;
/* Encryption works on a sector granularity */
bs->request_alignment = BDRV_SECTOR_SIZE;
}
s->l2_bits = s->cluster_bits - 3; /* L2 is always one cluster */
@@ -1192,7 +1202,7 @@ static void qcow2_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQcow2State *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->cluster_sectors;
bs->bl.pwrite_zeroes_alignment = s->cluster_size;
}
static int qcow2_set_key(BlockDriverState *bs, const char *key)
@@ -1330,16 +1340,20 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
BDRVQcow2State *s = bs->opaque;
uint64_t cluster_offset;
int index_in_cluster, ret;
unsigned int bytes;
int64_t status = 0;
*pnum = nb_sectors;
bytes = MIN(INT_MAX, nb_sectors * BDRV_SECTOR_SIZE);
qemu_co_mutex_lock(&s->lock);
ret = qcow2_get_cluster_offset(bs, sector_num << 9, pnum, &cluster_offset);
ret = qcow2_get_cluster_offset(bs, sector_num << 9, &bytes,
&cluster_offset);
qemu_co_mutex_unlock(&s->lock);
if (ret < 0) {
return ret;
}
*pnum = bytes >> BDRV_SECTOR_BITS;
if (cluster_offset != 0 && ret != QCOW2_CLUSTER_COMPRESSED &&
!s->cipher) {
index_in_cluster = sector_num & (s->cluster_sectors - 1);
@@ -1357,28 +1371,34 @@ static int64_t coroutine_fn qcow2_co_get_block_status(BlockDriverState *bs,
/* handle reading after the end of the backing file */
int qcow2_backing_read1(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t sector_num, int nb_sectors)
int64_t offset, int bytes)
{
uint64_t bs_size = bs->total_sectors * BDRV_SECTOR_SIZE;
int n1;
if ((sector_num + nb_sectors) <= bs->total_sectors)
return nb_sectors;
if (sector_num >= bs->total_sectors)
n1 = 0;
else
n1 = bs->total_sectors - sector_num;
qemu_iovec_memset(qiov, 512 * n1, 0, 512 * (nb_sectors - n1));
if ((offset + bytes) <= bs_size) {
return bytes;
}
if (offset >= bs_size) {
n1 = 0;
} else {
n1 = bs_size - offset;
}
qemu_iovec_memset(qiov, n1, 0, bytes - n1);
return n1;
}
static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
int remaining_sectors, QEMUIOVector *qiov)
static coroutine_fn int qcow2_co_preadv(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
int flags)
{
BDRVQcow2State *s = bs->opaque;
int index_in_cluster, n1;
int offset_in_cluster, n1;
int ret;
int cur_nr_sectors; /* number of sectors in current iteration */
unsigned int cur_bytes; /* number of bytes in current iteration */
uint64_t cluster_offset = 0;
uint64_t bytes_done = 0;
QEMUIOVector hd_qiov;
@@ -1388,26 +1408,24 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
qemu_co_mutex_lock(&s->lock);
while (remaining_sectors != 0) {
while (bytes != 0) {
/* prepare next request */
cur_nr_sectors = remaining_sectors;
cur_bytes = MIN(bytes, INT_MAX);
if (s->cipher) {
cur_nr_sectors = MIN(cur_nr_sectors,
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
cur_bytes = MIN(cur_bytes,
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
}
ret = qcow2_get_cluster_offset(bs, sector_num << 9,
&cur_nr_sectors, &cluster_offset);
ret = qcow2_get_cluster_offset(bs, offset, &cur_bytes, &cluster_offset);
if (ret < 0) {
goto fail;
}
index_in_cluster = sector_num & (s->cluster_sectors - 1);
offset_in_cluster = offset_into_cluster(s, offset);
qemu_iovec_reset(&hd_qiov);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
cur_nr_sectors * 512);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
switch (ret) {
case QCOW2_CLUSTER_UNALLOCATED:
@@ -1415,18 +1433,17 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
if (bs->backing) {
/* read from the base image */
n1 = qcow2_backing_read1(bs->backing->bs, &hd_qiov,
sector_num, cur_nr_sectors);
offset, cur_bytes);
if (n1 > 0) {
QEMUIOVector local_qiov;
qemu_iovec_init(&local_qiov, hd_qiov.niov);
qemu_iovec_concat(&local_qiov, &hd_qiov, 0,
n1 * BDRV_SECTOR_SIZE);
qemu_iovec_concat(&local_qiov, &hd_qiov, 0, n1);
BLKDBG_EVENT(bs->file, BLKDBG_READ_BACKING_AIO);
qemu_co_mutex_unlock(&s->lock);
ret = bdrv_co_readv(bs->backing->bs, sector_num,
n1, &local_qiov);
ret = bdrv_co_preadv(bs->backing->bs, offset, n1,
&local_qiov, 0);
qemu_co_mutex_lock(&s->lock);
qemu_iovec_destroy(&local_qiov);
@@ -1437,12 +1454,12 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
} else {
/* Note: in this case, no need to wait */
qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
}
break;
case QCOW2_CLUSTER_ZERO:
qemu_iovec_memset(&hd_qiov, 0, 0, 512 * cur_nr_sectors);
qemu_iovec_memset(&hd_qiov, 0, 0, cur_bytes);
break;
case QCOW2_CLUSTER_COMPRESSED:
@@ -1453,8 +1470,8 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
qemu_iovec_from_buf(&hd_qiov, 0,
s->cluster_cache + index_in_cluster * 512,
512 * cur_nr_sectors);
s->cluster_cache + offset_in_cluster,
cur_bytes);
break;
case QCOW2_CLUSTER_NORMAL:
@@ -1481,34 +1498,34 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
}
}
assert(cur_nr_sectors <=
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors);
assert(cur_bytes <= QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cluster_data,
512 * cur_nr_sectors);
qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
}
BLKDBG_EVENT(bs->file, BLKDBG_READ_AIO);
qemu_co_mutex_unlock(&s->lock);
ret = bdrv_co_readv(bs->file->bs,
(cluster_offset >> 9) + index_in_cluster,
cur_nr_sectors, &hd_qiov);
ret = bdrv_co_preadv(bs->file->bs,
cluster_offset + offset_in_cluster,
cur_bytes, &hd_qiov, 0);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
goto fail;
}
if (bs->encrypted) {
assert(s->cipher);
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((cur_bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
Error *err = NULL;
if (qcow2_encrypt_sectors(s, sector_num, cluster_data,
cluster_data, cur_nr_sectors, false,
&err) < 0) {
if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
cluster_data, cluster_data,
cur_bytes >> BDRV_SECTOR_BITS,
false, &err) < 0) {
error_free(err);
ret = -EIO;
goto fail;
}
qemu_iovec_from_buf(qiov, bytes_done,
cluster_data, 512 * cur_nr_sectors);
qemu_iovec_from_buf(qiov, bytes_done, cluster_data, cur_bytes);
}
break;
@@ -1518,9 +1535,9 @@ static coroutine_fn int qcow2_co_readv(BlockDriverState *bs, int64_t sector_num,
goto fail;
}
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
bytes -= cur_bytes;
offset += cur_bytes;
bytes_done += cur_bytes;
}
ret = 0;
@@ -1533,23 +1550,21 @@ fail:
return ret;
}
static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
int64_t sector_num,
int remaining_sectors,
QEMUIOVector *qiov)
static coroutine_fn int qcow2_co_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
int flags)
{
BDRVQcow2State *s = bs->opaque;
int index_in_cluster;
int offset_in_cluster;
int ret;
int cur_nr_sectors; /* number of sectors in current iteration */
unsigned int cur_bytes; /* number of sectors in current iteration */
uint64_t cluster_offset;
QEMUIOVector hd_qiov;
uint64_t bytes_done = 0;
uint8_t *cluster_data = NULL;
QCowL2Meta *l2meta = NULL;
trace_qcow2_writev_start_req(qemu_coroutine_self(), sector_num,
remaining_sectors);
trace_qcow2_writev_start_req(qemu_coroutine_self(), offset, bytes);
qemu_iovec_init(&hd_qiov, qiov->niov);
@@ -1557,22 +1572,21 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_lock(&s->lock);
while (remaining_sectors != 0) {
while (bytes != 0) {
l2meta = NULL;
trace_qcow2_writev_start_part(qemu_coroutine_self());
index_in_cluster = sector_num & (s->cluster_sectors - 1);
cur_nr_sectors = remaining_sectors;
if (bs->encrypted &&
cur_nr_sectors >
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster) {
cur_nr_sectors =
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_sectors - index_in_cluster;
offset_in_cluster = offset_into_cluster(s, offset);
cur_bytes = MIN(bytes, INT_MAX);
if (bs->encrypted) {
cur_bytes = MIN(cur_bytes,
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size
- offset_in_cluster);
}
ret = qcow2_alloc_cluster_offset(bs, sector_num << 9,
&cur_nr_sectors, &cluster_offset, &l2meta);
ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
&cluster_offset, &l2meta);
if (ret < 0) {
goto fail;
}
@@ -1580,8 +1594,7 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
assert((cluster_offset & 511) == 0);
qemu_iovec_reset(&hd_qiov);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done,
cur_nr_sectors * 512);
qemu_iovec_concat(&hd_qiov, qiov, bytes_done, cur_bytes);
if (bs->encrypted) {
Error *err = NULL;
@@ -1600,8 +1613,9 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
QCOW_MAX_CRYPT_CLUSTERS * s->cluster_size);
qemu_iovec_to_buf(&hd_qiov, 0, cluster_data, hd_qiov.size);
if (qcow2_encrypt_sectors(s, sector_num, cluster_data,
cluster_data, cur_nr_sectors,
if (qcow2_encrypt_sectors(s, offset >> BDRV_SECTOR_BITS,
cluster_data, cluster_data,
cur_bytes >>BDRV_SECTOR_BITS,
true, &err) < 0) {
error_free(err);
ret = -EIO;
@@ -1609,13 +1623,11 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
}
qemu_iovec_reset(&hd_qiov);
qemu_iovec_add(&hd_qiov, cluster_data,
cur_nr_sectors * 512);
qemu_iovec_add(&hd_qiov, cluster_data, cur_bytes);
}
ret = qcow2_pre_write_overlap_check(bs, 0,
cluster_offset + index_in_cluster * BDRV_SECTOR_SIZE,
cur_nr_sectors * BDRV_SECTOR_SIZE);
cluster_offset + offset_in_cluster, cur_bytes);
if (ret < 0) {
goto fail;
}
@@ -1623,10 +1635,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
qemu_co_mutex_unlock(&s->lock);
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
trace_qcow2_writev_data(qemu_coroutine_self(),
(cluster_offset >> 9) + index_in_cluster);
ret = bdrv_co_writev(bs->file->bs,
(cluster_offset >> 9) + index_in_cluster,
cur_nr_sectors, &hd_qiov);
cluster_offset + offset_in_cluster);
ret = bdrv_co_pwritev(bs->file->bs,
cluster_offset + offset_in_cluster,
cur_bytes, &hd_qiov, 0);
qemu_co_mutex_lock(&s->lock);
if (ret < 0) {
goto fail;
@@ -1652,10 +1664,10 @@ static coroutine_fn int qcow2_co_writev(BlockDriverState *bs,
l2meta = next;
}
remaining_sectors -= cur_nr_sectors;
sector_num += cur_nr_sectors;
bytes_done += cur_nr_sectors * 512;
trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_nr_sectors);
bytes -= cur_bytes;
offset += cur_bytes;
bytes_done += cur_bytes;
trace_qcow2_writev_done_part(qemu_coroutine_self(), cur_bytes);
}
ret = 0;
@@ -1757,13 +1769,6 @@ static void qcow2_invalidate_cache(BlockDriverState *bs, Error **errp)
qcow2_close(bs);
bdrv_invalidate_cache(bs->file->bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
bs->drv = NULL;
return;
}
memset(s, 0, sizeof(BDRVQcow2State));
options = qdict_clone_shallow(bs->options);
@@ -2004,19 +2009,19 @@ static int qcow2_change_backing_file(BlockDriverState *bs,
static int preallocate(BlockDriverState *bs)
{
uint64_t nb_sectors;
uint64_t bytes;
uint64_t offset;
uint64_t host_offset = 0;
int num;
unsigned int cur_bytes;
int ret;
QCowL2Meta *meta;
nb_sectors = bdrv_nb_sectors(bs);
bytes = bdrv_getlength(bs);
offset = 0;
while (nb_sectors) {
num = MIN(nb_sectors, INT_MAX >> BDRV_SECTOR_BITS);
ret = qcow2_alloc_cluster_offset(bs, offset, &num,
while (bytes) {
cur_bytes = MIN(bytes, INT_MAX);
ret = qcow2_alloc_cluster_offset(bs, offset, &cur_bytes,
&host_offset, &meta);
if (ret < 0) {
return ret;
@@ -2042,8 +2047,8 @@ static int preallocate(BlockDriverState *bs)
/* TODO Preallocate data if requested */
nb_sectors -= num;
offset += num << BDRV_SECTOR_BITS;
bytes -= cur_bytes;
offset += cur_bytes;
}
/*
@@ -2052,11 +2057,9 @@ static int preallocate(BlockDriverState *bs)
* EOF). Extend the image to the last allocated sector.
*/
if (host_offset != 0) {
uint8_t buf[BDRV_SECTOR_SIZE];
memset(buf, 0, BDRV_SECTOR_SIZE);
ret = bdrv_write(bs->file->bs,
(host_offset >> BDRV_SECTOR_BITS) + num - 1,
buf, 1);
uint8_t data = 0;
ret = bdrv_pwrite(bs->file->bs, (host_offset + cur_bytes) - 1,
&data, 1);
if (ret < 0) {
return ret;
}
@@ -2207,7 +2210,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
cpu_to_be64(QCOW2_COMPAT_LAZY_REFCOUNTS);
}
ret = blk_pwrite(blk, 0, header, cluster_size);
ret = blk_pwrite(blk, 0, header, cluster_size, 0);
g_free(header);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write qcow2 header");
@@ -2217,7 +2220,7 @@ static int qcow2_create2(const char *filename, int64_t total_size,
/* Write a refcount table with one refcount block */
refcount_table = g_malloc0(2 * cluster_size);
refcount_table[0] = cpu_to_be64(2 * cluster_size);
ret = blk_pwrite(blk, cluster_size, refcount_table, 2 * cluster_size);
ret = blk_pwrite(blk, cluster_size, refcount_table, 2 * cluster_size, 0);
g_free(refcount_table);
if (ret < 0) {
@@ -2400,9 +2403,7 @@ static int qcow2_create(const char *filename, QemuOpts *opts, Error **errp)
ret = qcow2_create2(filename, size, backing_file, backing_fmt, flags,
cluster_size, prealloc, opts, version, refcount_order,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
finish:
g_free(backing_file);
@@ -2411,21 +2412,67 @@ finish:
return ret;
}
static coroutine_fn int qcow2_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
static bool is_zero_sectors(BlockDriverState *bs, int64_t start,
uint32_t count)
{
int nr;
BlockDriverState *file;
int64_t res;
if (!count) {
return true;
}
res = bdrv_get_block_status_above(bs, NULL, start, count,
&nr, &file);
return res >= 0 && (res & BDRV_BLOCK_ZERO) && nr == count;
}
static coroutine_fn int qcow2_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
{
int ret;
BDRVQcow2State *s = bs->opaque;
/* Emulate misaligned zero writes */
if (sector_num % s->cluster_sectors || nb_sectors % s->cluster_sectors) {
return -ENOTSUP;
uint32_t head = offset % s->cluster_size;
uint32_t tail = (offset + count) % s->cluster_size;
trace_qcow2_pwrite_zeroes_start_req(qemu_coroutine_self(), offset, count);
if (head || tail) {
int64_t cl_start = (offset - head) >> BDRV_SECTOR_BITS;
uint64_t off;
unsigned int nr;
assert(head + count <= s->cluster_size);
/* check whether remainder of cluster already reads as zero */
if (!(is_zero_sectors(bs, cl_start,
DIV_ROUND_UP(head, BDRV_SECTOR_SIZE)) &&
is_zero_sectors(bs, (offset + count) >> BDRV_SECTOR_BITS,
DIV_ROUND_UP(-tail & (s->cluster_size - 1),
BDRV_SECTOR_SIZE)))) {
return -ENOTSUP;
}
qemu_co_mutex_lock(&s->lock);
/* We can have new write after previous check */
offset = cl_start << BDRV_SECTOR_BITS;
count = s->cluster_size;
nr = s->cluster_size;
ret = qcow2_get_cluster_offset(bs, offset, &nr, &off);
if (ret != QCOW2_CLUSTER_UNALLOCATED && ret != QCOW2_CLUSTER_ZERO) {
qemu_co_mutex_unlock(&s->lock);
return -ENOTSUP;
}
} else {
qemu_co_mutex_lock(&s->lock);
}
trace_qcow2_pwrite_zeroes(qemu_coroutine_self(), offset, count);
/* Whatever is left can use real zero clusters */
qemu_co_mutex_lock(&s->lock);
ret = qcow2_zero_clusters(bs, sector_num << BDRV_SECTOR_BITS,
nb_sectors);
ret = qcow2_zero_clusters(bs, offset, count >> BDRV_SECTOR_BITS);
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -2616,8 +2663,8 @@ static int make_completely_empty(BlockDriverState *bs)
/* After this call, neither the in-memory nor the on-disk refcount
* information accurately describe the actual references */
ret = bdrv_write_zeroes(bs->file->bs, s->l1_table_offset / BDRV_SECTOR_SIZE,
l1_clusters * s->cluster_sectors, 0);
ret = bdrv_pwrite_zeroes(bs->file->bs, s->l1_table_offset,
l1_clusters * s->cluster_size, 0);
if (ret < 0) {
goto fail_broken_refcounts;
}
@@ -2630,9 +2677,8 @@ static int make_completely_empty(BlockDriverState *bs)
* overwrite parts of the existing refcount and L1 table, which is not
* an issue because the dirty flag is set, complete data loss is in fact
* desired and partial data loss is consequently fine as well */
ret = bdrv_write_zeroes(bs->file->bs, s->cluster_size / BDRV_SECTOR_SIZE,
(2 + l1_clusters) * s->cluster_size /
BDRV_SECTOR_SIZE, 0);
ret = bdrv_pwrite_zeroes(bs->file->bs, s->cluster_size,
(2 + l1_clusters) * s->cluster_size, 0);
/* This call (even if it failed overall) may have overwritten on-disk
* refcount structures; in that case, the in-memory refcount information
* will probably differ from the on-disk information which makes the BDS
@@ -2774,14 +2820,14 @@ static coroutine_fn int qcow2_co_flush_to_os(BlockDriverState *bs)
int ret;
qemu_co_mutex_lock(&s->lock);
ret = qcow2_cache_flush(bs, s->l2_table_cache);
ret = qcow2_cache_write(bs, s->l2_table_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
}
if (qcow2_need_accurate_refcounts(s)) {
ret = qcow2_cache_flush(bs, s->refcount_block_cache);
ret = qcow2_cache_write(bs, s->refcount_block_cache);
if (ret < 0) {
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -2861,36 +2907,20 @@ static int qcow2_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t pos)
{
BDRVQcow2State *s = bs->opaque;
int64_t total_sectors = bs->total_sectors;
bool zero_beyond_eof = bs->zero_beyond_eof;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_VMSTATE_SAVE);
bs->zero_beyond_eof = false;
ret = bdrv_pwritev(bs, qcow2_vm_state_offset(s) + pos, qiov);
bs->zero_beyond_eof = zero_beyond_eof;
/* bdrv_co_do_writev will have increased the total_sectors value to include
* the VM state - the VM state is however not an actual part of the block
* device, therefore, we need to restore the old value. */
bs->total_sectors = total_sectors;
return ret;
return bs->drv->bdrv_co_pwritev(bs, qcow2_vm_state_offset(s) + pos,
qiov->size, qiov, 0);
}
static int qcow2_load_vmstate(BlockDriverState *bs, uint8_t *buf,
int64_t pos, int size)
static int qcow2_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t pos)
{
BDRVQcow2State *s = bs->opaque;
bool zero_beyond_eof = bs->zero_beyond_eof;
int ret;
BLKDBG_EVENT(bs->file, BLKDBG_VMSTATE_LOAD);
bs->zero_beyond_eof = false;
ret = bdrv_pread(bs, qcow2_vm_state_offset(s) + pos, buf, size);
bs->zero_beyond_eof = zero_beyond_eof;
return ret;
return bs->drv->bdrv_co_preadv(bs, qcow2_vm_state_offset(s) + pos,
qiov->size, qiov, 0);
}
/*
@@ -3329,11 +3359,11 @@ BlockDriver bdrv_qcow2 = {
.bdrv_co_get_block_status = qcow2_co_get_block_status,
.bdrv_set_key = qcow2_set_key,
.bdrv_co_readv = qcow2_co_readv,
.bdrv_co_writev = qcow2_co_writev,
.bdrv_co_preadv = qcow2_co_preadv,
.bdrv_co_pwritev = qcow2_co_pwritev,
.bdrv_co_flush_to_os = qcow2_co_flush_to_os,
.bdrv_co_write_zeroes = qcow2_co_write_zeroes,
.bdrv_co_pwrite_zeroes = qcow2_co_pwrite_zeroes,
.bdrv_co_discard = qcow2_co_discard,
.bdrv_truncate = qcow2_truncate,
.bdrv_write_compressed = qcow2_write_compressed,

View File

@@ -302,8 +302,8 @@ typedef struct Qcow2COWRegion {
*/
uint64_t offset;
/** Number of sectors to copy */
int nb_sectors;
/** Number of bytes to copy */
int nb_bytes;
} Qcow2COWRegion;
/**
@@ -318,12 +318,6 @@ typedef struct QCowL2Meta
/** Host offset of the first newly allocated cluster */
uint64_t alloc_offset;
/**
* Number of sectors from the start of the first allocated cluster to
* the end of the (possibly shortened) request
*/
int nb_available;
/** Number of newly allocated clusters */
int nb_clusters;
@@ -471,8 +465,7 @@ static inline uint64_t l2meta_cow_start(QCowL2Meta *m)
static inline uint64_t l2meta_cow_end(QCowL2Meta *m)
{
return m->offset + m->cow_end.offset
+ (m->cow_end.nb_sectors << BDRV_SECTOR_BITS);
return m->offset + m->cow_end.offset + m->cow_end.nb_bytes;
}
static inline uint64_t refcount_diff(uint64_t r1, uint64_t r2)
@@ -544,9 +537,10 @@ int qcow2_encrypt_sectors(BDRVQcow2State *s, int64_t sector_num,
int nb_sectors, bool enc, Error **errp);
int qcow2_get_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *cluster_offset);
unsigned int *bytes, uint64_t *cluster_offset);
int qcow2_alloc_cluster_offset(BlockDriverState *bs, uint64_t offset,
int *num, uint64_t *host_offset, QCowL2Meta **m);
unsigned int *bytes, uint64_t *host_offset,
QCowL2Meta **m);
uint64_t qcow2_alloc_compressed_cluster_offset(BlockDriverState *bs,
uint64_t offset,
int compressed_size);
@@ -583,6 +577,7 @@ int qcow2_cache_destroy(BlockDriverState* bs, Qcow2Cache *c);
void qcow2_cache_entry_mark_dirty(BlockDriverState *bs, Qcow2Cache *c,
void *table);
int qcow2_cache_flush(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_write(BlockDriverState *bs, Qcow2Cache *c);
int qcow2_cache_set_dependency(BlockDriverState *bs, Qcow2Cache *c,
Qcow2Cache *dependency);
void qcow2_cache_depends_on_flush(Qcow2Cache *c);

View File

@@ -234,8 +234,7 @@ int qed_check(BDRVQEDState *s, BdrvCheckResult *result, bool fix)
}
check.result->bfi.total_clusters =
(s->header.image_size + s->header.cluster_size - 1) /
s->header.cluster_size;
DIV_ROUND_UP(s->header.image_size, s->header.cluster_size);
ret = qed_check_l1_table(&check, s->l1_table);
if (ret == 0) {
/* Only check for leaks if entire image was scanned successfully */

View File

@@ -16,6 +16,7 @@
#include "trace.h"
#include "qemu/sockets.h" /* for EINPROGRESS on Windows */
#include "qed.h"
#include "qemu/bswap.h"
typedef struct {
GenericCB gencb;

View File

@@ -15,6 +15,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/timer.h"
#include "qemu/bswap.h"
#include "trace.h"
#include "qed.h"
#include "qapi/qmp/qerror.h"
@@ -142,8 +143,7 @@ static void qed_write_header(BDRVQEDState *s, BlockCompletionFunc cb,
* them, and write back.
*/
int nsectors = (sizeof(QEDHeader) + BDRV_SECTOR_SIZE - 1) /
BDRV_SECTOR_SIZE;
int nsectors = DIV_ROUND_UP(sizeof(QEDHeader), BDRV_SECTOR_SIZE);
size_t len = nsectors * BDRV_SECTOR_SIZE;
QEDWriteHeaderCB *write_header_cb = gencb_alloc(sizeof(*write_header_cb),
cb, opaque);
@@ -517,7 +517,7 @@ static void bdrv_qed_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVQEDState *s = bs->opaque;
bs->bl.write_zeroes_alignment = s->header.cluster_size >> BDRV_SECTOR_BITS;
bs->bl.pwrite_zeroes_alignment = s->header.cluster_size;
}
/* We have nothing to do for QED reopen, stubs just return
@@ -601,18 +601,18 @@ static int qed_create(const char *filename, uint32_t cluster_size,
}
qed_header_cpu_to_le(&header, &le_header);
ret = blk_pwrite(blk, 0, &le_header, sizeof(le_header));
ret = blk_pwrite(blk, 0, &le_header, sizeof(le_header), 0);
if (ret < 0) {
goto out;
}
ret = blk_pwrite(blk, sizeof(le_header), backing_file,
header.backing_filename_size);
header.backing_filename_size, 0);
if (ret < 0) {
goto out;
}
l1_table = g_malloc0(l1_size);
ret = blk_pwrite(blk, header.l1_table_offset, l1_table, l1_size);
ret = blk_pwrite(blk, header.l1_table_offset, l1_table, l1_size, 0);
if (ret < 0) {
goto out;
}
@@ -1418,7 +1418,7 @@ typedef struct {
bool done;
} QEDWriteZeroesCB;
static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
static void coroutine_fn qed_co_pwrite_zeroes_cb(void *opaque, int ret)
{
QEDWriteZeroesCB *cb = opaque;
@@ -1429,10 +1429,10 @@ static void coroutine_fn qed_co_write_zeroes_cb(void *opaque, int ret)
}
}
static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BdrvRequestFlags flags)
static int coroutine_fn bdrv_qed_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset,
int count,
BdrvRequestFlags flags)
{
BlockAIOCB *blockacb;
BDRVQEDState *s = bs->opaque;
@@ -1440,25 +1440,22 @@ static int coroutine_fn bdrv_qed_co_write_zeroes(BlockDriverState *bs,
QEMUIOVector qiov;
struct iovec iov;
/* Refuse if there are untouched backing file sectors */
if (bs->backing) {
if (qed_offset_into_cluster(s, sector_num * BDRV_SECTOR_SIZE) != 0) {
return -ENOTSUP;
}
if (qed_offset_into_cluster(s, nb_sectors * BDRV_SECTOR_SIZE) != 0) {
return -ENOTSUP;
}
/* Fall back if the request is not aligned */
if (qed_offset_into_cluster(s, offset) ||
qed_offset_into_cluster(s, count)) {
return -ENOTSUP;
}
/* Zero writes start without an I/O buffer. If a buffer becomes necessary
* then it will be allocated during request processing.
*/
iov.iov_base = NULL,
iov.iov_len = nb_sectors * BDRV_SECTOR_SIZE,
iov.iov_base = NULL;
iov.iov_len = count;
qemu_iovec_init_external(&qiov, &iov, 1);
blockacb = qed_aio_setup(bs, sector_num, &qiov, nb_sectors,
qed_co_write_zeroes_cb, &cb,
blockacb = qed_aio_setup(bs, offset >> BDRV_SECTOR_BITS, &qiov,
count >> BDRV_SECTOR_BITS,
qed_co_pwrite_zeroes_cb, &cb,
QED_AIOCB_WRITE | QED_AIOCB_ZERO);
if (!blockacb) {
return -EIO;
@@ -1594,12 +1591,6 @@ static void bdrv_qed_invalidate_cache(BlockDriverState *bs, Error **errp)
bdrv_qed_close(bs);
bdrv_invalidate_cache(bs->file->bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
memset(s, 0, sizeof(BDRVQEDState));
ret = bdrv_qed_open(bs, NULL, bs->open_flags, &local_err);
if (local_err) {
@@ -1669,7 +1660,7 @@ static BlockDriver bdrv_qed = {
.bdrv_co_get_block_status = bdrv_qed_co_get_block_status,
.bdrv_aio_readv = bdrv_qed_aio_readv,
.bdrv_aio_writev = bdrv_qed_aio_writev,
.bdrv_co_write_zeroes = bdrv_qed_co_write_zeroes,
.bdrv_co_pwrite_zeroes = bdrv_qed_co_pwrite_zeroes,
.bdrv_truncate = bdrv_qed_truncate,
.bdrv_getlength = bdrv_qed_getlength,
.bdrv_get_info = bdrv_qed_get_info,

View File

@@ -14,6 +14,7 @@
*/
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "block/block_int.h"
#include "qapi/qmp/qbool.h"
#include "qapi/qmp/qdict.h"
@@ -67,6 +68,9 @@ typedef struct QuorumVotes {
typedef struct BDRVQuorumState {
BdrvChild **children; /* children BlockDriverStates */
int num_children; /* children count */
unsigned next_child_index; /* the index of the next child that should
* be added
*/
int threshold; /* if less than threshold children reads gave the
* same result a quorum error occurs.
*/
@@ -747,21 +751,6 @@ static int64_t quorum_getlength(BlockDriverState *bs)
return result;
}
static void quorum_invalidate_cache(BlockDriverState *bs, Error **errp)
{
BDRVQuorumState *s = bs->opaque;
Error *local_err = NULL;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_invalidate_cache(s->children[i]->bs, &local_err);
if (local_err) {
error_propagate(errp, local_err);
return;
}
}
}
static coroutine_fn int quorum_co_flush(BlockDriverState *bs)
{
BDRVQuorumState *s = bs->opaque;
@@ -898,9 +887,9 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
ret = -EINVAL;
goto exit;
}
if (s->num_children < 2) {
if (s->num_children < 1) {
error_setg(&local_err,
"Number of provided children must be greater than 1");
"Number of provided children must be 1 or more");
ret = -EINVAL;
goto exit;
}
@@ -964,6 +953,7 @@ static int quorum_open(BlockDriverState *bs, QDict *options, int flags,
opened[i] = true;
}
s->next_child_index = s->num_children;
g_free(opened);
goto exit;
@@ -981,9 +971,7 @@ close_exit:
exit:
qemu_opts_del(opts);
/* propagate error */
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
return ret;
}
@@ -999,25 +987,70 @@ static void quorum_close(BlockDriverState *bs)
g_free(s->children);
}
static void quorum_detach_aio_context(BlockDriverState *bs)
static void quorum_add_child(BlockDriverState *bs, BlockDriverState *child_bs,
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
int i;
BdrvChild *child;
char indexstr[32];
int ret;
for (i = 0; i < s->num_children; i++) {
bdrv_detach_aio_context(s->children[i]->bs);
assert(s->num_children <= INT_MAX / sizeof(BdrvChild *));
if (s->num_children == INT_MAX / sizeof(BdrvChild *) ||
s->next_child_index == UINT_MAX) {
error_setg(errp, "Too many children");
return;
}
ret = snprintf(indexstr, 32, "children.%u", s->next_child_index);
if (ret < 0 || ret >= 32) {
error_setg(errp, "cannot generate child name");
return;
}
s->next_child_index++;
bdrv_drained_begin(bs);
/* We can safely add the child now */
bdrv_ref(child_bs);
child = bdrv_attach_child(bs, child_bs, indexstr, &child_format);
s->children = g_renew(BdrvChild *, s->children, s->num_children + 1);
s->children[s->num_children++] = child;
bdrv_drained_end(bs);
}
static void quorum_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
static void quorum_del_child(BlockDriverState *bs, BdrvChild *child,
Error **errp)
{
BDRVQuorumState *s = bs->opaque;
int i;
for (i = 0; i < s->num_children; i++) {
bdrv_attach_aio_context(s->children[i]->bs, new_context);
if (s->children[i] == child) {
break;
}
}
/* we have checked it in bdrv_del_child() */
assert(i < s->num_children);
if (s->num_children <= s->threshold) {
error_setg(errp,
"The number of children cannot be lower than the vote threshold %d",
s->threshold);
return;
}
bdrv_drained_begin(bs);
/* We can safely remove this child now */
memmove(&s->children[i], &s->children[i + 1],
(s->num_children - i - 1) * sizeof(BdrvChild *));
s->children = g_renew(BdrvChild *, s->children, --s->num_children);
bdrv_unref_child(bs, child);
bdrv_drained_end(bs);
}
static void quorum_refresh_filename(BlockDriverState *bs, QDict *options)
@@ -1070,10 +1103,9 @@ static BlockDriver bdrv_quorum = {
.bdrv_aio_readv = quorum_aio_readv,
.bdrv_aio_writev = quorum_aio_writev,
.bdrv_invalidate_cache = quorum_invalidate_cache,
.bdrv_detach_aio_context = quorum_detach_aio_context,
.bdrv_attach_aio_context = quorum_attach_aio_context,
.bdrv_add_child = quorum_add_child,
.bdrv_del_child = quorum_del_child,
.is_filter = true,
.bdrv_recurse_is_first_non_filter = quorum_recurse_is_first_non_filter,

View File

@@ -15,6 +15,7 @@
#ifndef QEMU_RAW_AIO_H
#define QEMU_RAW_AIO_H
#include "qemu/coroutine.h"
#include "qemu/iov.h"
/* AIO request types */
@@ -35,15 +36,18 @@
/* linux-aio.c - Linux native implementation */
#ifdef CONFIG_LINUX_AIO
void *laio_init(void);
void laio_cleanup(void *s);
BlockAIOCB *laio_submit(BlockDriverState *bs, void *aio_ctx, int fd,
typedef struct LinuxAioState LinuxAioState;
LinuxAioState *laio_init(void);
void laio_cleanup(LinuxAioState *s);
int coroutine_fn laio_co_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
uint64_t offset, QEMUIOVector *qiov, int type);
BlockAIOCB *laio_submit(BlockDriverState *bs, LinuxAioState *s, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type);
void laio_detach_aio_context(void *s, AioContext *old_context);
void laio_attach_aio_context(void *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, void *aio_ctx);
void laio_io_unplug(BlockDriverState *bs, void *aio_ctx, bool unplug);
void laio_detach_aio_context(LinuxAioState *s, AioContext *old_context);
void laio_attach_aio_context(LinuxAioState *s, AioContext *new_context);
void laio_io_plug(BlockDriverState *bs, LinuxAioState *s);
void laio_io_unplug(BlockDriverState *bs, LinuxAioState *s);
#endif
#ifdef _WIN32

View File

@@ -139,7 +139,7 @@ typedef struct BDRVRawState {
#ifdef CONFIG_LINUX_AIO
int use_aio;
void *aio_ctx;
LinuxAioState *aio_ctx;
#endif
#ifdef CONFIG_XFS
bool is_xfs:1;
@@ -398,7 +398,7 @@ static void raw_attach_aio_context(BlockDriverState *bs,
}
#ifdef CONFIG_LINUX_AIO
static int raw_set_aio(void **aio_ctx, int *use_aio, int bdrv_flags)
static int raw_set_aio(LinuxAioState **aio_ctx, int *use_aio, int bdrv_flags)
{
int ret = -1;
assert(aio_ctx != NULL);
@@ -517,6 +517,7 @@ static int raw_open_common(BlockDriverState *bs, QDict *options,
s->has_discard = true;
s->has_write_zeroes = true;
bs->supported_zero_flags = BDRV_REQ_MAY_UNMAP;
if ((bs->open_flags & BDRV_O_NOCACHE) != 0) {
s->needs_alignment = true;
}
@@ -581,15 +582,9 @@ static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_FILE;
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
return ret;
return raw_open_common(bs, options, flags, 0, errp);
}
static int raw_reopen_prepare(BDRVReopenState *state,
@@ -728,9 +723,33 @@ static void raw_reopen_abort(BDRVReopenState *state)
state->opaque = NULL;
}
static int hdev_get_max_transfer_length(int fd)
{
#ifdef BLKSECTGET
int max_sectors = 0;
if (ioctl(fd, BLKSECTGET, &max_sectors) == 0) {
return max_sectors;
} else {
return -errno;
}
#else
return -ENOSYS;
#endif
}
static void raw_refresh_limits(BlockDriverState *bs, Error **errp)
{
BDRVRawState *s = bs->opaque;
struct stat st;
if (!fstat(s->fd, &st)) {
if (S_ISBLK(st.st_mode)) {
int ret = hdev_get_max_transfer_length(s->fd);
if (ret >= 0) {
bs->bl.max_transfer_length = ret;
}
}
}
raw_probe_alignment(bs, s->fd, errp);
bs->bl.min_mem_alignment = s->buf_align;
@@ -1251,8 +1270,8 @@ static int aio_worker(void *arg)
}
static int paio_submit_co(BlockDriverState *bs, int fd,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
int type)
int64_t offset, QEMUIOVector *qiov,
int count, int type)
{
RawPosixAIOData *acb = g_new(RawPosixAIOData, 1);
ThreadPool *pool;
@@ -1261,16 +1280,16 @@ static int paio_submit_co(BlockDriverState *bs, int fd,
acb->aio_type = type;
acb->aio_fildes = fd;
acb->aio_nbytes = nb_sectors * BDRV_SECTOR_SIZE;
acb->aio_offset = sector_num * BDRV_SECTOR_SIZE;
acb->aio_nbytes = count;
acb->aio_offset = offset;
if (qiov) {
acb->aio_iov = qiov->iov;
acb->aio_niov = qiov->niov;
assert(qiov->size == acb->aio_nbytes);
assert(qiov->size == count);
}
trace_paio_submit_co(sector_num, nb_sectors, type);
trace_paio_submit_co(offset, count, type);
pool = aio_get_thread_pool(bdrv_get_aio_context(bs));
return thread_pool_submit_co(pool, aio_worker, acb);
}
@@ -1300,14 +1319,13 @@ static BlockAIOCB *paio_submit(BlockDriverState *bs, int fd,
return thread_pool_submit_aio(pool, aio_worker, acb, cb, opaque);
}
static BlockAIOCB *raw_aio_submit(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque, int type)
static int coroutine_fn raw_co_prw(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov, int type)
{
BDRVRawState *s = bs->opaque;
if (fd_open(bs) < 0)
return NULL;
return -EIO;
/*
* Check if the underlying device requires requests to be aligned,
@@ -1320,14 +1338,28 @@ static BlockAIOCB *raw_aio_submit(BlockDriverState *bs,
type |= QEMU_AIO_MISALIGNED;
#ifdef CONFIG_LINUX_AIO
} else if (s->use_aio) {
return laio_submit(bs, s->aio_ctx, s->fd, sector_num, qiov,
nb_sectors, cb, opaque, type);
assert(qiov->size == bytes);
return laio_co_submit(bs, s->aio_ctx, s->fd, offset, qiov, type);
#endif
}
}
return paio_submit(bs, s->fd, sector_num, qiov, nb_sectors,
cb, opaque, type);
return paio_submit_co(bs, s->fd, offset, qiov, bytes, type);
}
static int coroutine_fn raw_co_preadv(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
int flags)
{
return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_READ);
}
static int coroutine_fn raw_co_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
int flags)
{
assert(flags == 0);
return raw_co_prw(bs, offset, bytes, qiov, QEMU_AIO_WRITE);
}
static void raw_aio_plug(BlockDriverState *bs)
@@ -1345,37 +1377,11 @@ static void raw_aio_unplug(BlockDriverState *bs)
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, true);
laio_io_unplug(bs, s->aio_ctx);
}
#endif
}
static void raw_aio_flush_io_queue(BlockDriverState *bs)
{
#ifdef CONFIG_LINUX_AIO
BDRVRawState *s = bs->opaque;
if (s->use_aio) {
laio_io_unplug(bs, s->aio_ctx, false);
}
#endif
}
static BlockAIOCB *raw_aio_readv(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return raw_aio_submit(bs, sector_num, qiov, nb_sectors,
cb, opaque, QEMU_AIO_READ);
}
static BlockAIOCB *raw_aio_writev(BlockDriverState *bs,
int64_t sector_num, QEMUIOVector *qiov, int nb_sectors,
BlockCompletionFunc *cb, void *opaque)
{
return raw_aio_submit(bs, sector_num, qiov, nb_sectors,
cb, opaque, QEMU_AIO_WRITE);
}
static BlockAIOCB *raw_aio_flush(BlockDriverState *bs,
BlockCompletionFunc *cb, void *opaque)
{
@@ -1877,17 +1883,17 @@ static coroutine_fn BlockAIOCB *raw_aio_discard(BlockDriverState *bs,
cb, opaque, QEMU_AIO_DISCARD);
}
static int coroutine_fn raw_co_write_zeroes(
BlockDriverState *bs, int64_t sector_num,
int nb_sectors, BdrvRequestFlags flags)
static int coroutine_fn raw_co_pwrite_zeroes(
BlockDriverState *bs, int64_t offset,
int count, BdrvRequestFlags flags)
{
BDRVRawState *s = bs->opaque;
if (!(flags & BDRV_REQ_MAY_UNMAP)) {
return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
return paio_submit_co(bs, s->fd, offset, NULL, count,
QEMU_AIO_WRITE_ZEROES);
} else if (s->discard_zeroes) {
return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
return paio_submit_co(bs, s->fd, offset, NULL, count,
QEMU_AIO_DISCARD);
}
return -ENOTSUP;
@@ -1940,16 +1946,15 @@ BlockDriver bdrv_file = {
.bdrv_create = raw_create,
.bdrv_has_zero_init = bdrv_has_zero_init_1,
.bdrv_co_get_block_status = raw_co_get_block_status,
.bdrv_co_write_zeroes = raw_co_write_zeroes,
.bdrv_co_pwrite_zeroes = raw_co_pwrite_zeroes,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = raw_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2225,9 +2230,7 @@ hdev_open_Mac_error:
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (ret < 0) {
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
#if defined(__APPLE__) && defined(__MACH__)
if (*bsd_path) {
filename = bsd_path;
@@ -2303,8 +2306,8 @@ static coroutine_fn BlockAIOCB *hdev_aio_discard(BlockDriverState *bs,
cb, opaque, QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
}
static coroutine_fn int hdev_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors, BdrvRequestFlags flags)
static coroutine_fn int hdev_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count, BdrvRequestFlags flags)
{
BDRVRawState *s = bs->opaque;
int rc;
@@ -2314,10 +2317,10 @@ static coroutine_fn int hdev_co_write_zeroes(BlockDriverState *bs,
return rc;
}
if (!(flags & BDRV_REQ_MAY_UNMAP)) {
return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
return paio_submit_co(bs, s->fd, offset, NULL, count,
QEMU_AIO_WRITE_ZEROES|QEMU_AIO_BLKDEV);
} else if (s->discard_zeroes) {
return paio_submit_co(bs, s->fd, sector_num, NULL, nb_sectors,
return paio_submit_co(bs, s->fd, offset, NULL, count,
QEMU_AIO_DISCARD|QEMU_AIO_BLKDEV);
}
return -ENOTSUP;
@@ -2389,16 +2392,15 @@ static BlockDriver bdrv_host_device = {
.bdrv_reopen_abort = raw_reopen_abort,
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_co_write_zeroes = hdev_co_write_zeroes,
.bdrv_co_pwrite_zeroes = hdev_co_pwrite_zeroes,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_aio_discard = hdev_aio_discard,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2433,17 +2435,11 @@ static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
BDRVRawState *s = bs->opaque;
Error *local_err = NULL;
int ret;
s->type = FTYPE_CD;
/* open will not fail even if no CD is inserted, so add O_NONBLOCK */
ret = raw_open_common(bs, options, flags, O_NONBLOCK, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
return ret;
return raw_open_common(bs, options, flags, O_NONBLOCK, errp);
}
static int cdrom_probe_device(const char *filename)
@@ -2522,13 +2518,13 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,
@@ -2561,9 +2557,7 @@ static int cdrom_open(BlockDriverState *bs, QDict *options, int flags,
ret = raw_open_common(bs, options, flags, 0, &local_err);
if (ret) {
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
return ret;
}
@@ -2658,13 +2652,12 @@ static BlockDriver bdrv_host_cdrom = {
.bdrv_create = hdev_create,
.create_opts = &raw_create_opts,
.bdrv_aio_readv = raw_aio_readv,
.bdrv_aio_writev = raw_aio_writev,
.bdrv_co_preadv = raw_co_preadv,
.bdrv_co_pwritev = raw_co_pwritev,
.bdrv_aio_flush = raw_aio_flush,
.bdrv_refresh_limits = raw_refresh_limits,
.bdrv_io_plug = raw_aio_plug,
.bdrv_io_unplug = raw_aio_unplug,
.bdrv_flush_io_queue = raw_aio_flush_io_queue,
.bdrv_truncate = raw_truncate,
.bdrv_getlength = raw_getlength,

View File

@@ -105,8 +105,8 @@ raw_co_writev_flags(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
}
BLKDBG_EVENT(bs->file, BLKDBG_WRITE_AIO);
ret = bdrv_co_do_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
ret = bdrv_co_pwritev(bs->file->bs, sector_num * BDRV_SECTOR_SIZE,
nb_sectors * BDRV_SECTOR_SIZE, qiov, flags);
fail:
if (qiov == &local_qiov) {
@@ -116,13 +116,6 @@ fail:
return ret;
}
static int coroutine_fn
raw_co_writev(BlockDriverState *bs, int64_t sector_num, int nb_sectors,
QEMUIOVector *qiov)
{
return raw_co_writev_flags(bs, sector_num, nb_sectors, qiov, 0);
}
static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors, int *pnum,
@@ -134,11 +127,11 @@ static int64_t coroutine_fn raw_co_get_block_status(BlockDriverState *bs,
(sector_num << BDRV_SECTOR_BITS);
}
static int coroutine_fn raw_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num, int nb_sectors,
BdrvRequestFlags flags)
static int coroutine_fn raw_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset, int count,
BdrvRequestFlags flags)
{
return bdrv_co_write_zeroes(bs->file->bs, sector_num, nb_sectors, flags);
return bdrv_co_pwrite_zeroes(bs->file->bs, offset, count, flags);
}
static int coroutine_fn raw_co_discard(BlockDriverState *bs,
@@ -197,20 +190,15 @@ static int raw_has_zero_init(BlockDriverState *bs)
static int raw_create(const char *filename, QemuOpts *opts, Error **errp)
{
Error *local_err = NULL;
int ret;
ret = bdrv_create_file(filename, opts, &local_err);
if (local_err) {
error_propagate(errp, local_err);
}
return ret;
return bdrv_create_file(filename, opts, errp);
}
static int raw_open(BlockDriverState *bs, QDict *options, int flags,
Error **errp)
{
bs->sg = bs->file->bs->sg;
bs->supported_write_flags = BDRV_REQ_FUA;
bs->supported_zero_flags = BDRV_REQ_FUA | BDRV_REQ_MAY_UNMAP;
if (bs->probed && !bdrv_is_read_only(bs)) {
fprintf(stderr,
@@ -256,10 +244,8 @@ BlockDriver bdrv_raw = {
.bdrv_close = &raw_close,
.bdrv_create = &raw_create,
.bdrv_co_readv = &raw_co_readv,
.bdrv_co_writev = &raw_co_writev,
.bdrv_co_writev_flags = &raw_co_writev_flags,
.supported_write_flags = BDRV_REQ_FUA,
.bdrv_co_write_zeroes = &raw_co_write_zeroes,
.bdrv_co_pwrite_zeroes = &raw_co_pwrite_zeroes,
.bdrv_co_discard = &raw_co_discard,
.bdrv_co_get_block_status = &raw_co_get_block_status,
.bdrv_truncate = &raw_truncate,

View File

@@ -290,7 +290,8 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
if (only_read_conf_file) {
ret = rados_conf_read_file(cluster, value);
if (ret < 0) {
error_setg(errp, "error reading conf file %s", value);
error_setg_errno(errp, -ret, "error reading conf file %s",
value);
break;
}
}
@@ -299,7 +300,7 @@ static int qemu_rbd_set_conf(rados_t cluster, const char *conf,
} else if (!only_read_conf_file) {
ret = rados_conf_set(cluster, name, value);
if (ret < 0) {
error_setg(errp, "invalid conf option %s", name);
error_setg_errno(errp, -ret, "invalid conf option %s", name);
ret = -EINVAL;
break;
}
@@ -354,9 +355,10 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
}
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
if (rados_create(&cluster, clientname) < 0) {
error_setg(errp, "error initializing");
return -EIO;
ret = rados_create(&cluster, clientname);
if (ret < 0) {
error_setg_errno(errp, -ret, "error initializing");
return ret;
}
if (strstr(conf, "conf=") == NULL) {
@@ -381,21 +383,27 @@ static int qemu_rbd_create(const char *filename, QemuOpts *opts, Error **errp)
return -EIO;
}
if (rados_connect(cluster) < 0) {
error_setg(errp, "error connecting");
ret = rados_connect(cluster);
if (ret < 0) {
error_setg_errno(errp, -ret, "error connecting");
rados_shutdown(cluster);
return -EIO;
return ret;
}
if (rados_ioctx_create(cluster, pool, &io_ctx) < 0) {
error_setg(errp, "error opening pool %s", pool);
ret = rados_ioctx_create(cluster, pool, &io_ctx);
if (ret < 0) {
error_setg_errno(errp, -ret, "error opening pool %s", pool);
rados_shutdown(cluster);
return -EIO;
return ret;
}
ret = rbd_create(io_ctx, name, bytes, &obj_order);
rados_ioctx_destroy(io_ctx);
rados_shutdown(cluster);
if (ret < 0) {
error_setg_errno(errp, -ret, "error rbd create");
return ret;
}
return ret;
}
@@ -500,7 +508,7 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
clientname = qemu_rbd_parse_clientname(conf, clientname_buf);
r = rados_create(&s->cluster, clientname);
if (r < 0) {
error_setg(errp, "error initializing");
error_setg_errno(errp, -r, "error initializing");
goto failed_opts;
}
@@ -546,19 +554,19 @@ static int qemu_rbd_open(BlockDriverState *bs, QDict *options, int flags,
r = rados_connect(s->cluster);
if (r < 0) {
error_setg(errp, "error connecting");
error_setg_errno(errp, -r, "error connecting");
goto failed_shutdown;
}
r = rados_ioctx_create(s->cluster, pool, &s->io_ctx);
if (r < 0) {
error_setg(errp, "error opening pool %s", pool);
error_setg_errno(errp, -r, "error opening pool %s", pool);
goto failed_shutdown;
}
r = rbd_open(s->io_ctx, s->name, &s->image, s->snap);
if (r < 0) {
error_setg(errp, "error reading header from %s", s->name);
error_setg_errno(errp, -r, "error reading header from %s", s->name);
goto failed_open;
}
@@ -875,10 +883,8 @@ static int qemu_rbd_snap_rollback(BlockDriverState *bs,
const char *snapshot_name)
{
BDRVRBDState *s = bs->opaque;
int r;
r = rbd_snap_rollback(s->image, snapshot_name);
return r;
return rbd_snap_rollback(s->image, snapshot_name);
}
static int qemu_rbd_snap_list(BlockDriverState *bs,

View File

@@ -294,13 +294,16 @@ static inline size_t count_data_objs(const struct SheepdogInode *inode)
#undef DPRINTF
#ifdef DEBUG_SDOG
#define DPRINTF(fmt, args...) \
do { \
fprintf(stdout, "%s %d: " fmt, __func__, __LINE__, ##args); \
} while (0)
#define DEBUG_SDOG_PRINT 1
#else
#define DPRINTF(fmt, args...)
#define DEBUG_SDOG_PRINT 0
#endif
#define DPRINTF(fmt, args...) \
do { \
if (DEBUG_SDOG_PRINT) { \
fprintf(stderr, "%s %d: " fmt, __func__, __LINE__, ##args); \
} \
} while (0)
typedef struct SheepdogAIOCB SheepdogAIOCB;
@@ -1678,7 +1681,7 @@ static int sd_prealloc(const char *filename, Error **errp)
if (ret < 0) {
goto out;
}
ret = blk_pwrite(blk, idx * buf_size, buf, buf_size);
ret = blk_pwrite(blk, idx * buf_size, buf, buf_size, 0);
if (ret < 0) {
goto out;
}
@@ -2781,12 +2784,19 @@ static int sd_save_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
return ret;
}
static int sd_load_vmstate(BlockDriverState *bs, uint8_t *data,
int64_t pos, int size)
static int sd_load_vmstate(BlockDriverState *bs, QEMUIOVector *qiov,
int64_t pos)
{
BDRVSheepdogState *s = bs->opaque;
void *buf;
int ret;
return do_load_save_vmstate(s, data, pos, size, 1);
buf = qemu_blockalign(bs, qiov->size);
ret = do_load_save_vmstate(s, buf, pos, qiov->size, 1);
qemu_iovec_from_buf(qiov, 0, buf, qiov->size);
qemu_vfree(buf);
return ret;
}

View File

@@ -358,9 +358,7 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
ret = bdrv_snapshot_load_tmp(bs, NULL, id_or_name, &local_err);
}
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
return ret;
}
@@ -373,9 +371,10 @@ int bdrv_snapshot_load_tmp_by_id_or_name(BlockDriverState *bs,
bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
{
bool ok = true;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
while (ok && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
@@ -383,8 +382,12 @@ bool bdrv_all_can_snapshot(BlockDriverState **first_bad_bs)
ok = bdrv_can_snapshot(bs);
}
aio_context_release(ctx);
if (!ok) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return ok;
}
@@ -393,10 +396,11 @@ int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
Error **err)
{
int ret = 0;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
QEMUSnapshotInfo sn1, *snapshot = &sn1;
while (ret == 0 && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
@@ -405,8 +409,12 @@ int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
ret = bdrv_snapshot_delete_by_id_or_name(bs, name, err);
}
aio_context_release(ctx);
if (ret < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return ret;
}
@@ -415,9 +423,10 @@ int bdrv_all_delete_snapshot(const char *name, BlockDriverState **first_bad_bs,
int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
{
int err = 0;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
while (err == 0 && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
@@ -425,8 +434,12 @@ int bdrv_all_goto_snapshot(const char *name, BlockDriverState **first_bad_bs)
err = bdrv_snapshot_goto(bs, name);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
@@ -435,9 +448,10 @@ int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
{
QEMUSnapshotInfo sn;
int err = 0;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
while (err == 0 && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
@@ -445,8 +459,12 @@ int bdrv_all_find_snapshot(const char *name, BlockDriverState **first_bad_bs)
err = bdrv_snapshot_find(bs, &sn, name);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
@@ -457,9 +475,10 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
BlockDriverState **first_bad_bs)
{
int err = 0;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
while (err == 0 && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
aio_context_acquire(ctx);
@@ -471,23 +490,32 @@ int bdrv_all_create_snapshot(QEMUSnapshotInfo *sn,
err = bdrv_snapshot_create(bs, sn);
}
aio_context_release(ctx);
if (err < 0) {
goto fail;
}
}
fail:
*first_bad_bs = bs;
return err;
}
BlockDriverState *bdrv_all_find_vmstate_bs(void)
{
bool not_found = true;
BlockDriverState *bs = NULL;
BlockDriverState *bs;
BdrvNextIterator it;
while (not_found && (bs = bdrv_next(bs))) {
for (bs = bdrv_first(&it); bs; bs = bdrv_next(&it)) {
AioContext *ctx = bdrv_get_aio_context(bs);
bool found;
aio_context_acquire(ctx);
not_found = !bdrv_can_snapshot(bs);
found = bdrv_can_snapshot(bs);
aio_context_release(ctx);
if (found) {
break;
}
}
return bs;
}

View File

@@ -39,7 +39,7 @@ typedef struct StreamBlockJob {
char *backing_file_str;
} StreamBlockJob;
static int coroutine_fn stream_populate(BlockDriverState *bs,
static int coroutine_fn stream_populate(BlockBackend *blk,
int64_t sector_num, int nb_sectors,
void *buf)
{
@@ -52,7 +52,8 @@ static int coroutine_fn stream_populate(BlockDriverState *bs,
qemu_iovec_init_external(&qiov, &iov, 1);
/* Copy-on-read the unallocated clusters */
return bdrv_co_copy_on_readv(bs, sector_num, nb_sectors, &qiov);
return blk_co_preadv(blk, sector_num * BDRV_SECTOR_SIZE, qiov.size, &qiov,
BDRV_REQ_COPY_ON_READ);
}
typedef struct {
@@ -64,6 +65,7 @@ static void stream_complete(BlockJob *job, void *opaque)
{
StreamBlockJob *s = container_of(job, StreamBlockJob, common);
StreamCompleteData *data = opaque;
BlockDriverState *bs = blk_bs(job->blk);
BlockDriverState *base = s->base;
if (!block_job_is_cancelled(&s->common) && data->reached_end &&
@@ -75,8 +77,8 @@ static void stream_complete(BlockJob *job, void *opaque)
base_fmt = base->drv->format_name;
}
}
data->ret = bdrv_change_backing_file(job->bs, base_id, base_fmt);
bdrv_set_backing_hd(job->bs, base);
data->ret = bdrv_change_backing_file(bs, base_id, base_fmt);
bdrv_set_backing_hd(bs, base);
}
g_free(s->backing_file_str);
@@ -88,7 +90,8 @@ static void coroutine_fn stream_run(void *opaque)
{
StreamBlockJob *s = opaque;
StreamCompleteData *data;
BlockDriverState *bs = s->common.bs;
BlockBackend *blk = s->common.blk;
BlockDriverState *bs = blk_bs(blk);
BlockDriverState *base = s->base;
int64_t sector_num = 0;
int64_t end = -1;
@@ -159,12 +162,11 @@ wait:
goto wait;
}
}
ret = stream_populate(bs, sector_num, n, buf);
ret = stream_populate(blk, sector_num, n, buf);
}
if (ret < 0) {
BlockErrorAction action =
block_job_error_action(&s->common, s->common.bs, s->on_error,
true, -ret);
block_job_error_action(&s->common, s->on_error, true, -ret);
if (action == BLOCK_ERROR_ACTION_STOP) {
n = 0;
continue;
@@ -224,13 +226,6 @@ void stream_start(BlockDriverState *bs, BlockDriverState *base,
{
StreamBlockJob *s;
if ((on_error == BLOCKDEV_ON_ERROR_STOP ||
on_error == BLOCKDEV_ON_ERROR_ENOSPC) &&
(!bs->blk || !blk_iostatus_is_enabled(bs->blk))) {
error_setg(errp, QERR_INVALID_PARAMETER, "on-error");
return;
}
s = block_job_create(&stream_job_driver, bs, speed, cb, opaque, errp);
if (!s) {
return;

View File

@@ -23,13 +23,14 @@
*/
#include "qemu/osdep.h"
#include "sysemu/block-backend.h"
#include "block/throttle-groups.h"
#include "qemu/queue.h"
#include "qemu/thread.h"
#include "sysemu/qtest.h"
/* The ThrottleGroup structure (with its ThrottleState) is shared
* among different BlockDriverState and it's independent from
* among different BlockBackends and it's independent from
* AioContext, so in order to use it from different threads it needs
* its own locking.
*
@@ -39,26 +40,26 @@
* The whole ThrottleGroup structure is private and invisible to
* outside users, that only use it through its ThrottleState.
*
* In addition to the ThrottleGroup structure, BlockDriverState has
* In addition to the ThrottleGroup structure, BlockBackendPublic has
* fields that need to be accessed by other members of the group and
* therefore also need to be protected by this lock. Once a BDS is
* registered in a group those fields can be accessed by other threads
* any time.
* therefore also need to be protected by this lock. Once a
* BlockBackend is registered in a group those fields can be accessed
* by other threads any time.
*
* Again, all this is handled internally and is mostly transparent to
* the outside. The 'throttle_timers' field however has an additional
* constraint because it may be temporarily invalid (see for example
* bdrv_set_aio_context()). Therefore in this file a thread will
* access some other BDS's timers only after verifying that that BDS
* has throttled requests in the queue.
* access some other BlockBackend's timers only after verifying that
* that BlockBackend has throttled requests in the queue.
*/
typedef struct ThrottleGroup {
char *name; /* This is constant during the lifetime of the group */
QemuMutex lock; /* This lock protects the following four fields */
ThrottleState ts;
QLIST_HEAD(, BlockDriverState) head;
BlockDriverState *tokens[2];
QLIST_HEAD(, BlockBackendPublic) head;
BlockBackend *tokens[2];
bool any_timer_armed[2];
/* These two are protected by the global throttle_groups_lock */
@@ -132,93 +133,98 @@ void throttle_group_unref(ThrottleState *ts)
qemu_mutex_unlock(&throttle_groups_lock);
}
/* Get the name from a BlockDriverState's ThrottleGroup. The name (and
* the pointer) is guaranteed to remain constant during the lifetime
* of the group.
/* Get the name from a BlockBackend's ThrottleGroup. The name (and the pointer)
* is guaranteed to remain constant during the lifetime of the group.
*
* @bs: a BlockDriverState that is member of a throttling group
* @blk: a BlockBackend that is member of a throttling group
* @ret: the name of the group.
*/
const char *throttle_group_get_name(BlockDriverState *bs)
const char *throttle_group_get_name(BlockBackend *blk)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
return tg->name;
}
/* Return the next BlockDriverState in the round-robin sequence,
* simulating a circular list.
/* Return the next BlockBackend in the round-robin sequence, simulating a
* circular list.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @ret: the next BlockDriverState in the sequence
* @blk: the current BlockBackend
* @ret: the next BlockBackend in the sequence
*/
static BlockDriverState *throttle_group_next_bs(BlockDriverState *bs)
static BlockBackend *throttle_group_next_blk(BlockBackend *blk)
{
ThrottleState *ts = bs->throttle_state;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
BlockDriverState *next = QLIST_NEXT(bs, round_robin);
BlockBackendPublic *next = QLIST_NEXT(blkp, round_robin);
if (!next) {
return QLIST_FIRST(&tg->head);
next = QLIST_FIRST(&tg->head);
}
return next;
return blk_by_public(next);
}
/* Return the next BlockDriverState in the round-robin sequence with
* pending I/O requests.
/* Return the next BlockBackend in the round-robin sequence with pending I/O
* requests.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @blk: the current BlockBackend
* @is_write: the type of operation (read/write)
* @ret: the next BlockDriverState with pending requests, or bs
* if there is none.
* @ret: the next BlockBackend with pending requests, or blk if there is
* none.
*/
static BlockDriverState *next_throttle_token(BlockDriverState *bs,
bool is_write)
static BlockBackend *next_throttle_token(BlockBackend *blk, bool is_write)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockDriverState *token, *start;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
BlockBackend *token, *start;
start = token = tg->tokens[is_write];
/* get next bs round in round robin style */
token = throttle_group_next_bs(token);
while (token != start && !token->pending_reqs[is_write]) {
token = throttle_group_next_bs(token);
token = throttle_group_next_blk(token);
while (token != start && !blkp->pending_reqs[is_write]) {
token = throttle_group_next_blk(token);
}
/* If no IO are queued for scheduling on the next round robin token
* then decide the token is the current bs because chances are
* the current bs get the current request queued.
*/
if (token == start && !token->pending_reqs[is_write]) {
token = bs;
if (token == start && !blkp->pending_reqs[is_write]) {
token = blk;
}
return token;
}
/* Check if the next I/O request for a BlockDriverState needs to be
* throttled or not. If there's no timer set in this group, set one
* and update the token accordingly.
/* Check if the next I/O request for a BlockBackend needs to be throttled or
* not. If there's no timer set in this group, set one and update the token
* accordingly.
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @blk: the current BlockBackend
* @is_write: the type of operation (read/write)
* @ret: whether the I/O request needs to be throttled or not
*/
static bool throttle_group_schedule_timer(BlockDriverState *bs,
bool is_write)
static bool throttle_group_schedule_timer(BlockBackend *blk, bool is_write)
{
ThrottleState *ts = bs->throttle_state;
ThrottleTimers *tt = &bs->throttle_timers;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleState *ts = blkp->throttle_state;
ThrottleTimers *tt = &blkp->throttle_timers;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool must_wait;
if (blkp->io_limits_disabled) {
return false;
}
/* Check if any of the timers in this group is already armed */
if (tg->any_timer_armed[is_write]) {
return true;
@@ -226,9 +232,9 @@ static bool throttle_group_schedule_timer(BlockDriverState *bs,
must_wait = throttle_schedule_timer(ts, tt, is_write);
/* If a timer just got armed, set bs as the current token */
/* If a timer just got armed, set blk as the current token */
if (must_wait) {
tg->tokens[is_write] = bs;
tg->tokens[is_write] = blk;
tg->any_timer_armed[is_write] = true;
}
@@ -239,18 +245,19 @@ static bool throttle_group_schedule_timer(BlockDriverState *bs,
*
* This assumes that tg->lock is held.
*
* @bs: the current BlockDriverState
* @blk: the current BlockBackend
* @is_write: the type of operation (read/write)
*/
static void schedule_next_request(BlockDriverState *bs, bool is_write)
static void schedule_next_request(BlockBackend *blk, bool is_write)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
bool must_wait;
BlockDriverState *token;
BlockBackend *token;
/* Check if there's any pending request to schedule next */
token = next_throttle_token(bs, is_write);
if (!token->pending_reqs[is_write]) {
token = next_throttle_token(blk, is_write);
if (!blkp->pending_reqs[is_write]) {
return;
}
@@ -259,12 +266,12 @@ static void schedule_next_request(BlockDriverState *bs, bool is_write)
/* If it doesn't have to wait, queue it for immediate execution */
if (!must_wait) {
/* Give preference to requests from the current bs */
/* Give preference to requests from the current blk */
if (qemu_in_coroutine() &&
qemu_co_queue_next(&bs->throttled_reqs[is_write])) {
token = bs;
qemu_co_queue_next(&blkp->throttled_reqs[is_write])) {
token = blk;
} else {
ThrottleTimers *tt = &token->throttle_timers;
ThrottleTimers *tt = &blkp->throttle_timers;
int64_t now = qemu_clock_get_ns(tt->clock_type);
timer_mod(tt->timers[is_write], now + 1);
tg->any_timer_armed[is_write] = true;
@@ -277,53 +284,67 @@ static void schedule_next_request(BlockDriverState *bs, bool is_write)
* if necessary, and schedule the next request using a round robin
* algorithm.
*
* @bs: the current BlockDriverState
* @blk: the current BlockBackend
* @bytes: the number of bytes for this I/O
* @is_write: the type of operation (read/write)
*/
void coroutine_fn throttle_group_co_io_limits_intercept(BlockDriverState *bs,
void coroutine_fn throttle_group_co_io_limits_intercept(BlockBackend *blk,
unsigned int bytes,
bool is_write)
{
bool must_wait;
BlockDriverState *token;
BlockBackend *token;
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
/* First we check if this I/O has to be throttled. */
token = next_throttle_token(bs, is_write);
token = next_throttle_token(blk, is_write);
must_wait = throttle_group_schedule_timer(token, is_write);
/* Wait if there's a timer set or queued requests of this type */
if (must_wait || bs->pending_reqs[is_write]) {
bs->pending_reqs[is_write]++;
if (must_wait || blkp->pending_reqs[is_write]) {
blkp->pending_reqs[is_write]++;
qemu_mutex_unlock(&tg->lock);
qemu_co_queue_wait(&bs->throttled_reqs[is_write]);
qemu_co_queue_wait(&blkp->throttled_reqs[is_write]);
qemu_mutex_lock(&tg->lock);
bs->pending_reqs[is_write]--;
blkp->pending_reqs[is_write]--;
}
/* The I/O will be executed, so do the accounting */
throttle_account(bs->throttle_state, is_write, bytes);
throttle_account(blkp->throttle_state, is_write, bytes);
/* Schedule the next request */
schedule_next_request(bs, is_write);
schedule_next_request(blk, is_write);
qemu_mutex_unlock(&tg->lock);
}
void throttle_group_restart_blk(BlockBackend *blk)
{
BlockBackendPublic *blkp = blk_get_public(blk);
int i;
for (i = 0; i < 2; i++) {
while (qemu_co_enter_next(&blkp->throttled_reqs[i])) {
;
}
}
}
/* Update the throttle configuration for a particular group. Similar
* to throttle_config(), but guarantees atomicity within the
* throttling group.
*
* @bs: a BlockDriverState that is member of the group
* @blk: a BlockBackend that is a member of the group
* @cfg: the configuration to set
*/
void throttle_group_config(BlockDriverState *bs, ThrottleConfig *cfg)
void throttle_group_config(BlockBackend *blk, ThrottleConfig *cfg)
{
ThrottleTimers *tt = &bs->throttle_timers;
ThrottleState *ts = bs->throttle_state;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleTimers *tt = &blkp->throttle_timers;
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
/* throttle_config() cancels the timers */
@@ -335,18 +356,22 @@ void throttle_group_config(BlockDriverState *bs, ThrottleConfig *cfg)
}
throttle_config(ts, tt, cfg);
qemu_mutex_unlock(&tg->lock);
qemu_co_enter_next(&blkp->throttled_reqs[0]);
qemu_co_enter_next(&blkp->throttled_reqs[1]);
}
/* Get the throttle configuration from a particular group. Similar to
* throttle_get_config(), but guarantees atomicity within the
* throttling group.
*
* @bs: a BlockDriverState that is member of the group
* @blk: a BlockBackend that is a member of the group
* @cfg: the configuration will be written here
*/
void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg)
void throttle_group_get_config(BlockBackend *blk, ThrottleConfig *cfg)
{
ThrottleState *ts = bs->throttle_state;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
qemu_mutex_lock(&tg->lock);
throttle_get_config(ts, cfg);
@@ -356,12 +381,13 @@ void throttle_group_get_config(BlockDriverState *bs, ThrottleConfig *cfg)
/* ThrottleTimers callback. This wakes up a request that was waiting
* because it had been throttled.
*
* @bs: the BlockDriverState whose request had been throttled
* @blk: the BlockBackend whose request had been throttled
* @is_write: the type of operation (read/write)
*/
static void timer_cb(BlockDriverState *bs, bool is_write)
static void timer_cb(BlockBackend *blk, bool is_write)
{
ThrottleState *ts = bs->throttle_state;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleState *ts = blkp->throttle_state;
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
bool empty_queue;
@@ -371,13 +397,13 @@ static void timer_cb(BlockDriverState *bs, bool is_write)
qemu_mutex_unlock(&tg->lock);
/* Run the request that was waiting for this timer */
empty_queue = !qemu_co_enter_next(&bs->throttled_reqs[is_write]);
empty_queue = !qemu_co_enter_next(&blkp->throttled_reqs[is_write]);
/* If the request queue was empty then we have to take care of
* scheduling the next one */
if (empty_queue) {
qemu_mutex_lock(&tg->lock);
schedule_next_request(bs, is_write);
schedule_next_request(blk, is_write);
qemu_mutex_unlock(&tg->lock);
}
}
@@ -392,17 +418,17 @@ static void write_timer_cb(void *opaque)
timer_cb(opaque, true);
}
/* Register a BlockDriverState in the throttling group, also
* initializing its timers and updating its throttle_state pointer to
* point to it. If a throttling group with that name does not exist
* yet, it will be created.
/* Register a BlockBackend in the throttling group, also initializing its
* timers and updating its throttle_state pointer to point to it. If a
* throttling group with that name does not exist yet, it will be created.
*
* @bs: the BlockDriverState to insert
* @blk: the BlockBackend to insert
* @groupname: the name of the group
*/
void throttle_group_register_bs(BlockDriverState *bs, const char *groupname)
void throttle_group_register_blk(BlockBackend *blk, const char *groupname)
{
int i;
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleState *ts = throttle_group_incref(groupname);
ThrottleGroup *tg = container_of(ts, ThrottleGroup, ts);
int clock_type = QEMU_CLOCK_REALTIME;
@@ -412,67 +438,67 @@ void throttle_group_register_bs(BlockDriverState *bs, const char *groupname)
clock_type = QEMU_CLOCK_VIRTUAL;
}
bs->throttle_state = ts;
blkp->throttle_state = ts;
qemu_mutex_lock(&tg->lock);
/* If the ThrottleGroup is new set this BlockDriverState as the token */
/* If the ThrottleGroup is new set this BlockBackend as the token */
for (i = 0; i < 2; i++) {
if (!tg->tokens[i]) {
tg->tokens[i] = bs;
tg->tokens[i] = blk;
}
}
QLIST_INSERT_HEAD(&tg->head, bs, round_robin);
QLIST_INSERT_HEAD(&tg->head, blkp, round_robin);
throttle_timers_init(&bs->throttle_timers,
bdrv_get_aio_context(bs),
throttle_timers_init(&blkp->throttle_timers,
blk_get_aio_context(blk),
clock_type,
read_timer_cb,
write_timer_cb,
bs);
blk);
qemu_mutex_unlock(&tg->lock);
}
/* Unregister a BlockDriverState from its group, removing it from the
* list, destroying the timers and setting the throttle_state pointer
* to NULL.
/* Unregister a BlockBackend from its group, removing it from the list,
* destroying the timers and setting the throttle_state pointer to NULL.
*
* The BlockDriverState must not have pending throttled requests, so
* the caller has to drain them first.
* The BlockBackend must not have pending throttled requests, so the caller has
* to drain them first.
*
* The group will be destroyed if it's empty after this operation.
*
* @bs: the BlockDriverState to remove
* @blk: the BlockBackend to remove
*/
void throttle_group_unregister_bs(BlockDriverState *bs)
void throttle_group_unregister_blk(BlockBackend *blk)
{
ThrottleGroup *tg = container_of(bs->throttle_state, ThrottleGroup, ts);
BlockBackendPublic *blkp = blk_get_public(blk);
ThrottleGroup *tg = container_of(blkp->throttle_state, ThrottleGroup, ts);
int i;
assert(bs->pending_reqs[0] == 0 && bs->pending_reqs[1] == 0);
assert(qemu_co_queue_empty(&bs->throttled_reqs[0]));
assert(qemu_co_queue_empty(&bs->throttled_reqs[1]));
assert(blkp->pending_reqs[0] == 0 && blkp->pending_reqs[1] == 0);
assert(qemu_co_queue_empty(&blkp->throttled_reqs[0]));
assert(qemu_co_queue_empty(&blkp->throttled_reqs[1]));
qemu_mutex_lock(&tg->lock);
for (i = 0; i < 2; i++) {
if (tg->tokens[i] == bs) {
BlockDriverState *token = throttle_group_next_bs(bs);
/* Take care of the case where this is the last bs in the group */
if (token == bs) {
if (tg->tokens[i] == blk) {
BlockBackend *token = throttle_group_next_blk(blk);
/* Take care of the case where this is the last blk in the group */
if (token == blk) {
token = NULL;
}
tg->tokens[i] = token;
}
}
/* remove the current bs from the list */
QLIST_REMOVE(bs, round_robin);
throttle_timers_destroy(&bs->throttle_timers);
/* remove the current blk from the list */
QLIST_REMOVE(blkp, round_robin);
throttle_timers_destroy(&blkp->throttle_timers);
qemu_mutex_unlock(&tg->lock);
throttle_group_unref(&tg->ts);
bs->throttle_state = NULL;
blkp->throttle_state = NULL;
}
static void throttle_groups_init(void)

116
block/trace-events Normal file
View File

@@ -0,0 +1,116 @@
# See docs/trace-events.txt for syntax documentation.
# block.c
bdrv_open_common(void *bs, const char *filename, int flags, const char *format_name) "bs %p filename \"%s\" flags %#x format_name \"%s\""
bdrv_lock_medium(void *bs, bool locked) "bs %p locked %d"
# block/block-backend.c
blk_co_preadv(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags %x"
blk_co_pwritev(void *blk, void *bs, int64_t offset, unsigned int bytes, int flags) "blk %p bs %p offset %"PRId64" bytes %u flags %x"
# block/io.c
bdrv_aio_discard(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
bdrv_aio_flush(void *bs, void *opaque) "bs %p opaque %p"
bdrv_aio_readv(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
bdrv_aio_writev(void *bs, int64_t sector_num, int nb_sectors, void *opaque) "bs %p sector_num %"PRId64" nb_sectors %d opaque %p"
bdrv_co_readv(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
bdrv_co_writev(void *bs, int64_t sector_num, int nb_sector) "bs %p sector_num %"PRId64" nb_sectors %d"
bdrv_co_pwrite_zeroes(void *bs, int64_t offset, int count, int flags) "bs %p offset %"PRId64" count %d flags %#x"
bdrv_co_do_copy_on_readv(void *bs, int64_t offset, unsigned int bytes, int64_t cluster_offset, unsigned int cluster_bytes) "bs %p offset %"PRId64" bytes %u cluster_offset %"PRId64" cluster_bytes %u"
# block/stream.c
stream_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
stream_start(void *bs, void *base, void *s, void *co, void *opaque) "bs %p base %p s %p co %p opaque %p"
# block/commit.c
commit_one_iteration(void *s, int64_t sector_num, int nb_sectors, int is_allocated) "s %p sector_num %"PRId64" nb_sectors %d is_allocated %d"
commit_start(void *bs, void *base, void *top, void *s, void *co, void *opaque) "bs %p base %p top %p s %p co %p opaque %p"
# block/mirror.c
mirror_start(void *bs, void *s, void *co, void *opaque) "bs %p s %p co %p opaque %p"
mirror_restart_iter(void *s, int64_t cnt) "s %p dirty count %"PRId64
mirror_before_flush(void *s) "s %p"
mirror_before_drain(void *s, int64_t cnt) "s %p dirty count %"PRId64
mirror_before_sleep(void *s, int64_t cnt, int synced, uint64_t delay_ns) "s %p dirty count %"PRId64" synced %d delay %"PRIu64"ns"
mirror_one_iteration(void *s, int64_t sector_num, int nb_sectors) "s %p sector_num %"PRId64" nb_sectors %d"
mirror_iteration_done(void *s, int64_t sector_num, int nb_sectors, int ret) "s %p sector_num %"PRId64" nb_sectors %d ret %d"
mirror_yield(void *s, int64_t cnt, int buf_free_count, int in_flight) "s %p dirty count %"PRId64" free buffers %d in_flight %d"
mirror_yield_in_flight(void *s, int64_t sector_num, int in_flight) "s %p sector_num %"PRId64" in_flight %d"
mirror_yield_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d"
mirror_break_buf_busy(void *s, int nb_chunks, int in_flight) "s %p requested chunks %d in_flight %d"
# block/backup.c
backup_do_cow_enter(void *job, int64_t start, int64_t sector_num, int nb_sectors) "job %p start %"PRId64" sector_num %"PRId64" nb_sectors %d"
backup_do_cow_return(void *job, int64_t sector_num, int nb_sectors, int ret) "job %p sector_num %"PRId64" nb_sectors %d ret %d"
backup_do_cow_skip(void *job, int64_t start) "job %p start %"PRId64
backup_do_cow_process(void *job, int64_t start) "job %p start %"PRId64
backup_do_cow_read_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
backup_do_cow_write_fail(void *job, int64_t start, int ret) "job %p start %"PRId64" ret %d"
# blockdev.c
qmp_block_job_cancel(void *job) "job %p"
qmp_block_job_pause(void *job) "job %p"
qmp_block_job_resume(void *job) "job %p"
qmp_block_job_complete(void *job) "job %p"
block_job_cb(void *bs, void *job, int ret) "bs %p job %p ret %d"
qmp_block_stream(void *bs, void *job) "bs %p job %p"
# block/raw-win32.c
# block/raw-posix.c
paio_submit_co(int64_t offset, int count, int type) "offset %"PRId64" count %d type %d"
paio_submit(void *acb, void *opaque, int64_t sector_num, int nb_sectors, int type) "acb %p opaque %p sector_num %"PRId64" nb_sectors %d type %d"
# block/qcow2.c
qcow2_writev_start_req(void *co, int64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
qcow2_writev_done_req(void *co, int ret) "co %p ret %d"
qcow2_writev_start_part(void *co) "co %p"
qcow2_writev_done_part(void *co, int cur_bytes) "co %p cur_bytes %d"
qcow2_writev_data(void *co, uint64_t offset) "co %p offset %" PRIx64
qcow2_pwrite_zeroes_start_req(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
qcow2_pwrite_zeroes(void *co, int64_t offset, int count) "co %p offset %" PRIx64 " count %d"
# block/qcow2-cluster.c
qcow2_alloc_clusters_offset(void *co, uint64_t offset, int bytes) "co %p offset %" PRIx64 " bytes %d"
qcow2_handle_copied(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
qcow2_handle_alloc(void *co, uint64_t guest_offset, uint64_t host_offset, uint64_t bytes) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " bytes %" PRIx64
qcow2_do_alloc_clusters_offset(void *co, uint64_t guest_offset, uint64_t host_offset, int nb_clusters) "co %p guest_offset %" PRIx64 " host_offset %" PRIx64 " nb_clusters %d"
qcow2_cluster_alloc_phys(void *co) "co %p"
qcow2_cluster_link_l2(void *co, int nb_clusters) "co %p nb_clusters %d"
qcow2_l2_allocate(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_get_empty(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_write_l2(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_write_l1(void *bs, int l1_index) "bs %p l1_index %d"
qcow2_l2_allocate_done(void *bs, int l1_index, int ret) "bs %p l1_index %d ret %d"
# block/qcow2-cache.c
qcow2_cache_get(void *co, int c, uint64_t offset, bool read_from_disk) "co %p is_l2_cache %d offset %" PRIx64 " read_from_disk %d"
qcow2_cache_get_replace_entry(void *co, int c, int i) "co %p is_l2_cache %d index %d"
qcow2_cache_get_read(void *co, int c, int i) "co %p is_l2_cache %d index %d"
qcow2_cache_get_done(void *co, int c, int i) "co %p is_l2_cache %d index %d"
qcow2_cache_flush(void *co, int c) "co %p is_l2_cache %d"
qcow2_cache_entry_flush(void *co, int c, int i) "co %p is_l2_cache %d index %d"
# block/qed-l2-cache.c
qed_alloc_l2_cache_entry(void *l2_cache, void *entry) "l2_cache %p entry %p"
qed_unref_l2_cache_entry(void *entry, int ref) "entry %p ref %d"
qed_find_l2_cache_entry(void *l2_cache, void *entry, uint64_t offset, int ref) "l2_cache %p entry %p offset %"PRIu64" ref %d"
# block/qed-table.c
qed_read_table(void *s, uint64_t offset, void *table) "s %p offset %"PRIu64" table %p"
qed_read_table_cb(void *s, void *table, int ret) "s %p table %p ret %d"
qed_write_table(void *s, uint64_t offset, void *table, unsigned int index, unsigned int n) "s %p offset %"PRIu64" table %p index %u n %u"
qed_write_table_cb(void *s, void *table, int flush, int ret) "s %p table %p flush %d ret %d"
# block/qed.c
qed_need_check_timer_cb(void *s) "s %p"
qed_start_need_check_timer(void *s) "s %p"
qed_cancel_need_check_timer(void *s) "s %p"
qed_aio_complete(void *s, void *acb, int ret) "s %p acb %p ret %d"
qed_aio_setup(void *s, void *acb, int64_t sector_num, int nb_sectors, void *opaque, int flags) "s %p acb %p sector_num %"PRId64" nb_sectors %d opaque %p flags %#x"
qed_aio_next_io(void *s, void *acb, int ret, uint64_t cur_pos) "s %p acb %p ret %d cur_pos %"PRIu64
qed_aio_read_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"
qed_aio_write_data(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"
qed_aio_write_prefill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_postfill(void *s, void *acb, uint64_t start, size_t len, uint64_t offset) "s %p acb %p start %"PRIu64" len %zu offset %"PRIu64
qed_aio_write_main(void *s, void *acb, int ret, uint64_t offset, size_t len) "s %p acb %p ret %d offset %"PRIu64" len %zu"

View File

@@ -54,6 +54,7 @@
#include "block/block_int.h"
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "qemu/coroutine.h"
#include "qemu/cutils.h"
@@ -557,98 +558,109 @@ static int64_t coroutine_fn vdi_co_get_block_status(BlockDriverState *bs,
return BDRV_BLOCK_DATA | BDRV_BLOCK_OFFSET_VALID | offset;
}
static int vdi_co_read(BlockDriverState *bs,
int64_t sector_num, uint8_t *buf, int nb_sectors)
static int coroutine_fn
vdi_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVdiState *s = bs->opaque;
QEMUIOVector local_qiov;
uint32_t bmap_entry;
uint32_t block_index;
uint32_t sector_in_block;
uint32_t n_sectors;
uint32_t offset_in_block;
uint32_t n_bytes;
uint64_t bytes_done = 0;
int ret = 0;
logout("\n");
while (ret >= 0 && nb_sectors > 0) {
block_index = sector_num / s->block_sectors;
sector_in_block = sector_num % s->block_sectors;
n_sectors = s->block_sectors - sector_in_block;
if (n_sectors > nb_sectors) {
n_sectors = nb_sectors;
}
qemu_iovec_init(&local_qiov, qiov->niov);
logout("will read %u sectors starting at sector %" PRIu64 "\n",
n_sectors, sector_num);
while (ret >= 0 && bytes > 0) {
block_index = offset / s->block_size;
offset_in_block = offset % s->block_size;
n_bytes = MIN(bytes, s->block_size - offset_in_block);
logout("will read %u bytes starting at offset %" PRIu64 "\n",
n_bytes, offset);
/* prepare next AIO request */
bmap_entry = le32_to_cpu(s->bmap[block_index]);
if (!VDI_IS_ALLOCATED(bmap_entry)) {
/* Block not allocated, return zeros, no need to wait. */
memset(buf, 0, n_sectors * SECTOR_SIZE);
qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
ret = 0;
} else {
uint64_t offset = s->header.offset_data / SECTOR_SIZE +
(uint64_t)bmap_entry * s->block_sectors +
sector_in_block;
ret = bdrv_read(bs->file->bs, offset, buf, n_sectors);
}
logout("%u sectors read\n", n_sectors);
uint64_t data_offset = s->header.offset_data +
(uint64_t)bmap_entry * s->block_size +
offset_in_block;
nb_sectors -= n_sectors;
sector_num += n_sectors;
buf += n_sectors * SECTOR_SIZE;
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = bdrv_co_preadv(bs->file->bs, data_offset, n_bytes,
&local_qiov, 0);
}
logout("%u bytes read\n", n_bytes);
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
}
qemu_iovec_destroy(&local_qiov);
return ret;
}
static int vdi_co_write(BlockDriverState *bs,
int64_t sector_num, const uint8_t *buf, int nb_sectors)
static int coroutine_fn
vdi_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVdiState *s = bs->opaque;
QEMUIOVector local_qiov;
uint32_t bmap_entry;
uint32_t block_index;
uint32_t sector_in_block;
uint32_t n_sectors;
uint32_t offset_in_block;
uint32_t n_bytes;
uint32_t bmap_first = VDI_UNALLOCATED;
uint32_t bmap_last = VDI_UNALLOCATED;
uint8_t *block = NULL;
uint64_t bytes_done = 0;
int ret = 0;
logout("\n");
while (ret >= 0 && nb_sectors > 0) {
block_index = sector_num / s->block_sectors;
sector_in_block = sector_num % s->block_sectors;
n_sectors = s->block_sectors - sector_in_block;
if (n_sectors > nb_sectors) {
n_sectors = nb_sectors;
}
qemu_iovec_init(&local_qiov, qiov->niov);
logout("will write %u sectors starting at sector %" PRIu64 "\n",
n_sectors, sector_num);
while (ret >= 0 && bytes > 0) {
block_index = offset / s->block_size;
offset_in_block = offset % s->block_size;
n_bytes = MIN(bytes, s->block_size - offset_in_block);
logout("will write %u bytes starting at offset %" PRIu64 "\n",
n_bytes, offset);
/* prepare next AIO request */
bmap_entry = le32_to_cpu(s->bmap[block_index]);
if (!VDI_IS_ALLOCATED(bmap_entry)) {
/* Allocate new block and write to it. */
uint64_t offset;
uint64_t data_offset;
bmap_entry = s->header.blocks_allocated;
s->bmap[block_index] = cpu_to_le32(bmap_entry);
s->header.blocks_allocated++;
offset = s->header.offset_data / SECTOR_SIZE +
(uint64_t)bmap_entry * s->block_sectors;
data_offset = s->header.offset_data +
(uint64_t)bmap_entry * s->block_size;
if (block == NULL) {
block = g_malloc(s->block_size);
bmap_first = block_index;
}
bmap_last = block_index;
/* Copy data to be written to new block and zero unused parts. */
memset(block, 0, sector_in_block * SECTOR_SIZE);
memcpy(block + sector_in_block * SECTOR_SIZE,
buf, n_sectors * SECTOR_SIZE);
memset(block + (sector_in_block + n_sectors) * SECTOR_SIZE, 0,
(s->block_sectors - n_sectors - sector_in_block) * SECTOR_SIZE);
memset(block, 0, offset_in_block);
qemu_iovec_to_buf(qiov, bytes_done, block + offset_in_block,
n_bytes);
memset(block + offset_in_block + n_bytes, 0,
s->block_size - n_bytes - offset_in_block);
/* Note that this coroutine does not yield anywhere from reading the
* bmap entry until here, so in regards to all the coroutines trying
@@ -658,12 +670,12 @@ static int vdi_co_write(BlockDriverState *bs,
* acquire the lock and thus the padded cluster is written before
* the other coroutines can write to the affected area. */
qemu_co_mutex_lock(&s->write_lock);
ret = bdrv_write(bs->file->bs, offset, block, s->block_sectors);
ret = bdrv_pwrite(bs->file->bs, data_offset, block, s->block_size);
qemu_co_mutex_unlock(&s->write_lock);
} else {
uint64_t offset = s->header.offset_data / SECTOR_SIZE +
(uint64_t)bmap_entry * s->block_sectors +
sector_in_block;
uint64_t data_offset = s->header.offset_data +
(uint64_t)bmap_entry * s->block_size +
offset_in_block;
qemu_co_mutex_lock(&s->write_lock);
/* This lock is only used to make sure the following write operation
* is executed after the write issued by the coroutine allocating
@@ -674,16 +686,23 @@ static int vdi_co_write(BlockDriverState *bs,
* that that write operation has returned (there may be other writes
* in flight, but they do not concern this very operation). */
qemu_co_mutex_unlock(&s->write_lock);
ret = bdrv_write(bs->file->bs, offset, buf, n_sectors);
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = bdrv_co_pwritev(bs->file->bs, data_offset, n_bytes,
&local_qiov, 0);
}
nb_sectors -= n_sectors;
sector_num += n_sectors;
buf += n_sectors * SECTOR_SIZE;
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
logout("%u sectors written\n", n_sectors);
logout("%u bytes written\n", n_bytes);
}
qemu_iovec_destroy(&local_qiov);
logout("finished data write\n");
if (ret < 0) {
return ret;
@@ -694,6 +713,7 @@ static int vdi_co_write(BlockDriverState *bs,
VdiHeader *header = (VdiHeader *) block;
uint8_t *base;
uint64_t offset;
uint32_t n_sectors;
logout("now writing modified header\n");
assert(VDI_IS_ALLOCATED(bmap_first));
@@ -808,7 +828,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
vdi_header_print(&header);
#endif
vdi_header_to_le(&header);
ret = blk_pwrite(blk, offset, &header, sizeof(header));
ret = blk_pwrite(blk, offset, &header, sizeof(header), 0);
if (ret < 0) {
error_setg(errp, "Error writing header to %s", filename);
goto exit;
@@ -829,7 +849,7 @@ static int vdi_create(const char *filename, QemuOpts *opts, Error **errp)
bmap[i] = VDI_UNALLOCATED;
}
}
ret = blk_pwrite(blk, offset, bmap, bmap_size);
ret = blk_pwrite(blk, offset, bmap, bmap_size, 0);
if (ret < 0) {
error_setg(errp, "Error writing bmap to %s", filename);
goto exit;
@@ -903,9 +923,9 @@ static BlockDriver bdrv_vdi = {
.bdrv_co_get_block_status = vdi_co_get_block_status,
.bdrv_make_empty = vdi_make_empty,
.bdrv_read = vdi_co_read,
.bdrv_co_preadv = vdi_co_preadv,
#if defined(CONFIG_VDI_WRITE)
.bdrv_write = vdi_co_write,
.bdrv_co_pwritev = vdi_co_pwritev,
#endif
.bdrv_get_info = vdi_get_info,

View File

@@ -18,6 +18,7 @@
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "block/block_int.h"
#include "qemu/bswap.h"
#include "block/vhdx.h"
#include <uuid/uuid.h>

View File

@@ -23,6 +23,7 @@
#include "block/block_int.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "block/vhdx.h"

View File

@@ -22,11 +22,11 @@
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "qemu/crc32c.h"
#include "qemu/bswap.h"
#include "block/vhdx.h"
#include "migration/migration.h"
#include <uuid/uuid.h>
#include <glib.h>
/* Options for VHDX creation */
@@ -1856,13 +1856,14 @@ static int vhdx_create(const char *filename, QemuOpts *opts, Error **errp)
creator = g_utf8_to_utf16("QEMU v" QEMU_VERSION, -1, NULL,
&creator_items, NULL);
signature = cpu_to_le64(VHDX_FILE_SIGNATURE);
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature));
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET, &signature, sizeof(signature),
0);
if (ret < 0) {
goto delete_and_exit;
}
if (creator) {
ret = blk_pwrite(blk, VHDX_FILE_ID_OFFSET + sizeof(signature),
creator, creator_items * sizeof(gunichar2));
creator, creator_items * sizeof(gunichar2), 0);
if (ret < 0) {
goto delete_and_exit;
}

View File

@@ -30,10 +30,10 @@
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "qemu/cutils.h"
#include <zlib.h>
#include <glib.h>
#define VMDK3_MAGIC (('C' << 24) | ('O' << 16) | ('W' << 8) | 'D')
#define VMDK4_MAGIC (('K' << 24) | ('D' << 16) | ('M' << 8) | 'V')
@@ -997,9 +997,9 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
for (i = 0; i < s->num_extents; i++) {
if (!s->extents[i].flat) {
bs->bl.write_zeroes_alignment =
MAX(bs->bl.write_zeroes_alignment,
s->extents[i].cluster_sectors);
bs->bl.pwrite_zeroes_alignment =
MAX(bs->bl.pwrite_zeroes_alignment,
s->extents[i].cluster_sectors << BDRV_SECTOR_BITS);
}
}
}
@@ -1016,27 +1016,26 @@ static void vmdk_refresh_limits(BlockDriverState *bs, Error **errp)
*/
static int get_whole_cluster(BlockDriverState *bs,
VmdkExtent *extent,
uint64_t cluster_sector_num,
uint64_t sector_num,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
uint64_t cluster_offset,
uint64_t offset,
uint64_t skip_start_bytes,
uint64_t skip_end_bytes)
{
int ret = VMDK_OK;
int64_t cluster_bytes;
uint8_t *whole_grain;
/* For COW, align request sector_num to cluster start */
sector_num = QEMU_ALIGN_DOWN(sector_num, extent->cluster_sectors);
cluster_bytes = extent->cluster_sectors << BDRV_SECTOR_BITS;
offset = QEMU_ALIGN_DOWN(offset, cluster_bytes);
whole_grain = qemu_blockalign(bs, cluster_bytes);
if (!bs->backing) {
memset(whole_grain, 0, skip_start_sector << BDRV_SECTOR_BITS);
memset(whole_grain + (skip_end_sector << BDRV_SECTOR_BITS), 0,
cluster_bytes - (skip_end_sector << BDRV_SECTOR_BITS));
memset(whole_grain, 0, skip_start_bytes);
memset(whole_grain + skip_end_bytes, 0, cluster_bytes - skip_end_bytes);
}
assert(skip_end_sector <= extent->cluster_sectors);
assert(skip_end_bytes <= cluster_bytes);
/* we will be here if it's first write on non-exist grain(cluster).
* try to read from parent image, if exist */
if (bs->backing && !vmdk_is_cid_valid(bs)) {
@@ -1045,42 +1044,43 @@ static int get_whole_cluster(BlockDriverState *bs,
}
/* Read backing data before skip range */
if (skip_start_sector > 0) {
if (skip_start_bytes > 0) {
if (bs->backing) {
ret = bdrv_read(bs->backing->bs, sector_num,
whole_grain, skip_start_sector);
ret = bdrv_pread(bs->backing->bs, offset, whole_grain,
skip_start_bytes);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = bdrv_write(extent->file->bs, cluster_sector_num, whole_grain,
skip_start_sector);
ret = bdrv_pwrite(extent->file->bs, cluster_offset, whole_grain,
skip_start_bytes);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
/* Read backing data after skip range */
if (skip_end_sector < extent->cluster_sectors) {
if (skip_end_bytes < cluster_bytes) {
if (bs->backing) {
ret = bdrv_read(bs->backing->bs, sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
ret = bdrv_pread(bs->backing->bs, offset + skip_end_bytes,
whole_grain + skip_end_bytes,
cluster_bytes - skip_end_bytes);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = bdrv_write(extent->file->bs, cluster_sector_num + skip_end_sector,
whole_grain + (skip_end_sector << BDRV_SECTOR_BITS),
extent->cluster_sectors - skip_end_sector);
ret = bdrv_pwrite(extent->file->bs, cluster_offset + skip_end_bytes,
whole_grain + skip_end_bytes,
cluster_bytes - skip_end_bytes);
if (ret < 0) {
ret = VMDK_ERROR;
goto exit;
}
}
ret = VMDK_OK;
exit:
qemu_vfree(whole_grain);
return ret;
@@ -1142,8 +1142,8 @@ static int get_cluster_offset(BlockDriverState *bs,
uint64_t offset,
bool allocate,
uint64_t *cluster_offset,
uint64_t skip_start_sector,
uint64_t skip_end_sector)
uint64_t skip_start_bytes,
uint64_t skip_end_bytes)
{
unsigned int l1_index, l2_offset, l2_index;
int min_index, i, j;
@@ -1230,10 +1230,8 @@ static int get_cluster_offset(BlockDriverState *bs,
* This problem may occur because of insufficient space on host disk
* or inappropriate VM shutdown.
*/
ret = get_whole_cluster(bs, extent,
cluster_sector,
offset >> BDRV_SECTOR_BITS,
skip_start_sector, skip_end_sector);
ret = get_whole_cluster(bs, extent, cluster_sector * BDRV_SECTOR_SIZE,
offset, skip_start_bytes, skip_end_bytes);
if (ret) {
return ret;
}
@@ -1259,15 +1257,24 @@ static VmdkExtent *find_extent(BDRVVmdkState *s,
return NULL;
}
static inline uint64_t vmdk_find_offset_in_cluster(VmdkExtent *extent,
int64_t offset)
{
uint64_t extent_begin_offset, extent_relative_offset;
uint64_t cluster_size = extent->cluster_sectors * BDRV_SECTOR_SIZE;
extent_begin_offset =
(extent->end_sector - extent->sectors) * BDRV_SECTOR_SIZE;
extent_relative_offset = offset - extent_begin_offset;
return extent_relative_offset % cluster_size;
}
static inline uint64_t vmdk_find_index_in_cluster(VmdkExtent *extent,
int64_t sector_num)
{
uint64_t index_in_cluster, extent_begin_sector, extent_relative_sector_num;
extent_begin_sector = extent->end_sector - extent->sectors;
extent_relative_sector_num = sector_num - extent_begin_sector;
index_in_cluster = extent_relative_sector_num % extent->cluster_sectors;
return index_in_cluster;
uint64_t offset;
offset = vmdk_find_offset_in_cluster(extent, sector_num * BDRV_SECTOR_SIZE);
return offset / BDRV_SECTOR_SIZE;
}
static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
@@ -1319,38 +1326,57 @@ static int64_t coroutine_fn vmdk_co_get_block_status(BlockDriverState *bs,
}
static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset,
int64_t offset_in_cluster, const uint8_t *buf,
int nb_sectors, int64_t sector_num)
int64_t offset_in_cluster, QEMUIOVector *qiov,
uint64_t qiov_offset, uint64_t n_bytes,
uint64_t offset)
{
int ret;
VmdkGrainMarker *data = NULL;
uLongf buf_len;
const uint8_t *write_buf = buf;
int write_len = nb_sectors * 512;
QEMUIOVector local_qiov;
struct iovec iov;
int64_t write_offset;
int64_t write_end_sector;
if (extent->compressed) {
void *compressed_data;
if (!extent->has_marker) {
ret = -EINVAL;
goto out;
}
buf_len = (extent->cluster_sectors << 9) * 2;
data = g_malloc(buf_len + sizeof(VmdkGrainMarker));
if (compress(data->data, &buf_len, buf, nb_sectors << 9) != Z_OK ||
buf_len == 0) {
compressed_data = g_malloc(n_bytes);
qemu_iovec_to_buf(qiov, qiov_offset, compressed_data, n_bytes);
ret = compress(data->data, &buf_len, compressed_data, n_bytes);
g_free(compressed_data);
if (ret != Z_OK || buf_len == 0) {
ret = -EINVAL;
goto out;
}
data->lba = sector_num;
data->size = buf_len;
write_buf = (uint8_t *)data;
write_len = buf_len + sizeof(VmdkGrainMarker);
}
write_offset = cluster_offset + offset_in_cluster,
ret = bdrv_pwrite(extent->file->bs, write_offset, write_buf, write_len);
write_end_sector = DIV_ROUND_UP(write_offset + write_len, BDRV_SECTOR_SIZE);
data->lba = offset >> BDRV_SECTOR_BITS;
data->size = buf_len;
n_bytes = buf_len + sizeof(VmdkGrainMarker);
iov = (struct iovec) {
.iov_base = data,
.iov_len = n_bytes,
};
qemu_iovec_init_external(&local_qiov, &iov, 1);
} else {
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_iovec_concat(&local_qiov, qiov, qiov_offset, n_bytes);
}
write_offset = cluster_offset + offset_in_cluster,
ret = bdrv_co_pwritev(extent->file->bs, write_offset, n_bytes,
&local_qiov, 0);
write_end_sector = DIV_ROUND_UP(write_offset + n_bytes, BDRV_SECTOR_SIZE);
if (extent->compressed) {
extent->next_cluster_sector = write_end_sector;
@@ -1359,19 +1385,21 @@ static int vmdk_write_extent(VmdkExtent *extent, int64_t cluster_offset,
write_end_sector);
}
if (ret != write_len) {
ret = ret < 0 ? ret : -EIO;
if (ret < 0) {
goto out;
}
ret = 0;
out:
g_free(data);
if (!extent->compressed) {
qemu_iovec_destroy(&local_qiov);
}
return ret;
}
static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
int64_t offset_in_cluster, uint8_t *buf,
int nb_sectors)
int64_t offset_in_cluster, QEMUIOVector *qiov,
int bytes)
{
int ret;
int cluster_bytes, buf_bytes;
@@ -1383,14 +1411,13 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
if (!extent->compressed) {
ret = bdrv_pread(extent->file->bs,
cluster_offset + offset_in_cluster,
buf, nb_sectors * 512);
if (ret == nb_sectors * 512) {
return 0;
} else {
return -EIO;
ret = bdrv_co_preadv(extent->file->bs,
cluster_offset + offset_in_cluster, bytes,
qiov, 0);
if (ret < 0) {
return ret;
}
return 0;
}
cluster_bytes = extent->cluster_sectors * 512;
/* Read two clusters in case GrainMarker + compressed data > one cluster */
@@ -1422,11 +1449,11 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
}
if (offset_in_cluster < 0 ||
offset_in_cluster + nb_sectors * 512 > buf_len) {
offset_in_cluster + bytes > buf_len) {
ret = -EINVAL;
goto out;
}
memcpy(buf, uncomp_buf + offset_in_cluster, nb_sectors * 512);
qemu_iovec_from_buf(qiov, 0, uncomp_buf + offset_in_cluster, bytes);
ret = 0;
out:
@@ -1435,64 +1462,73 @@ static int vmdk_read_extent(VmdkExtent *extent, int64_t cluster_offset,
return ret;
}
static int vmdk_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
vmdk_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVmdkState *s = bs->opaque;
int ret;
uint64_t n, index_in_cluster;
uint64_t n_bytes, offset_in_cluster;
VmdkExtent *extent = NULL;
QEMUIOVector local_qiov;
uint64_t cluster_offset;
uint64_t bytes_done = 0;
while (nb_sectors > 0) {
extent = find_extent(s, sector_num, extent);
qemu_iovec_init(&local_qiov, qiov->niov);
qemu_co_mutex_lock(&s->lock);
while (bytes > 0) {
extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
if (!extent) {
return -EIO;
ret = -EIO;
goto fail;
}
ret = get_cluster_offset(bs, extent, NULL,
sector_num << 9, false, &cluster_offset,
0, 0);
index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
offset, false, &cluster_offset, 0, 0);
offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
- offset_in_cluster);
if (ret != VMDK_OK) {
/* if not allocated, try to read from parent image, if exist */
if (bs->backing && ret != VMDK_ZEROED) {
if (!vmdk_is_cid_valid(bs)) {
return -EINVAL;
ret = -EINVAL;
goto fail;
}
ret = bdrv_read(bs->backing->bs, sector_num, buf, n);
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = bdrv_co_preadv(bs->backing->bs, offset, n_bytes,
&local_qiov, 0);
if (ret < 0) {
return ret;
goto fail;
}
} else {
memset(buf, 0, 512 * n);
qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
}
} else {
ret = vmdk_read_extent(extent,
cluster_offset, index_in_cluster * 512,
buf, n);
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = vmdk_read_extent(extent, cluster_offset, offset_in_cluster,
&local_qiov, n_bytes);
if (ret) {
return ret;
goto fail;
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
}
return 0;
}
static coroutine_fn int vmdk_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVVmdkState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = vmdk_read(bs, sector_num, buf, nb_sectors);
ret = 0;
fail:
qemu_co_mutex_unlock(&s->lock);
qemu_iovec_destroy(&local_qiov);
return ret;
}
@@ -1506,38 +1542,38 @@ static coroutine_fn int vmdk_co_read(BlockDriverState *bs, int64_t sector_num,
*
* Returns: error code with 0 for success.
*/
static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors,
bool zeroed, bool zero_dry_run)
static int vmdk_pwritev(BlockDriverState *bs, uint64_t offset,
uint64_t bytes, QEMUIOVector *qiov,
bool zeroed, bool zero_dry_run)
{
BDRVVmdkState *s = bs->opaque;
VmdkExtent *extent = NULL;
int ret;
int64_t index_in_cluster, n;
int64_t offset_in_cluster, n_bytes;
uint64_t cluster_offset;
uint64_t bytes_done = 0;
VmdkMetaData m_data;
if (sector_num > bs->total_sectors) {
error_report("Wrong offset: sector_num=0x%" PRIx64
if (DIV_ROUND_UP(offset, BDRV_SECTOR_SIZE) > bs->total_sectors) {
error_report("Wrong offset: offset=0x%" PRIx64
" total_sectors=0x%" PRIx64,
sector_num, bs->total_sectors);
offset, bs->total_sectors);
return -EIO;
}
while (nb_sectors > 0) {
extent = find_extent(s, sector_num, extent);
while (bytes > 0) {
extent = find_extent(s, offset >> BDRV_SECTOR_BITS, extent);
if (!extent) {
return -EIO;
}
index_in_cluster = vmdk_find_index_in_cluster(extent, sector_num);
n = extent->cluster_sectors - index_in_cluster;
if (n > nb_sectors) {
n = nb_sectors;
}
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
offset_in_cluster = vmdk_find_offset_in_cluster(extent, offset);
n_bytes = MIN(bytes, extent->cluster_sectors * BDRV_SECTOR_SIZE
- offset_in_cluster);
ret = get_cluster_offset(bs, extent, &m_data, offset,
!(extent->compressed || zeroed),
&cluster_offset,
index_in_cluster, index_in_cluster + n);
&cluster_offset, offset_in_cluster,
offset_in_cluster + n_bytes);
if (extent->compressed) {
if (ret == VMDK_OK) {
/* Refuse write to allocated cluster for streamOptimized */
@@ -1546,7 +1582,7 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return -EIO;
} else {
/* allocate */
ret = get_cluster_offset(bs, extent, &m_data, sector_num << 9,
ret = get_cluster_offset(bs, extent, &m_data, offset,
true, &cluster_offset, 0, 0);
}
}
@@ -1556,9 +1592,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
if (zeroed) {
/* Do zeroed write, buf is ignored */
if (extent->has_zero_grain &&
index_in_cluster == 0 &&
n >= extent->cluster_sectors) {
n = extent->cluster_sectors;
offset_in_cluster == 0 &&
n_bytes >= extent->cluster_sectors * BDRV_SECTOR_SIZE) {
n_bytes = extent->cluster_sectors * BDRV_SECTOR_SIZE;
if (!zero_dry_run) {
/* update L2 tables */
if (vmdk_L2update(extent, &m_data, VMDK_GTE_ZEROED)
@@ -1570,9 +1606,8 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return -ENOTSUP;
}
} else {
ret = vmdk_write_extent(extent,
cluster_offset, index_in_cluster * 512,
buf, n, sector_num);
ret = vmdk_write_extent(extent, cluster_offset, offset_in_cluster,
qiov, bytes_done, n_bytes, offset);
if (ret) {
return ret;
}
@@ -1585,9 +1620,9 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
}
}
}
nb_sectors -= n;
sector_num += n;
buf += n * 512;
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
/* update CID on the first write every time the virtual disk is
* opened */
@@ -1602,43 +1637,84 @@ static int vmdk_write(BlockDriverState *bs, int64_t sector_num,
return 0;
}
static coroutine_fn int vmdk_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
static int coroutine_fn
vmdk_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
int ret;
BDRVVmdkState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
ret = vmdk_pwritev(bs, offset, bytes, qiov, false, false);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
typedef struct VmdkWriteCompressedCo {
BlockDriverState *bs;
int64_t sector_num;
const uint8_t *buf;
int nb_sectors;
int ret;
} VmdkWriteCompressedCo;
static void vmdk_co_write_compressed(void *opaque)
{
VmdkWriteCompressedCo *co = opaque;
QEMUIOVector local_qiov;
uint64_t offset = co->sector_num * BDRV_SECTOR_SIZE;
uint64_t bytes = co->nb_sectors * BDRV_SECTOR_SIZE;
struct iovec iov = (struct iovec) {
.iov_base = (uint8_t*) co->buf,
.iov_len = bytes,
};
qemu_iovec_init_external(&local_qiov, &iov, 1);
co->ret = vmdk_pwritev(co->bs, offset, bytes, &local_qiov, false, false);
}
static int vmdk_write_compressed(BlockDriverState *bs,
int64_t sector_num,
const uint8_t *buf,
int nb_sectors)
{
BDRVVmdkState *s = bs->opaque;
if (s->num_extents == 1 && s->extents[0].compressed) {
return vmdk_write(bs, sector_num, buf, nb_sectors, false, false);
Coroutine *co;
AioContext *aio_context = bdrv_get_aio_context(bs);
VmdkWriteCompressedCo data = {
.bs = bs,
.sector_num = sector_num,
.buf = buf,
.nb_sectors = nb_sectors,
.ret = -EINPROGRESS,
};
co = qemu_coroutine_create(vmdk_co_write_compressed);
qemu_coroutine_enter(co, &data);
while (data.ret == -EINPROGRESS) {
aio_poll(aio_context, true);
}
return data.ret;
} else {
return -ENOTSUP;
}
}
static int coroutine_fn vmdk_co_write_zeroes(BlockDriverState *bs,
int64_t sector_num,
int nb_sectors,
BdrvRequestFlags flags)
static int coroutine_fn vmdk_co_pwrite_zeroes(BlockDriverState *bs,
int64_t offset,
int bytes,
BdrvRequestFlags flags)
{
int ret;
BDRVVmdkState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
/* write zeroes could fail if sectors not aligned to cluster, test it with
* dry_run == true before really updating image */
ret = vmdk_write(bs, sector_num, NULL, nb_sectors, true, true);
ret = vmdk_pwritev(bs, offset, bytes, NULL, true, true);
if (!ret) {
ret = vmdk_write(bs, sector_num, NULL, nb_sectors, true, false);
ret = vmdk_pwritev(bs, offset, bytes, NULL, true, false);
}
qemu_co_mutex_unlock(&s->lock);
return ret;
@@ -1728,12 +1804,12 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
header.check_bytes[3] = 0xa;
/* write all the data */
ret = blk_pwrite(blk, 0, &magic, sizeof(magic));
ret = blk_pwrite(blk, 0, &magic, sizeof(magic), 0);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
}
ret = blk_pwrite(blk, sizeof(magic), &header, sizeof(header));
ret = blk_pwrite(blk, sizeof(magic), &header, sizeof(header), 0);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
@@ -1753,7 +1829,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
gd_buf[i] = cpu_to_le32(tmp);
}
ret = blk_pwrite(blk, le64_to_cpu(header.rgd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
gd_buf, gd_buf_size, 0);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
@@ -1765,7 +1841,7 @@ static int vmdk_create_extent(const char *filename, int64_t filesize,
gd_buf[i] = cpu_to_le32(tmp);
}
ret = blk_pwrite(blk, le64_to_cpu(header.gd_offset) * BDRV_SECTOR_SIZE,
gd_buf, gd_buf_size);
gd_buf, gd_buf_size, 0);
if (ret < 0) {
error_setg(errp, QERR_IO_ERROR);
goto exit;
@@ -1829,8 +1905,8 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
int64_t total_size = 0, filesize;
char *adapter_type = NULL;
char *backing_file = NULL;
char *hw_version = NULL;
char *fmt = NULL;
int flags = 0;
int ret = 0;
bool flat, split, compress;
GString *ext_desc_lines;
@@ -1861,7 +1937,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
"# The Disk Data Base\n"
"#DDB\n"
"\n"
"ddb.virtualHWVersion = \"%d\"\n"
"ddb.virtualHWVersion = \"%s\"\n"
"ddb.geometry.cylinders = \"%" PRId64 "\"\n"
"ddb.geometry.heads = \"%" PRIu32 "\"\n"
"ddb.geometry.sectors = \"63\"\n"
@@ -1878,8 +1954,20 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
BDRV_SECTOR_SIZE);
adapter_type = qemu_opt_get_del(opts, BLOCK_OPT_ADAPTER_TYPE);
backing_file = qemu_opt_get_del(opts, BLOCK_OPT_BACKING_FILE);
hw_version = qemu_opt_get_del(opts, BLOCK_OPT_HWVERSION);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_COMPAT6, false)) {
flags |= BLOCK_FLAG_COMPAT6;
if (strcmp(hw_version, "undefined")) {
error_setg(errp,
"compat6 cannot be enabled with hwversion set");
ret = -EINVAL;
goto exit;
}
g_free(hw_version);
hw_version = g_strdup("6");
}
if (strcmp(hw_version, "undefined") == 0) {
g_free(hw_version);
hw_version = g_strdup("4");
}
fmt = qemu_opt_get_del(opts, BLOCK_OPT_SUBFMT);
if (qemu_opt_get_bool_del(opts, BLOCK_OPT_ZEROED_GRAIN, false)) {
@@ -2001,7 +2089,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
fmt,
parent_desc_line,
ext_desc_lines->str,
(flags & BLOCK_FLAG_COMPAT6 ? 6 : 4),
hw_version,
total_size /
(int64_t)(63 * number_heads * BDRV_SECTOR_SIZE),
number_heads,
@@ -2028,7 +2116,7 @@ static int vmdk_create(const char *filename, QemuOpts *opts, Error **errp)
blk_set_allow_write_beyond_eof(new_blk, true);
ret = blk_pwrite(new_blk, desc_offset, desc, desc_len);
ret = blk_pwrite(new_blk, desc_offset, desc, desc_len, 0);
if (ret < 0) {
error_setg_errno(errp, -ret, "Could not write description");
goto exit;
@@ -2047,6 +2135,7 @@ exit:
}
g_free(adapter_type);
g_free(backing_file);
g_free(hw_version);
g_free(fmt);
g_free(desc);
g_free(path);
@@ -2250,27 +2339,6 @@ static int vmdk_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static void vmdk_detach_aio_context(BlockDriverState *bs)
{
BDRVVmdkState *s = bs->opaque;
int i;
for (i = 0; i < s->num_extents; i++) {
bdrv_detach_aio_context(s->extents[i].file->bs);
}
}
static void vmdk_attach_aio_context(BlockDriverState *bs,
AioContext *new_context)
{
BDRVVmdkState *s = bs->opaque;
int i;
for (i = 0; i < s->num_extents; i++) {
bdrv_attach_aio_context(s->extents[i].file->bs, new_context);
}
}
static QemuOptsList vmdk_create_opts = {
.name = "vmdk-create-opts",
.head = QTAILQ_HEAD_INITIALIZER(vmdk_create_opts.head),
@@ -2297,6 +2365,12 @@ static QemuOptsList vmdk_create_opts = {
.help = "VMDK version 6 image",
.def_value_str = "off"
},
{
.name = BLOCK_OPT_HWVERSION,
.type = QEMU_OPT_STRING,
.help = "VMDK hardware version",
.def_value_str = "undefined"
},
{
.name = BLOCK_OPT_SUBFMT,
.type = QEMU_OPT_STRING,
@@ -2321,10 +2395,10 @@ static BlockDriver bdrv_vmdk = {
.bdrv_open = vmdk_open,
.bdrv_check = vmdk_check,
.bdrv_reopen_prepare = vmdk_reopen_prepare,
.bdrv_read = vmdk_co_read,
.bdrv_write = vmdk_co_write,
.bdrv_co_preadv = vmdk_co_preadv,
.bdrv_co_pwritev = vmdk_co_pwritev,
.bdrv_write_compressed = vmdk_write_compressed,
.bdrv_co_write_zeroes = vmdk_co_write_zeroes,
.bdrv_co_pwrite_zeroes = vmdk_co_pwrite_zeroes,
.bdrv_close = vmdk_close,
.bdrv_create = vmdk_create,
.bdrv_co_flush_to_disk = vmdk_co_flush,
@@ -2334,8 +2408,6 @@ static BlockDriver bdrv_vmdk = {
.bdrv_get_specific_info = vmdk_get_specific_info,
.bdrv_refresh_limits = vmdk_refresh_limits,
.bdrv_get_info = vmdk_get_info,
.bdrv_detach_aio_context = vmdk_detach_aio_context,
.bdrv_attach_aio_context = vmdk_attach_aio_context,
.supports_backing = true,
.create_opts = &vmdk_create_opts,

View File

@@ -29,6 +29,7 @@
#include "sysemu/block-backend.h"
#include "qemu/module.h"
#include "migration/migration.h"
#include "qemu/bswap.h"
#if defined(CONFIG_UUID)
#include <uuid/uuid.h>
#endif
@@ -454,22 +455,21 @@ static int vpc_reopen_prepare(BDRVReopenState *state,
* The parameter write must be 1 if the offset will be used for a write
* operation (the block bitmaps is updated then), 0 otherwise.
*/
static inline int64_t get_sector_offset(BlockDriverState *bs,
int64_t sector_num, int write)
static inline int64_t get_image_offset(BlockDriverState *bs, uint64_t offset,
bool write)
{
BDRVVPCState *s = bs->opaque;
uint64_t offset = sector_num * 512;
uint64_t bitmap_offset, block_offset;
uint32_t pagetable_index, pageentry_index;
uint32_t pagetable_index, offset_in_block;
pagetable_index = offset / s->block_size;
pageentry_index = (offset % s->block_size) / 512;
offset_in_block = offset % s->block_size;
if (pagetable_index >= s->max_table_entries || s->pagetable[pagetable_index] == 0xffffffff)
return -1; /* not allocated */
bitmap_offset = 512 * (uint64_t) s->pagetable[pagetable_index];
block_offset = bitmap_offset + s->bitmap_size + (512 * pageentry_index);
block_offset = bitmap_offset + s->bitmap_size + offset_in_block;
/* We must ensure that we don't write to any sectors which are marked as
unused in the bitmap. We get away with setting all bits in the block
@@ -487,6 +487,12 @@ static inline int64_t get_sector_offset(BlockDriverState *bs,
return block_offset;
}
static inline int64_t get_sector_offset(BlockDriverState *bs,
int64_t sector_num, bool write)
{
return get_image_offset(bs, sector_num * BDRV_SECTOR_SIZE, write);
}
/*
* Writes the footer to the end of the image file. This is needed when the
* file grows as it overwrites the old footer
@@ -513,7 +519,7 @@ static int rewrite_footer(BlockDriverState* bs)
*
* Returns the sectors' offset in the image file on success and < 0 on error
*/
static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
static int64_t alloc_block(BlockDriverState* bs, int64_t offset)
{
BDRVVPCState *s = bs->opaque;
int64_t bat_offset;
@@ -522,14 +528,13 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
uint8_t bitmap[s->bitmap_size];
/* Check if sector_num is valid */
if ((sector_num < 0) || (sector_num > bs->total_sectors))
return -1;
if ((offset < 0) || (offset > bs->total_sectors * BDRV_SECTOR_SIZE)) {
return -EINVAL;
}
/* Write entry into in-memory BAT */
index = (sector_num * 512) / s->block_size;
if (s->pagetable[index] != 0xFFFFFFFF)
return -1;
index = offset / s->block_size;
assert(s->pagetable[index] == 0xFFFFFFFF);
s->pagetable[index] = s->free_data_block_offset / 512;
/* Initialize the block's bitmap */
@@ -553,11 +558,11 @@ static int64_t alloc_block(BlockDriverState* bs, int64_t sector_num)
if (ret < 0)
goto fail;
return get_sector_offset(bs, sector_num, 0);
return get_image_offset(bs, offset, false);
fail:
s->free_data_block_offset -= (s->block_size + s->bitmap_size);
return -1;
return ret;
}
static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
@@ -573,104 +578,105 @@ static int vpc_get_info(BlockDriverState *bs, BlockDriverInfo *bdi)
return 0;
}
static int vpc_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
vpc_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVPCState *s = bs->opaque;
int ret;
int64_t offset;
int64_t sectors, sectors_per_block;
int64_t image_offset;
int64_t n_bytes;
int64_t bytes_done = 0;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
QEMUIOVector local_qiov;
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_read(bs->file->bs, sector_num, buf, nb_sectors);
return bdrv_co_preadv(bs->file->bs, offset, bytes, qiov, 0);
}
while (nb_sectors > 0) {
offset = get_sector_offset(bs, sector_num, 0);
sectors_per_block = s->block_size >> BDRV_SECTOR_BITS;
sectors = sectors_per_block - (sector_num % sectors_per_block);
if (sectors > nb_sectors) {
sectors = nb_sectors;
}
qemu_co_mutex_lock(&s->lock);
qemu_iovec_init(&local_qiov, qiov->niov);
if (offset == -1) {
memset(buf, 0, sectors * BDRV_SECTOR_SIZE);
while (bytes > 0) {
image_offset = get_image_offset(bs, offset, false);
n_bytes = MIN(bytes, s->block_size - (offset % s->block_size));
if (image_offset == -1) {
qemu_iovec_memset(qiov, bytes_done, 0, n_bytes);
} else {
ret = bdrv_pread(bs->file->bs, offset, buf,
sectors * BDRV_SECTOR_SIZE);
if (ret != sectors * BDRV_SECTOR_SIZE) {
return -1;
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = bdrv_co_preadv(bs->file->bs, image_offset, n_bytes,
&local_qiov, 0);
if (ret < 0) {
goto fail;
}
}
nb_sectors -= sectors;
sector_num += sectors;
buf += sectors * BDRV_SECTOR_SIZE;
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
}
return 0;
}
static coroutine_fn int vpc_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
{
int ret;
BDRVVPCState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = vpc_read(bs, sector_num, buf, nb_sectors);
ret = 0;
fail:
qemu_iovec_destroy(&local_qiov);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
static int vpc_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
static int coroutine_fn
vpc_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVPCState *s = bs->opaque;
int64_t offset;
int64_t sectors, sectors_per_block;
int64_t image_offset;
int64_t n_bytes;
int64_t bytes_done = 0;
int ret;
VHDFooter *footer = (VHDFooter *) s->footer_buf;
QEMUIOVector local_qiov;
if (be32_to_cpu(footer->type) == VHD_FIXED) {
return bdrv_write(bs->file->bs, sector_num, buf, nb_sectors);
}
while (nb_sectors > 0) {
offset = get_sector_offset(bs, sector_num, 1);
sectors_per_block = s->block_size >> BDRV_SECTOR_BITS;
sectors = sectors_per_block - (sector_num % sectors_per_block);
if (sectors > nb_sectors) {
sectors = nb_sectors;
}
if (offset == -1) {
offset = alloc_block(bs, sector_num);
if (offset < 0)
return -1;
}
ret = bdrv_pwrite(bs->file->bs, offset, buf,
sectors * BDRV_SECTOR_SIZE);
if (ret != sectors * BDRV_SECTOR_SIZE) {
return -1;
}
nb_sectors -= sectors;
sector_num += sectors;
buf += sectors * BDRV_SECTOR_SIZE;
return bdrv_co_pwritev(bs->file->bs, offset, bytes, qiov, 0);
}
return 0;
}
static coroutine_fn int vpc_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
{
int ret;
BDRVVPCState *s = bs->opaque;
qemu_co_mutex_lock(&s->lock);
ret = vpc_write(bs, sector_num, buf, nb_sectors);
qemu_iovec_init(&local_qiov, qiov->niov);
while (bytes > 0) {
image_offset = get_image_offset(bs, offset, true);
n_bytes = MIN(bytes, s->block_size - (offset % s->block_size));
if (image_offset == -1) {
image_offset = alloc_block(bs, offset);
if (image_offset < 0) {
ret = image_offset;
goto fail;
}
}
qemu_iovec_reset(&local_qiov);
qemu_iovec_concat(&local_qiov, qiov, bytes_done, n_bytes);
ret = bdrv_co_pwritev(bs->file->bs, image_offset, n_bytes,
&local_qiov, 0);
if (ret < 0) {
goto fail;
}
bytes -= n_bytes;
offset += n_bytes;
bytes_done += n_bytes;
}
ret = 0;
fail:
qemu_iovec_destroy(&local_qiov);
qemu_co_mutex_unlock(&s->lock);
return ret;
}
@@ -783,13 +789,13 @@ static int create_dynamic_disk(BlockBackend *blk, uint8_t *buf,
block_size = 0x200000;
num_bat_entries = (total_sectors + block_size / 512) / (block_size / 512);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE, 0);
if (ret < 0) {
goto fail;
}
offset = 1536 + ((num_bat_entries * 4 + 511) & ~511);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE);
ret = blk_pwrite(blk, offset, buf, HEADER_SIZE, 0);
if (ret < 0) {
goto fail;
}
@@ -799,7 +805,7 @@ static int create_dynamic_disk(BlockBackend *blk, uint8_t *buf,
memset(buf, 0xFF, 512);
for (i = 0; i < (num_bat_entries * 4 + 511) / 512; i++) {
ret = blk_pwrite(blk, offset, buf, 512);
ret = blk_pwrite(blk, offset, buf, 512, 0);
if (ret < 0) {
goto fail;
}
@@ -826,7 +832,7 @@ static int create_dynamic_disk(BlockBackend *blk, uint8_t *buf,
/* Write the header */
offset = 512;
ret = blk_pwrite(blk, offset, buf, 1024);
ret = blk_pwrite(blk, offset, buf, 1024, 0);
if (ret < 0) {
goto fail;
}
@@ -848,7 +854,7 @@ static int create_fixed_disk(BlockBackend *blk, uint8_t *buf,
return ret;
}
ret = blk_pwrite(blk, total_size - HEADER_SIZE, buf, HEADER_SIZE);
ret = blk_pwrite(blk, total_size - HEADER_SIZE, buf, HEADER_SIZE, 0);
if (ret < 0) {
return ret;
}
@@ -1056,8 +1062,8 @@ static BlockDriver bdrv_vpc = {
.bdrv_reopen_prepare = vpc_reopen_prepare,
.bdrv_create = vpc_create,
.bdrv_read = vpc_co_read,
.bdrv_write = vpc_co_write,
.bdrv_co_preadv = vpc_co_preadv,
.bdrv_co_pwritev = vpc_co_pwritev,
.bdrv_co_get_block_status = vpc_co_get_block_status,
.bdrv_get_info = vpc_get_info,

View File

@@ -27,6 +27,7 @@
#include "qapi/error.h"
#include "block/block_int.h"
#include "qemu/module.h"
#include "qemu/bswap.h"
#include "migration/migration.h"
#include "qapi/qmp/qint.h"
#include "qapi/qmp/qbool.h"
@@ -113,15 +114,12 @@ static inline int array_ensure_allocated(array_t* array, int index)
static inline void* array_get_next(array_t* array) {
unsigned int next = array->next;
void* result;
if (array_ensure_allocated(array, next) < 0)
return NULL;
array->next = next + 1;
result = array_get(array, next);
return result;
return array_get(array, next);
}
static inline void* array_insert(array_t* array,unsigned int index,unsigned int count) {
@@ -1179,6 +1177,7 @@ static int vvfat_open(BlockDriverState *bs, QDict *options, int flags,
bs->read_only = 0;
}
bs->request_alignment = BDRV_SECTOR_SIZE; /* No sub-sector I/O supported */
bs->total_sectors = cyls * heads * secs;
if (init_directories(s, dirname, heads, secs, errp)) {
@@ -1421,14 +1420,31 @@ DLOG(fprintf(stderr, "sector %d not allocated\n", (int)sector_num));
return 0;
}
static coroutine_fn int vvfat_co_read(BlockDriverState *bs, int64_t sector_num,
uint8_t *buf, int nb_sectors)
static int coroutine_fn
vvfat_co_preadv(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
int ret;
BDRVVVFATState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
void *buf;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
buf = g_try_malloc(bytes);
if (bytes && buf == NULL) {
return -ENOMEM;
}
qemu_co_mutex_lock(&s->lock);
ret = vvfat_read(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
qemu_iovec_from_buf(qiov, 0, buf, bytes);
g_free(buf);
return ret;
}
@@ -1941,8 +1957,7 @@ DLOG(fprintf(stderr, "check direntry %d:\n", i); print_direntry(direntries + i))
/* check file size with FAT */
cluster_count = get_cluster_count_for_direntry(s, direntries + i, path2);
if (cluster_count !=
(le32_to_cpu(direntries[i].size) + s->cluster_size
- 1) / s->cluster_size) {
DIV_ROUND_UP(le32_to_cpu(direntries[i].size), s->cluster_size)) {
DLOG(fprintf(stderr, "Cluster count mismatch\n"));
goto fail;
}
@@ -2880,14 +2895,31 @@ DLOG(checkpoint());
return 0;
}
static coroutine_fn int vvfat_co_write(BlockDriverState *bs, int64_t sector_num,
const uint8_t *buf, int nb_sectors)
static int coroutine_fn
vvfat_co_pwritev(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
int ret;
BDRVVVFATState *s = bs->opaque;
uint64_t sector_num = offset >> BDRV_SECTOR_BITS;
int nb_sectors = bytes >> BDRV_SECTOR_BITS;
void *buf;
assert((offset & (BDRV_SECTOR_SIZE - 1)) == 0);
assert((bytes & (BDRV_SECTOR_SIZE - 1)) == 0);
buf = g_try_malloc(bytes);
if (bytes && buf == NULL) {
return -ENOMEM;
}
qemu_iovec_to_buf(qiov, 0, buf, bytes);
qemu_co_mutex_lock(&s->lock);
ret = vvfat_write(bs, sector_num, buf, nb_sectors);
qemu_co_mutex_unlock(&s->lock);
g_free(buf);
return ret;
}
@@ -2904,8 +2936,10 @@ static int64_t coroutine_fn vvfat_co_get_block_status(BlockDriverState *bs,
return BDRV_BLOCK_DATA;
}
static int write_target_commit(BlockDriverState *bs, int64_t sector_num,
const uint8_t* buffer, int nb_sectors) {
static int coroutine_fn
write_target_commit(BlockDriverState *bs, uint64_t offset, uint64_t bytes,
QEMUIOVector *qiov, int flags)
{
BDRVVVFATState* s = *((BDRVVVFATState**) bs->opaque);
return try_commit(s);
}
@@ -2918,7 +2952,7 @@ static void write_target_close(BlockDriverState *bs) {
static BlockDriver vvfat_write_target = {
.format_name = "vvfat_write_target",
.bdrv_write = write_target_commit,
.bdrv_co_pwritev = write_target_commit,
.bdrv_close = write_target_close,
};
@@ -2960,12 +2994,12 @@ static int enable_write_target(BDRVVVFATState *s, Error **errp)
goto err;
}
s->qcow = NULL;
options = qdict_new();
qdict_put(options, "driver", qstring_from_str("qcow"));
ret = bdrv_open(&s->qcow, s->qcow_filename, NULL, options,
BDRV_O_RDWR | BDRV_O_NO_FLUSH, errp);
if (ret < 0) {
s->qcow = bdrv_open(s->qcow_filename, NULL, options,
BDRV_O_RDWR | BDRV_O_NO_FLUSH, errp);
if (!s->qcow) {
ret = -EINVAL;
goto err;
}
@@ -3014,8 +3048,8 @@ static BlockDriver bdrv_vvfat = {
.bdrv_file_open = vvfat_open,
.bdrv_close = vvfat_close,
.bdrv_read = vvfat_co_read,
.bdrv_write = vvfat_co_write,
.bdrv_co_preadv = vvfat_co_preadv,
.bdrv_co_pwritev = vvfat_co_pwritev,
.bdrv_co_get_block_status = vvfat_co_get_block_status,
};

View File

@@ -56,6 +56,8 @@
static QTAILQ_HEAD(, BlockDriverState) monitor_bdrv_states =
QTAILQ_HEAD_INITIALIZER(monitor_bdrv_states);
static int do_open_tray(const char *device, bool force, Error **errp);
static const char *const if_name[IF_COUNT] = {
[IF_NONE] = "none",
[IF_IDE] = "ide",
@@ -73,7 +75,7 @@ static int if_max_devs[IF_COUNT] = {
* Do not change these numbers! They govern how drive option
* index maps to unit and bus. That mapping is ABI.
*
* All controllers used to imlement if=T drives need to support
* All controllers used to implement if=T drives need to support
* if_max_devs[T] units, for any T with if_max_devs[T] != 0.
* Otherwise, some index values map to "impossible" bus, unit
* values.
@@ -567,25 +569,12 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
if ((!file || !*file) && !qdict_size(bs_opts)) {
BlockBackendRootState *blk_rs;
blk = blk_new(errp);
if (!blk) {
goto early_err;
}
blk = blk_new();
blk_rs = blk_get_root_state(blk);
blk_rs->open_flags = bdrv_flags;
blk_rs->read_only = !(bdrv_flags & BDRV_O_RDWR);
blk_rs->detect_zeroes = detect_zeroes;
if (throttle_enabled(&cfg)) {
if (!throttling_group) {
throttling_group = blk_name(blk);
}
blk_rs->throttle_group = g_strdup(throttling_group);
blk_rs->throttle_state = throttle_group_incref(throttling_group);
blk_rs->throttle_state->cfg = cfg;
}
QDECREF(bs_opts);
} else {
if (file && !*file) {
@@ -611,15 +600,6 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
bs->detect_zeroes = detect_zeroes;
/* disk I/O throttling */
if (throttle_enabled(&cfg)) {
if (!throttling_group) {
throttling_group = blk_name(blk);
}
bdrv_io_limits_enable(bs, throttling_group);
bdrv_set_io_limits(bs, &cfg);
}
if (bdrv_key_required(bs)) {
autostart = 0;
}
@@ -633,6 +613,15 @@ static BlockBackend *blockdev_init(const char *file, QDict *bs_opts,
}
}
/* disk I/O throttling */
if (throttle_enabled(&cfg)) {
if (!throttling_group) {
throttling_group = blk_name(blk);
}
blk_io_limits_enable(blk, throttling_group);
blk_set_io_limits(blk, &cfg);
}
blk_set_enable_write_cache(blk, !writethrough);
blk_set_on_error(blk, on_read_error, on_write_error);
@@ -666,7 +655,6 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
QemuOpts *opts;
Error *local_error = NULL;
BlockdevDetectZeroesOptions detect_zeroes;
int ret;
int bdrv_flags = 0;
opts = qemu_opts_create(&qemu_root_bds_opts, NULL, 1, errp);
@@ -697,9 +685,8 @@ static BlockDriverState *bds_tree_init(QDict *bs_opts, Error **errp)
bdrv_flags |= BDRV_O_INACTIVE;
}
bs = NULL;
ret = bdrv_open(&bs, NULL, NULL, bs_opts, bdrv_flags, errp);
if (ret < 0) {
bs = bdrv_open(NULL, NULL, bs_opts, bdrv_flags, errp);
if (!bs) {
goto fail_no_bs_opts;
}
@@ -1652,7 +1639,7 @@ typedef struct ExternalSnapshotState {
static void external_snapshot_prepare(BlkActionState *common,
Error **errp)
{
int flags = 0, ret;
int flags = 0;
QDict *options = NULL;
Error *local_err = NULL;
/* Device and node name of the image to generate the snapshot from */
@@ -1777,17 +1764,16 @@ static void external_snapshot_prepare(BlkActionState *common,
flags |= BDRV_O_NO_BACKING;
}
assert(state->new_bs == NULL);
ret = bdrv_open(&state->new_bs, new_image_file, snapshot_ref, options,
flags, errp);
state->new_bs = bdrv_open(new_image_file, snapshot_ref, options, flags,
errp);
/* We will manually add the backing_hd field to the bs later */
if (ret != 0) {
if (!state->new_bs) {
return;
}
if (state->new_bs->blk != NULL) {
if (bdrv_has_blk(state->new_bs)) {
error_setg(errp, "The snapshot is already in use by %s",
blk_name(state->new_bs->blk));
bdrv_get_parent_name(state->new_bs));
return;
}
@@ -2293,12 +2279,18 @@ exit:
void qmp_eject(const char *device, bool has_force, bool force, Error **errp)
{
Error *local_err = NULL;
int rc;
qmp_blockdev_open_tray(device, has_force, force, &local_err);
if (local_err) {
if (!has_force) {
force = false;
}
rc = do_open_tray(device, force, &local_err);
if (rc && rc != -ENOSYS) {
error_propagate(errp, local_err);
return;
}
error_free(local_err);
qmp_x_blockdev_remove_medium(device, errp);
}
@@ -2327,35 +2319,41 @@ void qmp_block_passwd(bool has_device, const char *device,
aio_context_release(aio_context);
}
void qmp_blockdev_open_tray(const char *device, bool has_force, bool force,
Error **errp)
/*
* Attempt to open the tray of @device.
* If @force, ignore its tray lock.
* Else, if the tray is locked, don't open it, but ask the guest to open it.
* On error, store an error through @errp and return -errno.
* If @device does not exist, return -ENODEV.
* If it has no removable media, return -ENOTSUP.
* If it has no tray, return -ENOSYS.
* If the guest was asked to open the tray, return -EINPROGRESS.
* Else, return 0.
*/
static int do_open_tray(const char *device, bool force, Error **errp)
{
BlockBackend *blk;
bool locked;
if (!has_force) {
force = false;
}
blk = blk_by_name(device);
if (!blk) {
error_set(errp, ERROR_CLASS_DEVICE_NOT_FOUND,
"Device '%s' not found", device);
return;
return -ENODEV;
}
if (!blk_dev_has_removable_media(blk)) {
error_setg(errp, "Device '%s' is not removable", device);
return;
return -ENOTSUP;
}
if (!blk_dev_has_tray(blk)) {
/* Ignore this command on tray-less devices */
return;
error_setg(errp, "Device '%s' does not have a tray", device);
return -ENOSYS;
}
if (blk_dev_is_tray_open(blk)) {
return;
return 0;
}
locked = blk_dev_is_medium_locked(blk);
@@ -2366,6 +2364,31 @@ void qmp_blockdev_open_tray(const char *device, bool has_force, bool force,
if (!locked || force) {
blk_dev_change_media_cb(blk, false);
}
if (locked && !force) {
error_setg(errp, "Device '%s' is locked and force was not specified, "
"wait for tray to open and try again", device);
return -EINPROGRESS;
}
return 0;
}
void qmp_blockdev_open_tray(const char *device, bool has_force, bool force,
Error **errp)
{
Error *local_err = NULL;
int rc;
if (!has_force) {
force = false;
}
rc = do_open_tray(device, force, &local_err);
if (rc && rc != -ENOSYS && rc != -EINPROGRESS) {
error_propagate(errp, local_err);
return;
}
error_free(local_err);
}
void qmp_blockdev_close_tray(const char *device, Error **errp)
@@ -2503,9 +2526,9 @@ void qmp_x_blockdev_insert_medium(const char *device, const char *node_name,
return;
}
if (bs->blk) {
if (bdrv_has_blk(bs)) {
error_setg(errp, "Node '%s' is already in use by '%s'", node_name,
blk_name(bs->blk));
bdrv_get_parent_name(bs));
return;
}
@@ -2520,7 +2543,8 @@ void qmp_blockdev_change_medium(const char *device, const char *filename,
{
BlockBackend *blk;
BlockDriverState *medium_bs = NULL;
int bdrv_flags, ret;
int bdrv_flags;
int rc;
QDict *options = NULL;
Error *err = NULL;
@@ -2564,25 +2588,24 @@ void qmp_blockdev_change_medium(const char *device, const char *filename,
qdict_put(options, "driver", qstring_from_str(format));
}
assert(!medium_bs);
ret = bdrv_open(&medium_bs, filename, NULL, options, bdrv_flags, errp);
if (ret < 0) {
medium_bs = bdrv_open(filename, NULL, options, bdrv_flags, errp);
if (!medium_bs) {
goto fail;
}
blk_apply_root_state(blk, medium_bs);
bdrv_add_key(medium_bs, NULL, &err);
if (err) {
error_propagate(errp, err);
goto fail;
}
qmp_blockdev_open_tray(device, false, false, &err);
if (err) {
rc = do_open_tray(device, false, &err);
if (rc && rc != -ENOSYS) {
error_propagate(errp, err);
goto fail;
}
error_free(err);
err = NULL;
qmp_x_blockdev_remove_medium(device, &err);
if (err) {
@@ -2596,6 +2619,8 @@ void qmp_blockdev_change_medium(const char *device, const char *filename,
goto fail;
}
blk_apply_root_state(blk, medium_bs);
qmp_blockdev_close_tray(device, errp);
fail:
@@ -2661,13 +2686,6 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
goto out;
}
/* The BlockBackend must be the only parent */
assert(QLIST_FIRST(&bs->parents));
if (QLIST_NEXT(QLIST_FIRST(&bs->parents), next_parent)) {
error_setg(errp, "Cannot throttle device with multiple parents");
goto out;
}
throttle_config_init(&cfg);
cfg.buckets[THROTTLE_BPS_TOTAL].avg = bps;
cfg.buckets[THROTTLE_BPS_READ].avg = bps_rd;
@@ -2726,16 +2744,16 @@ void qmp_block_set_io_throttle(const char *device, int64_t bps, int64_t bps_rd,
if (throttle_enabled(&cfg)) {
/* Enable I/O limits if they're not enabled yet, otherwise
* just update the throttling group. */
if (!bs->throttle_state) {
bdrv_io_limits_enable(bs, has_group ? group : device);
if (!blk_get_public(blk)->throttle_state) {
blk_io_limits_enable(blk, has_group ? group : device);
} else if (has_group) {
bdrv_io_limits_update_group(bs, group);
blk_io_limits_update_group(blk, group);
}
/* Set the new throttling configuration */
bdrv_set_io_limits(bs, &cfg);
} else if (bs->throttle_state) {
blk_set_io_limits(blk, &cfg);
} else if (blk_get_public(blk)->throttle_state) {
/* If all throttling settings are set to 0, disable I/O limits */
bdrv_io_limits_disable(bs);
blk_io_limits_disable(blk);
}
out:
@@ -3186,7 +3204,6 @@ static void do_drive_backup(const char *device, const char *target,
Error *local_err = NULL;
int flags;
int64_t size;
int ret;
if (!has_speed) {
speed = 0;
@@ -3270,10 +3287,8 @@ static void do_drive_backup(const char *device, const char *target,
qdict_put(options, "driver", qstring_from_str(format));
}
target_bs = NULL;
ret = bdrv_open(&target_bs, target, NULL, options, flags, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
target_bs = bdrv_open(target, NULL, options, flags, errp);
if (!target_bs) {
goto out;
}
@@ -3291,8 +3306,8 @@ static void do_drive_backup(const char *device, const char *target,
backup_start(bs, target_bs, speed, sync, bmap,
on_source_error, on_target_error,
block_job_cb, bs, txn, &local_err);
bdrv_unref(target_bs);
if (local_err != NULL) {
bdrv_unref(target_bs);
error_propagate(errp, local_err);
goto out;
}
@@ -3333,7 +3348,7 @@ void do_blockdev_backup(const char *device, const char *target,
BlockdevOnError on_target_error,
BlockJobTxn *txn, Error **errp)
{
BlockBackend *blk, *target_blk;
BlockBackend *blk;
BlockDriverState *bs;
BlockDriverState *target_bs;
Error *local_err = NULL;
@@ -3364,24 +3379,25 @@ void do_blockdev_backup(const char *device, const char *target,
}
bs = blk_bs(blk);
target_blk = blk_by_name(target);
if (!target_blk) {
error_setg(errp, "Device '%s' not found", target);
target_bs = bdrv_lookup_bs(target, target, errp);
if (!target_bs) {
goto out;
}
if (!blk_is_available(target_blk)) {
error_setg(errp, "Device '%s' has no medium", target);
goto out;
if (bdrv_get_aio_context(target_bs) != aio_context) {
if (!bdrv_has_blk(target_bs)) {
/* The target BDS is not attached, we can safely move it to another
* AioContext. */
bdrv_set_aio_context(target_bs, aio_context);
} else {
error_setg(errp, "Target is attached to a different thread from "
"source.");
goto out;
}
}
target_bs = blk_bs(target_blk);
bdrv_ref(target_bs);
bdrv_set_aio_context(target_bs, aio_context);
backup_start(bs, target_bs, speed, sync, NULL, on_source_error,
on_target_error, block_job_cb, bs, txn, &local_err);
if (local_err != NULL) {
bdrv_unref(target_bs);
error_propagate(errp, local_err);
}
out:
@@ -3410,6 +3426,7 @@ static void blockdev_mirror_common(BlockDriverState *bs,
BlockDriverState *target,
bool has_replaces, const char *replaces,
enum MirrorSyncMode sync,
BlockMirrorBackingMode backing_mode,
bool has_speed, int64_t speed,
bool has_granularity, uint32_t granularity,
bool has_buf_size, int64_t buf_size,
@@ -3457,10 +3474,6 @@ static void blockdev_mirror_common(BlockDriverState *bs,
if (bdrv_op_is_blocked(target, BLOCK_OP_TYPE_MIRROR_TARGET, errp)) {
return;
}
if (target->blk) {
error_setg(errp, "Cannot mirror to an attached block device");
return;
}
if (!bs->backing && sync == MIRROR_SYNC_MODE_TOP) {
sync = MIRROR_SYNC_MODE_FULL;
@@ -3471,7 +3484,7 @@ static void blockdev_mirror_common(BlockDriverState *bs,
*/
mirror_start(bs, target,
has_replaces ? replaces : NULL,
speed, granularity, buf_size, sync,
speed, granularity, buf_size, sync, backing_mode,
on_source_error, on_target_error, unmap,
block_job_cb, bs, errp);
}
@@ -3494,11 +3507,11 @@ void qmp_drive_mirror(const char *device, const char *target,
BlockBackend *blk;
BlockDriverState *source, *target_bs;
AioContext *aio_context;
BlockMirrorBackingMode backing_mode;
Error *local_err = NULL;
QDict *options = NULL;
int flags;
int64_t size;
int ret;
blk = blk_by_name(device);
if (!blk) {
@@ -3568,6 +3581,12 @@ void qmp_drive_mirror(const char *device, const char *target,
}
}
if (mode == NEW_IMAGE_MODE_ABSOLUTE_PATHS) {
backing_mode = MIRROR_SOURCE_BACKING_CHAIN;
} else {
backing_mode = MIRROR_OPEN_BACKING_CHAIN;
}
if ((sync == MIRROR_SYNC_MODE_FULL || !source)
&& mode != NEW_IMAGE_MODE_EXISTING)
{
@@ -3607,18 +3626,16 @@ void qmp_drive_mirror(const char *device, const char *target,
/* Mirroring takes care of copy-on-write using the source's backing
* file.
*/
target_bs = NULL;
ret = bdrv_open(&target_bs, target, NULL, options,
flags | BDRV_O_NO_BACKING, &local_err);
if (ret < 0) {
error_propagate(errp, local_err);
target_bs = bdrv_open(target, NULL, options, flags | BDRV_O_NO_BACKING,
errp);
if (!target_bs) {
goto out;
}
bdrv_set_aio_context(target_bs, aio_context);
blockdev_mirror_common(bs, target_bs,
has_replaces, replaces, sync,
has_replaces, replaces, sync, backing_mode,
has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
@@ -3626,10 +3643,8 @@ void qmp_drive_mirror(const char *device, const char *target,
has_on_target_error, on_target_error,
has_unmap, unmap,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
bdrv_unref(target_bs);
}
bdrv_unref(target_bs);
error_propagate(errp, local_err);
out:
aio_context_release(aio_context);
}
@@ -3650,6 +3665,7 @@ void qmp_blockdev_mirror(const char *device, const char *target,
BlockBackend *blk;
BlockDriverState *target_bs;
AioContext *aio_context;
BlockMirrorBackingMode backing_mode = MIRROR_LEAVE_BACKING_CHAIN;
Error *local_err = NULL;
blk = blk_by_name(device);
@@ -3672,11 +3688,10 @@ void qmp_blockdev_mirror(const char *device, const char *target,
aio_context = bdrv_get_aio_context(bs);
aio_context_acquire(aio_context);
bdrv_ref(target_bs);
bdrv_set_aio_context(target_bs, aio_context);
blockdev_mirror_common(bs, target_bs,
has_replaces, replaces, sync,
has_replaces, replaces, sync, backing_mode,
has_speed, speed,
has_granularity, granularity,
has_buf_size, buf_size,
@@ -3684,10 +3699,7 @@ void qmp_blockdev_mirror(const char *device, const char *target,
has_on_target_error, on_target_error,
true, true,
&local_err);
if (local_err) {
error_propagate(errp, local_err);
bdrv_unref(target_bs);
}
error_propagate(errp, local_err);
aio_context_release(aio_context);
}
@@ -3795,6 +3807,7 @@ void qmp_block_job_resume(const char *device, Error **errp)
job->user_paused = false;
trace_qmp_block_job_resume(job);
block_job_iostatus_reset(job);
block_job_resume(job);
aio_context_release(aio_context);
}
@@ -3897,9 +3910,7 @@ void qmp_change_backing_file(const char *device,
if (ro) {
bdrv_reopen(image_bs, open_flags, &local_err);
if (local_err) {
error_propagate(errp, local_err); /* will preserve prior errp */
}
error_propagate(errp, local_err);
}
out:
@@ -4046,15 +4057,15 @@ void qmp_x_blockdev_del(bool has_id, const char *id,
bs = blk_bs(blk);
aio_context = blk_get_aio_context(blk);
} else {
blk = NULL;
bs = bdrv_find_node(node_name);
if (!bs) {
error_setg(errp, "Cannot find node %s", node_name);
return;
}
blk = bs->blk;
if (blk) {
if (bdrv_has_blk(bs)) {
error_setg(errp, "Node %s is in use by %s",
node_name, blk_name(blk));
node_name, bdrv_get_parent_name(bs));
return;
}
aio_context = bdrv_get_aio_context(bs);
@@ -4092,24 +4103,76 @@ out:
aio_context_release(aio_context);
}
static BdrvChild *bdrv_find_child(BlockDriverState *parent_bs,
const char *child_name)
{
BdrvChild *child;
QLIST_FOREACH(child, &parent_bs->children, next) {
if (strcmp(child->name, child_name) == 0) {
return child;
}
}
return NULL;
}
void qmp_x_blockdev_change(const char *parent, bool has_child,
const char *child, bool has_node,
const char *node, Error **errp)
{
BlockDriverState *parent_bs, *new_bs = NULL;
BdrvChild *p_child;
parent_bs = bdrv_lookup_bs(parent, parent, errp);
if (!parent_bs) {
return;
}
if (has_child == has_node) {
if (has_child) {
error_setg(errp, "The parameters child and node are in conflict");
} else {
error_setg(errp, "Either child or node must be specified");
}
return;
}
if (has_child) {
p_child = bdrv_find_child(parent_bs, child);
if (!p_child) {
error_setg(errp, "Node '%s' does not have child '%s'",
parent, child);
return;
}
bdrv_del_child(parent_bs, p_child, errp);
}
if (has_node) {
new_bs = bdrv_find_node(node);
if (!new_bs) {
error_setg(errp, "Node '%s' not found", node);
return;
}
bdrv_add_child(parent_bs, new_bs, errp);
}
}
BlockJobInfoList *qmp_query_block_jobs(Error **errp)
{
BlockJobInfoList *head = NULL, **p_next = &head;
BlockDriverState *bs;
BlockJob *job;
for (bs = bdrv_next(NULL); bs; bs = bdrv_next(bs)) {
AioContext *aio_context = bdrv_get_aio_context(bs);
for (job = block_job_next(NULL); job; job = block_job_next(job)) {
BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
AioContext *aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
if (bs->job) {
BlockJobInfoList *elem = g_new0(BlockJobInfoList, 1);
elem->value = block_job_query(bs->job);
*p_next = elem;
p_next = &elem->next;
}
elem->value = block_job_query(job);
aio_context_release(aio_context);
*p_next = elem;
p_next = &elem->next;
}
return head;

View File

@@ -50,17 +50,75 @@ struct BlockJobTxn {
int refcnt;
};
static QLIST_HEAD(, BlockJob) block_jobs = QLIST_HEAD_INITIALIZER(block_jobs);
BlockJob *block_job_next(BlockJob *job)
{
if (!job) {
return QLIST_FIRST(&block_jobs);
}
return QLIST_NEXT(job, job_list);
}
/* Normally the job runs in its BlockBackend's AioContext. The exception is
* block_job_defer_to_main_loop() where it runs in the QEMU main loop. Code
* that supports both cases uses this helper function.
*/
static AioContext *block_job_get_aio_context(BlockJob *job)
{
return job->deferred_to_main_loop ?
qemu_get_aio_context() :
blk_get_aio_context(job->blk);
}
static void block_job_attached_aio_context(AioContext *new_context,
void *opaque)
{
BlockJob *job = opaque;
if (job->driver->attached_aio_context) {
job->driver->attached_aio_context(job, new_context);
}
block_job_resume(job);
}
static void block_job_detach_aio_context(void *opaque)
{
BlockJob *job = opaque;
/* In case the job terminates during aio_poll()... */
block_job_ref(job);
block_job_pause(job);
if (!job->paused) {
/* If job is !job->busy this kicks it into the next pause point. */
block_job_enter(job);
}
while (!job->paused && !job->completed) {
aio_poll(block_job_get_aio_context(job), true);
}
block_job_unref(job);
}
void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
int64_t speed, BlockCompletionFunc *cb,
void *opaque, Error **errp)
{
BlockBackend *blk;
BlockJob *job;
assert(cb);
if (bs->job) {
error_setg(errp, QERR_DEVICE_IN_USE, bdrv_get_device_name(bs));
return NULL;
}
bdrv_ref(bs);
blk = blk_new();
blk_insert_bs(blk, bs);
job = g_malloc0(driver->instance_size);
error_setg(&job->blocker, "block device is in use by block job: %s",
BlockJobType_lookup[driver->job_type]);
@@ -69,13 +127,18 @@ void *block_job_create(const BlockJobDriver *driver, BlockDriverState *bs,
job->driver = driver;
job->id = g_strdup(bdrv_get_device_name(bs));
job->bs = bs;
job->blk = blk;
job->cb = cb;
job->opaque = opaque;
job->busy = true;
job->refcnt = 1;
bs->job = job;
QLIST_INSERT_HEAD(&block_jobs, job, job_list);
blk_add_aio_context_notifier(blk, block_job_attached_aio_context,
block_job_detach_aio_context, job);
/* Only set speed when necessary to avoid NotSupported error */
if (speed != 0) {
Error *local_err = NULL;
@@ -98,11 +161,16 @@ void block_job_ref(BlockJob *job)
void block_job_unref(BlockJob *job)
{
if (--job->refcnt == 0) {
job->bs->job = NULL;
bdrv_op_unblock_all(job->bs, job->blocker);
bdrv_unref(job->bs);
BlockDriverState *bs = blk_bs(job->blk);
bs->job = NULL;
bdrv_op_unblock_all(bs, job->blocker);
blk_remove_aio_context_notifier(job->blk,
block_job_attached_aio_context,
block_job_detach_aio_context, job);
blk_unref(job->blk);
error_free(job->blocker);
g_free(job->id);
QLIST_REMOVE(job, job_list);
g_free(job);
}
}
@@ -140,7 +208,7 @@ static void block_job_completed_txn_abort(BlockJob *job)
txn->aborting = true;
/* We are the first failed job. Cancel other jobs. */
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
ctx = bdrv_get_aio_context(other_job->bs);
ctx = blk_get_aio_context(other_job->blk);
aio_context_acquire(ctx);
}
QLIST_FOREACH(other_job, &txn->jobs, txn_list) {
@@ -157,7 +225,7 @@ static void block_job_completed_txn_abort(BlockJob *job)
assert(other_job->completed);
}
QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
ctx = bdrv_get_aio_context(other_job->bs);
ctx = blk_get_aio_context(other_job->blk);
block_job_completed_single(other_job);
aio_context_release(ctx);
}
@@ -179,7 +247,7 @@ static void block_job_completed_txn_success(BlockJob *job)
}
/* We are the last completed job, commit the transaction. */
QLIST_FOREACH_SAFE(other_job, &txn->jobs, txn_list, next) {
ctx = bdrv_get_aio_context(other_job->bs);
ctx = blk_get_aio_context(other_job->blk);
aio_context_acquire(ctx);
assert(other_job->ret == 0);
block_job_completed_single(other_job);
@@ -189,9 +257,7 @@ static void block_job_completed_txn_success(BlockJob *job)
void block_job_completed(BlockJob *job, int ret)
{
BlockDriverState *bs = job->bs;
assert(bs->job == job);
assert(blk_bs(job->blk)->job == job);
assert(!job->completed);
job->completed = true;
job->ret = ret;
@@ -236,11 +302,37 @@ void block_job_pause(BlockJob *job)
job->pause_count++;
}
bool block_job_is_paused(BlockJob *job)
static bool block_job_should_pause(BlockJob *job)
{
return job->pause_count > 0;
}
void coroutine_fn block_job_pause_point(BlockJob *job)
{
if (!block_job_should_pause(job)) {
return;
}
if (block_job_is_cancelled(job)) {
return;
}
if (job->driver->pause) {
job->driver->pause(job);
}
if (block_job_should_pause(job) && !block_job_is_cancelled(job)) {
job->paused = true;
job->busy = false;
qemu_coroutine_yield(); /* wait for block_job_resume() */
job->busy = true;
job->paused = false;
}
if (job->driver->resume) {
job->driver->resume(job);
}
}
void block_job_resume(BlockJob *job)
{
assert(job->pause_count > 0);
@@ -253,7 +345,6 @@ void block_job_resume(BlockJob *job)
void block_job_enter(BlockJob *job)
{
block_job_iostatus_reset(job);
if (job->co && !job->busy) {
qemu_coroutine_enter(job->co, NULL);
}
@@ -262,6 +353,7 @@ void block_job_enter(BlockJob *job)
void block_job_cancel(BlockJob *job)
{
job->cancelled = true;
block_job_iostatus_reset(job);
block_job_enter(job);
}
@@ -282,11 +374,10 @@ static int block_job_finish_sync(BlockJob *job,
void (*finish)(BlockJob *, Error **errp),
Error **errp)
{
BlockDriverState *bs = job->bs;
Error *local_err = NULL;
int ret;
assert(bs->job == job);
assert(blk_bs(job->blk)->job == job);
block_job_ref(job);
finish(job, &local_err);
@@ -296,9 +387,7 @@ static int block_job_finish_sync(BlockJob *job,
return -EBUSY;
}
while (!job->completed) {
aio_poll(job->deferred_to_main_loop ? qemu_get_aio_context() :
bdrv_get_aio_context(bs),
true);
aio_poll(block_job_get_aio_context(job), true);
}
ret = (job->cancelled && job->ret == 0) ? -ECANCELED : job->ret;
block_job_unref(job);
@@ -318,6 +407,19 @@ int block_job_cancel_sync(BlockJob *job)
return block_job_finish_sync(job, &block_job_cancel_err, NULL);
}
void block_job_cancel_sync_all(void)
{
BlockJob *job;
AioContext *aio_context;
while ((job = QLIST_FIRST(&block_jobs))) {
aio_context = blk_get_aio_context(job->blk);
aio_context_acquire(aio_context);
block_job_cancel_sync(job);
aio_context_release(aio_context);
}
}
int block_job_complete_sync(BlockJob *job, Error **errp)
{
return block_job_finish_sync(job, &block_job_complete, errp);
@@ -333,12 +435,12 @@ void block_job_sleep_ns(BlockJob *job, QEMUClockType type, int64_t ns)
}
job->busy = false;
if (block_job_is_paused(job)) {
qemu_coroutine_yield();
} else {
co_aio_sleep_ns(bdrv_get_aio_context(job->bs), type, ns);
if (!block_job_should_pause(job)) {
co_aio_sleep_ns(blk_get_aio_context(job->blk), type, ns);
}
job->busy = true;
block_job_pause_point(job);
}
void block_job_yield(BlockJob *job)
@@ -351,8 +453,12 @@ void block_job_yield(BlockJob *job)
}
job->busy = false;
qemu_coroutine_yield();
if (!block_job_should_pause(job)) {
qemu_coroutine_yield();
}
job->busy = true;
block_job_pause_point(job);
}
BlockJobInfo *block_job_query(BlockJob *job)
@@ -411,8 +517,7 @@ void block_job_event_ready(BlockJob *job)
job->speed, &error_abort);
}
BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
BlockdevOnError on_err,
BlockErrorAction block_job_error_action(BlockJob *job, BlockdevOnError on_err,
int is_read, int error)
{
BlockErrorAction action;
@@ -443,9 +548,6 @@ BlockErrorAction block_job_error_action(BlockJob *job, BlockDriverState *bs,
job->user_paused = true;
block_job_pause(job);
block_job_iostatus_set_err(job, error);
if (bs->blk && bs != job->bs) {
blk_iostatus_set_err(bs->blk, error);
}
}
return action;
}
@@ -469,7 +571,7 @@ static void block_job_defer_to_main_loop_bh(void *opaque)
aio_context_acquire(data->aio_context);
/* Fetch BDS AioContext again, in case it has changed */
aio_context = bdrv_get_aio_context(data->job->bs);
aio_context = blk_get_aio_context(data->job->blk);
aio_context_acquire(aio_context);
data->job->deferred_to_main_loop = false;
@@ -489,7 +591,7 @@ void block_job_defer_to_main_loop(BlockJob *job,
BlockJobDeferToMainLoopData *data = g_malloc(sizeof(*data));
data->job = job;
data->bh = qemu_bh_new(block_job_defer_to_main_loop_bh, data);
data->aio_context = bdrv_get_aio_context(job->bs);
data->aio_context = blk_get_aio_context(job->blk);
data->fn = fn;
data->opaque = opaque;
job->deferred_to_main_loop = true;

View File

@@ -28,6 +28,7 @@
#include "qapi/visitor.h"
#include "qemu/error-report.h"
#include "hw/hw.h"
#include "hw/qdev-core.h"
typedef struct FWBootEntry FWBootEntry;
@@ -301,9 +302,7 @@ static void device_set_bootindex(Object *obj, Visitor *v, const char *name,
add_boot_device_path(*prop->bootindex, prop->dev, prop->suffix);
out:
if (local_err) {
error_propagate(errp, local_err);
}
error_propagate(errp, local_err);
}
static void property_release_bootindex(Object *obj, const char *name,

View File

@@ -1,7 +1,6 @@
/* This is the Linux kernel elf-loading code, ported into user space */
#include "qemu/osdep.h"
#include <sys/mman.h>
#include "qemu.h"
#include "disas/disas.h"

View File

@@ -18,13 +18,14 @@
*/
#include "qemu/osdep.h"
#include <machine/trap.h>
#include <sys/mman.h>
#include "qapi/error.h"
#include "qemu.h"
#include "qemu/path.h"
#include "qemu/help_option.h"
/* For tb_lock */
#include "cpu.h"
#include "exec/exec-all.h"
#include "tcg.h"
#include "qemu/timer.h"
#include "qemu/envlist.h"
@@ -752,9 +753,6 @@ int main(int argc, char **argv)
}
cpu_model = NULL;
#if defined(cpudef_setup)
cpudef_setup(); /* parse cpu definitions in target config file (TBD) */
#endif
optind = 1;
for(;;) {
@@ -849,7 +847,8 @@ int main(int argc, char **argv)
}
/* init debug */
qemu_set_log_filename(log_file);
qemu_log_needs_buffers();
qemu_set_log_filename(log_file, &error_fatal);
if (log_mask) {
int mask;

View File

@@ -17,7 +17,6 @@
* along with this program; if not, see <http://www.gnu.org/licenses/>.
*/
#include "qemu/osdep.h"
#include <sys/mman.h>
#include "qemu.h"
#include "qemu-common.h"

View File

@@ -19,6 +19,7 @@
#include "cpu.h"
#include "exec/exec-all.h"
#include "exec/cpu_ldst.h"
#undef DEBUG_REMAP

View File

@@ -19,7 +19,6 @@
#include "qemu/osdep.h"
#include "qemu/cutils.h"
#include "qemu/path.h"
#include <sys/mman.h>
#include <sys/syscall.h>
#include <sys/param.h>
#include <sys/sysctl.h>
@@ -316,12 +315,14 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, abi_long arg1,
abi_long arg5, abi_long arg6, abi_long arg7,
abi_long arg8)
{
CPUState *cpu = ENV_GET_CPU(cpu_env);
abi_long ret;
void *p;
#ifdef DEBUG
gemu_log("freebsd syscall %d\n", num);
#endif
trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, arg7, arg8);
if(do_strace)
print_freebsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
@@ -401,6 +402,7 @@ abi_long do_freebsd_syscall(void *cpu_env, int num, abi_long arg1,
#endif
if (do_strace)
print_freebsd_syscall_ret(num, ret);
trace_guest_user_syscall_ret(cpu, num, ret);
return ret;
efault:
ret = -TARGET_EFAULT;
@@ -411,12 +413,14 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long arg1,
abi_long arg2, abi_long arg3, abi_long arg4,
abi_long arg5, abi_long arg6)
{
CPUState *cpu = ENV_GET_CPU(cpu_env);
abi_long ret;
void *p;
#ifdef DEBUG
gemu_log("netbsd syscall %d\n", num);
#endif
trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 0);
if(do_strace)
print_netbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
@@ -473,6 +477,7 @@ abi_long do_netbsd_syscall(void *cpu_env, int num, abi_long arg1,
#endif
if (do_strace)
print_netbsd_syscall_ret(num, ret);
trace_guest_user_syscall_ret(cpu, num, ret);
return ret;
efault:
ret = -TARGET_EFAULT;
@@ -483,12 +488,14 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, abi_long arg1,
abi_long arg2, abi_long arg3, abi_long arg4,
abi_long arg5, abi_long arg6)
{
CPUState *cpu = ENV_GET_CPU(cpu_env);
abi_long ret;
void *p;
#ifdef DEBUG
gemu_log("openbsd syscall %d\n", num);
#endif
trace_guest_user_syscall(cpu, num, arg1, arg2, arg3, arg4, arg5, arg6, 0, 0);
if(do_strace)
print_openbsd_syscall(num, arg1, arg2, arg3, arg4, arg5, arg6);
@@ -545,6 +552,7 @@ abi_long do_openbsd_syscall(void *cpu_env, int num, abi_long arg1,
#endif
if (do_strace)
print_openbsd_syscall_ret(num, ret);
trace_guest_user_syscall_ret(cpu, num, ret);
return ret;
efault:
ret = -TARGET_EFAULT;

366
configure vendored
View File

@@ -31,6 +31,7 @@ TMPCXX="${TMPDIR1}/${TMPB}.cxx"
TMPL="${TMPDIR1}/${TMPB}.lo"
TMPA="${TMPDIR1}/lib${TMPB}.la"
TMPE="${TMPDIR1}/${TMPB}.exe"
TMPMO="${TMPDIR1}/${TMPB}.mo"
rm -f config.log
@@ -164,7 +165,7 @@ have_backend () {
}
# default parameters
source_path=`dirname "$0"`
source_path=$(dirname "$0")
cpu=""
iasl="iasl"
interp_prefix="/usr/gnemul/qemu-%M"
@@ -207,7 +208,7 @@ fdt=""
netmap="no"
pixman=""
sdl=""
sdlabi="1.2"
sdlabi=""
virtfs=""
vnc="yes"
sparse="no"
@@ -269,7 +270,6 @@ aix="no"
blobs="yes"
pkgversion=""
pie=""
zero_malloc=""
qom_cast_debug="yes"
trace_backends="log"
trace_file="trace"
@@ -323,7 +323,7 @@ jemalloc="no"
# parse CC options first
for opt do
optarg=`expr "x$opt" : 'x[^=]*=\(.*\)'`
optarg=$(expr "x$opt" : 'x[^=]*=\(.*\)')
case "$opt" in
--cross-prefix=*) cross_prefix="$optarg"
;;
@@ -398,7 +398,7 @@ if test "$debug_info" = "yes"; then
fi
# make source path absolute
source_path=`cd "$source_path"; pwd`
source_path=$(cd "$source_path"; pwd)
# running configure in the source tree?
# we know that's the case if configure is there.
@@ -443,7 +443,7 @@ elif check_define __sun__ ; then
elif check_define __HAIKU__ ; then
targetos='Haiku'
else
targetos=`uname -s`
targetos=$(uname -s)
fi
# Some host OSes need non-standard checks for which CPU to use.
@@ -461,7 +461,7 @@ Darwin)
fi
;;
SunOS)
# `uname -m` returns i86pc even on an x86_64 box, so default based on isainfo
# $(uname -m) returns i86pc even on an x86_64 box, so default based on isainfo
if test -z "$cpu" && test "$(isainfo -k)" = "amd64"; then
cpu="x86_64"
fi
@@ -507,7 +507,7 @@ elif check_define __aarch64__ ; then
elif check_define __hppa__ ; then
cpu="hppa"
else
cpu=`uname -m`
cpu=$(uname -m)
fi
ARCH=
@@ -627,7 +627,7 @@ SunOS)
ld="gld"
smbd="${SMBD-/usr/sfw/sbin/smbd}"
needs_libsunmath="no"
solarisrev=`uname -r | cut -f2 -d.`
solarisrev=$(uname -r | cut -f2 -d.)
if [ "$cpu" = "i386" -o "$cpu" = "x86_64" ] ; then
if test "$solarisrev" -le 9 ; then
if test -f /opt/SUNWspro/prod/lib/libsunmath.so.1; then
@@ -722,7 +722,7 @@ fi
werror=""
for opt do
optarg=`expr "x$opt" : 'x[^=]*=\(.*\)'`
optarg=$(expr "x$opt" : 'x[^=]*=\(.*\)')
case "$opt" in
--help|-h) show_help=yes
;;
@@ -846,9 +846,9 @@ for opt do
;;
--audio-drv-list=*) audio_drv_list="$optarg"
;;
--block-drv-rw-whitelist=*|--block-drv-whitelist=*) block_drv_rw_whitelist=`echo "$optarg" | sed -e 's/,/ /g'`
--block-drv-rw-whitelist=*|--block-drv-whitelist=*) block_drv_rw_whitelist=$(echo "$optarg" | sed -e 's/,/ /g')
;;
--block-drv-ro-whitelist=*) block_drv_ro_whitelist=`echo "$optarg" | sed -e 's/,/ /g'`
--block-drv-ro-whitelist=*) block_drv_ro_whitelist=$(echo "$optarg" | sed -e 's/,/ /g')
;;
--enable-debug-tcg) debug_tcg="yes"
;;
@@ -943,7 +943,7 @@ for opt do
;;
--enable-cocoa)
cocoa="yes" ;
audio_drv_list="coreaudio `echo $audio_drv_list | sed s,coreaudio,,g`"
audio_drv_list="coreaudio $(echo $audio_drv_list | sed s,coreaudio,,g)"
;;
--disable-system) softmmu="no"
;;
@@ -1216,6 +1216,13 @@ esac
QEMU_CFLAGS="$CPU_CFLAGS $QEMU_CFLAGS"
EXTRA_CFLAGS="$CPU_CFLAGS $EXTRA_CFLAGS"
# For user-mode emulation the host arch has to be one we explicitly
# support, even if we're using TCI.
if [ "$ARCH" = "unknown" ]; then
bsd_user="no"
linux_user="no"
fi
default_target_list=""
mak_wilds=""
@@ -1380,7 +1387,6 @@ fi
if test "$ARCH" = "unknown"; then
if test "$tcg_interpreter" = "yes" ; then
echo "Unsupported CPU = $cpu, will use TCG with TCI (experimental)"
ARCH=tci
else
error_exit "Unsupported CPU = $cpu, try --enable-tcg-interpreter"
fi
@@ -1388,11 +1394,9 @@ fi
# Consult white-list to determine whether to enable werror
# by default. Only enable by default for git builds
z_version=`cut -f3 -d. $source_path/VERSION`
if test -z "$werror" ; then
if test -d "$source_path/.git" -a \
"$linux" = "yes" ; then
\( "$linux" = "yes" -o "$mingw32" = "yes" \) ; then
werror="yes"
else
werror="no"
@@ -1617,7 +1621,7 @@ if test "$solaris" = "yes" ; then
"install fileutils from www.blastwave.org using pkg-get -i fileutils" \
"to get ginstall which is used by default (which lives in /opt/csw/bin)"
fi
if test "`path_of $install`" = "/usr/sbin/install" ; then
if test "$(path_of $install)" = "/usr/sbin/install" ; then
error_exit "Solaris /usr/sbin/install is not an appropriate install program." \
"try ginstall from the GNU fileutils available from www.blastwave.org" \
"using pkg-get -i fileutils, or use --install=/usr/ucb/install"
@@ -1636,7 +1640,7 @@ fi
if test -z "${target_list+xxx}" ; then
target_list="$default_target_list"
else
target_list=`echo "$target_list" | sed -e 's/,/ /g'`
target_list=$(echo "$target_list" | sed -e 's/,/ /g')
fi
# Check that we recognised the target name; this allows a more
@@ -1781,14 +1785,23 @@ fi
# avx2 optimization requirement check
cat > $TMPC << EOF
static void bar(void) {}
#pragma GCC push_options
#pragma GCC target("avx2")
#include <cpuid.h>
#include <immintrin.h>
static int bar(void *a) {
return _mm256_movemask_epi8(_mm256_cmpeq_epi8(*(__m256i *)a, (__m256i){0}));
}
static void *bar_ifunc(void) {return (void*) bar;}
static void foo(void) __attribute__((ifunc("bar_ifunc")));
int main(void) { foo(); return 0; }
int foo(void *a) __attribute__((ifunc("bar_ifunc")));
int main(int argc, char *argv[]) { return foo(argv[0]);}
EOF
if compile_prog "-mavx2" "" ; then
if readelf --syms $TMPE |grep "IFUNC.*foo" >/dev/null 2>&1; then
avx2_opt="yes"
if compile_object "" ; then
if has readelf; then
if readelf --syms $TMPO 2>/dev/null |grep -q "IFUNC.*foo"; then
avx2_opt="yes"
fi
fi
fi
@@ -1879,6 +1892,9 @@ if test "$seccomp" != "no" ; then
arm|aarch64)
libseccomp_minver="2.2.3"
;;
ppc|ppc64)
libseccomp_minver="2.3.0"
;;
*)
libseccomp_minver=""
;;
@@ -1886,8 +1902,8 @@ if test "$seccomp" != "no" ; then
if test "$libseccomp_minver" != "" &&
$pkg_config --atleast-version=$libseccomp_minver libseccomp ; then
libs_softmmu="$libs_softmmu `$pkg_config --libs libseccomp`"
QEMU_CFLAGS="$QEMU_CFLAGS `$pkg_config --cflags libseccomp`"
libs_softmmu="$libs_softmmu $($pkg_config --libs libseccomp)"
QEMU_CFLAGS="$QEMU_CFLAGS $($pkg_config --cflags libseccomp)"
seccomp="yes"
else
if test "$seccomp" = "yes" ; then
@@ -2127,8 +2143,8 @@ fi
x11_cflags=
x11_libs=-lX11
if $pkg_config --exists "x11"; then
x11_cflags=`$pkg_config --cflags x11`
x11_libs=`$pkg_config --libs x11`
x11_cflags=$($pkg_config --cflags x11)
x11_libs=$($pkg_config --libs x11)
fi
##########################################
@@ -2155,8 +2171,9 @@ if test "$gtk" != "no"; then
gtkversion="2.18.0"
fi
if $pkg_config --exists "$gtkpackage >= $gtkversion"; then
gtk_cflags=`$pkg_config --cflags $gtkpackage`
gtk_libs=`$pkg_config --libs $gtkpackage`
gtk_cflags=$($pkg_config --cflags $gtkpackage)
gtk_libs=$($pkg_config --libs $gtkpackage)
gtk_version=$($pkg_config --modversion $gtkpackage)
if $pkg_config --exists "$gtkx11package >= $gtkversion"; then
gtk_cflags="$gtk_cflags $x11_cflags"
gtk_libs="$gtk_libs $x11_libs"
@@ -2194,8 +2211,8 @@ gnutls_gcrypt=no
gnutls_nettle=no
if test "$gnutls" != "no"; then
if gnutls_works; then
gnutls_cflags=`$pkg_config --cflags gnutls`
gnutls_libs=`$pkg_config --libs gnutls`
gnutls_cflags=$($pkg_config --cflags gnutls)
gnutls_libs=$($pkg_config --libs gnutls)
libs_softmmu="$gnutls_libs $libs_softmmu"
libs_tools="$gnutls_libs $libs_tools"
QEMU_CFLAGS="$QEMU_CFLAGS $gnutls_cflags"
@@ -2219,7 +2236,7 @@ if test "$gnutls" != "no"; then
gnutls_gcrypt=no
gnutls_nettle=yes
elif $pkg_config --exists 'gnutls >= 2.12'; then
case `$pkg_config --libs --static gnutls` in
case $($pkg_config --libs --static gnutls) in
*gcrypt*)
gnutls_gcrypt=yes
gnutls_nettle=no
@@ -2280,7 +2297,7 @@ has_libgcrypt_config() {
if test -n "$cross_prefix"
then
host=`libgcrypt-config --host`
host=$(libgcrypt-config --host)
if test "$host-" != $cross_prefix
then
return 1
@@ -2292,8 +2309,8 @@ has_libgcrypt_config() {
if test "$gcrypt" != "no"; then
if has_libgcrypt_config; then
gcrypt_cflags=`libgcrypt-config --cflags`
gcrypt_libs=`libgcrypt-config --libs`
gcrypt_cflags=$(libgcrypt-config --cflags)
gcrypt_libs=$(libgcrypt-config --libs)
# Debian has remove -lgpg-error from libgcrypt-config
# as it "spreads unnecessary dependencies" which in
# turn breaks static builds...
@@ -2333,15 +2350,16 @@ fi
if test "$nettle" != "no"; then
if $pkg_config --exists "nettle"; then
nettle_cflags=`$pkg_config --cflags nettle`
nettle_libs=`$pkg_config --libs nettle`
nettle_version=`$pkg_config --modversion nettle`
nettle_cflags=$($pkg_config --cflags nettle)
nettle_libs=$($pkg_config --libs nettle)
nettle_version=$($pkg_config --modversion nettle)
libs_softmmu="$nettle_libs $libs_softmmu"
libs_tools="$nettle_libs $libs_tools"
QEMU_CFLAGS="$QEMU_CFLAGS $nettle_cflags"
nettle="yes"
cat > $TMPC << EOF
#include <stddef.h>
#include <nettle/pbkdf2.h>
int main(void) {
pbkdf2_hmac_sha256(8, NULL, 1000, 8, NULL, 8, NULL);
@@ -2372,8 +2390,8 @@ tasn1=yes
tasn1_cflags=""
tasn1_libs=""
if $pkg_config --exists "libtasn1"; then
tasn1_cflags=`$pkg_config --cflags libtasn1`
tasn1_libs=`$pkg_config --libs libtasn1`
tasn1_cflags=$($pkg_config --cflags libtasn1)
tasn1_libs=$($pkg_config --libs libtasn1)
else
tasn1=no
fi
@@ -2392,20 +2410,25 @@ fi
if test "$vte" != "no"; then
if test "$gtkabi" = "3.0"; then
vtepackage="vte-2.90"
vteversion="0.32.0"
vteminversion="0.32.0"
if $pkg_config --exists "vte-2.91"; then
vtepackage="vte-2.91"
else
vtepackage="vte-2.90"
fi
else
vtepackage="vte"
vteversion="0.24.0"
vteminversion="0.24.0"
fi
if $pkg_config --exists "$vtepackage >= $vteversion"; then
vte_cflags=`$pkg_config --cflags $vtepackage`
vte_libs=`$pkg_config --libs $vtepackage`
if $pkg_config --exists "$vtepackage >= $vteminversion"; then
vte_cflags=$($pkg_config --cflags $vtepackage)
vte_libs=$($pkg_config --libs $vtepackage)
vteversion=$($pkg_config --modversion $vtepackage)
libs_softmmu="$vte_libs $libs_softmmu"
vte="yes"
elif test "$vte" = "yes"; then
if test "$gtkabi" = "3.0"; then
feature_not_found "vte" "Install libvte-2.90 devel"
feature_not_found "vte" "Install libvte-2.90/2.91 devel"
else
feature_not_found "vte" "Install libvte devel"
fi
@@ -2420,25 +2443,37 @@ fi
# Look for sdl configuration program (pkg-config or sdl-config). Try
# sdl-config even without cross prefix, and favour pkg-config over sdl-config.
if test "$sdlabi" = ""; then
if $pkg_config --exists "sdl"; then
sdlabi=1.2
elif $pkg_config --exists "sdl2"; then
sdlabi=2.0
else
sdlabi=1.2
fi
fi
if test $sdlabi = "2.0"; then
sdl_config=$sdl2_config
sdlname=sdl2
sdlconfigname=sdl2_config
else
elif test $sdlabi = "1.2"; then
sdlname=sdl
sdlconfigname=sdl_config
else
error_exit "Unknown sdlabi $sdlabi, must be 1.2 or 2.0"
fi
if test "`basename $sdl_config`" != $sdlconfigname && ! has ${sdl_config}; then
if test "$(basename $sdl_config)" != $sdlconfigname && ! has ${sdl_config}; then
sdl_config=$sdlconfigname
fi
if $pkg_config $sdlname --exists; then
sdlconfig="$pkg_config $sdlname"
_sdlversion=`$sdlconfig --modversion 2>/dev/null | sed 's/[^0-9]//g'`
sdlversion=$($sdlconfig --modversion 2>/dev/null)
elif has ${sdl_config}; then
sdlconfig="$sdl_config"
_sdlversion=`$sdlconfig --version | sed 's/[^0-9]//g'`
sdlversion=$($sdlconfig --version)
else
if test "$sdl" = "yes" ; then
feature_not_found "sdl" "Install SDL devel"
@@ -2456,14 +2491,14 @@ if test "$sdl" != "no" ; then
#undef main /* We don't want SDL to override our main() */
int main( void ) { return SDL_Init (SDL_INIT_VIDEO); }
EOF
sdl_cflags=`$sdlconfig --cflags 2> /dev/null`
sdl_cflags=$($sdlconfig --cflags 2>/dev/null)
if test "$static" = "yes" ; then
sdl_libs=`$sdlconfig --static-libs 2>/dev/null`
sdl_libs=$($sdlconfig --static-libs 2>/dev/null)
else
sdl_libs=`$sdlconfig --libs 2> /dev/null`
sdl_libs=$($sdlconfig --libs 2>/dev/null)
fi
if compile_prog "$sdl_cflags" "$sdl_libs" ; then
if test "$_sdlversion" -lt 121 ; then
if test $(echo $sdlversion | sed 's/[^0-9]//g') -lt 121 ; then
sdl_too_old=yes
else
sdl=yes
@@ -2472,8 +2507,8 @@ EOF
# static link with sdl ? (note: sdl.pc's --static --libs is broken)
if test "$sdl" = "yes" -a "$static" = "yes" ; then
if test $? = 0 && echo $sdl_libs | grep -- -laa > /dev/null; then
sdl_libs="$sdl_libs `aalib-config --static-libs 2>/dev/null`"
sdl_cflags="$sdl_cflags `aalib-config --cflags 2>/dev/null`"
sdl_libs="$sdl_libs $(aalib-config --static-libs 2>/dev/null)"
sdl_cflags="$sdl_cflags $(aalib-config --cflags 2>/dev/null)"
fi
if compile_prog "$sdl_cflags" "$sdl_libs" ; then
:
@@ -2590,8 +2625,8 @@ int main(void) {
}
EOF
if $pkg_config libpng --exists; then
vnc_png_cflags=`$pkg_config libpng --cflags`
vnc_png_libs=`$pkg_config libpng --libs`
vnc_png_cflags=$($pkg_config libpng --cflags)
vnc_png_libs=$($pkg_config libpng --libs)
else
vnc_png_cflags=""
vnc_png_libs="-lpng"
@@ -2785,7 +2820,7 @@ EOF
fi
}
audio_drv_list=`echo "$audio_drv_list" | sed -e 's/,/ /g'`
audio_drv_list=$(echo "$audio_drv_list" | sed -e 's/,/ /g')
for drv in $audio_drv_list; do
case $drv in
alsa)
@@ -2795,8 +2830,8 @@ for drv in $audio_drv_list; do
;;
pa)
audio_drv_probe $drv pulse/mainloop.h "-lpulse" \
"pa_mainloop *m = 0; pa_mainloop_free (m); return 0;"
audio_drv_probe $drv pulse/pulseaudio.h "-lpulse" \
"pa_context_set_source_output_volume(NULL, 0, NULL, NULL, NULL); return 0;"
libs_softmmu="-lpulse $libs_softmmu"
audio_pt_int="yes"
;;
@@ -2897,8 +2932,8 @@ if test "$curl" != "no" ; then
#include <curl/curl.h>
int main(void) { curl_easy_init(); curl_multi_setopt(0, 0, 0); return 0; }
EOF
curl_cflags=`$curlconfig --cflags 2>/dev/null`
curl_libs=`$curlconfig --libs 2>/dev/null`
curl_cflags=$($curlconfig --cflags 2>/dev/null)
curl_libs=$($curlconfig --libs 2>/dev/null)
if compile_prog "$curl_cflags" "$curl_libs" ; then
curl=yes
else
@@ -2916,8 +2951,8 @@ if test "$bluez" != "no" ; then
#include <bluetooth/bluetooth.h>
int main(void) { return bt_error(0); }
EOF
bluez_cflags=`$pkg_config --cflags bluez 2> /dev/null`
bluez_libs=`$pkg_config --libs bluez 2> /dev/null`
bluez_cflags=$($pkg_config --cflags bluez 2>/dev/null)
bluez_libs=$($pkg_config --libs bluez 2>/dev/null)
if compile_prog "$bluez_cflags" "$bluez_libs" ; then
bluez=yes
libs_softmmu="$bluez_libs $libs_softmmu"
@@ -2940,8 +2975,8 @@ fi
for i in $glib_modules; do
if $pkg_config --atleast-version=$glib_req_ver $i; then
glib_cflags=`$pkg_config --cflags $i`
glib_libs=`$pkg_config --libs $i`
glib_cflags=$($pkg_config --cflags $i)
glib_libs=$($pkg_config --libs $i)
CFLAGS="$glib_cflags $CFLAGS"
LIBS="$glib_libs $LIBS"
libs_qga="$glib_libs $libs_qga"
@@ -2967,7 +3002,7 @@ int main(void) {
}
EOF
if ! compile_prog "-Werror $CFLAGS" "$LIBS" ; then
if ! compile_prog "$CFLAGS" "$LIBS" ; then
error_exit "sizeof(size_t) doesn't match GLIB_SIZEOF_SIZE_T."\
"You probably need to set PKG_CONFIG_LIBDIR"\
"to point to the right pkg-config files for your"\
@@ -3030,8 +3065,8 @@ if test "$pixman" = "none"; then
pixman_libs=
elif test "$pixman" = "system"; then
# pixman version has been checked above
pixman_cflags=`$pkg_config --cflags pixman-1`
pixman_libs=`$pkg_config --libs pixman-1`
pixman_cflags=$($pkg_config --cflags pixman-1)
pixman_libs=$($pkg_config --libs pixman-1)
else
if test ! -d ${source_path}/pixman/pixman; then
error_exit "pixman >= 0.21.8 not present. Your options:" \
@@ -3147,8 +3182,8 @@ fi
min_libssh2_version=1.2.8
if test "$libssh2" != "no" ; then
if $pkg_config --atleast-version=$min_libssh2_version libssh2; then
libssh2_cflags=`$pkg_config libssh2 --cflags`
libssh2_libs=`$pkg_config libssh2 --libs`
libssh2_cflags=$($pkg_config libssh2 --cflags)
libssh2_libs=$($pkg_config libssh2 --libs)
libssh2=yes
else
if test "$libssh2" = "yes" ; then
@@ -3399,8 +3434,8 @@ fi
if test "$glusterfs" != "no" ; then
if $pkg_config --atleast-version=3 glusterfs-api; then
glusterfs="yes"
glusterfs_cflags=`$pkg_config --cflags glusterfs-api`
glusterfs_libs=`$pkg_config --libs glusterfs-api`
glusterfs_cflags=$($pkg_config --cflags glusterfs-api)
glusterfs_libs=$($pkg_config --libs glusterfs-api)
if $pkg_config --atleast-version=4 glusterfs-api; then
glusterfs_xlator_opt="yes"
fi
@@ -3780,8 +3815,8 @@ if compile_prog "" "" ; then
epoll=yes
fi
# epoll_create1 and epoll_pwait are later additions
# so we must check separately for their presence
# epoll_create1 is a later addition
# so we must check separately for its presence
epoll_create1=no
cat > $TMPC << EOF
#include <sys/epoll.h>
@@ -3803,20 +3838,6 @@ if compile_prog "" "" ; then
epoll_create1=yes
fi
epoll_pwait=no
cat > $TMPC << EOF
#include <sys/epoll.h>
int main(void)
{
epoll_pwait(0, 0, 0, 0, 0);
return 0;
}
EOF
if compile_prog "" "" ; then
epoll_pwait=yes
fi
# check for sendfile support
sendfile=no
cat > $TMPC << EOF
@@ -4171,24 +4192,6 @@ if compile_prog "" "" ; then
posix_madvise=yes
fi
##########################################
# check if we have usable SIGEV_THREAD_ID
sigev_thread_id=no
cat > $TMPC << EOF
#include <signal.h>
int main(void) {
struct sigevent ev;
ev.sigev_notify = SIGEV_THREAD_ID;
ev._sigev_un._tid = 0;
asm volatile("" : : "g"(&ev));
return 0;
}
EOF
if compile_prog "" "" ; then
sigev_thread_id=yes
fi
##########################################
# check if trace backend exists
@@ -4207,12 +4210,12 @@ int main(void) { return 0; }
EOF
if compile_prog "" "" ; then
if $pkg_config lttng-ust --exists; then
lttng_ust_libs=`$pkg_config --libs lttng-ust`
lttng_ust_libs=$($pkg_config --libs lttng-ust)
else
lttng_ust_libs="-llttng-ust"
fi
if $pkg_config liburcu-bp --exists; then
urcu_bp_libs=`$pkg_config --libs liburcu-bp`
urcu_bp_libs=$($pkg_config --libs liburcu-bp)
else
urcu_bp_libs="-lurcu-bp"
fi
@@ -4508,6 +4511,38 @@ if compile_prog "" "" ; then
have_fsxattr=yes
fi
##########################################
# check if rtnetlink.h exists and is useful
have_rtnetlink=no
cat > $TMPC << EOF
#include <linux/rtnetlink.h>
int main(void) {
return IFLA_PROTO_DOWN;
}
EOF
if compile_prog "" "" ; then
have_rtnetlink=yes
fi
#################################################
# Sparc implicitly links with --relax, which is
# incompatible with -r, so --no-relax should be
# given. It does no harm to give it on other
# platforms too.
# Note: the prototype is needed since QEMU_CFLAGS
# contains -Wmissing-prototypes
cat > $TMPC << EOF
extern int foo(void);
int foo(void) { return 0; }
EOF
if ! compile_object ""; then
error_exit "Failed to compile object file for LD_REL_FLAGS test"
fi
if do_cc -nostdlib -Wl,-r -Wl,--no-relax -o $TMPMO $TMPO; then
LD_REL_FLAGS="-Wl,--no-relax"
fi
##########################################
# End of CC checks
# After here, no more $cc or $ld runs
@@ -4536,16 +4571,6 @@ if test "$libnfs" != "no" ; then
fi
fi
# Disable zero malloc errors for official releases unless explicitly told to
# enable/disable
if test -z "$zero_malloc" ; then
if test "$z_version" = "50" ; then
zero_malloc="no"
else
zero_malloc="yes"
fi
fi
# Now we've finished running tests it's OK to add -Werror to the compiler flags
if test "$werror" = "yes"; then
QEMU_CFLAGS="-Werror $QEMU_CFLAGS"
@@ -4593,7 +4618,7 @@ if test "$softmmu" = yes ; then
tools="$tools fsdev/virtfs-proxy-helper\$(EXESUF)"
else
if test "$virtfs" = yes; then
error_exit "VirtFS is supported only on Linux and requires libcap-devel and libattr-devel"
error_exit "VirtFS is supported only on Linux and requires libcap devel and libattr devel"
fi
virtfs=no
fi
@@ -4652,10 +4677,10 @@ if test "$guest_agent_msi" = "yes"; then
fi
if test "$QEMU_GA_VERSION" = ""; then
QEMU_GA_VERSION=`cat $source_path/VERSION`
QEMU_GA_VERSION=$(cat $source_path/VERSION)
fi
QEMU_GA_MSI_MINGW_DLL_PATH="-D Mingw_dlls=`$pkg_config --variable=prefix glib-2.0`/bin"
QEMU_GA_MSI_MINGW_DLL_PATH="-D Mingw_dlls=$($pkg_config --variable=prefix glib-2.0)/bin"
case "$cpu" in
x86_64)
@@ -4686,7 +4711,7 @@ if test "$cpu" = "s390x" ; then
fi
# Probe for the need for relocating the user-only binary.
if test "$pie" = "no" ; then
if ( [ "$linux_user" = yes ] || [ "$bsd_user" = yes ] ) && [ "$pie" = no ]; then
textseg_addr=
case "$cpu" in
arm | i386 | ppc* | s390* | sparc* | x86_64 | x32)
@@ -4708,6 +4733,16 @@ EOF
# In case ld does not support -Ttext-segment, edit the default linker
# script via sed to set the .text start addr. This is needed on FreeBSD
# at least.
if ! $ld --verbose >/dev/null 2>&1; then
error_exit \
"We need to link the QEMU user mode binaries at a" \
"specific text address. Unfortunately your linker" \
"doesn't support either the -Ttext-segment option or" \
"printing the default linker script with --verbose." \
"If you don't want the user mode binaries, pass the" \
"--disable-user option to configure."
fi
$ld --verbose | sed \
-e '1,/==================================================/d' \
-e '/==================================================/,$d' \
@@ -4718,21 +4753,27 @@ EOF
fi
fi
echo_version() {
if test "$1" = "yes" ; then
echo "($2)"
fi
}
# prepend pixman and ftd flags after all config tests are done
QEMU_CFLAGS="$pixman_cflags $fdt_cflags $QEMU_CFLAGS"
libs_softmmu="$pixman_libs $libs_softmmu"
echo "Install prefix $prefix"
echo "BIOS directory `eval echo $qemu_datadir`"
echo "binary directory `eval echo $bindir`"
echo "library directory `eval echo $libdir`"
echo "module directory `eval echo $qemu_moddir`"
echo "libexec directory `eval echo $libexecdir`"
echo "include directory `eval echo $includedir`"
echo "config directory `eval echo $sysconfdir`"
echo "BIOS directory $(eval echo $qemu_datadir)"
echo "binary directory $(eval echo $bindir)"
echo "library directory $(eval echo $libdir)"
echo "module directory $(eval echo $qemu_moddir)"
echo "libexec directory $(eval echo $libexecdir)"
echo "include directory $(eval echo $includedir)"
echo "config directory $(eval echo $sysconfdir)"
if test "$mingw32" = "no" ; then
echo "local state directory `eval echo $local_statedir`"
echo "Manual directory `eval echo $mandir`"
echo "local state directory $(eval echo $local_statedir)"
echo "Manual directory $(eval echo $mandir)"
echo "ELF interp prefix $interp_prefix"
else
echo "local state directory queried at runtime"
@@ -4767,22 +4808,18 @@ if test "$darwin" = "yes" ; then
echo "Cocoa support $cocoa"
fi
echo "pixman $pixman"
echo "SDL support $sdl"
echo "GTK support $gtk"
echo "SDL support $sdl $(echo_version $sdl $sdlversion)"
echo "GTK support $gtk $(echo_version $gtk $gtk_version)"
echo "GTK GL support $gtk_gl"
echo "VTE support $vte $(echo_version $vte $vteversion)"
echo "GNUTLS support $gnutls"
echo "GNUTLS hash $gnutls_hash"
echo "GNUTLS rnd $gnutls_rnd"
echo "libgcrypt $gcrypt"
echo "libgcrypt kdf $gcrypt_kdf"
if test "$nettle" = "yes"; then
echo "nettle $nettle ($nettle_version)"
else
echo "nettle $nettle"
fi
echo "nettle $nettle $(echo_version $nettle $nettle_version)"
echo "nettle kdf $nettle_kdf"
echo "libtasn1 $tasn1"
echo "VTE support $vte"
echo "curses support $curses"
echo "virgl support $virglrenderer"
echo "curl support $curl"
@@ -4822,7 +4859,6 @@ echo "preadv support $preadv"
echo "fdatasync $fdatasync"
echo "madvise $madvise"
echo "posix_madvise $posix_madvise"
echo "sigev_thread_id $sigev_thread_id"
echo "uuid support $uuid"
echo "libcap-ng support $cap_ng"
echo "vhost-net support $vhost_net"
@@ -4831,11 +4867,7 @@ echo "Trace backends $trace_backends"
if have_backend "simple"; then
echo "Trace output file $trace_file-<pid>"
fi
if test "$spice" = "yes"; then
echo "spice support $spice ($spice_protocol_version/$spice_server_version)"
else
echo "spice support $spice"
fi
echo "spice support $spice $(echo_version $spice $spice_protocol_version/$spice_server_version)"
echo "rbd support $rbd"
echo "xfsctl support $xfs"
echo "smartcard support $smartcard"
@@ -4914,7 +4946,7 @@ if test "$bigendian" = "yes" ; then
fi
if test "$mingw32" = "yes" ; then
echo "CONFIG_WIN32=y" >> $config_host_mak
rc_version=`cat $source_path/VERSION`
rc_version=$(cat $source_path/VERSION)
version_major=${rc_version%%.*}
rc_version=${rc_version#*.}
version_minor=${rc_version%%.*}
@@ -4990,7 +5022,7 @@ if test "$cap_ng" = "yes" ; then
fi
echo "CONFIG_AUDIO_DRIVERS=$audio_drv_list" >> $config_host_mak
for drv in $audio_drv_list; do
def=CONFIG_`echo $drv | LC_ALL=C tr '[a-z]' '[A-Z]'`
def=CONFIG_$(echo $drv | LC_ALL=C tr '[a-z]' '[A-Z]')
echo "$def=y" >> $config_host_mak
done
if test "$audio_pt_int" = "yes" ; then
@@ -5022,7 +5054,7 @@ fi
if test "$xfs" = "yes" ; then
echo "CONFIG_XFS=y" >> $config_host_mak
fi
qemu_version=`head $source_path/VERSION`
qemu_version=$(head $source_path/VERSION)
echo "VERSION=$qemu_version" >>$config_host_mak
echo "PKGVERSION=$pkgversion" >>$config_host_mak
echo "SRC_PATH=$source_path" >> $config_host_mak
@@ -5033,7 +5065,7 @@ fi
if test "$modules" = "yes"; then
# $shacmd can generate a hash started with digit, which the compiler doesn't
# like as an symbol. So prefix it with an underscore
echo "CONFIG_STAMP=_`(echo $qemu_version; echo $pkgversion; cat $0) | $shacmd - | cut -f1 -d\ `" >> $config_host_mak
echo "CONFIG_STAMP=_$( (echo $qemu_version; echo $pkgversion; cat $0) | $shacmd - | cut -f1 -d\ )" >> $config_host_mak
echo "CONFIG_MODULES=y" >> $config_host_mak
fi
if test "$sdl" = "yes" ; then
@@ -5098,9 +5130,6 @@ fi
if test "$epoll_create1" = "yes" ; then
echo "CONFIG_EPOLL_CREATE1=y" >> $config_host_mak
fi
if test "$epoll_pwait" = "yes" ; then
echo "CONFIG_EPOLL_PWAIT=y" >> $config_host_mak
fi
if test "$sendfile" = "yes" ; then
echo "CONFIG_SENDFILE=y" >> $config_host_mak
fi
@@ -5244,9 +5273,6 @@ fi
if test "$posix_madvise" = "yes" ; then
echo "CONFIG_POSIX_MADVISE=y" >> $config_host_mak
fi
if test "$sigev_thread_id" = "yes" ; then
echo "CONFIG_SIGEV_THREAD_ID=y" >> $config_host_mak
fi
if test "$spice" = "yes" ; then
echo "CONFIG_SPICE=y" >> $config_host_mak
@@ -5309,9 +5335,6 @@ if [ "$bsd" = "yes" ] ; then
echo "CONFIG_BSD=y" >> $config_host_mak
fi
if test "$zero_malloc" = "yes" ; then
echo "CONFIG_ZERO_MALLOC=y" >> $config_host_mak
fi
if test "$localtime_r" = "yes" ; then
echo "CONFIG_LOCALTIME_R=y" >> $config_host_mak
fi
@@ -5445,6 +5468,10 @@ if test "$rdma" = "yes" ; then
echo "CONFIG_RDMA=y" >> $config_host_mak
fi
if test "$have_rtnetlink" = "yes" ; then
echo "CONFIG_RTNETLINK=y" >> $config_host_mak
fi
# Hold two types of flag:
# CONFIG_THREAD_SETNAME_BYTHREAD - we've got a way of setting the name on
# a thread we have a handle to
@@ -5513,6 +5540,7 @@ else
fi
echo "LDFLAGS=$LDFLAGS" >> $config_host_mak
echo "LDFLAGS_NOPIE=$LDFLAGS_NOPIE" >> $config_host_mak
echo "LD_REL_FLAGS=$LD_REL_FLAGS" >> $config_host_mak
echo "LIBS+=$LIBS" >> $config_host_mak
echo "LIBS_TOOLS+=$libs_tools" >> $config_host_mak
echo "EXESUF=$EXESUF" >> $config_host_mak
@@ -5561,7 +5589,7 @@ fi
for target in $target_list; do
target_dir="$target"
config_target_mak=$target_dir/config-target.mak
target_name=`echo $target | cut -d '-' -f 1`
target_name=$(echo $target | cut -d '-' -f 1)
target_bigendian="no"
case "$target_name" in
@@ -5601,7 +5629,7 @@ mkdir -p $target_dir
echo "# Automatically generated by configure - do not modify" > $config_target_mak
bflt="no"
interp_prefix1=`echo "$interp_prefix" | sed "s/%M/$target_name/g"`
interp_prefix1=$(echo "$interp_prefix" | sed "s/%M/$target_name/g")
gdb_xml_files=""
TARGET_ARCH="$target_name"
@@ -5727,7 +5755,7 @@ upper() {
echo "$@"| LC_ALL=C tr '[a-z]' '[A-Z]'
}
target_arch_name="`upper $TARGET_ARCH`"
target_arch_name="$(upper $TARGET_ARCH)"
echo "TARGET_$target_arch_name=y" >> $config_target_mak
echo "TARGET_NAME=$target_name" >> $config_target_mak
echo "TARGET_BASE_ARCH=$TARGET_BASE_ARCH" >> $config_target_mak
@@ -5943,11 +5971,11 @@ for bios_file in \
$source_path/pc-bios/u-boot.* \
$source_path/pc-bios/palcode-*
do
FILES="$FILES pc-bios/`basename $bios_file`"
FILES="$FILES pc-bios/$(basename $bios_file)"
done
for test_file in `find $source_path/tests/acpi-test-data -type f`
for test_file in $(find $source_path/tests/acpi-test-data -type f)
do
FILES="$FILES tests/acpi-test-data`echo $test_file | sed -e 's/.*acpi-test-data//'`"
FILES="$FILES tests/acpi-test-data$(echo $test_file | sed -e 's/.*acpi-test-data//')"
done
mkdir -p $DIRS
for f in $FILES ; do

View File

@@ -7,9 +7,9 @@
*/
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "qemu/host-utils.h"
#include "qemu/sockets.h"
#include <sys/mman.h>
#include <sys/socket.h>
#include <sys/un.h>

View File

@@ -15,7 +15,7 @@
* unix socket. For each client, the server will create some eventfd
* (see EVENTFD(2)), one per vector. These fd are transmitted to all
* clients using the SCM_RIGHTS cmsg message. Therefore, each client is
* able to send a notification to another client without beeing
* able to send a notification to another client without being
* "profixied" by the server.
*
* We use this mechanism to send interruptions between guests.

View File

@@ -20,16 +20,14 @@
#include "qemu/osdep.h"
#include "cpu.h"
#include "sysemu/cpus.h"
#include "exec/exec-all.h"
#include "exec/memory-internal.h"
bool exit_request;
CPUState *tcg_current_cpu;
/* exit the current TB from a signal handler. The host registers are
restored in a state compatible with the CPU emulator
*/
#if defined(CONFIG_SOFTMMU)
void cpu_resume_from_signal(CPUState *cpu, void *puc)
/* exit the current TB, but without causing any exception to be raised */
void cpu_loop_exit_noexc(CPUState *cpu)
{
/* XXX: restore cpu registers saved in host registers */
@@ -37,6 +35,7 @@ void cpu_resume_from_signal(CPUState *cpu, void *puc)
siglongjmp(cpu->jmp_env, 1);
}
#if defined(CONFIG_SOFTMMU)
void cpu_reloading_memory_map(void)
{
if (qemu_in_vcpu_thread()) {
@@ -68,7 +67,6 @@ void cpu_reloading_memory_map(void)
void cpu_loop_exit(CPUState *cpu)
{
cpu->current_tb = NULL;
siglongjmp(cpu->jmp_env, 1);
}
@@ -77,6 +75,5 @@ void cpu_loop_exit_restore(CPUState *cpu, uintptr_t pc)
if (pc) {
cpu_restore_state(cpu, pc);
}
cpu->current_tb = NULL;
siglongjmp(cpu->jmp_env, 1);
}

View File

@@ -20,6 +20,7 @@
#include "cpu.h"
#include "trace.h"
#include "disas/disas.h"
#include "exec/exec-all.h"
#include "tcg.h"
#include "qemu/atomic.h"
#include "sysemu/qtest.h"
@@ -136,7 +137,9 @@ static void init_delay_params(SyncClocks *sc, const CPUState *cpu)
static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
{
CPUArchState *env = cpu->env_ptr;
uintptr_t next_tb;
uintptr_t ret;
TranslationBlock *last_tb;
int tb_exit;
uint8_t *tb_ptr = itb->tc_ptr;
qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
@@ -160,118 +163,125 @@ static inline tcg_target_ulong cpu_tb_exec(CPUState *cpu, TranslationBlock *itb)
#endif /* DEBUG_DISAS */
cpu->can_do_io = !use_icount;
next_tb = tcg_qemu_tb_exec(env, tb_ptr);
ret = tcg_qemu_tb_exec(env, tb_ptr);
cpu->can_do_io = 1;
trace_exec_tb_exit((void *) (next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK);
last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
tb_exit = ret & TB_EXIT_MASK;
trace_exec_tb_exit(last_tb, tb_exit);
if ((next_tb & TB_EXIT_MASK) > TB_EXIT_IDX1) {
if (tb_exit > TB_EXIT_IDX1) {
/* We didn't start executing this TB (eg because the instruction
* counter hit zero); we must restore the guest PC to the address
* of the start of the TB.
*/
CPUClass *cc = CPU_GET_CLASS(cpu);
TranslationBlock *tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
qemu_log_mask_and_addr(CPU_LOG_EXEC, itb->pc,
qemu_log_mask_and_addr(CPU_LOG_EXEC, last_tb->pc,
"Stopped execution of TB chain before %p ["
TARGET_FMT_lx "] %s\n",
itb->tc_ptr, itb->pc, lookup_symbol(itb->pc));
last_tb->tc_ptr, last_tb->pc,
lookup_symbol(last_tb->pc));
if (cc->synchronize_from_tb) {
cc->synchronize_from_tb(cpu, tb);
cc->synchronize_from_tb(cpu, last_tb);
} else {
assert(cc->set_pc);
cc->set_pc(cpu, tb->pc);
cc->set_pc(cpu, last_tb->pc);
}
}
if ((next_tb & TB_EXIT_MASK) == TB_EXIT_REQUESTED) {
if (tb_exit == TB_EXIT_REQUESTED) {
/* We were asked to stop executing TBs (probably a pending
* interrupt. We've now stopped, so clear the flag.
*/
cpu->tcg_exit_req = 0;
}
return next_tb;
return ret;
}
#ifndef CONFIG_USER_ONLY
/* Execute the code without caching the generated code. An interpreter
could be used if available. */
static void cpu_exec_nocache(CPUState *cpu, int max_cycles,
TranslationBlock *orig_tb, bool ignore_icount)
{
TranslationBlock *tb;
bool old_tb_flushed;
/* Should never happen.
We only end up here when an existing TB is too long. */
if (max_cycles > CF_COUNT_MASK)
max_cycles = CF_COUNT_MASK;
old_tb_flushed = cpu->tb_flushed;
cpu->tb_flushed = false;
tb = tb_gen_code(cpu, orig_tb->pc, orig_tb->cs_base, orig_tb->flags,
max_cycles | CF_NOCACHE
| (ignore_icount ? CF_IGNORE_ICOUNT : 0));
tb->orig_tb = tcg_ctx.tb_ctx.tb_invalidated_flag ? NULL : orig_tb;
cpu->current_tb = tb;
tb->orig_tb = cpu->tb_flushed ? NULL : orig_tb;
cpu->tb_flushed |= old_tb_flushed;
/* execute the generated code */
trace_exec_tb_nocache(tb, tb->pc);
cpu_tb_exec(cpu, tb);
cpu->current_tb = NULL;
tb_phys_invalidate(tb, -1);
tb_free(tb);
}
#endif
struct tb_desc {
target_ulong pc;
target_ulong cs_base;
CPUArchState *env;
tb_page_addr_t phys_page1;
uint32_t flags;
};
static bool tb_cmp(const void *p, const void *d)
{
const TranslationBlock *tb = p;
const struct tb_desc *desc = d;
if (tb->pc == desc->pc &&
tb->page_addr[0] == desc->phys_page1 &&
tb->cs_base == desc->cs_base &&
tb->flags == desc->flags) {
/* check next page if needed */
if (tb->page_addr[1] == -1) {
return true;
} else {
tb_page_addr_t phys_page2;
target_ulong virt_page2;
virt_page2 = (desc->pc & TARGET_PAGE_MASK) + TARGET_PAGE_SIZE;
phys_page2 = get_page_addr_code(desc->env, virt_page2);
if (tb->page_addr[1] == phys_page2) {
return true;
}
}
}
return false;
}
static TranslationBlock *tb_find_physical(CPUState *cpu,
target_ulong pc,
target_ulong cs_base,
uint64_t flags)
uint32_t flags)
{
CPUArchState *env = (CPUArchState *)cpu->env_ptr;
TranslationBlock *tb, **ptb1;
unsigned int h;
tb_page_addr_t phys_pc, phys_page1;
target_ulong virt_page2;
tb_page_addr_t phys_pc;
struct tb_desc desc;
uint32_t h;
tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
/* find translated block using physical mappings */
phys_pc = get_page_addr_code(env, pc);
phys_page1 = phys_pc & TARGET_PAGE_MASK;
h = tb_phys_hash_func(phys_pc);
ptb1 = &tcg_ctx.tb_ctx.tb_phys_hash[h];
for(;;) {
tb = *ptb1;
if (!tb) {
return NULL;
}
if (tb->pc == pc &&
tb->page_addr[0] == phys_page1 &&
tb->cs_base == cs_base &&
tb->flags == flags) {
/* check next page if needed */
if (tb->page_addr[1] != -1) {
tb_page_addr_t phys_page2;
virt_page2 = (pc & TARGET_PAGE_MASK) +
TARGET_PAGE_SIZE;
phys_page2 = get_page_addr_code(env, virt_page2);
if (tb->page_addr[1] == phys_page2) {
break;
}
} else {
break;
}
}
ptb1 = &tb->phys_hash_next;
}
/* Move the TB to the head of the list */
*ptb1 = tb->phys_hash_next;
tb->phys_hash_next = tcg_ctx.tb_ctx.tb_phys_hash[h];
tcg_ctx.tb_ctx.tb_phys_hash[h] = tb;
return tb;
desc.env = (CPUArchState *)cpu->env_ptr;
desc.cs_base = cs_base;
desc.flags = flags;
desc.pc = pc;
phys_pc = get_page_addr_code(desc.env, pc);
desc.phys_page1 = phys_pc & TARGET_PAGE_MASK;
h = tb_hash_func(phys_pc, pc, flags);
return qht_lookup(&tcg_ctx.tb_ctx.htable, tb_cmp, &desc, h);
}
static TranslationBlock *tb_find_slow(CPUState *cpu,
target_ulong pc,
target_ulong cs_base,
uint64_t flags)
uint32_t flags)
{
TranslationBlock *tb;
@@ -309,26 +319,72 @@ found:
return tb;
}
static inline TranslationBlock *tb_find_fast(CPUState *cpu)
static inline TranslationBlock *tb_find_fast(CPUState *cpu,
TranslationBlock **last_tb,
int tb_exit)
{
CPUArchState *env = (CPUArchState *)cpu->env_ptr;
TranslationBlock *tb;
target_ulong cs_base, pc;
int flags;
uint32_t flags;
/* we record a subset of the CPU state. It will
always be the same before a given translated block
is executed. */
cpu_get_tb_cpu_state(env, &pc, &cs_base, &flags);
tb_lock();
tb = cpu->tb_jmp_cache[tb_jmp_cache_hash_func(pc)];
if (unlikely(!tb || tb->pc != pc || tb->cs_base != cs_base ||
tb->flags != flags)) {
tb = tb_find_slow(cpu, pc, cs_base, flags);
}
if (cpu->tb_flushed) {
/* Ensure that no TB jump will be modified as the
* translation buffer has been flushed.
*/
*last_tb = NULL;
cpu->tb_flushed = false;
}
#ifndef CONFIG_USER_ONLY
/* We don't take care of direct jumps when address mapping changes in
* system emulation. So it's not safe to make a direct jump to a TB
* spanning two pages because the mapping for the second page can change.
*/
if (tb->page_addr[1] != -1) {
*last_tb = NULL;
}
#endif
/* See if we can patch the calling TB. */
if (*last_tb && !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
tb_add_jump(*last_tb, tb_exit, tb);
}
tb_unlock();
return tb;
}
static void cpu_handle_debug_exception(CPUState *cpu)
static inline bool cpu_handle_halt(CPUState *cpu)
{
if (cpu->halted) {
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
&& replay_interrupt()) {
X86CPU *x86_cpu = X86_CPU(cpu);
apic_poll_irq(x86_cpu->apic_state);
cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
}
#endif
if (!cpu_has_work(cpu)) {
current_cpu = NULL;
return true;
}
cpu->halted = 0;
}
return false;
}
static inline void cpu_handle_debug_exception(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
CPUWatchpoint *wp;
@@ -342,37 +398,197 @@ static void cpu_handle_debug_exception(CPUState *cpu)
cc->debug_excp_handler(cpu);
}
static inline bool cpu_handle_exception(CPUState *cpu, int *ret)
{
if (cpu->exception_index >= 0) {
if (cpu->exception_index >= EXCP_INTERRUPT) {
/* exit request from the cpu execution loop */
*ret = cpu->exception_index;
if (*ret == EXCP_DEBUG) {
cpu_handle_debug_exception(cpu);
}
cpu->exception_index = -1;
return true;
} else {
#if defined(CONFIG_USER_ONLY)
/* if user mode only, we simulate a fake exception
which will be handled outside the cpu execution
loop */
#if defined(TARGET_I386)
CPUClass *cc = CPU_GET_CLASS(cpu);
cc->do_interrupt(cpu);
#endif
*ret = cpu->exception_index;
cpu->exception_index = -1;
return true;
#else
if (replay_exception()) {
CPUClass *cc = CPU_GET_CLASS(cpu);
cc->do_interrupt(cpu);
cpu->exception_index = -1;
} else if (!replay_has_interrupt()) {
/* give a chance to iothread in replay mode */
*ret = EXCP_INTERRUPT;
return true;
}
#endif
}
#ifndef CONFIG_USER_ONLY
} else if (replay_has_exception()
&& cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
/* try to cause an exception pending in the log */
TranslationBlock *last_tb = NULL; /* Avoid chaining TBs */
cpu_exec_nocache(cpu, 1, tb_find_fast(cpu, &last_tb, 0), true);
*ret = -1;
return true;
#endif
}
return false;
}
static inline void cpu_handle_interrupt(CPUState *cpu,
TranslationBlock **last_tb)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
int interrupt_request = cpu->interrupt_request;
if (unlikely(interrupt_request)) {
if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
/* Mask out external interrupts for this step. */
interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
}
if (interrupt_request & CPU_INTERRUPT_DEBUG) {
cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
cpu->exception_index = EXCP_DEBUG;
cpu_loop_exit(cpu);
}
if (replay_mode == REPLAY_MODE_PLAY && !replay_has_interrupt()) {
/* Do nothing */
} else if (interrupt_request & CPU_INTERRUPT_HALT) {
replay_interrupt();
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
cpu_loop_exit(cpu);
}
#if defined(TARGET_I386)
else if (interrupt_request & CPU_INTERRUPT_INIT) {
X86CPU *x86_cpu = X86_CPU(cpu);
CPUArchState *env = &x86_cpu->env;
replay_interrupt();
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
do_cpu_init(x86_cpu);
cpu->exception_index = EXCP_HALTED;
cpu_loop_exit(cpu);
}
#else
else if (interrupt_request & CPU_INTERRUPT_RESET) {
replay_interrupt();
cpu_reset(cpu);
cpu_loop_exit(cpu);
}
#endif
/* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit. */
else {
replay_interrupt();
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
*last_tb = NULL;
}
/* The target hook may have updated the 'cpu->interrupt_request';
* reload the 'interrupt_request' value */
interrupt_request = cpu->interrupt_request;
}
if (interrupt_request & CPU_INTERRUPT_EXITTB) {
cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
/* ensure that no TB jump will be modified as
the program flow was changed */
*last_tb = NULL;
}
}
if (unlikely(cpu->exit_request || replay_has_interrupt())) {
cpu->exit_request = 0;
cpu->exception_index = EXCP_INTERRUPT;
cpu_loop_exit(cpu);
}
}
static inline void cpu_loop_exec_tb(CPUState *cpu, TranslationBlock *tb,
TranslationBlock **last_tb, int *tb_exit,
SyncClocks *sc)
{
uintptr_t ret;
if (unlikely(cpu->exit_request)) {
return;
}
trace_exec_tb(tb, tb->pc);
ret = cpu_tb_exec(cpu, tb);
*last_tb = (TranslationBlock *)(ret & ~TB_EXIT_MASK);
*tb_exit = ret & TB_EXIT_MASK;
switch (*tb_exit) {
case TB_EXIT_REQUESTED:
/* Something asked us to stop executing
* chained TBs; just continue round the main
* loop. Whatever requested the exit will also
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop. But we need to
* ensure the tcg_exit_req read in generated code
* comes before the next read of cpu->exit_request
* or cpu->interrupt_request.
*/
smp_rmb();
*last_tb = NULL;
break;
case TB_EXIT_ICOUNT_EXPIRED:
{
/* Instruction counter expired. */
#ifdef CONFIG_USER_ONLY
abort();
#else
int insns_left = cpu->icount_decr.u32;
if (cpu->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
cpu->icount_extra += insns_left;
insns_left = MIN(0xffff, cpu->icount_extra);
cpu->icount_extra -= insns_left;
cpu->icount_decr.u16.low = insns_left;
} else {
if (insns_left > 0) {
/* Execute remaining instructions. */
cpu_exec_nocache(cpu, insns_left, *last_tb, false);
align_clocks(sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
*last_tb = NULL;
cpu_loop_exit(cpu);
}
break;
#endif
}
default:
break;
}
}
/* main execution loop */
int cpu_exec(CPUState *cpu)
{
CPUClass *cc = CPU_GET_CLASS(cpu);
#ifdef TARGET_I386
X86CPU *x86_cpu = X86_CPU(cpu);
CPUArchState *env = &x86_cpu->env;
#endif
int ret, interrupt_request;
TranslationBlock *tb;
uintptr_t next_tb;
int ret;
SyncClocks sc;
/* replay_interrupt may need current_cpu */
current_cpu = cpu;
if (cpu->halted) {
#if defined(TARGET_I386) && !defined(CONFIG_USER_ONLY)
if ((cpu->interrupt_request & CPU_INTERRUPT_POLL)
&& replay_interrupt()) {
apic_poll_irq(x86_cpu->apic_state);
cpu_reset_interrupt(cpu, CPU_INTERRUPT_POLL);
}
#endif
if (!cpu_has_work(cpu)) {
current_cpu = NULL;
return EXCP_HALTED;
}
cpu->halted = 0;
if (cpu_handle_halt(cpu)) {
return EXCP_HALTED;
}
atomic_mb_set(&tcg_current_cpu, cpu);
@@ -391,185 +607,26 @@ int cpu_exec(CPUState *cpu)
*/
init_delay_params(&sc, cpu);
/* prepare setjmp context for exception handling */
for(;;) {
TranslationBlock *tb, *last_tb;
int tb_exit = 0;
/* prepare setjmp context for exception handling */
if (sigsetjmp(cpu->jmp_env, 0) == 0) {
/* if an exception is pending, we execute it here */
if (cpu->exception_index >= 0) {
if (cpu->exception_index >= EXCP_INTERRUPT) {
/* exit request from the cpu execution loop */
ret = cpu->exception_index;
if (ret == EXCP_DEBUG) {
cpu_handle_debug_exception(cpu);
}
cpu->exception_index = -1;
break;
} else {
#if defined(CONFIG_USER_ONLY)
/* if user mode only, we simulate a fake exception
which will be handled outside the cpu execution
loop */
#if defined(TARGET_I386)
cc->do_interrupt(cpu);
#endif
ret = cpu->exception_index;
cpu->exception_index = -1;
break;
#else
if (replay_exception()) {
cc->do_interrupt(cpu);
cpu->exception_index = -1;
} else if (!replay_has_interrupt()) {
/* give a chance to iothread in replay mode */
ret = EXCP_INTERRUPT;
break;
}
#endif
}
} else if (replay_has_exception()
&& cpu->icount_decr.u16.low + cpu->icount_extra == 0) {
/* try to cause an exception pending in the log */
cpu_exec_nocache(cpu, 1, tb_find_fast(cpu), true);
ret = -1;
if (cpu_handle_exception(cpu, &ret)) {
break;
}
next_tb = 0; /* force lookup of first TB */
last_tb = NULL; /* forget the last executed TB after exception */
cpu->tb_flushed = false; /* reset before first TB lookup */
for(;;) {
interrupt_request = cpu->interrupt_request;
if (unlikely(interrupt_request)) {
if (unlikely(cpu->singlestep_enabled & SSTEP_NOIRQ)) {
/* Mask out external interrupts for this step. */
interrupt_request &= ~CPU_INTERRUPT_SSTEP_MASK;
}
if (interrupt_request & CPU_INTERRUPT_DEBUG) {
cpu->interrupt_request &= ~CPU_INTERRUPT_DEBUG;
cpu->exception_index = EXCP_DEBUG;
cpu_loop_exit(cpu);
}
if (replay_mode == REPLAY_MODE_PLAY
&& !replay_has_interrupt()) {
/* Do nothing */
} else if (interrupt_request & CPU_INTERRUPT_HALT) {
replay_interrupt();
cpu->interrupt_request &= ~CPU_INTERRUPT_HALT;
cpu->halted = 1;
cpu->exception_index = EXCP_HLT;
cpu_loop_exit(cpu);
}
#if defined(TARGET_I386)
else if (interrupt_request & CPU_INTERRUPT_INIT) {
replay_interrupt();
cpu_svm_check_intercept_param(env, SVM_EXIT_INIT, 0);
do_cpu_init(x86_cpu);
cpu->exception_index = EXCP_HALTED;
cpu_loop_exit(cpu);
}
#else
else if (interrupt_request & CPU_INTERRUPT_RESET) {
replay_interrupt();
cpu_reset(cpu);
cpu_loop_exit(cpu);
}
#endif
/* The target hook has 3 exit conditions:
False when the interrupt isn't processed,
True when it is, and we should restart on a new TB,
and via longjmp via cpu_loop_exit. */
else {
replay_interrupt();
if (cc->cpu_exec_interrupt(cpu, interrupt_request)) {
next_tb = 0;
}
}
/* Don't use the cached interrupt_request value,
do_interrupt may have updated the EXITTB flag. */
if (cpu->interrupt_request & CPU_INTERRUPT_EXITTB) {
cpu->interrupt_request &= ~CPU_INTERRUPT_EXITTB;
/* ensure that no TB jump will be modified as
the program flow was changed */
next_tb = 0;
}
}
if (unlikely(cpu->exit_request
|| replay_has_interrupt())) {
cpu->exit_request = 0;
cpu->exception_index = EXCP_INTERRUPT;
cpu_loop_exit(cpu);
}
tb_lock();
tb = tb_find_fast(cpu);
/* Note: we do it here to avoid a gcc bug on Mac OS X when
doing it in tb_find_slow */
if (tcg_ctx.tb_ctx.tb_invalidated_flag) {
/* as some TB could have been invalidated because
of memory exceptions while generating the code, we
must recompute the hash index here */
next_tb = 0;
tcg_ctx.tb_ctx.tb_invalidated_flag = 0;
}
/* see if we can patch the calling TB. When the TB
spans two pages, we cannot safely do a direct
jump. */
if (next_tb != 0 && tb->page_addr[1] == -1
&& !qemu_loglevel_mask(CPU_LOG_TB_NOCHAIN)) {
tb_add_jump((TranslationBlock *)(next_tb & ~TB_EXIT_MASK),
next_tb & TB_EXIT_MASK, tb);
}
tb_unlock();
if (likely(!cpu->exit_request)) {
trace_exec_tb(tb, tb->pc);
/* execute the generated code */
cpu->current_tb = tb;
next_tb = cpu_tb_exec(cpu, tb);
cpu->current_tb = NULL;
switch (next_tb & TB_EXIT_MASK) {
case TB_EXIT_REQUESTED:
/* Something asked us to stop executing
* chained TBs; just continue round the main
* loop. Whatever requested the exit will also
* have set something else (eg exit_request or
* interrupt_request) which we will handle
* next time around the loop. But we need to
* ensure the tcg_exit_req read in generated code
* comes before the next read of cpu->exit_request
* or cpu->interrupt_request.
*/
smp_rmb();
next_tb = 0;
break;
case TB_EXIT_ICOUNT_EXPIRED:
{
/* Instruction counter expired. */
int insns_left = cpu->icount_decr.u32;
if (cpu->icount_extra && insns_left >= 0) {
/* Refill decrementer and continue execution. */
cpu->icount_extra += insns_left;
insns_left = MIN(0xffff, cpu->icount_extra);
cpu->icount_extra -= insns_left;
cpu->icount_decr.u16.low = insns_left;
} else {
if (insns_left > 0) {
/* Execute remaining instructions. */
tb = (TranslationBlock *)(next_tb & ~TB_EXIT_MASK);
cpu_exec_nocache(cpu, insns_left, tb, false);
align_clocks(&sc, cpu);
}
cpu->exception_index = EXCP_INTERRUPT;
next_tb = 0;
cpu_loop_exit(cpu);
}
break;
}
default:
break;
}
}
cpu_handle_interrupt(cpu, &last_tb);
tb = tb_find_fast(cpu, &last_tb, tb_exit);
cpu_loop_exec_tb(cpu, tb, &last_tb, &tb_exit, &sc);
/* Try to align the host and virtual clocks
if the guest is in advance */
align_clocks(&sc, cpu);
/* reset soft MMU for next block (it can currently
only be set by a memory fault) */
} /* for(;;) */
} else {
#if defined(__clang__) || !QEMU_GNUC_PREREQ(4, 6)
@@ -579,18 +636,10 @@ int cpu_exec(CPUState *cpu)
* Newer versions of gcc would complain about this code (-Wclobbered). */
cpu = current_cpu;
cc = CPU_GET_CLASS(cpu);
#ifdef TARGET_I386
x86_cpu = X86_CPU(cpu);
env = &x86_cpu->env;
#endif
#else /* buggy compiler */
/* Assert that the compiler does not smash local variables. */
g_assert(cpu == current_cpu);
g_assert(cc == CPU_GET_CLASS(cpu));
#ifdef TARGET_I386
g_assert(x86_cpu == X86_CPU(cpu));
g_assert(env == &x86_cpu->env);
#endif
#endif /* buggy compiler */
cpu->can_do_io = 1;
tb_lock_reset();

101
cpus.c
View File

@@ -24,7 +24,8 @@
/* Needed early for CONFIG_BSD etc. */
#include "qemu/osdep.h"
#include "qemu-common.h"
#include "cpu.h"
#include "monitor/monitor.h"
#include "qapi/qmp/qerror.h"
#include "qemu/error-report.h"
@@ -34,6 +35,7 @@
#include "sysemu/dma.h"
#include "sysemu/kvm.h"
#include "qmp-commands.h"
#include "exec/exec-all.h"
#include "qemu/thread.h"
#include "sysemu/cpus.h"
@@ -247,13 +249,13 @@ int64_t cpu_get_clock(void)
void cpu_enable_ticks(void)
{
/* Here, the really thing protected by seqlock is cpu_clock_offset. */
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
if (!timers_state.cpu_ticks_enabled) {
timers_state.cpu_ticks_offset -= cpu_get_host_ticks();
timers_state.cpu_clock_offset -= get_clock();
timers_state.cpu_ticks_enabled = 1;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
}
/* disable cpu_get_ticks() : the clock is stopped. You must not call
@@ -263,13 +265,13 @@ void cpu_enable_ticks(void)
void cpu_disable_ticks(void)
{
/* Here, the really thing protected by seqlock is cpu_clock_offset. */
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
if (timers_state.cpu_ticks_enabled) {
timers_state.cpu_ticks_offset += cpu_get_host_ticks();
timers_state.cpu_clock_offset = cpu_get_clock_locked();
timers_state.cpu_ticks_enabled = 0;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
}
/* Correlation between real and virtual time is always going to be
@@ -292,7 +294,7 @@ static void icount_adjust(void)
return;
}
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
cur_time = cpu_get_clock_locked();
cur_icount = cpu_get_icount_locked();
@@ -313,7 +315,7 @@ static void icount_adjust(void)
last_delta = delta;
timers_state.qemu_icount_bias = cur_icount
- (timers_state.qemu_icount << icount_time_shift);
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
}
static void icount_adjust_rt(void *opaque)
@@ -353,7 +355,7 @@ static void icount_warp_rt(void)
return;
}
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
if (runstate_is_running()) {
int64_t clock = REPLAY_CLOCK(REPLAY_CLOCK_VIRTUAL_RT,
cpu_get_clock_locked());
@@ -372,7 +374,7 @@ static void icount_warp_rt(void)
timers_state.qemu_icount_bias += warp_delta;
}
vm_clock_warp_start = -1;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
if (qemu_clock_expired(QEMU_CLOCK_VIRTUAL)) {
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
@@ -397,9 +399,9 @@ void qtest_clock_warp(int64_t dest)
int64_t deadline = qemu_clock_deadline_ns_all(QEMU_CLOCK_VIRTUAL);
int64_t warp = qemu_soonest_timeout(dest - clock, deadline);
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
timers_state.qemu_icount_bias += warp;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
qemu_clock_run_timers(QEMU_CLOCK_VIRTUAL);
timerlist_run_timers(aio_context->tlg.tl[QEMU_CLOCK_VIRTUAL]);
@@ -466,9 +468,9 @@ void qemu_start_warp_timer(void)
* It is useful when we want a deterministic execution time,
* isolated from host latencies.
*/
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
timers_state.qemu_icount_bias += deadline;
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
qemu_clock_notify(QEMU_CLOCK_VIRTUAL);
} else {
/*
@@ -479,11 +481,11 @@ void qemu_start_warp_timer(void)
* you will not be sending network packets continuously instead of
* every 100ms.
*/
seqlock_write_lock(&timers_state.vm_clock_seqlock);
seqlock_write_begin(&timers_state.vm_clock_seqlock);
if (vm_clock_warp_start == -1 || vm_clock_warp_start > clock) {
vm_clock_warp_start = clock;
}
seqlock_write_unlock(&timers_state.vm_clock_seqlock);
seqlock_write_end(&timers_state.vm_clock_seqlock);
timer_mod_anticipate(icount_warp_timer, clock + deadline);
}
} else if (deadline == 0) {
@@ -619,7 +621,7 @@ int cpu_throttle_get_percentage(void)
void cpu_ticks_init(void)
{
seqlock_init(&timers_state.vm_clock_seqlock, NULL);
seqlock_init(&timers_state.vm_clock_seqlock);
vmstate_register(NULL, 0, &vmstate_timers, &timers_state);
throttle_timer = timer_new_ns(QEMU_CLOCK_VIRTUAL_RT,
cpu_throttle_timer_tick, NULL);
@@ -778,7 +780,7 @@ static void sigbus_reraise(void)
raise(SIGBUS);
sigemptyset(&set);
sigaddset(&set, SIGBUS);
sigprocmask(SIG_UNBLOCK, &set, NULL);
pthread_sigmask(SIG_UNBLOCK, &set, NULL);
}
perror("Failed to re-raise SIGBUS!\n");
abort();
@@ -970,6 +972,18 @@ void async_run_on_cpu(CPUState *cpu, void (*func)(void *data), void *data)
qemu_cpu_kick(cpu);
}
static void qemu_kvm_destroy_vcpu(CPUState *cpu)
{
if (kvm_destroy_vcpu(cpu) < 0) {
error_report("kvm_destroy_vcpu failed");
exit(EXIT_FAILURE);
}
}
static void qemu_tcg_destroy_vcpu(CPUState *cpu)
{
}
static void flush_queued_work(CPUState *cpu)
{
struct qemu_work_item *wi;
@@ -1059,7 +1073,7 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
cpu->created = true;
qemu_cond_signal(&qemu_cpu_cond);
while (1) {
do {
if (cpu_can_run(cpu)) {
r = kvm_cpu_exec(cpu);
if (r == EXCP_DEBUG) {
@@ -1067,8 +1081,12 @@ static void *qemu_kvm_cpu_thread_fn(void *arg)
}
}
qemu_kvm_wait_io_event(cpu);
}
} while (!cpu->unplug || cpu_can_run(cpu));
qemu_kvm_destroy_vcpu(cpu);
cpu->created = false;
qemu_cond_signal(&qemu_cpu_cond);
qemu_mutex_unlock_iothread();
return NULL;
}
@@ -1122,6 +1140,7 @@ static void tcg_exec_all(void);
static void *qemu_tcg_cpu_thread_fn(void *arg)
{
CPUState *cpu = arg;
CPUState *remove_cpu = NULL;
rcu_register_thread();
@@ -1159,6 +1178,18 @@ static void *qemu_tcg_cpu_thread_fn(void *arg)
}
}
qemu_tcg_wait_io_event(QTAILQ_FIRST(&cpus));
CPU_FOREACH(cpu) {
if (cpu->unplug && !cpu_can_run(cpu)) {
remove_cpu = cpu;
break;
}
}
if (remove_cpu) {
qemu_tcg_destroy_vcpu(remove_cpu);
cpu->created = false;
qemu_cond_signal(&qemu_cpu_cond);
remove_cpu = NULL;
}
}
return NULL;
@@ -1315,6 +1346,21 @@ void resume_all_vcpus(void)
}
}
void cpu_remove(CPUState *cpu)
{
cpu->stop = true;
cpu->unplug = true;
qemu_cpu_kick(cpu);
}
void cpu_remove_sync(CPUState *cpu)
{
cpu_remove(cpu);
while (cpu->created) {
qemu_cond_wait(&qemu_cpu_cond, &qemu_global_mutex);
}
}
/* For temporary buffers for forming a name */
#define VCPU_THREAD_NAME_SIZE 16
@@ -1531,6 +1577,9 @@ static void tcg_exec_all(void)
break;
}
} else if (cpu->stop || cpu->stopped) {
if (cpu->unplug) {
next_cpu = CPU_NEXT(cpu);
}
break;
}
}
@@ -1691,21 +1740,7 @@ exit:
void qmp_inject_nmi(Error **errp)
{
#if defined(TARGET_I386)
CPUState *cs;
CPU_FOREACH(cs) {
X86CPU *cpu = X86_CPU(cs);
if (!cpu->apic_state) {
cpu_interrupt(cs, CPU_INTERRUPT_NMI);
} else {
apic_deliver_nmi(cpu->apic_state);
}
}
#else
nmi_monitor_handle(monitor_get_cpu_index(), errp);
#endif
}
void dump_drift_info(FILE *f, fprintf_function cpu_fprintf)

View File

@@ -28,7 +28,10 @@
#include "exec/memory-internal.h"
#include "exec/ram_addr.h"
#include "exec/exec-all.h"
#include "tcg/tcg.h"
#include "qemu/error-report.h"
#include "exec/log.h"
/* DEBUG defines, enable DEBUG_TLB_LOG to log to the CPU_LOG_MMU target */
/* #define DEBUG_TLB */
@@ -76,10 +79,6 @@ void tlb_flush(CPUState *cpu, int flush_global)
tlb_debug("(%d)\n", flush_global);
/* must reset current TB so that interrupts cannot modify the
links while we are modifying them */
cpu->current_tb = NULL;
memset(env->tlb_table, -1, sizeof(env->tlb_table));
memset(env->tlb_v_table, -1, sizeof(env->tlb_v_table));
memset(cpu->tb_jmp_cache, 0, sizeof(cpu->tb_jmp_cache));
@@ -95,9 +94,6 @@ static inline void v_tlb_flush_by_mmuidx(CPUState *cpu, va_list argp)
CPUArchState *env = cpu->env_ptr;
tlb_debug("start\n");
/* must reset current TB so that interrupts cannot modify the
links while we are modifying them */
cpu->current_tb = NULL;
for (;;) {
int mmu_idx = va_arg(argp, int);
@@ -152,9 +148,6 @@ void tlb_flush_page(CPUState *cpu, target_ulong addr)
tlb_flush(cpu, 1);
return;
}
/* must reset current TB so that interrupts cannot modify the
links while we are modifying them */
cpu->current_tb = NULL;
addr &= TARGET_PAGE_MASK;
i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
@@ -193,9 +186,6 @@ void tlb_flush_page_by_mmuidx(CPUState *cpu, target_ulong addr, ...)
va_end(argp);
return;
}
/* must reset current TB so that interrupts cannot modify the
links while we are modifying them */
cpu->current_tb = NULL;
addr &= TARGET_PAGE_MASK;
i = (addr >> TARGET_PAGE_BITS) & (CPU_TLB_SIZE - 1);
@@ -258,7 +248,8 @@ static inline ram_addr_t qemu_ram_addr_from_host_nofail(void *ptr)
{
ram_addr_t ram_addr;
if (qemu_ram_addr_from_host(ptr, &ram_addr) == NULL) {
ram_addr = qemu_ram_addr_from_host(ptr);
if (ram_addr == RAM_ADDR_INVALID) {
fprintf(stderr, "Bad ram pointer %p\n", ptr);
abort();
}
@@ -438,6 +429,39 @@ void tlb_set_page(CPUState *cpu, target_ulong vaddr,
prot, mmu_idx, size);
}
static void report_bad_exec(CPUState *cpu, target_ulong addr)
{
/* Accidentally executing outside RAM or ROM is quite common for
* several user-error situations, so report it in a way that
* makes it clear that this isn't a QEMU bug and provide suggestions
* about what a user could do to fix things.
*/
error_report("Trying to execute code outside RAM or ROM at 0x"
TARGET_FMT_lx, addr);
error_printf("This usually means one of the following happened:\n\n"
"(1) You told QEMU to execute a kernel for the wrong machine "
"type, and it crashed on startup (eg trying to run a "
"raspberry pi kernel on a versatilepb QEMU machine)\n"
"(2) You didn't give QEMU a kernel or BIOS filename at all, "
"and QEMU executed a ROM full of no-op instructions until "
"it fell off the end\n"
"(3) Your guest kernel has a bug and crashed by jumping "
"off into nowhere\n\n"
"This is almost always one of the first two, so check your "
"command line and that you are using the right type of kernel "
"for this machine.\n"
"If you think option (3) is likely then you can try debugging "
"your guest with the -d debug options; in particular "
"-d guest_errors will cause the log to include a dump of the "
"guest register state at this point.\n\n"
"Execution cannot continue; stopping here.\n\n");
/* Report also to the logs, with more detail including register dump */
qemu_log_mask(LOG_GUEST_ERROR, "qemu: fatal: Trying to execute code "
"outside RAM or ROM at 0x" TARGET_FMT_lx "\n", addr);
log_cpu_state_mask(LOG_GUEST_ERROR, cpu, CPU_DUMP_FPU | CPU_DUMP_CCOP);
}
/* NOTE: this function can trigger an exception */
/* NOTE2: the returned address is not exactly the physical address: it
* is actually a ram_addr_t (in system mode; the user mode emulation
@@ -466,8 +490,8 @@ tb_page_addr_t get_page_addr_code(CPUArchState *env1, target_ulong addr)
if (cc->do_unassigned_access) {
cc->do_unassigned_access(cpu, addr, false, true, 0, 4);
} else {
cpu_abort(cpu, "Trying to execute code outside RAM or ROM at 0x"
TARGET_FMT_lx "\n", addr);
report_bad_exec(cpu, addr);
exit(1);
}
}
p = (void *)((uintptr_t)addr + env1->tlb_table[mmu_idx][page_index].addend);

View File

@@ -24,6 +24,7 @@
*/
#include "qemu/osdep.h"
#include "qemu/bswap.h"
#include "crypto/afsplit.h"
#include "crypto/random.h"

View File

@@ -20,6 +20,7 @@
#include "qemu/osdep.h"
#include "qapi/error.h"
#include "qemu/bswap.h"
#include "crypto/block-luks.h"
@@ -1080,8 +1081,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
luks->header.key_slots[i].key_offset =
(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
(ROUND_UP(((splitkeylen + (QCRYPTO_BLOCK_LUKS_SECTOR_SIZE - 1)) /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
(ROUND_UP(DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) * i);
}
@@ -1181,8 +1181,7 @@ qcrypto_block_luks_create(QCryptoBlock *block,
luks->header.payload_offset =
(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE) +
(ROUND_UP(((splitkeylen + (QCRYPTO_BLOCK_LUKS_SECTOR_SIZE - 1)) /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
(ROUND_UP(DIV_ROUND_UP(splitkeylen, QCRYPTO_BLOCK_LUKS_SECTOR_SIZE),
(QCRYPTO_BLOCK_LUKS_KEY_SLOT_OFFSET /
QCRYPTO_BLOCK_LUKS_SECTOR_SIZE)) *
QCRYPTO_BLOCK_LUKS_NUM_KEY_SLOTS);

View File

@@ -36,9 +36,7 @@ static size_t qcrypto_hash_alg_size[QCRYPTO_HASH_ALG__MAX] = {
size_t qcrypto_hash_digest_len(QCryptoHashAlgorithm alg)
{
if (alg >= G_N_ELEMENTS(qcrypto_hash_alg_size)) {
return 0;
}
assert(alg < G_N_ELEMENTS(qcrypto_hash_alg_size));
return qcrypto_hash_alg_size[alg];
}

View File

@@ -392,11 +392,14 @@ qcrypto_tls_creds_load_cert(QCryptoTLSCredsX509 *creds,
gsize buflen;
GError *gerr;
int ret = -1;
int err;
trace_qcrypto_tls_creds_x509_load_cert(creds, isServer, certFile);
if (gnutls_x509_crt_init(&cert) < 0) {
error_setg(errp, "Unable to initialize certificate");
err = gnutls_x509_crt_init(&cert);
if (err < 0) {
error_setg(errp, "Unable to initialize certificate: %s",
gnutls_strerror(err));
goto cleanup;
}
@@ -410,11 +413,13 @@ qcrypto_tls_creds_load_cert(QCryptoTLSCredsX509 *creds,
data.data = (unsigned char *)buf;
data.size = strlen(buf);
if (gnutls_x509_crt_import(cert, &data, GNUTLS_X509_FMT_PEM) < 0) {
err = gnutls_x509_crt_import(cert, &data, GNUTLS_X509_FMT_PEM);
if (err < 0) {
error_setg(errp, isServer ?
"Unable to import server certificate %s" :
"Unable to import client certificate %s",
certFile);
"Unable to import server certificate %s: %s" :
"Unable to import client certificate %s: %s",
certFile,
gnutls_strerror(err));
goto cleanup;
}

19
crypto/trace-events Normal file
View File

@@ -0,0 +1,19 @@
# See docs/trace-events.txt for syntax documentation.
# crypto/tlscreds.c
qcrypto_tls_creds_load_dh(void *creds, const char *filename) "TLS creds load DH creds=%p filename=%s"
qcrypto_tls_creds_get_path(void *creds, const char *filename, const char *path) "TLS creds path creds=%p filename=%s path=%s"
# crypto/tlscredsanon.c
qcrypto_tls_creds_anon_load(void *creds, const char *dir) "TLS creds anon load creds=%p dir=%s"
# crypto/tlscredsx509.c
qcrypto_tls_creds_x509_load(void *creds, const char *dir) "TLS creds x509 load creds=%p dir=%s"
qcrypto_tls_creds_x509_check_basic_constraints(void *creds, const char *file, int status) "TLS creds x509 check basic constraints creds=%p file=%s status=%d"
qcrypto_tls_creds_x509_check_key_usage(void *creds, const char *file, int status, int usage, int critical) "TLS creds x509 check key usage creds=%p file=%s status=%d usage=%d critical=%d"
qcrypto_tls_creds_x509_check_key_purpose(void *creds, const char *file, int status, const char *usage, int critical) "TLS creds x509 check key usage creds=%p file=%s status=%d usage=%s critical=%d"
qcrypto_tls_creds_x509_load_cert(void *creds, int isServer, const char *file) "TLS creds x509 load cert creds=%p isServer=%d file=%s"
qcrypto_tls_creds_x509_load_cert_list(void *creds, const char *file) "TLS creds x509 load cert list creds=%p file=%s"
# crypto/tlssession.c
qcrypto_tls_session_new(void *session, void *creds, const char *hostname, const char *aclname, int endpoint) "TLS session new session=%p creds=%p hostname=%s aclname=%s endpoint=%d"

View File

@@ -3,4 +3,7 @@
# We support all the 32 bit boards so need all their config
include arm-softmmu.mak
CONFIG_AUX=y
CONFIG_DDC=y
CONFIG_DPCD=y
CONFIG_XLNX_ZYNQMP=y

View File

@@ -66,6 +66,7 @@ CONFIG_PXA2XX=y
CONFIG_BITBANG_I2C=y
CONFIG_FRAMEBUFFER=y
CONFIG_XILINX_SPIPS=y
CONFIG_ZYNQ_DEVCFG=y
CONFIG_ARM11SCU=y
CONFIG_A9SCU=y
@@ -100,6 +101,7 @@ CONFIG_ALLWINNER_A10_PIT=y
CONFIG_ALLWINNER_A10_PIC=y
CONFIG_ALLWINNER_A10=y
CONFIG_FSL_IMX6=y
CONFIG_FSL_IMX31=y
CONFIG_FSL_IMX25=y

View File

@@ -18,6 +18,7 @@ CONFIG_MEGASAS_SCSI_PCI=y
CONFIG_MPTSAS_SCSI_PCI=y
CONFIG_RTL8139_PCI=y
CONFIG_E1000_PCI=y
CONFIG_E1000E_PCI=y
CONFIG_VMXNET3_PCI=y
CONFIG_IDE_CORE=y
CONFIG_IDE_QDEV=y

View File

@@ -49,6 +49,7 @@ CONFIG_ETSEC=y
CONFIG_LIBDECNUMBER=y
# For pSeries
CONFIG_XICS=$(CONFIG_PSERIES)
CONFIG_XICS_SPAPR=$(CONFIG_PSERIES)
CONFIG_XICS_KVM=$(and $(CONFIG_PSERIES),$(CONFIG_KVM))
# For PReP
CONFIG_MC146818RTC=y

View File

@@ -20,6 +20,7 @@
#include "qapi/error.h"
#include "qemu-common.h"
#include "qemu/error-report.h"
#include "qemu/bswap.h"
#include "sysemu/device_tree.h"
#include "sysemu/sysemu.h"
#include "hw/loader.h"

View File

@@ -70,16 +70,17 @@ void qemu_sglist_destroy(QEMUSGList *qsg)
typedef struct {
BlockAIOCB common;
BlockBackend *blk;
AioContext *ctx;
BlockAIOCB *acb;
QEMUSGList *sg;
uint64_t sector_num;
uint64_t offset;
DMADirection dir;
int sg_cur_index;
dma_addr_t sg_cur_byte;
QEMUIOVector iov;
QEMUBH *bh;
DMAIOFunc *io_func;
void *io_func_opaque;
} DMAAIOCB;
static void dma_blk_cb(void *opaque, int ret);
@@ -130,7 +131,7 @@ static void dma_blk_cb(void *opaque, int ret)
trace_dma_blk_cb(dbs, ret);
dbs->acb = NULL;
dbs->sector_num += dbs->iov.size / 512;
dbs->offset += dbs->iov.size;
if (dbs->sg_cur_index == dbs->sg->nsg || ret < 0) {
dma_complete(dbs, ret);
@@ -154,8 +155,7 @@ static void dma_blk_cb(void *opaque, int ret)
if (dbs->iov.size == 0) {
trace_dma_map_wait(dbs);
dbs->bh = aio_bh_new(blk_get_aio_context(dbs->blk),
reschedule_dma, dbs);
dbs->bh = aio_bh_new(dbs->ctx, reschedule_dma, dbs);
cpu_register_map_client(dbs->bh);
return;
}
@@ -164,8 +164,8 @@ static void dma_blk_cb(void *opaque, int ret)
qemu_iovec_discard_back(&dbs->iov, dbs->iov.size & ~BDRV_SECTOR_MASK);
}
dbs->acb = dbs->io_func(dbs->blk, dbs->sector_num, &dbs->iov,
dbs->iov.size / 512, dma_blk_cb, dbs);
dbs->acb = dbs->io_func(dbs->offset, &dbs->iov,
dma_blk_cb, dbs, dbs->io_func_opaque);
assert(dbs->acb);
}
@@ -185,29 +185,38 @@ static void dma_aio_cancel(BlockAIOCB *acb)
}
}
static AioContext *dma_get_aio_context(BlockAIOCB *acb)
{
DMAAIOCB *dbs = container_of(acb, DMAAIOCB, common);
return dbs->ctx;
}
static const AIOCBInfo dma_aiocb_info = {
.aiocb_size = sizeof(DMAAIOCB),
.cancel_async = dma_aio_cancel,
.get_aio_context = dma_get_aio_context,
};
BlockAIOCB *dma_blk_io(
BlockBackend *blk, QEMUSGList *sg, uint64_t sector_num,
DMAIOFunc *io_func, BlockCompletionFunc *cb,
BlockAIOCB *dma_blk_io(AioContext *ctx,
QEMUSGList *sg, uint64_t offset,
DMAIOFunc *io_func, void *io_func_opaque,
BlockCompletionFunc *cb,
void *opaque, DMADirection dir)
{
DMAAIOCB *dbs = blk_aio_get(&dma_aiocb_info, blk, cb, opaque);
DMAAIOCB *dbs = qemu_aio_get(&dma_aiocb_info, NULL, cb, opaque);
trace_dma_blk_io(dbs, blk, sector_num, (dir == DMA_DIRECTION_TO_DEVICE));
trace_dma_blk_io(dbs, io_func_opaque, offset, (dir == DMA_DIRECTION_TO_DEVICE));
dbs->acb = NULL;
dbs->blk = blk;
dbs->sg = sg;
dbs->sector_num = sector_num;
dbs->ctx = ctx;
dbs->offset = offset;
dbs->sg_cur_index = 0;
dbs->sg_cur_byte = 0;
dbs->dir = dir;
dbs->io_func = io_func;
dbs->io_func_opaque = io_func_opaque;
dbs->bh = NULL;
qemu_iovec_init(&dbs->iov, sg->nsg);
dma_blk_cb(dbs, 0);
@@ -215,19 +224,39 @@ BlockAIOCB *dma_blk_io(
}
static
BlockAIOCB *dma_blk_read_io_func(int64_t offset, QEMUIOVector *iov,
BlockCompletionFunc *cb, void *cb_opaque,
void *opaque)
{
BlockBackend *blk = opaque;
return blk_aio_preadv(blk, offset, iov, 0, cb, cb_opaque);
}
BlockAIOCB *dma_blk_read(BlockBackend *blk,
QEMUSGList *sg, uint64_t sector,
QEMUSGList *sg, uint64_t offset,
void (*cb)(void *opaque, int ret), void *opaque)
{
return dma_blk_io(blk, sg, sector, blk_aio_readv, cb, opaque,
return dma_blk_io(blk_get_aio_context(blk),
sg, offset, dma_blk_read_io_func, blk, cb, opaque,
DMA_DIRECTION_FROM_DEVICE);
}
static
BlockAIOCB *dma_blk_write_io_func(int64_t offset, QEMUIOVector *iov,
BlockCompletionFunc *cb, void *cb_opaque,
void *opaque)
{
BlockBackend *blk = opaque;
return blk_aio_pwritev(blk, offset, iov, 0, cb, cb_opaque);
}
BlockAIOCB *dma_blk_write(BlockBackend *blk,
QEMUSGList *sg, uint64_t sector,
QEMUSGList *sg, uint64_t offset,
void (*cb)(void *opaque, int ret), void *opaque)
{
return dma_blk_io(blk, sg, sector, blk_aio_writev, cb, opaque,
return dma_blk_io(blk_get_aio_context(blk),
sg, offset, dma_blk_write_io_func, blk, cb, opaque,
DMA_DIRECTION_TO_DEVICE);
}

View File

@@ -62,7 +62,7 @@ operations:
typeof(*ptr) atomic_fetch_sub(ptr, val)
typeof(*ptr) atomic_fetch_and(ptr, val)
typeof(*ptr) atomic_fetch_or(ptr, val)
typeof(*ptr) atomic_xchg(ptr, val
typeof(*ptr) atomic_xchg(ptr, val)
typeof(*ptr) atomic_cmpxchg(ptr, old, new)
all of which return the old value of *ptr. These operations are
@@ -326,21 +326,41 @@ and memory barriers, and the equivalents in QEMU:
use a boxed atomic_t type; atomic operations in QEMU are polymorphic
and use normal C types.
- atomic_read and atomic_set in Linux give no guarantee at all;
atomic_read and atomic_set in QEMU include a compiler barrier
(similar to the ACCESS_ONCE macro in Linux).
- Originally, atomic_read and atomic_set in Linux gave no guarantee
at all. Linux 4.1 updated them to implement volatile
semantics via ACCESS_ONCE (or the more recent READ/WRITE_ONCE).
- most atomic read-modify-write operations in Linux return void;
in QEMU, all of them return the old value of the variable.
QEMU's atomic_read/set implement, if the compiler supports it, C11
atomic relaxed semantics, and volatile semantics otherwise.
Both semantics prevent the compiler from doing certain transformations;
the difference is that atomic accesses are guaranteed to be atomic,
while volatile accesses aren't. Thus, in the volatile case we just cross
our fingers hoping that the compiler will generate atomic accesses,
since we assume the variables passed are machine-word sized and
properly aligned.
No barriers are implied by atomic_read/set in either Linux or QEMU.
- atomic read-modify-write operations in Linux are of three kinds:
atomic_OP returns void
atomic_OP_return returns new value of the variable
atomic_fetch_OP returns the old value of the variable
atomic_cmpxchg returns the old value of the variable
In QEMU, the second kind does not exist. Currently Linux has
atomic_fetch_or only. QEMU provides and, or, inc, dec, add, sub.
- different atomic read-modify-write operations in Linux imply
a different set of memory barriers; in QEMU, all of them enforce
sequential consistency, which means they imply full memory barriers
before and after the operation.
- Linux does not have an equivalent of atomic_mb_read() and
atomic_mb_set(). In particular, note that set_mb() is a little
weaker than atomic_mb_set().
- Linux does not have an equivalent of atomic_mb_set(). In particular,
note that smp_store_mb() is a little weaker than atomic_mb_set().
atomic_mb_read() compiles to the same instructions as Linux's
smp_load_acquire(), but this should be treated as an implementation
detail. If required, QEMU might later add atomic_load_acquire() and
atomic_store_release() macros.
SOURCES

View File

@@ -438,6 +438,11 @@ top level Makefile, so anything defined in this file will influence the
entire build system. Care needs to be taken when writing rules for tests
to ensure they only apply to the unit test execution / build.
- tests/docker/Makefile.include
Rules for Docker tests. Like tests/Makefile, this file is included
directly by the top level Makefile, anything defined in this file will
influence the entire build system.
- po/Makefile

133
docs/igd-assign.txt Normal file
View File

@@ -0,0 +1,133 @@
Intel Graphics Device (IGD) assignment with vfio-pci
====================================================
IGD has two different modes for assignment using vfio-pci:
1) Universal Pass-Through (UPT) mode:
In this mode the IGD device is added as a *secondary* (ie. non-primary)
graphics device in combination with an emulated primary graphics device.
This mode *requires* guest driver support to remove the external
dependencies generally associated with IGD (see below). Those guest
drivers only support this mode for Broadwell and newer IGD, according to
Intel. Additionally, this mode by default, and as officially supported
by Intel, does not support direct video output. The intention is to use
this mode either to provide hardware acceleration to the emulated graphics
or to use this mode in combination with guest-based remote access software,
for example VNC (see below for optional output support). This mode
theoretically has no device specific handling dependencies on vfio-pci or
the VM firmware.
2) "Legacy" mode:
In this mode the IGD device is intended to be the primary and exclusive
graphics device in the VM[1], as such QEMU does not facilitate any sort
of remote graphics to the VM in this mode. A connected physical monitor
is the intended output device for IGD. This mode includes several
requirements and restrictions:
* IGD must be given address 02.0 on the PCI root bus in the VM
* The host kernel must support vfio extensions for IGD (v4.6)
* vfio VGA support very likely needs to be enabled in the host kernel
* The VM firmware must support specific fw_cfg enablers for IGD
* The VM machine type must support a PCI host bridge at 00.0 (standard)
* The VM machine type must provide or allow to be created a special
ISA/LPC bridge device (vfio-pci-igd-lpc-bridge) on the root bus at
PCI address 1f.0.
* The IGD device must have a VGA ROM, either provided via the romfile
option or loaded automatically through vfio (standard). rombar=0
will disable legacy mode support.
* Hotplug of the IGD device is not supported.
* The IGD device must be a SandyBridge or newer model device.
For either mode, depending on the host kernel, the i915 driver in the host
may generate faults and errors upon re-binding to an IGD device after it
has been assigned to a VM. It's therefore generally recommended to prevent
such driver binding unless the host driver is known to work well for this.
There are numerous ways to do this, i915 can be blacklisted on the host,
the driver_override option can be used to ensure that only vfio-pci can bind
to the device on the host[2], virsh nodedev-detach can be used to bind the
device to vfio drivers and then managed='no' set in the VM xml to prevent
re-binding to i915, etc. Also note that IGD is also typically the primary
graphics in the host and special options may be required beyond simply
blacklisting i915 or using pci-stub/vfio-pci to take ownership of IGD as a
PCI class device. Lower level drivers exist that may still claim the device.
It may therefore be necessary to use kernel boot options video=vesafb:off or
video=efifb:off (depending on host BIOS/UEFI) or these can be combined to
a catch-all, video=vesafb:off,efifb:off. Error messages such as:
Failed to mmap 0000:00:02.0 BAR <>. Performance may be slow
are a good indicator that such a problem exists. The host files /proc/iomem
and /proc/ioports are often useful for identifying drivers consuming ranges
of the device to cause such conflicts.
Additionally, IGD device are known to generate small numbers of DMAR faults
when initially assigned. It is believed that this is simply the IGD attempting
to access the reserved GTT space after reset, which it no longer has access to
when accessed from userspace. So long as the DMAR faults are small in number
and most importantly, not ongoing, these are not an indication of an error.
Additionally++, analog VGA output (as opposed to digital outputs like HDMI,
DVI, or DisplayPort) may be unsupported in some use cases. In the author's
experience, even DP to VGA adapters can be troublesome while adapters between
digital formats work well.
Usage
=====
The intention is for IGD assignment to be transparent for users and thus for
management tools like libvirt. To make use of legacy mode, simply remove all
other graphics options and use "-nographic" and either "-vga none" or
"-nodefaults", along with adding the device using vfio-pci:
-device vfio-pci,host=00:02.0,id=hostdev0,bus=pci.0,addr=0x2
For UPT mode, retain the default emulated graphics and simply add the vfio-pci
device making use of any other bus address other than 02.0. libvirt will
default to assigning the device a UPT compatible address while legacy mode
users will need to manually edit the XML if using a tool like virt-manager
where the VM device address is not expressly specified.
An experimental vfio-pci option also exists to enable OpRegion, and thus
external monitor support, for UPT mode. This can be enabled by adding
"x-igd-opregion=on" to the vfio-pci device options for the IGD device. As
with legacy mode, this requires the host to support features introduced in
the v4.6 kernel. If Intel chooses to embrace this support, the option may
be made non-experimental in the future, opening it to libvirt support.
Developer ABI
=============
Legacy mode IGD support imposes two fw_cfg requirements on the VM firmware:
1) "etc/igd-opregion"
This fw_cfg file exposes the OpRegion for the IGD device. A reserved
region should be created below 4GB (recommended 4KB alignment), sized
sufficient for the fw_cfg file size, and the content of this file copied
to it. The dword based address of this reserved memory region must also
be written to the ASLS register at offset 0xFC on the IGD device. It is
recommended that firmware should make use of this fw_cfg entry for any
PCI class VGA device with Intel vendor ID. Multiple of such devices
within a VM is undefined.
2) "etc/igd-bdsm-size"
This fw_cfg file contains an 8-byte, little endian integer indicating
the size of the reserved memory region required for IGD stolen memory.
Firmware must allocate a reserved memory below 4GB with required 1MB
alignment equal to this size. Additionally the base address of this
reserved region must be written to the dword BDSM register in PCI config
space of the IGD device at offset 0x5C. As this support is related to
running the IGD ROM, which has other dependencies on the device appearing
at guest address 00:02.0, it's expected that this fw_cfg file is only
relevant to a single PCI class VGA device with Intel vendor ID, appearing
at PCI bus address 00:02.0.
Footnotes
=========
[1] Nothing precludes adding additional emulated or assigned graphics devices
as non-primary, other than the combination typically not working. I only
intend to set user expectations, others are welcome to find working
combinations or fix whatever issues prevent this from working in the common
case.
[2] # echo "vfio-pci" > /sys/bus/pci/devices/0000:00:02.0/driver_override

View File

@@ -41,8 +41,13 @@ MemoryRegion):
MemoryRegionOps structure describing the callbacks.
- ROM: a ROM memory region works like RAM for reads (directly accessing
a region of host memory), but like MMIO for writes (invoking a callback).
You initialize these with memory_region_init_rom_device().
a region of host memory), and forbids writes. You initialize these with
memory_region_init_rom().
- ROM device: a ROM device memory region works like RAM for reads
(directly accessing a region of host memory), but like MMIO for
writes (invoking a callback). You initialize these with
memory_region_init_rom_device().
- IOMMU region: an IOMMU region translates addresses of accesses made to it
and forwards them to some other target memory region. As the name suggests,

View File

@@ -403,8 +403,8 @@ listen thread: --- page -- page -- page -- page -- page --
On receipt of CMD_PACKAGED (1)
All the data associated with the package - the ( ... ) section in the
diagram - is read into memory (into a QEMUSizedBuffer), and the main thread
recurses into qemu_loadvm_state_main to process the contents of the package (2)
diagram - is read into memory, and the main thread recurses into
qemu_loadvm_state_main to process the contents of the package (2)
which contains commands (3,6) and devices (4...)
On receipt of 'postcopy listen' - 3 -(i.e. the 1st command in the package)

Some files were not shown because too many files have changed in this diff Show More