With "ps2: use QEMU qcodes instead of scancodes", key handling was
changed to qcode base. But all scancodes are not converted to new one.
This adds some missing qcodes/scancodes what I found in using.
[set1 and set3 are from <hpoussin@reactos.org>]
Signed-off-by: OGAWA Hirofumi <hirofumi@mail.parknet.co.jp>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
For builds with Mingw-w64 as it is included in Cygwin, there are two
header files which define KEY_EVENT with different values.
This results in lots of compiler warnings like this one:
CC vl.o
In file included from /qemu/include/ui/console.h:340:0,
from /qemu/vl.c:76:
/usr/i686-w64-mingw32/sys-root/mingw/include/curses.h:1522:0: warning: "KEY_EVENT" redefined
#define KEY_EVENT 0633 /* We were interrupted by an event */
In file included from /usr/share/mingw-w64/include/windows.h:74:0,
from /usr/share/mingw-w64/include/winsock2.h:23,
from /qemu/include/sysemu/os-win32.h:29,
from /qemu/include/qemu/osdep.h:100,
from /qemu/vl.c:24:
/usr/share/mingw-w64/include/wincon.h:101:0: note: this is the location of the previous definition
#define KEY_EVENT 0x1
QEMU only uses the KEY_EVENT macro from wincon.h.
Therefore we can undefine the macro coming from curses.h.
The explicit include statement for curses.h in ui/curses.c is not needed
and was removed.
Those two modifications fix the redefinition warnings.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Message-id: 20161119185318.10564-1-sw@weilnetz.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
If the buffer is not big enough, snprintf() does not return the number
of bytes that have been written to the buffer, but the number of bytes
that would be needed for writing the whole string. By using this value
for the following vnc_write() calls, we send some junk at the end of
the name in case the qemu_name is longer than 1017 bytes, which could
confuse the VNC clients. Fix this by adding an additional size check
here.
Buglink: https://bugs.launchpad.net/qemu/+bug/1637447
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1479749115-21932-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch fixes a segfault at QEMU startup, introduced in a08156321a.
gd_vc_find_current() return NULL, which is dereferenced without checking it.
While at it, disable the whole 'View' menu if no console exists.
Reproducer: qemu-system-i386 -M none -nodefaults
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1483263585-8101-1-git-send-email-hpoussin@reactos.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
- transport specific callbacks (for Xen)
- fix crash (2.8 regression)
- 9p functional tests
# gpg: Signature made Tue 03 Jan 2017 17:30:58 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
tests: virtio-9p: ".." cannot be used to walk out of the shared directory
tests: virtio-9p: no slash in path elements during walk
tests: virtio-9p: add walk operation test
tests: virtio-9p: add attach operation test
tests: virtio-9p: add version operation test
9pfs: fix P9_NOTAG and P9_NOFID macros
tests: virtio-9p: code refactoring
tests: virtio-9p: rename PCI configuration test
9pfs: fix crash when fsdev is missing
9pfs: introduce init_out/in_iov_from_pdu
9pfs: call v9fs_init_qiov_from_pdu before v9fs_pack
9pfs: introduce transport specific callbacks
9pfs: move pdus to V9fsState
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch is based on the algorithm for the kvm.ko halt_poll_ns
parameter in Linux. The initial polling time is zero.
If the event loop is woken up within the maximum polling time it means
polling could be effective, so grow polling time.
If the event loop is woken up beyond the maximum polling time it means
polling is not effective, so shrink polling time.
If the event loop makes progress within the current polling time then
the sweet spot has been reached.
This algorithm adjusts the polling time so it can adapt to variations in
workloads. The goal is to reach the sweet spot while also recognizing
when polling would hurt more than help.
Two new trace events, poll_grow and poll_shrink, are added for observing
polling time adjustment.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-13-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The begin and end callbacks can be used to prepare for the polling loop
and clean up when polling stops. Note that they may only be called once
for multiple aio_poll() calls if polling continues to succeed. Once
polling fails the end callback is invoked before aio_poll() resumes file
descriptor monitoring.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-11-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The Linux AIO userspace ABI includes a ring that is shared with the
kernel. This allows userspace programs to process completions without
system calls.
Add an AioContext poll handler to check for completions in the ring.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The AioContext event loop uses ppoll(2) or epoll_wait(2) to monitor file
descriptors or until a timer expires. In cases like virtqueues, Linux
AIO, and ThreadPool it is technically possible to wait for events via
polling (i.e. continuously checking for events without blocking).
Polling can be faster than blocking syscalls because file descriptors,
the process scheduler, and system calls are bypassed.
The main disadvantage to polling is that it increases CPU utilization.
In classic polling configuration a full host CPU thread might run at
100% to respond to events as quickly as possible. This patch implements
a timeout so we fall back to blocking syscalls if polling detects no
activity. After the timeout no CPU cycles are wasted on polling until
the next event loop iteration.
The run_poll_handlers_begin() and run_poll_handlers_end() trace events
are added to aid performance analysis and troubleshooting. If you need
to know whether polling mode is being used, trace these events to find
out.
Note that the AioContext is now re-acquired before disabling notify_me
in the non-polling case. This makes the code cleaner since notify_me
was enabled outside the non-polling AioContext release region. This
change is correct since it's safe to keep notify_me enabled longer
(disabling is an optimization) but potentially causes unnecessary
event_notifer_set() calls. I think the chance of performance regression
is small here.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The new AioPollFn io_poll() argument to aio_set_fd_handler() and
aio_set_event_handler() is used in the next patch.
Keep this code change separate due to the number of files it touches.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161201192652.9509-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
According to the 9P spec at http://man.cat-v.org/plan_9/5/intro, the
parent directory of the root directory of a server's tree is itself.
This test hence checks that the qid of the root directory as returned by
attach is the same as the qid of ".." when walking from the root directory.
Signed-off-by: Greg Kurz <groug@kaod.org>
The walk operation is used to traverse the directory tree and to associate
paths to fids. A single walk can be used to traverse up to P9_MAXWELEM path
elements at the same time.
The test creates a path with P9_MAXWELEM elements on the backend (à la
'mkdir -p') and issues a walk operation. The walk is expected to succeed
without error.
Reference:
http://man.cat-v.org/plan_9/5/walk
Signed-off-by: Greg Kurz <groug@kaod.org>
The attach operation is used to establish a connection between the
client and the server. After this, the client is able to access the
underlying filesystem and do I/O.
This test simply ensures the operation succeeds without error.
Reference:
http://man.cat-v.org/plan_9/5/attach
Signed-off-by: Greg Kurz <groug@kaod.org>
This patch lays the foundations to be able to test 9P operations and
provides a test for the version operation as a first example.
A 9P request is composed of a T-message sent by the client (guest) to the
server (QEMU), and a R-message sent by the server back to the client.
The following general calls are available to implement requests for any
9P operations:
v9fs_req_init(): allocates the request structure and the guest memory for
the T-message
v9fs_req_send(): allocates the guest memory for the R-message and sends the
T-message to QEMU
v9fs_req_recv(): waits for QEMU to answer and does some sanity checks on the
returned R-message header
v9fs_req_free(): releases the guest memory and the request structure
Helpers are provided, to be used by each specific 9P operation to copy data
to/from the guest memory.
The version operation is used to negotiate the 9P protocol version to be
used and the maximum buffer size for exchanged data. It is necessarily
the first message of a 9P session. For simplicity, the maximum buffer size
is hardcoded to 4k, which should be enough for functional tests.
The test simply advertises the "9P2000.L" version to QEMU and expects QEMU
to answer it is supported.
References:
http://man.cat-v.org/plan_9/5/introhttp://man.cat-v.org/plan_9/5/version
Signed-off-by: Greg Kurz <groug@kaod.org>
The u16 and u32 types don't exist in QEMU common headers. It never broke
build because these two macros aren't use by the current code, but this
is about to change with the future addition of functional tests for 9P.
Also, these should have enclosing parenthesis to be usable in any
syntactical situation.
As suggested by Eric Blake, let's use UINT16_MAX and UINT32_MAX to address
both issues.
Signed-off-by: Greg Kurz <groug@kaod.org>
This moves the test_share static and the QOSState into the QVirtIO9P
structure, and put PCI related code in functions with a _pci_ name.
This will avoid code duplication in future tests, and allow to add
support for non-PCI platforms.
Signed-off-by: Greg Kurz <groug@kaod.org>
If the user passes -device virtio-9p without the corresponding -fsdev, QEMU
dereferences a NULL pointer and crashes.
This is a 2.8 regression introduced by commit 702dbcc274.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Li Qiang <liq3ea@gmail.com>
Not all 9pfs transports share memory between request and response. For
those who don't, it is necessary to know how much memory is required in
the response.
Split the existing init_iov_from_pdu function in two:
init_out_iov_from_pdu (for writes) and init_in_iov_from_pdu (for reads).
init_in_iov_from_pdu takes an additional size parameter to specify the
memory required for the response message.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
v9fs_xattr_read should not access VirtQueueElement elems directly.
Move v9fs_init_qiov_from_pdu up in the file and call
v9fs_init_qiov_from_pdu before v9fs_pack. Use v9fs_pack on the new
iovec.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Don't call virtio functions from 9pfs generic code, use generic function
callbacks instead.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
pdus are initialized and used in 9pfs common code. Move the array from
V9fsVirtioState to V9fsState.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
This is a cleanup patch. It adds call to tcg_temp_free()
when it is missing.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Implement CAS using cmpxchg.
Implement CAS2 using helper and either cmpxchg when
the 32bit addresses are consecutive, or with
parallel_cpus+cpu_loop_exit_atomic() otherwise.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Update helper to set the throwing location in case of div-by-0.
Cleanup divX.w and add quad word variants of divX.l.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twidle.net>
[laurent: modified to clear Z on overflow, as found with risu]
target-arm queue:
* add VBAR support to ARM1176 CPUs
* hw/i2c: add NULL check to i2c slave init callbacks
* pxa2xx.c: fix trailing whitespace
* aspeed: various cleanups
* aspeed: add romulus-bmc board
* virt: add 2.9 machine type
* gicv3: don't signal Pending+Active interrupts to CPU
* gicv3: fix incorrect usage of fieldoffset
* arm: log AArch64 exception returns
* gicv3: fix aff3 field in typer register
* aarch64: fix ldst_single_struct on BE hosts
* aarch64: fix vec_reg_offset on BE hosts
* arm: fix Cortex-A8 MVFR1 register value
* cadence_uart: check if receiver timeout counter disabled
* cadence_uart: check register values on migration
# gpg: Signature made Tue 27 Dec 2016 15:19:26 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20161227: (25 commits)
target-arm: Add VBAR support to ARM1176 CPUs
hw/i2c: Add a NULL check for i2c slave init callbacks
hw/arm: remove trailing whitespace
aspeed/smc: set the number of flash modules for the FMC controller
aspeed/smc: improve segment register support
aspeed/scu: fix SCU region size
aspeed: change SoC revision of the palmetto-bmc machine
aspeed: add the definitions for the AST2400 A1 SoC
aspeed: add a memory region for SRAM
aspeed: add support for the romulus-bmc board
aspeed: extend the board configuration with flash models
aspeed: attach the second SPI controller object to the SoC
aspeed: remove cannot_destroy_with_object_finalize_yet
aspeed: QOMify the CPU object and attach it to the SoC
m25p80: add support for the mx66l1g45g
hw/arm/virt: add 2.9 machine type
hw/intc/arm_gicv3: Don't signal Pending+Active interrupts to CPU
hw/intc/arm_gicv3: Remove incorrect usage of fieldoffset
target-arm: Log AArch64 exception returns
hw/intc/arm_gicv3_common: fix aff3 in typer
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ARM1176 CPUs have TrustZone support and can use the Vector Base
Address Register, but currently, qemu only adds VBAR support to ARMv7
CPUs. Fix this by adding a new feature ARM_FEATURE_VBAR which can used
for ARMv7 and ARM1176 CPUs.
The VBAR feature is always set for ARMv7 because some legacy boards
require it even if this is not architecturally correct.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Message-id: 1481810970-9692-1-git-send-email-clg@kaod.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The HW does not enforce all the rules in the specs and allows a few
"curious" setups like zero size segments and overlaps. So change the
model to be in sync but keep the warnings which are always interesting
for debug.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 1480434248-27138-13-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Romulus machine is an OpenPOWER system with an AST2500 SoC for
the BMC and a POWER9 chip for the host. It does not make much
difference for qemu a part from the fact that the FMC controller has
two SPI flash module.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Joel Stanley <joel@jms.id.au>
Message-id: 1480434248-27138-8-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The GICv3 requires that we only signal Pending interrupts to
the CPU. This category does not include Pending+Active interrupts,
which means we need to check whether the interrupt is Active in
the gicr_int_pending() and gicd_int_pending() functions.
Interrupts are rarely in the Active+Pending state, but KVM
uses this as part of its handling of the virtual timer, so
this bug was causing KVM to go into an infinite loop of
taking the vtimer interrupt when the guest first triggered it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
In the ARMCPRegInfo definitions for the GICv3 CPU interface
registers, we were trying to use .fieldoffset to specify
the locations of data fields within the GICv3CPUState struct.
This is completely broken, because .fieldoffset is for offsets
into the CPUARMState struct. We didn't notice because we
were only using this for reads to BPR0, AP0R<n>, IGRPEN0
and CTLR_EL3, and Linux doesn't use these registers.
Replace the .fieldoffset uses with explicit read functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
We already log exception entry; add logging of the AArch64 exception
return path as well.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
The value of the MVFR1 (Media and VFP Feature Register 1) register for
the Cortex-A8 appears to be incorrect (according to the TRM, DDI0344K),
with the "full denormal arithmetic" and "propagation of NaN" fields
holding both 0 instead of both 1.
I had a go tracing the history of the use of this value, and it seems
it's always just been wrong in QEMU: maybe it was derived from early
documentation, or guessed based on the use of a "VFP Lite" implementation
in the Cortex-A8.
Depending on the startup/early-boot code in use, this can manifest as
failure to perform denormal arithmetic properly: in our case, selecting
a Cortex-A8 CPU when using QEMU as an instruction-set simulator for
bare-metal GCC testing caused tests using denormal arithmetic to
fail. Problems might be masked (or not occur) when using a full OS kernel
with suitable trap handlers (I'm not sure).
Signed-off-by: Julian Brown <julian@codesourcery.com>
Message-id: 1481130858-31767-1-git-send-email-julian@codesourcery.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When register Rcvr_timeout_reg0 (R_RTOR in cadence_uart.c) is set to
0, the receiver timeout counter should be disabled. See page 1801 of
"Zynq-7000 AP SoC Technical Reference Manual". This commit adds a
such a check before setting the receive timeout interrupt.
Signed-off-by: Andrew Gacek <andrew.gacek@gmail.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence UART device emulator calculates speed by dividing the
baud rate by a 'baud rate generator' & 'baud rate divider' value.
The device specification defines these register values to be
non-zero and within certain limits. Checks were recently added when
writing to these registers but not when restoring from migration.
This patch adds checks when restoring from migration to avoid divide by
zero errors.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 04ae30ed8ee1758cd2d2af880da4d28f74c67738.1481132150.git.alistair.francis@xilinx.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can't use LOAD AND TEST for unsigned data and then expect to
extract the result with ADD LOGICAL WITH CARRY. Fall through to
using COMPARE LOGICAL IMMEDIATE instead.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Merge qcrypto 2016/12/21 v2
# gpg: Signature made Thu 22 Dec 2016 10:46:17 GMT
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qcrypto-2016-12-21-2:
crypto: add HMAC algorithms testcases
crypto: support HMAC algorithms based on nettle
crypto: support HMAC algorithms based on glib
crypto: support HMAC algorithms based on libgcrypt
crypto: add HMAC algorithms framework
configure: add CONFIG_GCRYPT_HMAC item
crypto: add 3des-ede support when using libgcrypt/nettle
cipher: fix leak on initialization error
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The new paging more is extension of IA32e mode with more additional page
table level.
It brings support of 57-bit vitrual address space (128PB) and 52-bit
physical address space (4PB).
The structure of new page table level is identical to pml4.
The feature is enumerated with CPUID.(EAX=07H, ECX=0):ECX[bit 16].
CR4.LA57[bit 12] need to be set when pageing enables to activate 5-level
paging mode.
Signed-off-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Message-Id: <20161215001305.146807-1-kirill.shutemov@linux.intel.com>
[Drop changes to target-i386/translate.c. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The syscall and sysret instructions behave a bit differently:
TF is checked after the instruction completes.
This allows the o/s to disable #DB at a syscall by adding TF to FMASK.
And then when the sysret is executed the #DB is taken "as if" the
syscall insn just completed.
Signed-off-by: Doug Evans <dje@google.com>
Message-Id: <94eb2c0bfa1c6a9fec0543057483@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Check for KVM_CAP_ADJUST_CLOCK capability KVM_CLOCK_TSC_STABLE, which
indicates that KVM_GET_CLOCK returns a value as seen by the guest at
that moment.
For new machine types, use this value rather than reading
from guest memory.
This reduces kvmclock difference on migration from 5s to 0.1s
(when max_downtime == 5s).
Signed-off-by: Marcelo Tosatti <mtosatti@redhat.com>
Message-Id: <20161121105052.598267440@redhat.com>
[Add comment explaining what is going on. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a scsi-disk object receives VERIFY command with BYTCHK bit being zero,
scsi_block_is_passthrough returns false and finally makes req being proceeded
by scsi_block_dma_command. Because scsi_block_dma_command has removed process
of VERIFY, QEMU will abort in this function.
Reported-by: Junlian Bell <zhongjun@sangfor.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The patch is to fix the confusing assert fail message caused by
un-initialized device structure (from bite sized tasks).
The bug can be reproduced by
./qemu-system-x86_64 -nographic -device cfi.pflash01
The CFI hardware is dynamically loaded by QOM realizing mechanism,
however the realizing function in pflash_cfi01_realize function
requires the device being initialized manually before calling, like
./qemu-system-x86_64 -nographic
-device cfi.pflash01,num-blocks=1024,sector-length=4096,name=testcard
Once the initializing parameters are left off in the command, it will
leave the device structure not initialized, which makes
pflash_cfi01_realize try to realize a zero-volume card, causing
/mnt/EXT_volume/projects/qemu/qemu-dev/exec.c:1378:
find_ram_offset: Assertion `size != 0\' failed.
Through my test, at least the flash device's block-number, sector-length
and its name is needed for pflash_cfi01_realize to behave correctly. So
I think the new asserts are needed to hint the QEMU user to specify
the device's parameters correctly.
Signed-off-by: Ziyue Yang <skiver.cloud.yzy@gmail.com>
Message-Id: <1481810693-13733-1-git-send-email-skiver.cloud.yzy@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Ziyue Yang <yzylivezh@hotmail.com>
get_opt_value() truncates the value at the first comma
Use memcpy() instead so that -append works correctly in the
presence of commas. For -initrd to work right, instead,
unescape the module filename and parameters with get_opt_value()
before calling mb_add_cmdline().
Signed-off-by: Vlad Lungu <vlad.lungu@windriver.com>
Message-Id: <1481805124-16242-1-git-send-email-vlad.lungu@windriver.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The remote protocol can't handle flipping back and forth
between 32-bit and 64-bit regs. To compensate, pretend "as if"
on 64-bit cpu when in 32-bit mode.
Signed-off-by: Doug Evans <dje@google.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <001a113dca8274572005406e03c3@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids taking the active_timers_lock or resetting/setting the
timers_done_ev if there are no active timers. This removes a small
(2-3%) source of overhead for dataplane. The list is then checked
again inside the lock, or a NULL pointer could be dereferenced.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These will be used more as soon as the acquire/release is pushed down to
the ioeventfd handlers.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Really rule chaining is not a particularly expensive task, since
GNU Make caches the directory listing. However it is easy to
avoid it for most files and for phony targets (one was missing).
After this patch, only "Makefile", "scripts/hxtool" and
"scripts/create_config" attempt to use chained rules.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unnesting variables spends a lot of time parsing and executing foreach
and if functions. Because actually very few variables have to be
saved and restored, a good strategy is to remember what has to be done
in load-vars, and only iterate the right variables in load-vars.
For save-vars, unroll the foreach loop to provide another small
improvement.
This speeds up a "noop" build from around 15.5 seconds on my laptop
to 11.7 (25% roughly).
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When the Intel 6300ESB watchdog is hot unplug. The timer allocated
in realize isn't freed thus leaking memory leak. This patch avoid
this through adding the exit function.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-Id: <583cde9c.3223ed0a.7f0c2.886e@mx.google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Device models often have to perform multiple access to a single
memory region that is known in advance, but would to use "DMA-style"
functions instead of address_space_map/unmap. This can happen
for example when the data has to undergo endianness conversion.
Introduce a new data structure to cache the result of
address_space_translate without forcing usage of a host address
like address_space_map does.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This extracts the common part of address_space_map and
address_space_cache_init into a new function.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Templatize the address_space_* and *_phys functions, so that we can add
similar functions in the next patch that work with a lightweight,
cache-like version of address_space_map/unmap.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do them right before the next patch generalizes them into a multi-included
file.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch add nettle-backed HMAC algorithms support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch add glib-backed HMAC algorithms support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch add HMAC algorithms based on libgcrypt support
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch introduce HMAC algorithms framework.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This item will be used for support libcrypt-backed HMAC algorithms.
Support for hmac has been added in Libgcrypt 1.6.0, but we cannot
use pkg-config to get libcrypt's version. However we can make a
in configure to know whether current libcrypt support hmac.
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Libgcrypt and nettle support 3des-ede, so this patch add 3des-ede
support when using libgcrypt or nettle.
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
On error path, ctx may be leaked. Assign ctx earlier, and call
qcrypto_cipher_free() on error.
Spotted thanks to ASAN.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The blocksize option is defined in RFC 1783 and RFC 2348.
We now support block sizes between 1 and 1428 bytes, instead of 512 only.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
We've currently got 18 architectures in QEMU, and thus 18 target-xxx
folders in the root folder of the QEMU source tree. More architectures
(e.g. RISC-V, AVR) are likely to be included soon, too, so the main
folder of the QEMU sources slowly gets quite overcrowded with the
target-xxx folders.
To disburden the main folder a little bit, let's move the target-xxx
folders into a dedicated target/ folder, so that target-xxx/ simply
becomes target/xxx/ instead.
Acked-by: Laurent Vivier <laurent@vivier.eu> [m68k part]
Acked-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de> [tricore part]
Acked-by: Michael Walle <michael@walle.cc> [lm32 part]
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com> [s390x part]
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com> [s390x part]
Acked-by: Eduardo Habkost <ehabkost@redhat.com> [i386 part]
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com> [sparc part]
Acked-by: Richard Henderson <rth@twiddle.net> [alpha part]
Acked-by: Max Filippov <jcmvbkbc@gmail.com> [xtensa part]
Reviewed-by: David Gibson <david@gibson.dropbear.id.au> [ppc part]
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com> [crisµblaze part]
Acked-by: Guan Xuetao <gxt@mprc.pku.edu.cn> [unicore32 part]
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch makes virtio-gpu track host memory allocations for ressources
and applies a limit (configurable 256M by default). When exceeding the
limit virtio-gpu throws VIRTIO_GPU_RESP_ERR_OUT_OF_MEMORY errors (like
it already does today when pixman image allocations fail).
This patch covers 2d mode only. For 3d mode we have to figure how we
are going to handle this best. qemu doesn't track resources in case
virglrenderer is used, so I guess we should extend virglrenderer to
allow setting a limit, then let qemu set the limit and catch
virgl_renderer_resource_create failures.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Cc: Dave Airlie <airlied@redhat.com>
Cc: 李强 <liqiang6-s@360.cn>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 1480423356-22255-1-git-send-email-kraxel@redhat.com
Virtio GPU device while processing 'VIRTIO_GPU_CMD_GET_CAPSET'
command, retrieves the maximum capabilities size to fill in the
response object. It continues to fill in capabilities even if
retrieved 'max_size' is zero(0), thus resulting in OOB access.
Add check to avoid it.
Reported-by: Zhenhao Hong <zhenhaohong@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20161214070156.23368-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This patch fixes a cross-version migration regression introduced
by commit d1b4259f ("virtio-bus: Plug devices after features are
negotiated").
The problem is encountered when host's vhost backend does not support
VIRTIO_F_VERSION_1, and migration is initiated from a v2.7 or prior
machine with virtio-pci modern capabilities enabled to a v2.8 machine.
In this case, modern capabilities get exposed to the guest by the source,
whereas the target will detect version 1 is not supported so will only
expose legacy capabilities.
The problem is fixed by introducing a new "x-ignore-backend-features"
property, which is set in v2.7 and prior compatibility modes. Doing this,
v2.7 machine keeps its broken behaviour (enabling modern while version
is not supported), and newer machines will behave correctly.
Reported-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-id: 20161214163035.3297-1-maxime.coquelin@redhat.com
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Cornelia Huck <cornelia.huck@de.ibm.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The "Copy" menu item copies VTE terminal text to the clipboard. This
only works with VTE terminals, not with graphics consoles.
Disable the menu item when the current notebook page isn't a VTE
terminal.
This patch fixes a segfault. Reproducer: Start QEMU and click the Copy
menu item when the guest display is visible.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Tested-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20161214142518.10504-1-stefanha@redhat.com
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We intentionally renamed 'debug-level' to 'debug' in the QMP
schema for 'blockdev-add' related to gluster, in order to
match the command line (commit 1a417e46). However, since
'debug-level' was visible in 2.7, that means that we should
document that 'debug' was not available until 2.8.
The change was intentional because 'blockdev-add' itself
underwent incompatible changes (such as commit 0153d2f) for
the same release; our intent is that after 2.8, these
interfaces will now be stable. [In hindsight, we should have
used the name x-blockdev-add when we first introduced it]
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20161206182020.25736-1-eblake@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
A bug (1647683) was reported showing a crash when removing
breakpoints. The reproducer was bisected to 3359baad when tb_flush
was finally made thread safe. While in MTTCG the locking in
breakpoint_invalidate would have prevented any problems, but
currently tb_lock() is a NOP for system emulation.
The race is between a tb_flush from the gdbstub and the
tb_invalidate_phys_addr() in breakpoint_invalidate().
Ideally we'd have actual locking here; for the moment the
simple fix is to do a full tb_flush() for a bp invalidate,
since that is thread-safe even if no lock is taken.
Reported-by: Julian Brown <julian@codesourcery.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1481047629-7763-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The qcow2_make_empty() function is reached during 'qemu-img commit',
in order to clear out ALL clusters of an image. However, if the
image cannot use the fast code path (true if the image is format
0.10, or if the image contains a snapshot), the cluster size is
larger than 512, and the image is larger than 2G in size, then our
choice of sector_step causes problems. Since it is not cluster
aligned, but qcow2_discard_clusters() silently ignores an unaligned
head or tail, we are leaving clusters allocated.
Enhance the testsuite to expose the flaw, and patch the problem by
ensuring our step size is aligned.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
# gpg: Signature made Tue 06 Dec 2016 02:24:23 AM GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* jasowang/tags/net-pull-request:
fsl_etsec: Fix various small problems in hexdump code
fsl_etsec: Pad short payloads with zeros
net: mcf: check receive buffer size register value
Message-id: 1480991552-14360-1-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Fix various small problems in hexdump code, such as:
- Reference to non-existing field etsec->nic->nc.name is replaced
with nc->name
- Type mismatch warnings
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Document:
1. The new debug and logfile options with their usages
2. New json format and its usage and
3. update "GlusterFS, Device URL Syntax" section in "Invocation"
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
The QMP definition of BlockdevOptionsNfs:
{ 'struct': 'BlockdevOptionsNfs',
'data': { 'server': 'NFSServer',
'path': 'str',
'*user': 'int',
'*group': 'int',
'*tcp-syn-count': 'int',
'*readahead-size': 'int',
'*page-cache-size': 'int',
'*debug-level': 'int' } }
To make this consistent with other block protocols like gluster, lets
change s/debug-level/debug/
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
The QMP definition of BlockdevOptionsGluster:
{ 'struct': 'BlockdevOptionsGluster',
'data': { 'volume': 'str',
'path': 'str',
'server': ['GlusterServer'],
'*debug-level': 'int',
'*logfile': 'str' } }
But instead of 'debug-level we have exported 'debug' as the option for choosing
debug level of gluster protocol driver.
This patch fix QMP definition BlockdevOptionsGluster
s/debug-level/debug/
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
While testing rth's latest TCG patches with risu I found ldaxp was
broken. Investigating further I found it was broken by 1dd089d0 when
the cmpxchg atomic work was merged. As part of that change the code
attempted to be clever by doing a single 64 bit load and then shuffle
the data around to set the two 32 bit registers.
As I couldn't quite follow the endian magic I've simply partially
reverted the change to the original code gen_load_exclusive code. This
doesn't affect the cmpxchg functionality as that is all done on in
gen_store_exclusive part which is untouched.
I've also restored the comment that was removed (with a slight tweak
to mention cmpxchg).
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Acked-by: Richard Henderson <rth@twiddle.net>
Message-id: 20161202173454.19179-1-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The qobject_from_jsonf() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf(). In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd). Rather than enhance
the parser, it is just as easy to force the use of int (where
the value is small enough) or long long instead of int64_t,
which we know always works.
This should cover all remaining testsuite uses of
qobject_from_json[fv]() that were trying to rely on PRId64,
although my proof for that was done by adding in asserts and
checking that 'make check' still passed, where such asserts
are inappropriate during hard freeze. A later series in 2.9
may remove all dynamic JSON parsing, but that's a bigger task.
Reported by: G 3 <programmingkidx@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1479922617-4400-4-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rename value64 to value_ll]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The qobject_from_jsonv() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf(). In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd). Rather than enhance
the parser, it is just as easy to use normal printf() for
this particular conversion, matching what is done elsewhere
in this file [1], which is safe in this instance because the
format does not contain any of the problematic differences
(bare '%' or the '%s' format).
The use of PRId64 for a variable named 'pid' is gross, but it
is a sad reality of the 64-bit mingw environment, which
mistakenly defines pid_t as a 64-bit type even though getpid()
returns 'int' on that platform [2]. Our definition of the
QGA GuestExec type defines 'pid' as a 64-bit entity, and we
can't tighten it to 'int32' unless the mingw header is fixed.
Using 'long long' instead of 'int64_t' just so that we can
stick with qobject_from_jsonv("%lld") instead of printf() is
not any prettier, since we may have later type churn anyways.
[1] see 'git grep -A2 strdup_printf tests/test-qga.c'
[2] https://bugzilla.redhat.com/show_bug.cgi?id=1397787
Reported by: G 3 <programmingkidx@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1479922617-4400-3-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The qobject_from_jsonf() function implements a pseudo-printf
language for creating a QObject; however, it is hard-coded to
only parse a subset of formats understood by -Wformat, and is
not a straight synonym to bare printf(). In particular, any
use of an int64_t integer works only if the system's
definition of PRId64 matches what the parser expects; which
works on glibc (%lld or %ld depending on 32- vs. 64-bit) and
mingw (%I64d), but not on Mac OS (%qd). Rather than enhance
the parser, it is just as easy to use 'long long', which we
know always works. There are few enough callers of
qobject_from_json[fv]() that it is easy to audit that this is
the only non-testsuite caller that was actually relying on
this particular conversion.
Reported by: G 3 <programmingkidx@gmail.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1479922617-4400-2-git-send-email-eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Cast tv.tv_sec, tv.tv_usec to long long for type correctness]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
In Cirrus CLGD 54xx VGA Emulator, if cirrus graphics mode is VGA,
'cirrus_get_bpp' returns zero(0), which could lead to a divide
by zero error in while copying pixel data. The same could occur
via blit pitch values. Add check to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1476776717-24807-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Depending on QEMU network setup it is possible for us to receive a
complete Ethernet packet that is less 64 bytes long. One such example is
when QEMU is configured to use a standalone TAP device (not set to be a
part of any bridge) receives and ARP packet. In cases like that we need
to add more than just 4-bytes of CRC padding and ensure that our payload
is at least 60 bytes long, such that, when combined with CRC padding
bytes the resulting size is at least 802.3 minimum MTU bytes
long (64). Failing to do that results in code in etsec_walk_rx_ring()
setting BD_RX_SH which, in turn, makes corresponding Linux driver of
emulated host to reject buffer as a runt packet
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
ColdFire Fast Ethernet Controller uses a receive buffer size
register(EMRBR) to hold maximum size of all receive buffers.
It is set by a user before any operation. If it was set to be
zero, ColdFire emulator would go into an infinite loop while
receiving data in mcf_fec_receive. Add check to avoid it.
Reported-by: Wjjzhang <wjjzhang@tencent.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In update_cursor_data_virgl function, if the 'width'/ 'height'
is not equal to current cursor's width/height it will return
without free the 'data' allocated previously. This will lead
a memory leak issue. This patch fix this issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 58187760.41d71c0a.cca75.4cb9@mx.google.com
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In virgl_cmd_get_capset_info dispatch function, the 'resp' hasn't
been full initialized before writing to the guest. This will leak
the 'resp.padding' and 'resp.hdr.padding' fieds to the guest. This
patch fix this issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5818661e.0860240a.77264.7a56@mx.google.com
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently if the client keeps sending the same monitor config to
QEMU/spice-server, QEMU will always raise
a QXL_INTERRUPT_CLIENT_MONITORS_CONFIG regardless of whether there was a
change or not.
Guest-side (with fedora 25), the kernel QXL KMS driver will also forward the
event to user-space without checking if there were actual changes.
Next in line are gnome-shell/mutter (on a default f25 install), which
will try to reconfigure everything without checking if there is anything
to do.
Where this gets ugly is that when applying the resolution changes,
gnome-shell/mutter will call drmModeRmFB, drmModeAddFB, and
drmModeSetCrtc, which will cause the primary surface to be destroyed and
recreated by the QXL KMS driver. This in turn will cause the client to
resend a client monitors config message, which will cause QEMU to reemit
an interrupt with an unchanged monitors configuration, ...
This causes https://bugzilla.redhat.com/show_bug.cgi?id=1266484
This commit makes sure that we only emit
QXL_INTERRUPT_CLIENT_MONITORS_CONFIG when there are actual configuration
changes the guest should act on.
Signed-off-by: Christophe Fergeau <cfergeau@redhat.com>
Message-id: 20161028144840.18326-1-cfergeau@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Needed to emit FPU exception on Loongson multimedia instructions
executing if Status:CU1 is clear. or FPR changes may be missed
on Linux.
Signed-off-by: Heiher <wangr@lemote.com>
Signed-off-by: Fuxin Zhang <zhangfx@lemote.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
ppc patch queue 2016-12-01
Just a single migration / hotplug fix in this set. I believe it's
important enough to go in this late in the 2.8 release process.
# gpg: Signature made Thu 01 Dec 2016 04:43:49 AM GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* dgibson/tags/ppc-for-2.8-20161201:
spapr: fix default DRC state for coldplugged LMBs
Message-id: 20161201044441.14365-1-david@gibson.dropbear.id.au
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently we set the initial isolation/allocation state for DRCs
associated with coldplugged LMBs to ISOLATED/UNUSABLE,
respectively, under the assumption that the guest will move this
state to UNISOLATED/USABLE.
In fact, this is only the case for LMBs added via hotplug. For
coldplugged LMBs, the guest actually assumes the initial state to
be UNISOLATED/USABLE.
In practice, this only becomes an issue when we attempt to unplug
one of these LMBs, where the guest kernel will issue an
rtas-get-sensor-state call to check that the corresponding DRC is
in an USABLE state before it will release the LMB back to
QEMU. If the returned state is otherwise, the guest will assume no
further action is needed, which bypasses the QEMU-side cleanup that
occurs during the USABLE->UNUSABLE transition. This results in
LMBs and their corresponding pc-dimm devices to stick around
indefinitely.
This patch fixes the issue by manually setting DRCs associated with
cold-plugged LMBs to UNISOLATED/ALLOCATED, but leaving the hotplug
state untouched. As it turns out, this is analogous to the handling
for cold-plugged CPUs in spapr_core_plug().
Cc: qemu-ppc@nongnu.org
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Cc: Greg Kurz <gkurz@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Though crypto_cfg.reserve is an unused field, let me
initialize the structure in order to make coverity happy.
*** CID 1365923: Uninitialized variables (UNINIT)
/hw/virtio/virtio-crypto.c: 851 in virtio_crypto_get_config()
845 stl_le_p(&crypto_cfg.mac_algo_h, c->conf.mac_algo_h);
846 stl_le_p(&crypto_cfg.aead_algo, c->conf.aead_algo);
847 stl_le_p(&crypto_cfg.max_cipher_key_len, c->conf.max_cipher_key_len);
848 stl_le_p(&crypto_cfg.max_auth_key_len, c->conf.max_auth_key_len);
849 stq_le_p(&crypto_cfg.max_size, c->conf.max_size);
850
>>> CID 1365923: Uninitialized variables (UNINIT)
>>> Using uninitialized value "crypto_cfg". Field "crypto_cfg.reserve"
is uninitialized when calling "memcpy".
[Note: The source code implementation of the function
has been overridden by a builtin model.]
851 memcpy(config, &crypto_cfg, c->config_size);
852 }
853
Rported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to ISO C99 / N1256 (referenced in HACKING):
> 6.5.8 Relational operators
>
> 4 For the purposes of these operators, a pointer to an object that is
> not an element of an array behaves the same as a pointer to the first
> element of an array of length one with the type of the object as its
> element type.
>
> 5 When two pointers are compared, the result depends on the relative
> locations in the address space of the objects pointed to. If two
> pointers to object or incomplete types both point to the same object,
> or both point one past the last element of the same array object, they
> compare equal. If the objects pointed to are members of the same
> aggregate object, pointers to structure members declared later compare
> greater than pointers to members declared earlier in the structure,
> and pointers to array elements with larger subscript values compare
> greater than pointers to elements of the same array with lower
> subscript values. All pointers to members of the same union object
> compare equal. If the expression /P/ points to an element of an array
> object and the expression /Q/ points to the last element of the same
> array object, the pointer expression /Q+1/ compares greater than /P/.
> In all other cases, the behavior is undefined.
Our AddressSpace objects are allocated generally individually, and kept in
the "address_spaces" linked list, so we mustn't compare their addresses
with relops.
Convert the pointers subjected to the relop in rom_order_compare() to
"uintptr_t":
> 7.18.1.4 Integer types capable of holding object pointers
>
> 1 [...]
>
> The following type designates an unsigned integer type with the
> property that any valid pointer to void can be converted to this type,
> then converted back to pointer to void, and the result will compare
> equal to the original pointer:
>
> /uintptr_t/
>
> These types are optional.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-devel@nongnu.org
Fixes: 3e76099aac
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* Commit 3e76099aac ("loader: Allow a custom AddressSpace when loading
ROMs") introduced the "Rom.as" field:
(1) It modified the utility callers of rom_insert() to take "as" as a
new parameter from *their* callers, and set "rom->as" from that
parameter. The functions covered were rom_add_file() and
rom_add_elf_program().
(2) It also modified rom_insert() itself, to auto-assign
"&address_space_memory", in case the external caller passed -- and
the utility caller forwarded -- as=NULL.
Except, commit 3e76099aac forgot to update the third utility caller of
rom_insert(), under point (1), namely rom_add_blob().
* Later, commit 5e774eb3bd ("loader: Add AddressSpace loading support
to uImages") added the load_uimage_as() function, and the
rom_add_blob_fixed_as() function-like macro, with the necessary changes
elsewhere to propagate the new "as" parameter to rom_add_blob():
load_uimage_as()
load_uboot_image()
rom_add_blob_fixed_as()
rom_add_blob()
At this point, the signature (and workings) of rom_add_blob() had been
broken already, and the rom_add_blob_fixed_as() macro passed its "_as"
parameter to rom_add_blob() as "callback_opaque". Given that the
"fw_callback" parameter itself was set to NULL (correctly), this did no
additional damage (the opaque arg would never be used), but ultimately
it broke the new functionality of load_uimage_as().
* The load_uimage_as() function would be put to use in one of the later
patches, commit e481a1f63c ("generic-loader: Add a generic loader").
* We can fix this only in a unified patch now. Append "AddressSpace *as"
to the signature of rom_add_blob(), and handle the new parameter. Pass
NULL from all current callers, except from rom_add_blob_fixed_as(),
where "_as" has to be bumped to the proper position.
* Note that rom_add_file() rejects the case when both "mr" and "as" are
passed in as non-NULL. The action that this is apparently supposed to
prevent is the
rom->mr = mr;
assignment (that's the only place where the "mr" parameter is used in
rom_add_file()). In rom_add_blob() though, we have no "mr" parameter,
and the actions done on the fw_cfg branch:
if (fw_file_name && fw_cfg) {
if (mc->rom_file_has_mr) {
data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
mr = rom->mr;
} else {
data = rom->data;
}
reflect those that are performed by rom_add_file() too (with mr==NULL):
if (rom->fw_file && fw_cfg) {
if ((!option_rom || mc->option_rom_has_mr) &&
mc->rom_file_has_mr) {
data = rom_set_mr(rom, OBJECT(fw_cfg), devpath);
} else {
data = rom->data;
}
Hence we need no additional restrictions in rom_add_blob().
* Stable is not affected as both problematic commits appeared first in
v2.8.0-rc0.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Michael Walle <michael@walle.cc>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Shannon Zhao <zhaoshenglong@huawei.com>
Cc: qemu-arm@nongnu.org
Cc: qemu-devel@nongnu.org
Fixes: 3e76099aac
Fixes: 5e774eb3bd
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"mask" needs to be inverted before use.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Block layer patches for 2.8.0-rc2
# gpg: Signature made Tue 29 Nov 2016 03:16:10 PM GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream:
docs: Specify that cache-clean-interval is only supported in Linux
qcow2: Remove stale comment
qcow2: Allow 'cache-clean-interval' in Linux only
qcow2: Make qcow2_cache_table_release() work only in Linux
Message-id: 1480436227-2211-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Building qemu fails in distributions where gcc enables PIE by default
(e.g. Debian unstable) with:
/usr/bin/ld: -r and -pie may not be used together
You have to use -r instead of -Wl,-r to avoid gcc passing -pie to the linker
when PIE is enabled and a relocatable object is passed. However, clang
does not know about -r, so try -Wl,-r first.
[This is a fix for commit c96f0ee6a6
("rules.mak: Use -r instead of -Wl, -r to fix building when PIE is
default") which mostly worked but broke the ./configure --enable-modules
build with clang.
--Stefan]
Reported-by: Adrian Bunk <bunk@stusta.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161129153720.29747-1-pbonzini@redhat.com
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Small fixes for rc2.
# gpg: Signature made Mon 28 Nov 2016 03:45:20 PM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* bonzini/tags/for-upstream:
rules.mak: Use -r instead of -Wl, -r to fix building when PIE is default
migration/pcspk: Turn migration of pcspk off for 2.7 and older
migration/pcspk: Add a property to state if pcspk is migrated
pci-assign: sync MSI/MSI-X cap and table with PCIDevice
megasas: clean up and fix request completion/cancellation
megasas: do not call pci_dma_unmap after having freed the frame once
Message-id: 1480372837-109736-1-git-send-email-pbonzini@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
An hbitmap's granularity may be anything from 0 to 63, so when shifting
constants by its value, they should not be plain ints.
Even having changed the types, hbitmap_serialization_granularity() still
tries to shift 64 to the right by the granularity. This operation is
undefined if the granularity is greater than 57. Adding an assertion is
fine for now, because serializing is done only in tests so far, but this
means that only bitmaps with a granularity below 58 can be serialized
and we should thus add a hbitmap_is_serializable() function later.
One of the two places touched in this patch uses
QEMU_ALIGN_UP(x, 1 << y). We can use ROUND_UP() there, since the second
parameter is obviously a power of two.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20161115224732.1334-1-mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
target-arm queue:
* hw/arm/boot: fix crash handling device trees with no /chosen
or /memory nodes
* generic-loader: only set PC if a CPU is specified
# gpg: Signature made Mon 28 Nov 2016 01:47:21 PM GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* pm215/tags/pull-target-arm-20161128:
arm: Create /chosen and /memory devicetree nodes if necessary
generic-loader: file: Only set a PC if a CPU is specified
Message-id: 1480341071-5367-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There's no way to communicate back read data, so only writes can ever
be usefully specified. Ignore the field, paving the road for eventually
re-using the bit for something else in a few (many?) years time.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
There's no point setting fields always receiving the same value on each
iteration, as handle_ioreq() doesn't alter them anyway. Set state and
count once ahead of the loop, drop the redundant clearing of
data_is_ptr, and avoid the meaningless (because count is 1) setting of
df altogether.
Also avoid doing an unsigned long calculation of size when the field to
be initialized is only 32 bits wide (and the shift value in the range
0...3).
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
We should not consume the second slot if it didn't get written yet.
Normal writers - i.e. Xen - would not update write_pointer between the
two writes, but the page may get fiddled with by the guest itself, and
we're better off avoiding to enter an infinite loop in that case.
Reported-by: yanghongke <yanghongke@huawei.com>
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Reviewed-by: Paul Durrant <paul.durrant@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Building qemu fails in distributions where gcc enables PIE by default
(e.g. Debian unstable) with:
/usr/bin/ld: -r and -pie may not be used together
Use -r instead of -Wl,-r to avoid gcc passing -pie to the linker
when PIE is enabled and a relocatable object is passed.
Signed-off-by: Adrian Bunk <bunk@stusta.de>
Message-Id: <20161127162817.15144-1-bunk@stusta.de>
Cc: qemu-stable@nongnu.org
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn"),
kvm_irqchip_add_msi_route() starts to use pci_get_msi_message() to fetch
MSI info. This requires that we setup MSI related fields in PCIDevice.
For most devices, that won't be a problem, as long as we are using
general interfaces like msi_init()/msix_init().
However, for pci-assign devices, MSI/MSI-X is treated differently - PCI
assign devices are maintaining its own MSI table and cap information in
AssignedDevice struct. however that's not synced up with PCIDevice's
fields. That will leads to pci_get_msi_message() failed to find correct
MSI capability, even with an NULL msix_table.
A quick fix is to sync up the two places: both the capability bits and
table address for MSI/MSI-X.
Reported-by: Changlimin <changlimin@h3c.com>
Tested-by: Changlimin <changlimin@h3c.com>
Cc: qemu-stable@nongnu.org
Fixes: e1d4fb2d ("kvm-irqchip: x86: add msi route notify fn")
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1480042522-16551-1-git-send-email-peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
megasas_command_cancel is a callback; it should report the abort in
the frame, not try another abort! Compare for instance with
mptsas_request_cancelled.
So extract the common bits for request completion in a new function
megasas_complete_command, call it from both the .complete and .cancel
callbacks, and remove duplicate pieces from the DCMD path.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20161110152751.4267-2-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 8cc4678 ("megasas: remove useless check for cmd->frame", 2016-07-17) was
wrong because I trusted Coverity too much. It turns out that there _is_ a
path through which cmd->frame can become NULL. After megasas_handle_frame's
switch (md->frame->header.frame_cmd), megasas_init_firmware can be called.
From there, megasas_reset_frames will call megasas_unmap_frame which resets
cmd->frame = NULL.
However, there is another bug to fix in there, because megasas_unmap_frame
is called again after setting the command status. In this case QEMU should
not do anything, instead it calls pci_dma_unmap again. Harmless, but
better fix it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make it clear that having Linux is a hard requirement for this
feature.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The cache-clean-interval option of qcow2 only works on Linux. However
we allow setting it in other systems regardless of whether it works or
not.
In those systems this option is not simply a no-op: it actually
invalidates perfectly valid cache tables for no good reason without
freeing their memory.
This patch forbids using that option in non-Linux systems.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are using QEMU_MADV_DONTNEED to discard the memory of individual L2
cache tables. The problem with this is that those semantics are
specific to the Linux madvise() system call. Other implementations of
madvise() (including the very Linux implementation of posix_madvise())
don't do that, so we cannot use them for the same purpose.
This patch makes the code Linux-specific and uses madvise() directly
since there's no point in going through qemu_madvise() for this.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
"The multiplier and multiplicand are both word operands, and the result
is a long-word operand."
So compute flags on a long-word result, not on a word result.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
This pull request fixes some leaks (memory, fd) in the handle and proxy
backends.
# gpg: Signature made Wed 23 Nov 2016 12:53:41 PM GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* gkurz/tags/for-upstream:
9pfs: add cleanup operation for proxy backend driver
9pfs: add cleanup operation for handle backend driver
9pfs: add cleanup operation in FileOperations
9pfs: adjust the order of resource cleanup in device unrealize
Message-id: 1479920298-24983-1-git-send-email-groug@kaod.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
"The size of the operation can be specified as word or long.
Word length source operands are sign-extended to 32 bits for
comparison."
So comparison is always done using OS_LONG.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
opcodes of "EXG Ax,Ay" and "EXG Dx,Dy" have been swapped
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The guest sends discard requests as u64 sector/count pairs, but the
block layer operates internally with s64/s32 pairs. The conversion
leads to IO errors in the guest, the discard request is not processed.
domU.cfg:
'vdev=xvda, format=qcow2, backendtype=qdisk, target=/x.qcow2'
domU:
mkfs.ext4 -F /dev/xvda
Discarding device blocks: failed - Input/output error
Fix this by splitting the request into chunks of BDRV_REQUEST_MAX_SECTORS.
Add input range checking to avoid overflow.
Fixes f313520 ("xen_disk: add discard support")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
In the init operation of proxy backend dirver, it allocates a
V9fsProxy struct and some other resources. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
In the init operation of handle backend dirver, it allocates a
handle_data struct and opens a mount file. We should free these
resources when the 9pfs device is unrealized. This is what this
patch does.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Currently, the backend of VirtFS doesn't have a cleanup
function. This will lead resource leak issues if the backed
driver allocates resources. This patch addresses this issue.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Unrealize should undo things that were set during realize in
reverse order. So should do in the error path in realize.
Signed-off-by: Li Qiang <liq3ea@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
ppc patch queue 2016-11-23
Here's the first set of 2.8 hard freeze bugfixes for ppc.
The biggest thing here is a batch of fixes for migration breakages in
both 2.7 and current 2.8. Alas, there is at least one more migration
problem, which prevents memory unplug after a migration. I hoped to
include a fix for that here, but it turned out to have some problems
bigger than those it was solving. So, I expect at least one more hard
freeze pull request.
There are also a few other assorted bug fixes.
# gpg: Signature made Wed 23 Nov 2016 02:25:42 AM GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* dgibson/tags/ppc-for-2.8-20161123:
spapr: Fix 2.7<->2.8 migration of PCI host bridge
Revert "spapr: Fix migration of PCI host bridges from qemu-2.7"
target-ppc: Allow eventual removal of old migration mistakes
migration: Add VMSTATE_UINTTL_TEST()
target-ppc: Fix CPU migration from qemu-2.6 <-> later versions
ppc: Make uninorth interrupt swizzling identical to Grackle
target-ppc: fix index array of national digits
hw/char/spapr_vty: Return amount of free buffer entries in vty_can_receive()
ppc: BOOK3E: nothing should be done when MSR:PR is set
spapr: migration support for CAS-negotiated option vectors
tests/postcopy: Use KVM on ppc64 only if it is KVM-HV
Message-id: 1479869383-16162-1-git-send-email-david@gibson.dropbear.id.au
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration
from qemu-2.7 to the current version. It split the device's MMIO
window into two pieces for 32-bit and 64-bit MMIO.
The patch included backwards compatibility code to convert the old
property into the new format. However, the property value was also
transferred in the migration stream and compared with a (probably
unwise) VMSTATE_EQUAL. So, the "raw" value from 2.7 is compared to
the new style converted value from (pre-)2.8 giving a mismatch and
migration failure.
Along with the actual field that caused the breakage, there are
several other ill-advised VMSTATE_EQUAL()s. To fix forwards
migration, we read the values in the stream into scratch variables and
ignore them, instead of comparing for equality. To fix backwards
migration, we populate those scratch variables in pre_save() with
adjusted values to match the old behaviour.
To permit the eventual possibility of removing this cruft from the
stream, we only include these compatibility fields if a new
'pre-2.8-migration' property is set. We clear it on the pseries-2.8
machine type, which obviously can't be migrated backwards, but set it
on earlier machine type versions.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
This reverts commit 9b54ca0ba7.
The commit above corrected a migration breakage between qemu-2.7 and
qemu-2.8. However it did so by advancing the migration version for
the PCI host bridge, which obviously breaks migration backwards to
earlier qemu versions.
Although it's not totally essential, we'd like to maintain the
possibility for backwards migration, so revert the change in
preparation for a better fix.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Until very recently, the vmstate for ppc cpus included some poorly
thought out VMSTATE_EQUAL() components, that can easily break
migration compatibility, and did so between qemu-2.6 and later
versions. A hack was recently added which fixes this migration
breakage, but it leaves the unhelpful cruft of these fields in the
migration stream.
This patch adds a new cpu property allowing these fields to be removed
from the stream entirely. For the pseries-2.8 machine type - which
comes after the fix - and for all non-pseries machine types - which
aren't mature enough to care about cross-version migration - we remove
the fields from the stream.
For pseries-2.7 and earlier, The migration hack remains in place,
allowing backwards and forwards migration with the older machine
types.
This restricts the migration compatibility cruft to older machine
types, and at least opens the possibility of eventually deprecating
and removing it entirely.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
include/migration/cpu.h defines VMSTATE_UINTTL() and several variants
for migrating target_ulong fields. It's defined in terms of
VMSTATE_UINT32() or VMSTATE_UINT64() as appropriate.
It doesn't, however, include a VMSTATE_UINTTL_TEST() variant, which
I'm going to need shortly. So, add it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
When migration for target-ppc was converted to vmstate, several
VMSTATE_EQUAL() checks were foolishly included of things that really
should be internal state. Specifically we verified equality of the
insns_flags and insns_flags2 fields, which are used within TCG to
determine which groups of instructions are available on this cpu
model. Between qemu-2.6 and qemu-2.7 we made some changes to these
classes which broke migration.
This path fixes migration both forwards and backwards. On migration
from 2.6 to later versions we import the fields into teporary
variables, which we then ignore. In migration backwards, we populate
the temporary fields from the runtime fields, but mask out the bits
which were added after qemu-2.6, allowing the VMSTATE_EQUAL in
qemu-2.6 to accept the stream.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
It's currently broken as it uses an incorrect shift, it tries
to use the slot number but uses the top bits of the bus number
instead.
Note: Neither implementation matches what OpenBIOS ends up putting
in the device-tree either, which will have to be fixed separately.
This is not quite correct for modelling a real Mac since Apple
tend to tie all 4 interrupt lines of a slot together and have
separate interrupts for every slot and every motherboard devices
going straight to the PIC but we'll sort that out later.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The can_receive() callbacks of the character devices should return
the amount of characters that can be accepted at once, not just a
boolean value (which rather means only one character at a time).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The server architecture (BOOK3S) specifies that any instruction that
sets MSR:PR will also set MSR:EE, IR and DR.
However there is no such behavior specification for the embedded
architecture (BOOK3E).
Signed-off-by: Vladimir Svoboda <ze.vlad@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
With the additional of the OV5_HP_EVT option vector, we now have
certain functionality (namely, memory unplug) that checks at run-time
for whether or not the guest negotiated the option via CAS. Because
we don't currently migrate these negotiated values, we are unable
to unplug memory from a guest after it's been migrated until after
the guest is rebooted and CAS-negotiation is repeated.
This patch fixes this by adding CAS-negotiated options to the
migration stream. We do this using a subsection, since the
negotiated value of OV5_HP_EVT is the only option currently needed
to maintain proper functionality for a running guest.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The ppc64 postcopy test does not work with KVM-PR, and it is also
causing annoying warning messages when run on a x86 host. So let's
use KVM here only if we know that we're running with KVM-HV (which
automatically also means that we're running on a ppc64 host), and
fall back to TCG otherwise.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit fa778fff wired up support to send the NBD_CMD_WRITE_ZEROES,
but forgot to inform the block layer that FUA unmapping of zeroes is
supported. Without BDRV_REQ_MAY_UNMAP listed as a supported flag,
the block layer will always insist on the NBD layer passing
NBD_CMD_FLAG_NO_HOLE, resulting in the server always allocating
things even when it was desired to let the server punch holes.
Similarly, failing to set BDRV_REQ_FUA means that the client may
send unnecessary NBD_CMD_FLUSH when it could have instead used the
NBD_CMD_FLAG_FUA bit.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1479413642-22463-2-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In the user emulation code path, tlb_vaddr_to_host erronesously passed
vaddr as the guest address to be translated, instead of addr, the parameter
which actually contained the guest address.
This resulted in incorrect addresses being used when emulating block copy
(mvc/mvpg) and block clear (xc) instructions for the s390x target.
Signed-off-by: Bobby Bingham <koorogi@koorogi.info>
Message-Id: <20161113050523.23909-1-koorogi@koorogi.info>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches for 2.8.0-rc1
# gpg: Signature made Tue 22 Nov 2016 03:55:38 PM GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream:
block: Pass unaligned discard requests to drivers
block: Return -ENOTSUP rather than assert on unaligned discards
block: Let write zeroes fallback work even with small max_transfer
qcow2: Inform block layer about discard boundaries
Message-id: 1479830693-26676-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Attach the usb bus of a new pvusb controller to the qdev associated
with the Xen backend. Any device connected to that controller can now
specify the bus and port directly via its properties.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Create a qdev plugged to the xen-sysbus for each new backend device.
This device can be used as a parent for all needed devices of that
backend. The id of the new device will be "xen-<type>-<dev>" with
<type> being the xen backend type (e.g. "qdisk") and <dev> the xen
backend number of the type under which it is to be found in xenstore.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
In order to have an easy way to add a new qdev with a specific id
carve out the needed functionality from qdev_device_add() into a new
function qdev_set_id().
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Add a bus for Xen backend devices in order to be able to establish a
dedicated device path for pluggable devices.
Signed-off-by: Juergen Gross <jgross@suse.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
A typo prevents ISA interrupts from being recognized on cpu0,
which is where the smp kernel normally wants to see them.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Discard is advisory, so rounding the requests to alignment
boundaries is never semantically wrong from the data that
the guest sees. But at least the Dell Equallogic iSCSI SANs
has an interesting property that its advertised discard
alignment is 15M, yet documents that discarding a sequence
of 1M slices will eventually result in the 15M page being
marked as discarded, and it is possible to observe which
pages have been discarded.
Between commits 9f1963b and b8d0a980, we converted the block
layer to a byte-based interface that ultimately ignores any
unaligned head or tail based on the driver's advertised
discard granularity, which means that qemu 2.7 refuses to
pass any discard request smaller than 15M down to the Dell
Equallogic hardware. This is a slight regression in behavior
compared to earlier qemu, where a guest executing discards
in power-of-2 chunks used to be able to get every page
discarded, but is now left with various pages still allocated
because the guest requests did not align with the hardware's
15M pages.
Since the SCSI specification says nothing about a minimum
discard granularity, and only documents the preferred
alignment, it is best if the block layer gives the driver
every bit of information about discard requests, rather than
rounding it to alignment boundaries early.
Rework the block layer discard algorithm to mirror the write
zero algorithm: always peel off any unaligned head or tail
and manage that in isolation, then do the bulk of the request
on an aligned boundary. The fallback when the driver returns
-ENOTSUP for an unaligned request is to silently ignore that
portion of the discard request; but for devices that can pass
the partial request all the way down to hardware, this can
result in the hardware coalescing requests and discarding
aligned pages after all.
Reported by: Peter Lieven <pl@kamp.de>
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Right now, the block layer rounds discard requests, so that
individual drivers are able to assert that discard requests
will never be unaligned. But there are some ISCSI devices
that track and coalesce multiple unaligned requests, turning it
into an actual discard if the requests eventually cover an
entire page, which implies that it is better to always pass
discard requests as low down the stack as possible.
In isolation, this patch has no semantic effect, since the
block layer currently never passes an unaligned request through.
But the block layer already has code that silently ignores
drivers that return -ENOTSUP for a discard request that cannot
be honored (as well as drivers that return 0 even when nothing
was done). But the next patch will update the block layer to
fragment discard requests, so that clients are guaranteed that
they are either dealing with an unaligned head or tail, or an
aligned core, making it similar to the block layer semantics of
write zero fragmentation.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 443668ca rewrote the write_zeroes logic to guarantee that
an unaligned request never crosses a cluster boundary. But
in the rewrite, the new code assumed that at most one iteration
would be needed to get to an alignment boundary.
However, it is easy to trigger an assertion failure: the Linux
kernel limits loopback devices to advertise a max_transfer of
only 64k. Any operation that requires falling back to writes
rather than more efficient zeroing must obey max_transfer during
that fallback, which means an unaligned head may require multiple
iterations of the write fallbacks before reaching the aligned
boundaries, when layering a format with clusters larger than 64k
atop the protocol of file access to a loopback device.
Test case:
$ qemu-img create -f qcow2 -o cluster_size=1M file 10M
$ losetup /dev/loop2 /path/to/file
$ qemu-io -f qcow2 /dev/loop2
qemu-io> w 7m 1k
qemu-io> w -z 8003584 2093056
In fairness to Denis (as the original listed author of the culprit
commit), the faulty logic for at most one iteration is probably all
my fault in reworking his idea. But the solution is to restore what
was in place prior to that commit: when dealing with an unaligned
head or tail, iterate as many times as necessary while fragmenting
the operation at max_transfer boundaries.
Reported-by: Ed Swierk <eswierk@skyportsystems.com>
CC: qemu-stable@nongnu.org
CC: Denis V. Lunev <den@openvz.org>
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
At the qcow2 layer, discard is only possible on a per-cluster
basis; at the moment, qcow2 silently rounds any unaligned
requests to this granularity. However, an upcoming patch will
fix a regression in the block layer ignoring too much of an
unaligned discard request, by changing the block layer to
break up a discard request at alignment boundaries; for that
to work, the block layer must know about our limits.
However, we can't go one step further by changing
qcow2_discard_clusters() to assert that requests are always
aligned, since that helper function is reached on paths
outside of the block layer.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio, vhost, pc: fixes
Most notably this fixes a regression with vhost introduced by the pull before
last.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 18 Nov 2016 03:51:55 PM GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* mst/tags/for_upstream:
acpi: Use apic_id_limit when calculating legacy ACPI table size
ipmi: fix qemu crash while migrating with ipmi
ivshmem: Fix 64 bit memory bar configuration
virtio: set ISR on dataplane notifications
virtio: access ISR atomically
virtio: introduce grab/release_ioeventfd to fix vhost
virtio-crypto: fix virtio_queue_set_notification() race
Message-id: 1479484366-7977-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The code that calculates the legacy ACPI table size for migration
compatibility uses max_cpus when calculating legacy_aml_len (the size of
the DSDT and SSDT tables). However, the SSDT grows according to APIC ID
limit, not max_cpus.
The bug is not triggered very often because of the 4k alignment on the
table size. But it can be triggered if you are unlucky enough to cross a
4k boundary.
Change the legacy_aml_len calculation to use apic_id_limit, to calculate
the right size.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Qemu crash in the source side while migrating, after starting ipmi service inside vm.
./x86_64-softmmu/qemu-system-x86_64 --enable-kvm -smp 4 -m 4096 \
-drive file=/work/suse/suse11_sp3_64_vt,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \
-vnc :99 -monitor vc -device ipmi-bmc-sim,id=bmc0 -device isa-ipmi-kcs,bmc=bmc0,ioport=0xca2
Program received signal SIGSEGV, Segmentation fault.
[Switching to Thread 0x7ffec4268700 (LWP 7657)]
__memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
(gdb) bt
#0 __memcpy_ssse3_back () at ../sysdeps/x86_64/multiarch/memcpy-ssse3-back.S:2757
#1 0x00005555559ef775 in memcpy (__len=3, __src=0xc1421c, __dest=<optimized out>)
at /usr/include/bits/string3.h:51
#2 qemu_put_buffer (f=0x555557a97690, buf=0xc1421c <Address 0xc1421c out of bounds>, size=3)
at migration/qemu-file.c:346
#3 0x00005555559eef66 in vmstate_save_state (f=f@entry=0x555557a97690,
vmsd=0x555555f8a5a0 <vmstate_ISAIPMIKCSDevice>, opaque=0x555557231160,
vmdesc=vmdesc@entry=0x55555798cc40) at migration/vmstate.c:333
#4 0x00005555557cfe45 in vmstate_save (f=f@entry=0x555557a97690, se=se@entry=0x555557231de0,
vmdesc=vmdesc@entry=0x55555798cc40) at /mnt/sdb/zyy/qemu/migration/savevm.c:720
#5 0x00005555557d2be7 in qemu_savevm_state_complete_precopy (f=0x555557a97690,
iterable_only=iterable_only@entry=false) at /mnt/sdb/zyy/qemu/migration/savevm.c:1128
#6 0x00005555559ea102 in migration_completion (start_time=<synthetic pointer>,
old_vm_running=<synthetic pointer>, current_active_state=<optimized out>,
s=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1707
#7 migration_thread (opaque=0x5555560eaa80 <current_migration.44078>) at migration/migration.c:1855
#8 0x00007ffff3900dc5 in start_thread (arg=0x7ffec4268700) at pthread_create.c:308
#9 0x00007fffefc6c71d in clone () at ../sysdeps/unix/sysv/linux/x86_64/clone.S:113
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one. The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR. A 32 bit BAR can support only up to 1 GiB of shared memory.
This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa. Worse, the default got flipped as well. Devices
ivshmem-plain and ivshmem-doorbell are not affected.
Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02. Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.
Cc: qemu-stable@nongnu.org
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Dataplane has been omitting forever the step of setting ISR when
an interrupt is raised. This caused little breakage, because the
specification actually says that ISR may not be updated in MSI mode.
Some versions of the Windows drivers however didn't clear MSI mode
correctly, and proceeded using polling mode (using ISR, not the used
ring index!) for crashdump and hibernation. If it were just crashdump
and hibernation it would not be a big deal, but recent releases of
Windows do not really shut down, but rather log out and hibernate to
make the next startup faster. Hence, this manifested as a more serious
hang during shutdown with e.g. Windows 8.1 and virtio-win 1.8.0 RPMs.
Newer versions fixed this, while older versions do not use MSI at all.
The failure has always been there for virtio dataplane, but it became
visible after commits 9ffe337 ("virtio-blk: always use dataplane path
if ioeventfd is active", 2016-10-30) and ad07cd6 ("virtio-scsi: always
use dataplane path if ioeventfd is active", 2016-10-30) made virtio-blk
and virtio-scsi always use the dataplane code under KVM. The good news
therefore is that it was not a bug in the patches---they were doing
exactly what they were meant for, i.e. shake out remaining dataplane bugs.
The fix is not hard, so it's worth arranging for the broken drivers.
The virtio_should_notify+event_notifier_set pair that is common to
virtio-blk and virtio-scsi dataplane is replaced with a new public
function virtio_notify_irqfd that also sets ISR. The irqfd emulation
code now need not set ISR anymore, so virtio_irq is removed.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Following the recent refactoring of virtio notifiers [1], more specifically
the patch ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to
start/stop ioeventfd") that uses virtio_bus_set_host_notifier [2]
by default, core virtio code requires 'ioeventfd_started' to be set
to true/false when the host notifiers are configured.
When vhost is stopped and started, however, there is a stop followed by
another start. Since ioeventfd_started was never set to true, the 'stop'
operation triggered by virtio_bus_set_host_notifier() will not result
in a call to virtio_pci_ioeventfd_assign(assign=false). This leaves
the memory regions with stale notifiers and results on the next start
triggering the following assertion:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
Aborted
This patch reintroduces (hopefully in a cleaner way) the concept
that was present with ioeventfd_disabled before the refactoring.
When ioeventfd_grabbed>0, ioeventfd_started tracks whether ioeventfd
should be enabled or not, but ioeventfd is actually not started at
all until vhost releases the host notifiers.
[1] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07748.html
[2] http://lists.nongnu.org/archive/html/qemu-devel/2016-10/msg07760.html
Reported-by: Felipe Franciosi <felipe@nutanix.com>
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reported-by: Alex Williamson <alex.williamson@redhat.com>
Fixes: ed08a2a0b ("virtio: use virtio_bus_set_host_notifier to start/stop ioeventfd")
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Tested-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We must check for new virtqueue buffers after re-enabling notifications.
This prevents the race condition where the guest added buffers just
after we stopped popping the virtqueue but before we re-enabled
notifications.
I think the virtio-crypto code was based on virtio-net but this crucial
detail was missed. virtio-net does not have the race condition because
it processes the virtqueue one more time after re-enabling
notifications.
Cc: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
If the QEMU source dir is
/var/tmp/aaa-qemu-clone
and the build dir is
/var/tmp/qemu-aio-poll-v2
Then I get an error as:
trace/generated-tracers.c:15950:13: error: invalid suffix "_trace_events"
on integer constant
TraceEvent *2_trace_events[] = {
^
trace/generated-tracers.c:15950:13: error: expected identifier or ‘(’ before
numeric constant
trace/generated-tracers.c: In function ‘trace_2_register_events’:
trace/generated-tracers.c:17949:32: error: invalid suffix "_trace_events" on
integer constant
trace_event_register_group(2_trace_events);
^
make: *** [trace/generated-tracers.o] Error 1
This patch fixes the issue.
Reported-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Tested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Device ivshmem property use64=0 is designed to make the device
expose a 32 bit shared memory BAR instead of 64 bit one. The
default is a 64 bit BAR, except pc-1.2 and older retain a 32 bit
BAR. A 32 bit BAR can support only up to 1 GiB of shared memory.
This worked as designed until commit 5400c02 accidentally flipped
its sense: since then, we misinterpret use64=0 as use64=1 and vice
versa. Worse, the default got flipped as well. Devices
ivshmem-plain and ivshmem-doorbell are not affected.
Fix by restoring the test of IVShmemState member not_legacy_32bit
that got messed up in commit 5400c02. Also update its
initialization for devices ivhsmem-plain and ivshmem-doorbell.
Without that, they'd regress to 32 bit BARs.
Signed-off-by: Zhuang Yanying <ann.zhuangyanying@huawei.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1479385863-7648-1-git-send-email-ann.zhuangyanying@huawei.com>
PC will use this field in other way, so move it outside the common
code so PC could set a different value, i.e. all CPUs
regardless of where they are coming from (-smp X | -device cpu...).
It's quick and dirty hack as it could be implemented in more generic
way in MashineClass. But do it in simple way since only PC is affected
so far.
Later we can generalize it when another affected target gets support
for -device cpu.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1479212236-183810-3-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
virtio, vhost, pc, pci: documentation, fixes and cleanups
Lots of fixes all over the place.
Unfortunately, this does not yet fix a regression with vhost
introduced by the last pull, the issue is typically this error:
kvm_mem_ioeventfd_add: error adding ioeventfd: File exists
followed by QEMU aborting.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
* remotes/mst/tags/for_upstream: (28 commits)
docs: add PCIe devices placement guidelines
virtio: drop virtio_queue_get_ring_{size,addr}()
vhost: drop legacy vring layout bits
vhost: adapt vhost_verify_ring_mappings() to virtio 1 ring layout
nvdimm acpi: introduce NVDIMM_DSM_MEMORY_SIZE
nvdimm acpi: use aml_name_decl to define named object
nvdimm acpi: rename nvdimm_dsm_reserved_root
nvdimm acpi: fix two comments
nvdimm acpi: define DSM return codes
nvdimm acpi: rename nvdimm_acpi_hotplug
nvdimm acpi: cleanup nvdimm_build_fit
nvdimm acpi: rename nvdimm_plugged_device_list
docs: improve the doc of Read FIT method
nvdimm acpi: clean up nvdimm_build_acpi
pc: memhp: stop handling nvdimm hotplug in pc_dimm_unplug
pc: memhp: move nvdimm hotplug out of memory hotplug
nvdimm acpi: drop the lock of fit buffer
qdev: hotplug: drop HotplugHandler.post_plug callback
vhost: migration blocker only if shared log is used
virtio-net: mark VIRTIO_NET_F_GSO as legacy
...
Message-id: 1479237527-11846-1-git-send-email-mst@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
qdev: Fix assert in PCI address property when used by vfio-pci
# gpg: Signature made Tue 15 Nov 2016 06:27:18 PM GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* ehabkost/tags/machine-pull-request:
qdev: Fix assert in PCI address property when used by vfio-pci
Message-id: 1479234540-3192-1-git-send-email-ehabkost@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Allow the PCIHostDeviceAddress structure to work as the host property
in vfio-pci when it has it's default value of all fields set to ~0. In
this form the property indicates a non-existant device but given the
field bit sizes gets asserted as excess (and invalid) precision
overflows the string buffer. The BDF of an invalid device
"FFFF:FF:FF.F" is returned instead.
Signed-off-by: Daniel Oram <daniel.oram@gmail.com>
Reviewed-by: Alex Williamson <alex.williamson@redhat.com>
Message-Id: <71f06765c4ba16dcd71cbf78e877619948f04ed9.1478777270.git.daniel.oram@gmail.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Proposes best practices on how to use PCI Express/PCI device
in PCI Express based machines and explain the reasoning behind them.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The legacy vring layout is not used anymore as we use the separate
mappings even for legacy devices.
This patch simply removes it.
This also fixes a bug with virtio 1 devices when the vring descriptor table
is mapped at a higher address than the used vring because the following
function may return an insanely great value:
hwaddr virtio_queue_get_ring_size(VirtIODevice *vdev, int n)
{
return vdev->vq[n].vring.used - vdev->vq[n].vring.desc +
virtio_queue_get_used_size(vdev, n);
}
and the mapping fails.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
With virtio 1, the vring layout is split in 3 separate regions of
contiguous memory for the descriptor table, the available ring and the
used ring, as opposed with legacy virtio which uses a single region.
In case of memory re-mapping, the code ensures it doesn't affect the
vring mapping. This is done in vhost_verify_ring_mappings() which assumes
the device is legacy.
This patch changes vhost_verify_ring_mappings() to check the mappings of
each part of the vring separately.
This works for legacy mappings as well.
Cc: qemu-stable@nongnu.org
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To make the code more clearer, we
1) check ram_slots first, and build ssdt & nfit only when it is available
2) use nvdimm_get_plugged_device_list() to check if there is nvdimm device
plugged
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Commit 31190ed7 added a migration blocker in vhost_dev_init() to
check if memfd would succeed. It is better if this blocker first
checks if vhost backend requires shared log. This will avoid a
situation where a blocker is added inappropriately (e.g. shared
log allocation fails when vhost backend doesn't support it).
Signed-off-by: Rafael David Tinoco <rafael.tinoco@canonical.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
virtio 1.0 spec says this is a legacy feature bit,
hide it from guests in modern mode.
Note: for cross-version migration compatibility,
we keep the bit set in host_features.
The result will be that a guest migrating cross-version
will see host features change under it.
As guests only seem to read it once, this should
not be an issue. Meanwhile, will work to fix guests to
ignore this bit in virtio1 mode, too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Legacy features are those that transitional devices only
expose on the legacy interface.
Allow different ones per device class.
Cc: qemu-stable@nongnu.org # dependency for the next patch
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
We should not use cpu_to_le16() here, instead each of device/function
value is stored in a 8 byte field.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently the virtio-crypto device hasn't supported
hotpluggable and live migration well. Let's tag it
as not hotpluggable and migration actively and reopen
them once we support them well.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function does not fully initialize the returned VirtQueueElement and should
be used only internally from the virtio module.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The function undoes the effect of virtqueue_pop and doesn't do anything
destructive or irreversible so virtqueue_unpop is a more fitting name.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 15 Nov 2016 07:37:27 AM GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* jasowang/tags/net-pull-request:
docs: fix COLO architecture diagram
net: fix sending of data with -net socket, listen backend
net: skip virtio-net config of deleted nic's peers
Message-id: 1479195830-4725-1-git-send-email-jasowang@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# gpg: Signature made Tue 15 Nov 2016 04:10:29 AM GMT
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* jtc/tags/block-pull-request:
mirror: do not flush every time the disks are synced
block/curl: Do not wait for data beyond EOF
block/curl: Remember all sockets
block/curl: Fix return value from curl_read_cb
block/curl: Use BDRV_SECTOR_SIZE
block/curl: Drop TFTP "support"
qemu-iotests: avoid spurious failure on test 109
iotests: add transactional failure race test
blockjob: refactor backup_start as backup_job_create
blockjob: add block_job_start
blockjob: add .start field
blockjob: add .clean property
blockjob: fix dead pointer in txn list
Message-id: 1479183291-14086-1-git-send-email-jcody@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ppc patch queue 2016-11-15
Latest set of ppc and spapr related patches. Highlights are:
* More POWER9 instructions
* Fix some subtle outstanding bugs
* Add some extra tests
One patch affects bitops.h, so isn't strictly ppc related.
# gpg: Signature made Tue 15 Nov 2016 02:46:48 AM GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* dgibson/tags/ppc-for-2.8-20161115:
boot-serial-test: Add a test for the powernv machine
tests: add XSCOM tests for the PowerNV machine
ppc/pnv: Fix fatal bug on 32-bit hosts
ppc/pnv: fix xscom address translation for POWER9
ppc/pnv: add a 'xscom_core_base' field to PnvChipClass
spapr-vty: Fix bad assert() statement
FU exceptions should carry a cause (IC)
spapr: Fix migration of PCI host bridges from qemu-2.7
target-ppc: Implement bcdctz. instruction
target-ppc: Implement bcdcfz. instruction
target-ppc: Implement bcdctn. instruction
target-ppc: Implement bcdcfn. instruction
ppc: Remove some stub POWER6 models
ppc/pnv: fix compile breakage on old gcc
powernv: CPU compatibility modes don't make sense for powernv
target-ppc: add vprtyb[w/d/q] instructions
target-ppc: add vrldnm and vrlwnm instructions
target-ppc: add vrldnmi and vrlwmi instructions
bitops: fix rol/ror when shift is zero
Message-id: 1479178144-28153-1-git-send-email-david@gibson.dropbear.id.au
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The use of -net socket,listen was broken in the following
commit
commit 16a3df403b
Author: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Date: Fri May 13 15:35:19 2016 +0800
net/net: Add SocketReadState for reuse codes
This function is from net/socket.c, move it to net.c and net.h.
Add SocketReadState to make others reuse net_fill_rstate().
suggestion from jason.
This refactored the state out of NetSocketState into a
separate SocketReadState. This refactoring requires
that a callback is provided to be triggered upon
completion of a packet receive from the guest.
The patch only registered this callback in the codepaths
hit by -net socket,connect, not -net socket,listen. So
as a result packets sent by the guest in the latter case
get dropped on the floor.
This bug is hidden because net_fill_rstate() silently
does nothing if the callback is not set.
This patch adds in the middle callback registration
and also adds an assert so that QEMU aborts if there
are any other codepaths hit which are missing the
callback.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
https://bugzilla.redhat.com/show_bug.cgi?id=1373816
qemu core dump happens during repetitive unpug-plug
with multiple queues and Windows RSS-capable guest.
If back-end delete requested during virtio-net device
initialization, driver still can try configure the device
for multiple queues. The virtio-net device is expected
to be removed as soon as the initialization is done.
Signed-off-by: Yuri Benditovich <yuri.benditovich@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This puts a huge strain on the disks when there are many concurrent
migrations. With this patch we only flush twice: just before issuing
the event, and just before pivoting to the destination. If management
will complete the job close to the BLOCK_JOB_READY event, the cost of
the second flush should be small anyway.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161109162008.27287-2-pbonzini@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
libcurl will only give us as much data as there is, not more. The block
layer will deny requests beyond the end of file for us; but since this
block driver is still using a sector-based interface, we can still get
in trouble if the file size is not a multiple of 512.
While we have already made sure not to attempt transfers beyond the end
of the file, we are currently still trying to receive data from there if
the original request exceeds the file size. This patch fixes this issue
and invokes qemu_iovec_memset() on the iovec's tail.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-5-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
For some connection types (like FTP, generally), more than one socket
may be used (in FTP's case: control vs. data stream). As of commit
838ef60249 ("curl: Eliminate unnecessary
use of curl_multi_socket_all"), we have to remember all of the sockets
used by libcurl, but in fact we only did that for a single one. Since
one libcurl connection may use multiple sockets, however, we have to
remember them all.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161025025431.24714-4-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
While commit 38bbc0a580 is correct in that
the callback is supposed to return the number of bytes handled; what it
does not mention is that libcurl will throw an error if the callback did
not "handle" all of the data passed to it.
Therefore, if the callback receives some data that it cannot handle
(either because the receive buffer has not been set up yet or because it
would not fit into the receive buffer) and we have to ignore it, we
still have to report that the data has been handled.
Obviously, this should not happen normally. But it does happen at least
for FTP connections where some data (that we do not expect) may be
generated when the connection is established.
Cc: qemu-stable@nongnu.org
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20161025025431.24714-3-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Because TFTP does not support byte ranges, it was never usable with our
curl block driver. Since apparently nobody has ever complained loudly
enough for someone to take care of the issue until now, it seems
reasonable to assume that nobody has ever actually used it.
Therefore, it should be safe to just drop it from curl's protocol list.
[Jeff Cody: Below is additional summary pulled, with some rewording,
from followup emails between Max and Markus, to explain what
worked and what didn't]
TFTP would sometimes work, to a limited extent, for images <= the curl
"readahead" size, so long as reads started at offset zero. By default,
that readahead size is 256KB.
Reads starting at a non-zero offset would also have returned data from a
zero offset. It can become more complicated still, with mixed reads at
zero offset and non-zero offsets, due to data buffering.
In short, TFTP could only have worked before in very specific scenarios
with unrealistic expectations and constraints.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20161102175539.4375-4-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
In some cases it is possible that query-io-status is called just
before the job is completed, causing
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, "event": "BLOCK_JOB_COMPLETED", "data": {"device": "src", "len": 31457280, "offset": OFFSET, "speed": 0, "type": "mirror", "error": "Operation not permitted"}}
-{"return": []}
+{"return": [{"io-status": "ok", "device": "src", "busy": true, "len": 31457280, "offset": OFFSET, "paused": false, "speed": 0, "ready": false, "type": "mirror"}]}
Assert that the completeion event eventually happens.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20161109162008.27287-1-pbonzini@redhat.com
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Refactor backup_start as backup_job_create, which only creates the job,
but does not automatically start it. The old interface, 'backup_start',
is not kept in favor of limiting the number of nearly-identical interfaces
that would have to be edited to keep up with QAPI changes in the future.
Callers that wish to synchronously start the backup_block_job can
instead just call block_job_start immediately after calling
backup_job_create.
Transactions are updated to use the new interface, calling block_job_start
only during the .commit phase, which helps prevent race conditions where
jobs may finish before we even finish building the transaction. This may
happen, for instance, during empty block backup jobs.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1478587839-9834-6-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Instead of automatically starting jobs at creation time via backup_start
et al, we'd like to return a job object pointer that can be started
manually at later point in time.
For now, add the block_job_start mechanism and start the jobs
automatically as we have been doing, with conversions job-by-job coming
in later patches.
Of note: cancellation of unstarted jobs will perform all the normal
cleanup as if the job had started, particularly abort and clean. The
only difference is that we will not emit any events, because the job
never actually started.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1478587839-9834-5-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Add an explicit start field to specify the entrypoint. We already have
ownership of the coroutine itself AND managing the lifetime of the
coroutine, let's take control of creation of the coroutine, too.
This will allow us to delay creation of the actual coroutine until we
know we'll actually start a BlockJob in block_job_start. This avoids
the sticky question of how to "un-create" a Coroutine that hasn't been
started yet.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478587839-9834-4-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Cleaning up after we have deferred to the main thread but before the
transaction has converged can be dangerous and result in deadlocks
if the job cleanup invokes any BH polling loops.
A job may attempt to begin cleaning up, but may induce another job to
enter its cleanup routine. The second job, part of our same transaction,
will block waiting for the first job to finish, so neither job may now
make progress.
To rectify this, allow jobs to register a cleanup operation that will
always run regardless of if the job was in a transaction or not, and
if the transaction job group completed successfully or not.
Move sensitive cleanup to this callback instead which is guaranteed to
be run only after the transaction has converged, which removes sensitive
timing constraints from said cleanup.
Furthermore, in future patches these cleanup operations will be performed
regardless of whether or not we actually started the job. Therefore,
cleanup callbacks should essentially confine themselves to undoing create
operations, e.g. setup actions taken in what is now backup_start.
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478587839-9834-3-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The new powernv machine ships with a firmware that outputs
some text to the serial console, so we can automatically
test this machine type in the boot-serial tester, too.
And to get some (very limited) test coverage for the new
POWER9 CPU emulation, too, this test is also started with
"-cpu POWER9".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add a couple of tests on the XSCOM bus of the PowerNV machine for the
the POWER8 and POWER9 CPUs. The first tests reads the CFAM identifier
of the chip. The second test goes further in the XSCOM address space
and reaches the cores to read their DTS registers.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Fixed an incorrect indentation, and a Makefile problem]]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If the pnv machine type is compiled on a 32-bit host, the unsigned long
(host) type is 32-bit. This means that the hweight_long() used to
calculate the number of allowed cores only considers the low 32 bits of
the cores_mask variable, and can thus return 0 in some circumstances.
This corrects the bug.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Suggested-by: Richard Henderson <rth@twiddle.net>
[clg: replaced hweight_long() by ctpop64() ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
High addresses can overflow the uint32_t pcba variable after the 8byte
shift.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XSCOM addresses for the core registers are encoded in a slightly
different way on POWER8 and POWER9.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When using the serial console in the GTK interface of QEMU (and
QEMU has been compiled with CONFIG_VTE), it is possible to trigger
the assert() statement in vty_receive() in spapr_vty.c by pasting
a chunk of text with length > 16 into the QEMU window.
Most of the other serial backends seem to simply drop characters
that they can not handle, so I think we should also do the same in
spapr-vty to fix this issue.
Buglink: https://bugs.launchpad.net/qemu/+bug/1639322
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
As per the ISA we need a cause and executing a tabort r9 in libc
for example causes a EXCP_FU exception, we don't wire up the
IC (cause) when we post the exception. The cause is required
for the kernel to do the right thing. The fix applies only to 64
bit ppc targets.
Signed-off-by: Balbir singh <bsingharora@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
daa2369 "spapr_pci: Add a 64-bit MMIO window" subtly broke migration from
qemu-2.7 to the current version. It split the device's MMIO window into
two pieces for 32-bit and 64-bit MMIO.
The patch included backwards compatibility code to convert the old property
into the new format. However, the property value was also transferred in
the migration stream and compared with a (probably unwise) VMSTATE_EQUAL.
So, the "raw" value from 2.7 is compared to the new style converted value
from (pre-)2.8 giving a mismatch and migration failure.
Although it would be technically possible to fix this in a way allowing
backwards migration, that would leave an ugly legacy around indefinitely.
This patch takes the simpler approach of bumping the migration version,
dropping the unwise VMSTATE_EQUAL (and some equally unwise ones around it)
and ignoring them on an incoming migration.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
bcdctz. converts from BCD to Zoned numeric format. Zoned format uses
a byte to represent a digit where the most significant nibble is 0x3
or 0xf, depending on the preferred signal.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdcfz. converts from Zoned numeric format to BCD. Zoned format uses
a byte to represent a digit where the most significant nibble is 0x3
or 0xf, depending on the preferred signal.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdctn. converts from BCD to National numeric format. National format
uses a byte to represent a digit where the most significant nibble is
always 0x3 and the least sign. nibbles is the digit itself.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
bcdcfn. converts from National numeric format to BCD. National format
uses a byte to represent a digit where the most significant nibble is
always 0x3 and the least sign. nibbles is the digit itself.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The CPU model table includes stub (commented out) definitions for
CPU_POWERPC_POWER6_5 and CPU_POWERPC_POWER6A. These are not real cpu
models, but represent the POWER6 in some compatiblity modes. If we ever
do implement POWER6 (unlikely), we'll implement its compatibility modes in
a different way (similar to what we do for POWER7 and POWER8). So these
stub definitions can be removed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
PnvChip is defined twice and this can confuse old compilers :
CC ppc64-softmmu/hw/ppc/pnv_xscom.o
In file included from qemu.git/hw/ppc/pnv.c:29:
qemu.git/include/hw/ppc/pnv.h:60: error: redefinition of typedef ‘PnvChip’
qemu.git/include/hw/ppc/pnv_xscom.h:24: note: previous declaration of ‘PnvChip’ was here
make[1]: *** [hw/ppc/pnv.o] Error 1
make[1]: *** Waiting for unfinished jobs....
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
powernv has some code (derived from the spapr equivalent) used in device
tree generation which depends on the CPU's compatibility mode / logical
PVR. However, compatibility modes don't make sense on powernv - at least
not as a property controlled by the host - because the guest in powernv
has full hypervisor level access to the virtual system, and so owns the
PCR (Processor Compatibility Register) which implements compatiblity modes.
Note: the new logic doesn't take into account kvmppc_smt_threads() like the
old version did. However, if core->nr_threads exceeds kvmppc_smt_threads()
then things will already be broken and clamping the value in the device
tree isn't going to save us.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Add following POWER ISA 3.0 instructions.
vprtybw: Vector Parity Byte Word
vprtybd: Vector Parity Byte Double Word
vprtybq: Vector Parity Byte Quad Word
Signed-off-by: Ankit Kumar <ankit@linux.vnet.ibm.com>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
vrldmi: Vector Rotate Left Dword then Mask Insert
vrlwmi: Vector Rotate Left Word then Mask Insert
Signed-off-by: Gautham R. Shenoy <ego@linux.vnet.ibm.com>
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
( use extract[32,64] and rol[32,64], introduce mask helpers in
internal.h )
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All the variants for rol/ror have a bug in case where the shift == 0.
For example rol32, would generate:
return (word << 0) | (word >> 32);
Which though works, would be flagged as a runtime error on clang's
sanitizer.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu_savevm_state_iterate() expects the iterators to return 1
when they are done, and 0 if there is still something left to do.
However, ram_save_iterate() does not obey this rule and returns
the number of saved pages instead. This causes a fatal hang with
ppc64 guests when you run QEMU like this (also works with TCG):
qemu-img create -f qcow2 /tmp/test.qcow2 1M
qemu-system-ppc64 -nographic -nodefaults -m 256 \
-hda /tmp/test.qcow2 -serial mon:stdio
... then switch to the monitor by pressing CTRL-a c and try to
save a snapshot with "savevm test1" for example.
After the first iteration, ram_save_iterate() always returns 0 here,
so that qemu_savevm_state_iterate() hangs in an endless loop and you
can only "kill -9" the QEMU process.
Fix it by using proper return values in ram_save_iterate().
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
if_start() goes through the slirp->if_fastq and slirp->if_batchq
list of pending messages, and accesses ifm->ifq_so->so_nqueued of its
elements if ifm->ifq_so != NULL. When freeing a socket, we thus need
to make sure that any pending message for this socket does not refer
to the socket any more.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Brian Candler <b.candler@pobox.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
blk_eject is only used by scsi-disk and atapi, and in both cases we
only attempt to invoke blk_eject if we have a bona-fide change in
tray state.
The "issue" here is that the tray state does not generate a QMP event
unless there is a medium/BDS attached to the device, so if libvirt et al
are waiting for a tray event to occur from an empty-but-closed drive,
software opening that drive will not emit an event and libvirt will
wait forever.
Change this by modifying blk_eject to always emit an event, instead of
conditionally on a "real" backend eject.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1373264
Reported-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1478553214-497-2-git-send-email-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Commit 9ef2e93f introduced the concept of tagging ATAPI commands as
NONDATA, but this introduced a regression for certain commands better
described as CONDDATA. read_cd is such a command that both requires
a non-zero BCL if a transfer size is set, but is perfectly content to
accept a zero BCL if the transfer size is 0.
This test adds a regression test for the case where BCL and nb_sectors
are both 0.
Flesh out the CDROM tests by:
(1) Allowing the test to specify a BCL
(2) Allowing the buffer comparison test to compare a 0-size buffer
(3) Fix the BCL specification in libqos (It is LE, not BE)
(4) Add a nice human-readable message for future SCSI command additions
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1477970211-25754-4-git-send-email-jsnow@redhat.com
[Line length edit --js]
Signed-off-by: John Snow <jsnow@redhat.com>
Block layer patches for 2.8.0-rc0
# gpg: Signature made Fri 11 Nov 2016 03:46:12 PM GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* kwolf/tags/for-upstream:
raw-posix: Rename 'raw_s' to 'rs'
iotests: Always use -machine accel=qtest
iotests: Skip test 162 if there is no SSH support
block: Emit modules in bdrv_iterate_format()
block: Fix bdrv_iterate_format() sorting
nfs: Fix memory leak in nfs_file_create()
qcow2: Remove stale FIXME comment
raw_bsd: don't check size alignment when only offset is set
raw_bsd: move check to prevent overflow
hmp: Make block_stream set an explicit job ID
block/ssh: Code cleanup for unused parameter
block/nbd: Fix the leaked visitor
Message-id: 1478883311-24052-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We forgot to assign true to params->has_x_checkpoint_delay parameter
in qmp_query_migrate_parameters.
Without this, qmp command 'query-migrate-parameters' doesn't show the
default value for x-checkpoint-delay option.
This also fixes the fact that HMP was relying on unspecified behavior by
reading x_checkpoint_delay without checking has_x_checkpoint_delay.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Block patches for qemu 2.8
# gpg: Signature made Fri Nov 11 15:56:59 2016 CET
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2016-11-11:
raw-posix: Rename 'raw_s' to 'rs'
iotests: Always use -machine accel=qtest
iotests: Skip test 162 if there is no SSH support
block: Emit modules in bdrv_iterate_format()
block: Fix bdrv_iterate_format() sorting
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, we only use -machine accel=qtest when qemu is invoked through
the common.qemu functions. However, we always want to use it, so move it
from common.qemu directly into QEMU_OPTIONS.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161017183917.8837-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_iterate_format() did not actually sort the formats by name but by
"pointer interpreted as string". That is probably not what we intended
to do, so fix it (by changing qsort_strcmp() so it matches the example
from qsort()'s manual page).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20161012204907.25941-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
It was from the time when none of the global functions had a qcow2_
prefix.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We make sure that the size is aligned to sector length to prevent any
round ups. Otherwise we could end up reading/writing data outside the
area specified by user. This is only needed when user supplies the size
option to avoid any surprises. It is not necessary when only offset is
set.
More over, the check made it difficult to use the offset option without
size option. The check puts unneeded restriction on the offset which had
to be aligned too. Because bdrv_getlength() returns aligned value having
unaligned offset would make the check fail.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When only offset is specified but no size and the offset is greater than
the real size of the containing device an overflow occurs when parsing
the options. This overflow is harmless because we do check for this
exact situation little bit later, but it leads to an error message with
weird values. It is better to do the check is sooner and prevent the
overflow.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A job ID is always required in order to create a block job on a
non-root node. The default ID (obtained with bdrv_get_device_name())
is otherwise empty in this scenario and the job cannot be created.
The HMP block_stream command doesn't set a job ID and therefore it
doesn't allow streaming to intermediate nodes. One solution is to add
an extra parameter to set a job ID. The other solution is to simply
use the node name passed to block_stream as job ID. This won't work
if it's automatically generated (because it contains a '#') but is
otherwise simple enough for all other cases.
This way 'block_stream node3' will create a job with the ID 'node3'
and the good old 'block_stream virtio0' will keep the previous
behaviour and use 'virtio0' for the job ID.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch drops the unused parameter "BDRVSSHState" being passed into
the ssh_config() function and does code cleanup. The unused parameter
was introduced by the commit c322712.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch frees the leaked visitor in nbd_refresh_filename() and uses
visit_free() to fix it. The leak was introduced by the commit 491d6c7.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There are only very old and orphaned stable branches listed
in the MAINTAINERS file - so this section is pretty useless
nowadays. Let's remove it.
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
These files are currently unmaintained.
I'm proposing that Fam and I co-maintain them; under the model that
whomever between us isn't authoring a given series will be responsible
for reviewing it.
Signed-off-by: John Snow <jsnow@redhat.com>
Acked-by: Fam Zheng <famz@redhat.com>
Acked-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
I recently added new files to the source tree that are not
covered by any maintainer yet -- and since every new source
file should have a maintainer nowadays, I volunteer to look
after these files now, too.
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
disas/m68k.c obviously belong to the m68k CPU section in
the MAINTAINERS file, but remove the hw/m68k/ directory
here since it only contains machine (not CPU) related
files, as requested by Laurent. Add the machine related
files to the right machine sections instead.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The files w/cpu/a*mpcore.c are already assigned to the ARM CPU
section, but the corresponding headers include/hw/cpu/a*mpcore.h
are still missing.
The file hw/*/imx* are already assigned to the i.MX31 machine, but
the corresponding header files include/hw/*/imx* are still missing.
The file hw/misc/arm_integrator_debug.c seems to belong to Integrator
CP, hw/cpu/realview_mpcore.c seems to belong to Real View, and
hw/misc/mst_fpga.c seems to belong to PXA2XX.
And the files hw/misc/zynq* and include/hw/misc/zynq* seem to belong
to the Xilinx Zynq machine.
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
On systems which do not provide ncursesw.pc and whose /usr/include/curses.h
does not include wide support, we should not only try with no -I, i.e.
/usr/include, but also with -I/usr/include/ncursesw.
To properly detect for wide support with and without -Werror, we need to
check for the presence of e.g. the WACS_DEGREE macro.
We also want to stop at the first curses_inc_list configuration which works,
and make sure to set IFS to : at each new loop.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Tested-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Message-id: 20161109102752.13255-1-samuel.thibault@ens-lyon.org
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The printscreen/sysrq and pause/break keys currently don't work for guests
using -usbdevice keyboard when accessed through vnc with a gtk-vnc based
client.
The reason for this is a mismatch between gtk-vnc and qemu in how these keys
should be mapped to XT keycodes.
On the original IBM XT these keys behaved differently than other keys.
Quoting from https://www.win.tue.nl/~aeb/linux/kbd/scancodes-1.html:
The keys PrtSc/SysRq and Pause/Break are special. The former produces
scancode e0 2a e0 37 when no modifier key is pressed simultaneously, e0 37
together with Shift or Ctrl, but 54 together with (left or right) Alt. (And
one gets the expected sequences upon release. But see below.) The latter
produces scancode sequence e1 1d 45 e1 9d c5 when pressed (without modifier)
and nothing at all upon release. However, together with (left or right)
Ctrl, one gets e0 46 e0 c6, and again nothing at release. It does not
repeat.
Gtk-vnc supports the 'QEMU Extended Key Event Message' RFB extension to send
raw XT keycodes directly to qemu, but the specification doesn't explicitly
specify how to map such long/complicated keycode sequences. From the spec
(https://github.com/rfbproto/rfbproto/blob/master/rfbproto.rst#qemu-extended-key-event-message)
The keycode is the XT keycode that produced the keysym. An XT keycode is an
XT make scancode sequence encoded to fit in a single U32 quantity. Single
byte XT scancodes with a byte value less than 0x7f are encoded as is.
2-byte XT scancodes whose first byte is 0xe0 and second byte is less than
0x7f are encoded with the high bit of the first byte set
hid.c currently expects the keycode sequence with shift/ctl for sysrq (e0 37
-> 0xb7 in RFB), whereas gtk-vnc uses the sequence with alt (0x54).
Likewise, hid.c expects the code without modifiers (e1 1d 45 -> 0xc5 in
RFB), whereas gtk-vnc sends the keycode sequence with ctrl for pause (e0 46
-> 0xc6 in RFB).
See keymaps.cvs in gtk-vnc for the mapping used:
https://git.gnome.org/browse/gtk-vnc/tree/src/keymaps.csv#n150
Now, it isn't obvious to me which sequence is really "right", but as the
0x54/0xc6 keycodes are currently unused in hid.c, supporting both seems like
the pragmatic solution to me. The USB HID keyboard boot protocol used by
hid.c doesn't have any other mapping applicable to these keys.
The other guest keyboard interfaces (ps/2, virtio, ..) are not affected,
because they handle these keys differently.
Signed-off-by: Peter Korsgaard <peter@korsgaard.com>
Message-id: 20161028145132.1702-1-peter@korsgaard.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
GDK_KEY_Delete is only defined with gtk version 2.22 and newer,
on older versions this key was called GDK_Delete instead.
Since this is the case for all GDK_KEY_* defines, change the
already existing preprocessor check there to test for version 2.22,
so we know that we can remove this code block in case we require
that version as a minimum one day.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1478081328-25515-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In ehci_init_transfer function, if the 'cpage' is bigger than 4,
it doesn't free the 'p->sgl' once allocated previously thus leading
a memory leak issue. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Message-id: 5821c0f4.091c6b0a.e0c92.e811@mx.google.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
git shortlog 04186319..b991c67c
===============================
Laszlo Ersek (3):
[efi] Install the HII config access protocol on a child of the SNP handle
[librm] Conditionalize the workaround for the Tivoli VMM's SSE garbling
[build] Disable TIVOLI_VMM_WORKAROUND in the qemu configuration
Lukas Grossar (1):
[intel] Add PCI device ID for I219-V/LM
Michael Brown (57):
[efi] Fix uninitialised data in HII IFR structures
[bios] Do not enable interrupts when printing to the console
[pxe] Disable interrupts on the PIC before starting NBP
[dhcp] Allow for variable encapsulation of architecture-specific options
[dhcpv6] Include RFC5970 client architecture options in DHCPv6 requests
[dhcpv6] Include vendor class identifier option in DHCPv6 requests
[dhcp] Automatically generate vendor class identifier string
[xfer] Send intf_close() if redirection fails
[downloader] Treat redirection failures as fatal
[iscsi] Treat redirection failures as fatal
[debug] Allow per-object runtime enabling/disabling of debug messages
[debug] Allow debug messages to be initially disabled at runtime
[libc] Allow assertions to be globally enabled or disabled
[profile] Allow profiling to be globally enabled or disabled
[rng] Check for functioning RTC interrupt
[acpi] Add support for ACPI power off
[acpi] Allow time for ACPI power off to take effect
[ipv4] Send gratuitous ARPs whenever a new IPv4 address is applied
[intel] Strip spurious VLAN tags received by virtual function NICs
[intel] Remove duplicate intelvf_mbox_queues() function
[ipv6] Perform SLAAC only during autoconfiguration
[settings] Create space for IPv6 in settings display order
[ipv6] Rename ipv6_scope to dhcpv6_scope
[settings] Correctly mortalise autovivified child settings blocks
[ipv6] Allow settings to comprise arbitrary subsets of NDP options
[ipv6] Expose IPv6 settings acquired through NDP
[dhcpv6] Expose IPv6 address setting acquired through DHCPv6
[ipv6] Expose IPv6 link-local address settings
[settings] Allow settings blocks to specify a sibling ordering
[ipv6] Match user expectations for IPv6 settings priorities
[ipv6] Create routing table based on IPv6 settings
[ipv6] Rename ipv6_scope to ipv6_settings_scope
[test] Update IPv6 tests to use okx()
[ipv6] Allow for multiple routers
[hyperv] Use instance UUID in device name
[crypto] Remove obsolete extern declaration for asn1_invalidate_cursor()
[crypto] Allow for parsing of partial ASN.1 cursors
[image] Add image_asn1() to extract ASN.1 objects from image
[crypto] Add DER image format
[crypto] Add PEM image format
[image] Use image_asn1() to extract data from CMS signature images
[build] Remove obsolete explicit object requirements
[crypto] Enable both DER and PEM formats by default
[build] Remove more obsolete explicit object requirements
[pixbuf] Enable PNG format by default
[crypto] Add image_x509() to extract X.509 certificates from image
[crypto] Generalise X.509 "valid" field to a "flags" field
[list] Add list_next_entry() and list_prev_entry()
[crypto] Expose certstore_del() to explicitly remove stored certificates
[crypto] Allow certificates to be marked as having been added explicitly
[crypto] Add certstat() to display basic certificate information
[cmdline] Add certificate management commands
[crypto] Mark permanent certificates as permanent
[efi] Mark AppleNetBoot.h as a native iPXE header
[efi] Update to current EDK2 headers
[efi] Add EFI_BLOCK_IO2_PROTOCOL header and GUID definition
[bzimage] Fix page alignment of initrd images
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Commit 7d3123e converted a single read_sync() into a while loop
that assumed that read_sync() would either make progress or give
an error. But when the server hangs up early, the client sees
EOF (a read_sync() of 0) and never makes progress, which in turn
caused qemu-iotest './check -nbd 83' to go into an infinite loop.
Rework the loop to accomodate reads cut short by EOF.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1478551093-32757-1-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V HV_X64_MSR_VP_RUNTIME was introduced in linux-4.4 + qemu-2.5.
As long as the KVM module supports, qemu will save / load the
vmstate_msr_hyperv_runtime register during the migration.
Regardless of whether the hyperv_runtime configuration of x86_cpu_properties is
enabled.
The qemu-2.3 does not support this feature, of course, failed to migrate.
linux-BGSfqC:/home/qemu # ./x86_64-softmmu/qemu-system-x86_64 --enable-kvm \
-nodefaults -machine pc-i440fx-2.3,accel=kvm,usb=off -smp 4 -m 4096 -drive \
file=/work/suse/sles11sp3.img.bak,format=raw,if=none,id=drive-virtio-disk0,cache=none \
-device virtio-blk-pci,scsi=off,bus=pci.0,addr=0x5,drive=drive-virtio-disk0,id=virtio-disk0 \
-vnc :99 -device cirrus-vga,id=video0,vgamem_mb=8,bus=pci.0,addr=0x2 -monitor vc
save_section_header:se->section_id=3,se->idstr:ram,se->instance_id=0,se->version_id=4
save_section_header:se->section_id=0,se->idstr:timer,se->instance_id=0,se->version_id=2
save_section_header:se->section_id=4,se->idstr:cpu_common,se->instance_id=0,se->version_id=1
save_section_header:se->section_id=5,se->idstr:cpu,se->instance_id=0,se->version_id=12
vmstate_subsection_save:vmsd->name:cpu/async_pf_msr
hyperv_runtime_enable_needed:env->msr_hv_runtime=128902811
vmstate_subsection_save:vmsd->name:cpu/msr_hyperv_runtime
Since hyperv_runtime is false, vm will not use hv->runtime_offset, then
vmstate_msr_hyperv_runtime is no need to transfer while migrating.
Signed-off-by: ann.zhuangyanying@huawei.com
Message-Id: <1478247398-5016-1-git-send-email-ann.zhuangyanying@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
With current code, pid file is open after various
sockets, chardevs, fsdevs and the like. This causes
interesting effects, for example when monitor is a
unix-socket, and another qemu instance is already
running, new qemu first "damages" the socket and
next complain that it can't acquire the pid file and
exits, making running qemu unreachable.
Move pid file creation earlier, right after the call
to os_daemonize(), where we know our process id (pid).
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <1478096330-18081-1-git-send-email-mjt@msgid.tls.msk.ru>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The impact is small because kvm_get_vcpu_events fixes env->hflags, but
it is wrong and could cause INITs to be delayed arbitrarily with
-machine kernel_irqchip=off.
Reported-by: Achille Fouilleul <achille.fouilleul@gadz.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When using QEMU for Xen PV guest, QEMU abort with:
xen-common.c:118:xen_init: Object 0x7f2b8325dcb0 is not an instance of type generic-pc-machine
This is because the machine 'xenpv' also use accel=xen. Moving the code
to xen_hvm_init() fix the issue.
This fix 021746c131.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Commit 3ff2f67a changed bdrv_co_flush() so that no flush is issues if
the image hasn't been dirtied since the last flush. This is not quite
correct: The condition should be that the image hasn't been dirtied
since the last _successful_ flush. This patch changes the logic
accordingly.
Without this fix, subsequent bdrv_co_flush() calls would return success
without actually doing anything even though the image is still dirty.
The difference is visible in some blkdebug test cases where error
messages incorrectly disappeared after commit 3ff2f67a.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1478300595-10090-1-git-send-email-kwolf@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
target-arm queue:
* bitbang_i2c: Handle NACKs from devices
* Fix corruption of CPSR when SCTLR.EE is set
* nvic: set pending status for not active interrupts
* char: cadence: check baud rate generator and divider values
# gpg: Signature made Mon 07 Nov 2016 10:43:07 AM GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* pm215/tags/pull-target-arm-20161107:
hw/i2c/bitbang_i2c: Handle NACKs from devices
Fix corruption of CPSR when SCTLR.EE is set
nvic: set pending status for not active interrupts
char: cadence: check baud rate generator and divider values
Message-id: 1478515653-6361-1-git-send-email-peter.maydell@linaro.org
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the guest attempts to talk to a nonexistent device over i2c,
the i2c_start_transfer() function will return non-zero, indicating
that the bus is signalling a NACK. Similarly, if the i2c_send()
function returns nonzero then the target device returned a NACK.
Handle this possibility in the bitbang_i2c code, by returning
the state machine to the STOPPED state and returning the NACK
bit to the guest.
This bit of missing functionality was spotted by Coverity
(it noticed that we weren't checking the return value from
i2c_start_transfer()).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1477332749-27098-1-git-send-email-peter.maydell@linaro.org
Fix a typo in arm_cpu_do_interrupt_aarch32 (OR'ing with ~CPSR_E
instead of CPSR_E) which meant that when we took an interrupt with
SCTLR.EE set we would corrupt the CPSR.
Signed-off-by: Julian Brown <julian@codesourcery.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
According to ARM DUI 0552A 4.2.10. NVIC set pending status
also for disabled interrupts. Correct the logic for
when interrupts are marked pending both on input level
transition and when interrupts are dismissed, to match
the NVIC behaviour rather than the 11MPCore GIC.
Signed-off-by: Marcin Krzeminski <marcin.krzeminski@nokia.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence UART device emulator calculates speed by dividing the
baud rate by a 'baud rate generator' & 'baud rate divider' value.
The device specification defines these register values to be
non-zero and within certain limits. Add checks for these limits
to avoid errors like divide by zero.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1477596278-1470-1-git-send-email-ppandit@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
* NBD bugfix (Changlong)
* NBD write zeroes support (Eric)
* Memory backend fixes (Haozhong)
* Atomics fix (Alex)
* New AVX512 features (Luwei)
* "make check" logging fix (Paolo)
* Chardev refactoring fallout (Paolo)
* Small checkpatch improvements (Paolo, Jeff)
# gpg: Signature made Wed 02 Nov 2016 08:31:11 AM GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (30 commits)
main-loop: Suppress I/O thread warning under qtest
docs/rcu.txt: Fix minor typo
vl: exit qemu on guest panic if -no-shutdown is not set
checkpatch: allow spaces before parenthesis for 'coroutine_fn'
x86: add AVX512_4VNNIW and AVX512_4FMAPS features
slirp: fix CharDriver breakage
qemu-char: do not forward events through the mux until QEMU has started
nbd: Implement NBD_CMD_WRITE_ZEROES on client
nbd: Implement NBD_CMD_WRITE_ZEROES on server
nbd: Improve server handling of shutdown requests
nbd: Refactor conversion to errno to silence checkpatch
nbd: Support shorter handshake
nbd: Less allocation during NBD_OPT_LIST
nbd: Let client skip portions of server reply
nbd: Let server know when client gives up negotiation
nbd: Share common option-sending code in client
nbd: Send message along with server NBD_REP_ERR errors
nbd: Share common reply-sending code in server
nbd: Rename struct nbd_request and nbd_reply
nbd: Rename NbdClientSession to NBDClientSession
...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Introduce this field to control whether ACPI build is enabled by a
particular machine or accelerator.
It defaults to true if the machine itself supports ACPI build. Xen
accelerator will disable it because Xen is in charge of building ACPI
tables for the guest.
Signed-off-by: Wei Liu <wei.liu2@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Sander Eikelenboom <linux@eikelenboom.it>
Olaf Hering reported a build failure due to an undefined reference
to 'qemu_log_vprintf'. Explicitely including qemu/log.h seems to
fix the issue.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Tested-by: Olaf Hering <olaf@aepfle.de>
We do not want to display the "I/O thread spun" warning for test cases
that run under qtest. The first attempt for this (commit
01c22f2cdd) tested whether qtest_enabled()
was true.
Commit 21a24302e8 correctly recognized
that just testing qtest_enabled() is not sufficient since there are some
tests that do not use the qtest accelerator but just the qtest character
device, and thus replaced qtest_enabled() by qtest_driver().
However, there are also some tests that only use the qtest accelerator
and not the qtest chardev; perhaps most notably the bash iotests.
Therefore, we have to check both qtest_enabled() and qtest_driver().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-Id: <20161017180939.27912-1-mreitz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For automated testing purposes it can be helpful to exit qemu
(poweroff) when the guest panics. Make this the default unless
-no-shutdown is specified.
For internal-errors like errors from KVM_RUN the behaviour is
not changed, in other words QEMU does not exit to allow debugging
in the QEMU monitor.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <1476775794-108012-1-git-send-email-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SLIRP expects a CharBackend as the third argument to slirp_add_exec,
but net/slirp.c was passing a CharDriverState. Fix this to restore
guestfwd functionality.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Otherwise, the CHR_EVENT_OPENED event is sent twice: first when the
backend (for example "stdio") is opened, and second after processing
the command line.
The incorrect sending of the event prints the monitor banner when
QEMU is started with "-serial mon:stdio". This includes the "(qemu)"
prompt; thus the monitor seems to be dead, whereas actually the
active front-end is the serial port.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants a hole.
The generic block code takes care of falling back to the obvious
write of lots of zeroes if we return -ENOTSUP because the server
does not have WRITE_ZEROES.
Ideally, since NBD_CMD_WRITE_ZEROES does not involve any data
over the wire, we want to support transactions that are much
larger than the normal 32M limit imposed on NBD_CMD_WRITE. But
the server may still have a limit smaller than UINT_MAX, so
until experimental NBD protocol additions for advertising various
command sizes is finalized (see [1], [2]), for now we just stick to
the same limits as normal writes.
[1] https://github.com/yoe/nbd/blob/extension-info/doc/proto.md
[2] https://sourceforge.net/p/nbd/mailman/message/35081223/
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-17-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Upstream NBD protocol recently added the ability to efficiently
write zeroes without having to send the zeroes over the wire,
along with a flag to control whether the client wants to allow
a hole.
Note that when it comes to requiring full allocation, vs.
permitting optimizations, the NBD spec intentionally picked a
different sense for the flag; the rules in qemu are:
MAY_UNMAP == 0: must write zeroes
MAY_UNMAP == 1: may use holes if reads will see zeroes
while in NBD, the rules are:
FLAG_NO_HOLE == 1: must write zeroes
FLAG_NO_HOLE == 0: may use holes if reads will see zeroes
In all cases, the 'may use holes' scenario is optional (the
server need not use a hole, and must not use a hole if
subsequent reads would not see zeroes).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-16-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
NBD commit 6d34500b clarified how clients and servers are supposed
to behave before closing a connection. It added NBD_REP_ERR_SHUTDOWN
(for the server to announce it is about to go away during option
haggling, so the client should quit sending NBD_OPT_* other than
NBD_OPT_ABORT) and ESHUTDOWN (for the server to announce it is about
to go away during transmission, so the client should quit sending
NBD_CMD_* other than NBD_CMD_DISC). It also clarified that
NBD_OPT_ABORT gets a reply, while NBD_CMD_DISC does not.
This patch merely adds the missing reply to NBD_OPT_ABORT and teaches
the client to recognize server errors. Actually teaching the server
to send NBD_REP_ERR_SHUTDOWN or ESHUTDOWN would require knowing that
the server has been requested to shut down soon (maybe we could do
that by installing a SIGINT handler in qemu-nbd, which transitions
from RUNNING to a new state that waits for the client to react,
rather than just out-right quitting - but that's a bigger task for
another day).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-15-git-send-email-eblake@redhat.com>
[Move dummy ESHUTDOWN to include/qemu/osdep.h. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Checkpatch complains that 'return EINVAL' is usually wrong
(since we tend to favor 'return -EINVAL'). But it is a
false positive for nbd_errno_to_system_errno(). Since NBD
may add future defined wire values, refactor the code to
keep checkpatch happy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-14-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD Protocol allows the server and client to mutually agree
on a shorter handshake (omit the 124 bytes of reserved 0), via
the server advertising NBD_FLAG_NO_ZEROES and the client
acknowledging with NBD_FLAG_C_NO_ZEROES (only possible in
newstyle, whether or not it is fixed newstyle). It doesn't
shave much off the wire, but we might as well implement it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alex Bligh <alex@alex.org.uk>
Message-Id: <1476469998-28592-13-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since we know that the maximum name we are willing to accept
is small enough to stack-allocate, rework the iteration over
NBD_OPT_LIST responses to reuse a stack buffer rather than
allocating every time. Furthermore, we don't even have to
allocate if we know the server's length doesn't match what
we are searching for.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-12-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The server has a nice helper function nbd_negotiate_drop_sync()
which lets it easily ignore fluff from the client (such as the
payload to an unknown option request). We can't quite make it
common, since it depends on nbd_negotiate_read() which handles
coroutine magic, but we can copy the idea into the client where
we have places where we want to ignore data (such as the
description tacked on the end of NBD_REP_SERVER).
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-11-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD spec says that a client should send NBD_OPT_ABORT
rather than just dropping the connection, if the client doesn't
like something the server sent during option negotiation. This
is a best-effort attempt only, and can only be done in places
where we know the server is still in sync with what we've sent,
whether or not we've read everything the server has sent.
Technically, the server then has to reply with NBD_REP_ACK, but
it's not worth complicating the client to wait around for that
reply.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-10-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than open-coding each option request, it's easier to
have common helper functions do the work. That in turn requires
having convenient packed types for handling option requests
and replies.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-9-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD Protocol allows us to send human-readable messages
along with any NBD_REP_ERR error during option negotiation;
make use of this fact for clients that know what to do with
our message.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-8-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We have both 'struct NBDRequest' and 'struct nbd_request'; making
it confusing to see which does what. Furthermore, we want to
rename nbd_request to align with our normal CamelCase naming
conventions. So, rename the struct which is used to associate
the data received during request callbacks, while leaving the
shorter name for the description of the request sent over the
wire in the NBD protocol.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-4-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Current upstream NBD documents that requests have a 16-bit flags,
followed by a 16-bit type integer; although older versions mentioned
only a 32-bit field with masking to find flags. Since the protocol
is in network order (big-endian over the wire), the ABI is unchanged;
but dealing with the flags as a separate field rather than masking
will make it easier to add support for upcoming NBD extensions that
increase the number of both flags and commands.
Improve some comments in nbd.h based on the current upstream
NBD protocol (https://github.com/yoe/nbd/blob/master/doc/proto.md),
and touch some nearby code to keep checkpatch.pl happy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-3-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The NBD protocol allows servers to advertise a human-readable
description alongside an export name during NBD_OPT_LIST. Add
an option to pass through the user's string to the NBD client.
Doing this also makes it easier to test commit 200650d4, which
is the client counterpart of receiving the description.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <1476469998-28592-2-git-send-email-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 35c5a52d "acpi: do not use TARGET_PAGE_SIZE" changed struct
NvdimmDsmIn from a variable-size structure to a fixed-size structure of
4096 bytes. It forgot to adjust an assert in
nvdimm_dsm_set_label_data(..., NvdimmDsmIn *in, ...):
assert(sizeof(*in) + sizeof(*set_label_data) + set_label_data->length <=
4096);
which could crash QEMU when guest writes NVDIMM labels.
Fix it by replacing sizeof(*in) by offsetof(NvdimmDsmIn, arg3).
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reported-by: Dan Williams <dan.j.williams@intel.com>
Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
I misunderstood the workings of the power settings, the power off
is a force off operation and there needs to be a separate graceful
shutdown operation. So replace the force off operation with a
graceful shutdown.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The original commit:
commit 67aa56fc03
Author: Corey Minyard <cminyard@mvista.com>
Date: Thu Dec 17 12:50:06 2015 -0600
ipmi: Add an external connection simulation interface
defined a new variable CONFIG_IPMI_EXTERN, but then went
on to mistakely use the pre-existing CONFIG_IPMI_LOCAL
variable.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This is allowed by the IPMI specification for graceful shutdown,
so implement it.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When issuing a chassis 'powerdown' control command, the routine
qemu_system_shutdown_request() should be used to exit the guest.
qemu_system_powerdown_request() will initiate a soft shutdown which is
not what is required by the IPMI (28.3 Chassis Control Command):
0h = power down. Force system into soft off (S4/S45) state. This
is for 'emergency' management power down actions. The command does
not initiate a clean shut-down of the operating system prior to
powering down the system
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Get rid of the unnecessary mutex, it was a vestige
of something else that was not done. That way we don't
have to free it.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
_FIT is required for hotplug support, guest will inquire the updated
device info from it if a hotplug event is received
As FIT buffer is not completely mapped into guest address space, so a
new function, Read FIT whose UUID is UUID
648B9CF2-CDA1-4312-8AD9-49C4AF32BD62, handle 0x10000, function index
is 0x1, is reserved by QEMU to read the piece of FIT buffer. The buffer
is concatenated before _FIT return
Refer to docs/specs/acpi-nvdimm.txt for detailed design
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The buffer is used to save the FIT info for all the presented nvdimm
devices which is updated after the nvdimm device is plugged or
unplugged. In the later patch, it will be used to construct NVDIMM
ACPI _FIT method which reflects the presented nvdimm devices after
nvdimm hotplug
As FIT buffer can not completely mapped into guest address space,
OSPM will exit to QEMU multiple times, however, there is the race
condition - FIT may be changed during these multiple exits, so that
some rules are introduced:
1) the user should hold the @lock to access the buffer and
2) mark @dirty whenever the buffer is updated.
@dirty is cleared for the first time OSPM gets fit buffer, if
dirty is detected in the later access, OSPM will restart the
access
As fit should be updated after nvdimm device is successfully realized
so that a new hotplug callback, post_hotplug, is introduced
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
For each NVDIMM present or intended to be supported by platform,
platform firmware also exposes an ACPI Namespace Device under
the root device
So it builds nvdimm devices for all slots to support vNVDIMM hotplug
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As the arch dependent info, TARGET_PAGE_SIZE, has been dropped
from nvdimm acpi code, it can be compiled arch-independently
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to ACPI 6.0 spec, "Memory Device Physical Address
Region Base" in memdev is defined as "This field provides the
Device Physical Address base of the region". This field should
be zero in our case
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Based on ACPI spec:
RegionOffset := TermArg => Integer
However, Named object is not a TermArg.
This patch moves OperationRegion to NCAL() and uses localX as
its RegionOffset
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, 'RLEN' is the totally buffer size written by QEMU and it is
ACPI internally used only. The buffer size returned to guest should
not include 'RLEN' itself
Signed-off-by: Xiao Guangrong <guangrong.xiao@linux.intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch includes two parts: Cryptodev Backends
and virtio-crypto stuff. I can maintain cryptodev backends
which introduced by myself. For virtio-crypto stuff, I can
share the work with Michael (The whole virtio supporter).
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make crypto operations are executed asynchronously,
so that other QEMU threads and monitor couldn't
be blocked at the virtqueue handling context.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We use an opaque point to the VirtIOCryptoReq which
can support different packets based on different
algorithms.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduces VirtIOCryptoReq structure to store
crypto request so that we can easily support
asynchronous crypto operation in the future.
At present, we only support cipher and algorithm
chaining.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Realize the symmetric algorithm control queue handler,
including plain cipher and chainning algorithms.
Currently the control queue is used to create and
close session for symmetric algorithm.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Expose the capacity of algorithms supported by
virtio crypto device to the frontend driver using
pci configuration space.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds virtio-crypto-pci, which is the pci proxy for the virtio
crypto device.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
tcg queued patches
# gpg: Signature made Tue 01 Nov 2016 16:45:42 GMT
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-tcg-20161101-2:
tcg: correct 32-bit tcg_gen_ld8s_i64 sign-extension
tcg/tcg.h: Improve documentation of TCGv_i32 etc types
MAINTAINERS: Update PPC status and maintainer
target-microblaze: Cleanup dec_mul
tcg: Add tcg_gen_mulsu2_{i32,i64,tl}
log: Add locking to large logging blocks
target-openrisc: Do not dump cpu state with -d in_asm
target-microblaze: Do not dump cpu state with -d in_asm
target-cris: Do not dump cpu state with -d in_asm
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The version of tcg_gen_ld8s_i64 for 32-bit systems does a load into
the low part of the return value - then attempts a sign extension into
the high part, but wrongly sets the high part to a sign extension of
itself rather than of the low part. This results in TCG internal
errors from the use of the uninitialized high part (in some GCC tests
of AArch64 NEON shift intrinsics, in particular). This patch corrects
the sign-extension logic, making it match other functions such as
tcg_gen_ld16s_i64.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1610272333560.22353@digraph.polyomino.org.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The typedefs we use for the TCGv_i32, TCGv_i64 and TCGv_ptr
types are somewhat confusing, because we define them as
pointers to structs, but the structs themselves are never
defined. Explain in the comments a bit more clearly why
this is OK and what is going on under the hood.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1477067922-26202-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use tcg_gen_mul_tl for muli and mul instructions.
Use tcg_gen_muls2_tl for mulh instruction.
Use tcg_gen_mulu2_tl for mulhu instruction.
Use tcg_gen_mulsu2_tl for mulhsu instruction.
Note that this last fixes a bug, in that mulhsu was
previously treating both operands as signed, instead
of treating rb as unsigned.
Tested-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1475011433-24456-3-git-send-email-rth@twiddle.net>
Reuse the existing locking provided by stdio to keep in_asm, cpu,
op, op_opt, op_ind, and out_asm as contiguous blocks.
While it isn't possible to interleave e.g. in_asm or op_opt logs
because of the TB lock protecting all code generation, it is
possible to interleave cpu logs, or to interleave a cpu dump with
an out_asm dump.
For mingw32, we appear to have no viable solution for this. The locking
functions are not properly exported from the system runtime library.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
For '-object memory-backend-file,mem-path=foo,size=xyz', if the size of
file 'foo' does not match the given size 'xyz', the current QEMU will
truncate the file to the given size, which may corrupt the existing data
in that file. To avoid such data corruption, this patch disables
truncating non-empty backend files.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20161027042300.5929-2-haozhong.zhang@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Leave the implementation of error_vprintf and error_vprintf_unless_qmp
(the latter now trivially wrapped by error_printf_unless_qmp) to
libqemustub.a and monitor.c. This has two advantages: it lets us
remove the monitor_printf and monitor_vprintf stubs, and it lets
tests provide a different implementation of the functions that uses
g_test_message.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477326663-67817-2-git-send-email-pbonzini@redhat.com>
NBD is using the CoMutex in a way that wasn't anticipated. For example, if there are
N(N=26, MAX_NBD_REQUESTS=16) nbd write requests, so we will invoke nbd_client_co_pwritev
N times.
----------------------------------------------------------------------------------------
time request Actions
1 1 in_flight=1, Coroutine=C1
2 2 in_flight=2, Coroutine=C2
...
15 15 in_flight=15, Coroutine=C15
16 16 in_flight=16, Coroutine=C16, free_sema->holder=C16, mutex->locked=true
17 17 in_flight=16, Coroutine=C17, queue C17 into free_sema->queue
18 18 in_flight=16, Coroutine=C18, queue C18 into free_sema->queue
...
26 N in_flight=16, Coroutine=C26, queue C26 into free_sema->queue
----------------------------------------------------------------------------------------
Once nbd client recieves request No.16' reply, we will re-enter C16. It's ok, because
it's equal to 'free_sema->holder'.
----------------------------------------------------------------------------------------
time request Actions
27 16 in_flight=15, Coroutine=C16, free_sema->holder=C16, mutex->locked=false
----------------------------------------------------------------------------------------
Then nbd_coroutine_end invokes qemu_co_mutex_unlock what will pop coroutines from
free_sema->queue's head and enter C17. More free_sema->holder is C17 now.
----------------------------------------------------------------------------------------
time request Actions
28 17 in_flight=16, Coroutine=C17, free_sema->holder=C17, mutex->locked=true
----------------------------------------------------------------------------------------
In above scenario, we only recieves request No.16' reply. As time goes by, nbd client will
almostly recieves replies from requests 1 to 15 rather than request 17 who owns C17. In this
case, we will encounter assert "mutex->holder == self" failed since Kevin's commit 0e438cdc
"coroutine: Let CoMutex remember who holds it". For example, if nbd client recieves request
No.15' reply, qemu will stop unexpectedly:
----------------------------------------------------------------------------------------
time request Actions
29 15(most case) in_flight=15, Coroutine=C15, free_sema->holder=C17, mutex->locked=false
----------------------------------------------------------------------------------------
Per Paolo's suggestion "The simplest fix is to change it to CoQueue, which is like a condition
variable", this patch replaces CoMutex with CoQueue.
Cc: Wen Congyang <wency@cn.fujitsu.com>
Reported-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Message-Id: <1476267508-19499-1-git-send-email-xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid triggering on
typedef struct BlockJobDriver BlockJobDriver;
or
struct BlockJobDriver {
Cc: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
# gpg: Signature made Tue 01 Nov 2016 12:47:36 GMT
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/block-pull-request:
blockjobs: fix documentation
blockjobs: split interface into public/private, Part 1
Blockjobs: Internalize user_pause logic
blockjob: centralize QMP event emissions
Replication/Blockjobs: Create replication jobs as internal
blockjobs: Allow creating internal jobs
blockjobs: hide internal jobs from management API
block/gluster: fix port type in the QAPI options list
block/gluster: improve defense over string to int conversion
block: Turn on "unmap" in active commit
block/gluster: memory usage: use one glfs instance per volume
block: add gluster ifdef guard checks for SEEK_DATA/SEEK_HOLE support
rbd: make the code more readable
qapi: add release designator to gluster logfile option
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This pull request mostly contains some more fixes to prevent buggy guests from
breaking QEMU.
# gpg: Signature made Tue 01 Nov 2016 11:26:42 GMT
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: drop excessive error message from virtfs_reset()
9pfs: don't BUG_ON() if fid is already opened
9pfs: xattrcreate requires non-opened fids
9pfs: limit xattr size in xattrcreate
9pfs: fix integer overflow issue in xattr read/write
9pfs: convert 'len/copied_len' field in V9fsXattr to the type of uint64_t
9pfs: add xattrwalk_fid field in V9fsXattr struct
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To make it a little more obvious which functions are intended to be
public interface and which are intended to be for use only by jobs
themselves, split the interface into "public" and "private" files.
Convert blockjobs (e.g. block/backup) to using the private interface.
Leave blockdev and others on the public interface.
There are remaining uses of private state by qemu-img, and several
cases in blockdev.c and block/io.c where we grab job->blk for the
purposes of acquiring an AIOContext.
These will be corrected in future patches.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-7-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
There's no reason to leave this to blockdev; we can do it in blockjobs
directly and get rid of an extra callback for most users.
All non-internal events, even those created outside of QMP, will
consistently emit events.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477584421-1399-5-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
If jobs are not created directly by the user, do not allow them to be
seen by the user/management utility. At the moment, 'internal' jobs are
those that do not have an ID. As of this patch it is impossible to
create such jobs.
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1477584421-1399-2-git-send-email-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
After introduction of qapi schema in gluster block driver code, the port
type is now string as per InetSocketAddress
{ 'struct': 'InetSocketAddress',
'data': {
'host': 'str',
'port': 'str',
'*to': 'uint16',
'*ipv4': 'bool',
'*ipv6': 'bool' } }
but the current code still treats it as QEMU_OPT_NUMBER, hence fixing port
to accept QEMU_OPT_STRING.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
using atoi() for converting string to int may be error prone in case if
string supplied in the argument is not a fold of numerical number,
This is not a bug because in the existing code,
static QemuOptsList runtime_tcp_opts = {
.name = "gluster_tcp",
.head = QTAILQ_HEAD_INITIALIZER(runtime_tcp_opts.head),
.desc = {
...
{
.name = GLUSTER_OPT_PORT,
.type = QEMU_OPT_NUMBER,
.help = "port number ...",
},
...
};
port type is QEMU_OPT_NUMBER, before we actually reaches atoi() port is already
defended by parse_option_number()
However It is a good practice to use function like parse_uint_full()
over atoi() to keep port self defended
Note: As now the port string to int conversion has its defence code set,
and also we understand that port argument is actually a string type,
in the follow up patch let's move port type from QEMU_OPT_NUMBER to
QEMU_OPT_STRING
[Jeff Cody: removed spurious parenthesis]
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
We already specified BDRV_O_UNMAP when opening images in 'qemu-img
commit', but didn't turn on the "unmap" in the active commit job. This
patch fixes that so that zeroed clusters in top image can be discarded
which is desired in the virt-sparsify use case, where a temporary
overlay is created and fstrim'ed before commiting back, to free space in
the original image.
This also enables it for block-commit.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1474974892-5031-1-git-send-email-famz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Currently, for every drive accessed via gfapi we create a new glfs
instance (call glfs_new() followed by glfs_init()) which could consume
memory in few 100 MB's, from the table below it looks like for each
instance ~300 MB VSZ was consumed
Before:
-------
Disks VSZ RSS
1 1098728 187756
2 1430808 198656
3 1764932 199704
4 2084728 202684
This patch maintains a list of pre-opened glfs objects. On adding
a new drive belonging to the same gluster volume, we just reuse the
existing glfs object by updating its refcount.
With this approch we shrink up the unwanted memory consumption and
glfs_new/glfs_init calls for accessing a disk (file) if belongs to
same volume.
From below table notice that the memory usage after adding a disk
(which will reuse the existing glfs object hence) is in negligible
compared to before.
After:
------
Disks VSZ RSS
1 1101964 185768
2 1109604 194920
3 1114012 196036
4 1114496 199868
Disks: number of -drive
VSZ: virtual memory size of the process in KiB
RSS: resident set size, the non-swapped physical memory (in kiloBytes)
VSZ and RSS are analyzed using 'ps aux' utility.
Signed-off-by: Prasanna Kumar Kalever <prasanna.kalever@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 1477581890-4811-1-git-send-email-prasanna.kalever@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Add checks to see if the system compiling QEMU has support for
SEEK_HOLE/SEEK_DATA. If the system does not, we will flag that seek
data is unsupported in gluster.
Note: this is not a check on whether the gluster server itself supports
SEEK_DATA (that is already done during runtime), but rather if the
compilation environment supports SEEK_DATA.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Message-id: 00370bce5c98140d6c56ad5145635ec6551265cc.1475876377.git.jcody@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
The "logfile" option to BlockdevOptionsGluster will not be in
QEMU until 2.8. Update comment to indicate this.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
qemu-ga patch queue for 2.8
* add guest-fstrim support for w32
* add support for using virtio-vsock as the communication channel
# gpg: Signature made Tue 01 Nov 2016 00:55:40 GMT
# gpg: using RSA key 0x3353C9CEF108B584
# gpg: Good signature from "Michael Roth <flukshun@gmail.com>"
# gpg: aka "Michael Roth <mdroth@utexas.edu>"
# gpg: aka "Michael Roth <mdroth@linux.vnet.ibm.com>"
# Primary key fingerprint: CEAC C9E1 5534 EBAB B82D 3FA0 3353 C9CE F108 B584
* remotes/mdroth/tags/qga-pull-2016-10-31-tag:
qga: add vsock-listen method
sockets: add AF_VSOCK support
qga: drop unnecessary GA_CHANNEL_UNIX_LISTEN checks
qga: drop unused sockaddr in accept(2) call
qga: minimal support for fstrim for Windows guests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The virtfs_reset() function is called either when the virtio-9p device
gets reset, or when the client starts a new 9P session. In both cases,
if it finds fids from a previous session, the following is printed in
the monitor:
9pfs:virtfs_reset: One or more uncluncked fids found during reset
For example, if a linux guest with a mounted 9P share is reset from the
monitor with system_reset, the message will be printed. This is excessive
since these fids are now clunked and the state is clean.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
A buggy or malicious guest could pass the id of an already opened fid and
cause QEMU to abort. Let's return EINVAL to the guest instead.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
The xattrcreate operation only makes sense on a freshly cloned fid
actually, since any open state would be leaked because of the fid_type
change. This is indeed what the linux kernel client does:
fid = clone_fid(fid);
[...]
retval = p9_client_xattrcreate(fid, name, value_len, flags);
This patch also reverts commit ff55e94d23 since we are sure that a fid
with type P9_FID_NONE doesn't have a previously allocated xattr.
Signed-off-by: Greg Kurz <groug@kaod.org>
We shouldn't allow guests to create extended attribute with arbitrary sizes.
On linux hosts, the limit is XATTR_SIZE_MAX. Let's use it.
Signed-off-by: Greg Kurz <groug@kaod.org>
The v9fs_xattr_read() and v9fs_xattr_write() are passed a guest
originated offset: they must ensure this offset does not go beyond
the size of the extended attribute that was set in v9fs_xattrcreate().
Unfortunately, the current code implement these checks with unsafe
calculations on 32 and 64 bit values, which may allow a malicious
guest to cause OOB access anyway.
Fix this by comparing the offset and the xattr size, which are
both uint64_t, before trying to compute the effective number of bytes
to read or write.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-By: Guido Günther <agx@sigxcpu.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
The 'len' in V9fsXattr comes from the 'size' argument in setxattr()
function in guest. The setxattr() function's declaration is this:
int setxattr(const char *path, const char *name,
const void *value, size_t size, int flags);
and 'size' is treated as u64 in linux kernel client code:
int p9_client_xattrcreate(struct p9_fid *fid, const char *name,
u64 attr_size, int flags)
So the 'len' should have an type of 'uint64_t'.
The 'copied_len' in V9fsXattr is used to account for copied bytes, it
should also have an type of 'uint64_t'.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Currently, 9pfs sets the 'copied_len' field in V9fsXattr
to -1 to tag xattr walk fid. As the 'copied_len' is also
used to account for copied bytes, this may make confusion. This patch
add a bool 'xattrwalk_fid' to tag the xattr walk fid.
Suggested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Add AF_VSOCK (virtio-vsock) support as an alternative to virtio-serial.
$ qemu-system-x86_64 -device vhost-vsock-pci,guest-cid=3 ...
(guest)# qemu-ga -m vsock-listen -p 3:1234
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Add the AF_VSOCK address family so that qemu-ga will be able to use
virtio-vsock.
The AF_VSOCK address family uses <cid, port> address tuples. The cid is
the unique identifier comparable to an IP address. AF_VSOCK does not
use name resolution so it's easy to convert between struct sockaddr_vm
and strings.
This patch defines a VsockSocketAddress instead of trying to piggy-back
on InetSocketAddress. This is cleaner in the long run since it avoids
lots of IPv4 vs IPv6 vs vsock special casing.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
* treat trailing commas as garbage when parsing (Eric Blake)
* add configure check instead of checking AF_VSOCK directly
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Throughout the code there are c->listen_channel checks which manage the
listen socket file descriptor (waiting for accept(2), closing the file
descriptor, etc). These checks are currently preceded by explicit
c->method == GA_CHANNEL_UNIX_LISTEN checks.
Explicit GA_CHANNEL_UNIX_LISTEN checks are not necessary since serial
channel types do not create the listen channel (c->listen_channel).
As more listen channel types are added, explicitly checking all of them
becomes messy. Rely on c->listen_channel to determine whether or not a
listen socket file descriptor is used.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
ga_channel_listen_accept() is currently hard-coded to support only
AF_UNIX because the struct sockaddr_un type is used. This function
should work with any address family.
Drop the sockaddr since the client address is unused and is an optional
argument to accept(2).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Unfortunately, there is no public Windows API to start trimming the
filesystem. The only viable way here is to call 'defrag.exe /L' for
each volume.
This is working since Win8 and Win2k12.
Signed-off-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Denis Plotnikov <dplotnikov@virtuozzo.com>
CC: Michael Roth <mdroth@linux.vnet.ibm.com>
CC: Stefan Weil <sw@weilnetz.de>
CC: Marc-André Lureau <marcandre.lureau@gmail.com>
* check g_utf16_to_utf8() return value for GError handling instead
of GError directly (Marc-André)
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The cpu is allowed to require stricter alignment on these 8- and 16-byte
operations, and the OS is required to fix up the accesses as necessary,
so the previous code was not wrong.
However, we can easily handle this misalignment for all direct 8-byte
operations and for direct 16-byte loads.
We must retain 16-byte alignment for 16-byte stores, so that we don't have
to probe for writability of a second page before performing the first of
two 8-byte stores. We also retain 8-byte alignment for no-fault loads,
since they are rare and it's not worth extending the helpers for this.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, fix a problem with stqf_asi, when
a write might access two pages.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Now that we never call out to helpers when direct accesses can
handle an asi, remove the corresponding code in those helpers.
For ldda, this removes the entire helper.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Print a warning when mixing [+-]foo and foo=(on|off) in the -cpu
argument in a way that will break in the future.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Block layer patches
# gpg: Signature made Mon 31 Oct 2016 16:10:07 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (29 commits)
qapi: allow blockdev-add for NFS
block/nfs: Introduce runtime_opts in NFS
block: Mention replication in BlockdevDriver enum docs
qemu-iotests: test 'offset' and 'size' options in raw driver
raw_bsd: add offset and size options
qemu-iotests: Test the 'base-node' parameter of 'block-stream'
block: Add 'base-node' parameter to the 'block-stream' command
qemu-iotests: Test streaming to a Quorum child
qemu-iotests: Add iotests.supports_quorum()
qemu-iotests: Test block-stream and block-commit in parallel
qemu-iotests: Test overlapping stream and commit operations
qemu-iotests: Test block-stream operations in parallel
qemu-iotests: Test streaming to an intermediate layer
docs: Document how to stream to an intermediate layer
block: Add QMP support for streaming to an intermediate layer
block: Support streaming to an intermediate layer
block: Block all intermediate nodes in commit_active_start()
block: Block all nodes involved in the block-commit operation
block: Check blockers in all nodes involved in a block-commit job
block: Use block_job_add_bdrv() in backup_start()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some tests use the "-vnc none" option without any clear reason,
making those tests break when --disable-vnc is specified on
./configure. Remove the unnecessary option.
Reviewed-by: John Snow <jsnow@redhat.com>
Tested-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Now the kernel commit 05f0c03fbac1 ("vfio-pci: Allow to mmap
sub-page MMIO BARs if the mmio page is exclusive") allows VFIO
to mmap sub-page BARs. This is the corresponding QEMU patch.
With those patches applied, we could passthrough sub-page BARs
to guest, which can help to improve IO performance for some devices.
In this patch, we expand MemoryRegions of these sub-page
MMIO BARs to PAGE_SIZE in vfio_pci_write_config(), so that
the BARs could be passed to KVM ioctl KVM_SET_USER_MEMORY_REGION
with a valid size. The expanding size will be recovered when
the base address of sub-page BAR is changed and not page aligned
any more in guest. And we also set the priority of these BARs'
memory regions to zero in case of overlap with BARs which share
the same page with sub-page BARs in guest.
Signed-off-by: Yongji Xie <xyjxie@linux.vnet.ibm.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
When a PCI device is reset, pci_do_device_reset resets all BAR addresses
in the relevant PCIDevice's config buffer.
The VFIO configuration space stays untouched, so the guest OS may choose
to skip restoring the BAR addresses as they would seem intact. The PCI
device may be left non-operational.
One example of such a scenario is when the guest exits S3.
Fix this by resetting the BAR addresses in the VFIO configuration space
as well.
Signed-off-by: Ido Yariv <ido@wizery.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
As reported in the link below, user has a PCI device with a 4KB BAR
which contains the MSI-X table. This seems to hit a corner case in
the kernel where the region reports being mmap capable, but the sparse
mmap information reports a zero sized range. It's not entirely clear
that the kernel is incorrect in doing this, but regardless, we need
to handle it. To do this, fill our mmap array only with non-zero
sized sparse mmap entries and add an error return from the function
so we can tell the difference between nr_mmaps being zero based on
sparse mmap info vs lack of sparse mmap info.
NB, this doesn't actually change the behavior of the device, it only
removes the scary "Failed to mmap ... Performance may be slow" error
message. We cannot currently create an mmap over the MSI-X table.
Link: http://lists.nongnu.org/archive/html/qemu-discuss/2016-10/msg00009.html
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
With a vfio assigned device we lay down a base MemoryRegion registered
as an IO region, giving us read & write accessors. If the region
supports mmap, we lay down a higher priority sub-region MemoryRegion
on top of the base layer initialized as a RAM device pointer to the
mmap. Finally, if we have any quirks for the device (ie. address
ranges that need additional virtualization support), we put another IO
sub-region on top of the mmap MemoryRegion. When this is flattened,
we now potentially have sub-page mmap MemoryRegions exposed which
cannot be directly mapped through KVM.
This is as expected, but a subtle detail of this is that we end up
with two different access mechanisms through QEMU. If we disable the
mmap MemoryRegion, we make use of the IO MemoryRegion and service
accesses using pread and pwrite to the vfio device file descriptor.
If the mmap MemoryRegion is enabled and results in one of these
sub-page gaps, QEMU handles the access as RAM, using memcpy to the
mmap. Using either pread/pwrite or the mmap directly should be
correct, but using memcpy causes us problems. I expect that not only
does memcpy not necessarily honor the original width and alignment in
performing a copy, but it potentially also uses processor instructions
not intended for MMIO spaces. It turns out that this has been a
problem for Realtek NIC assignment, which has such a quirk that
creates a sub-page mmap MemoryRegion access.
To resolve this, we disable memory_access_is_direct() for ram_device
regions since QEMU assumes that it can use memcpy for those regions.
Instead we access through MemoryRegionOps, which replaces the memcpy
with simple de-references of standard sizes to the host memory.
With this patch we attempt to provide unrestricted access to the RAM
device, allowing byte through qword access as well as unaligned
access. The assumption here is that accesses initiated by the VM are
driven by a device specific driver, which knows the device
capabilities. If unaligned accesses are not supported by the device,
we don't want them to work in a VM by performing multiple aligned
accesses to compose the unaligned access. A down-side of this
philosophy is that the xp command from the monitor attempts to use
the largest available access weidth, unaware of the underlying
device. Using memcpy had this same restriction, but at least now an
operator can dump individual registers, even if blocks of device
memory may result in access widths beyond the capabilities of a
given device (RTL NICs only support up to dword).
Reported-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Setting skip_dump on a MemoryRegion allows us to modify one specific
code path, but the restriction we're trying to address encompasses
more than that. If we have a RAM MemoryRegion backed by a physical
device, it not only restricts our ability to dump that region, but
also affects how we should manipulate it. Here we recognize that
MemoryRegions do not change to sometimes allow dumps and other times
not, so we replace setting the skip_dump flag with a new initializer
so that we know exactly the type of region to which we're applying
this behavior.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce new object 'BlockdevOptionsNFS' in qapi/block-core.json to
support blockdev-add for NFS network protocol driver. Also make a new
struct NFSServer to support tcp connection.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make NFS block driver use various fine grained runtime_opts.
Set .bdrv_parse_filename() to nfs_parse_filename() and introduce two
new functions nfs_parse_filename() and nfs_parse_uri() to help parsing
the URI.
Add a new option "server" which then accepts a new struct NFSServer.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
[ kwolf: Fixed client->path ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Added two new options 'offset' and 'size'. This makes it possible to use
only part of the file as a device. This can be used e.g. to limit the
access only to single partition in a disk image or use a disk inside a
tar archive (like OVA).
When 'size' is specified we do our best to honour it.
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The block-stream command has traditionally used the 'base' parameter
to indicate the image to copy the data from. This test checks that the
'base-node' parameter can also be used for the same purpose.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The way to specify the node from which to copy data in the
block-stream operation is by using the 'base' parameter. This
parameter however takes a file name, not a node name.
Since we want to be able to perform this operation using only node
names, this patch adds a new 'base-node' parameter.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Quorum children are special in the sense that they're not directly
attached to a block backend but they're not used as backing images
either. However the intermediate block streaming code supports
streaming to them. This is a test case for that scenario.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There's many tests that need Quorum support in order to run. At the
moment each test implements its own check to see if Quorum is
enabled. This patch centralizes all those checks in a new function
called iotests.supports_quorum().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As with test_stream_parallel(), we allow mixing block-stream and
block-commit operations in the same backing chain as long as there's
no overlap among the involved nodes.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These test cases check that it's not possible to perform two
block-stream or block-commit operations if there are nodes involved in
both.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This test case checks that it's possible to launch several stream
operations in parallel in the same snapshot chain, each one involving
a different set of nodes.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds test_stream_intermediate(), similar to test_stream() but
streams to the intermediate image instead.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch makes the 'device' parameter of the 'block-stream' command
accept a node name that is not a root node. The presence of this
feature can't be directly tested with introspection; soon we'll
introduce a 'base-node' parameter whose presence can be checked for
this purpose.
In addition to that, operation blockers will be checked in all
intermediate nodes between the top and the base node.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This makes sure that the image we are streaming into is open in
read-write mode during the operation.
Operation blockers are also set in all intermediate nodes, since they
will be removed from the chain afterwards.
Finally, this also unblocks the stream operation in backing files.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When block-commit is launched without the top parameter, it uses
internally a mirror block job. In that case all intermediate nodes
between the active and base nodes must be blocked as well.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
After a successful block-commit operation all nodes between top and
base are removed from the backing chain, and top's overlay needs to
be updated to point to base. Because of that we should prevent other
block jobs from messing with them.
This patch blocks all operations in these nodes in commit_start().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
qmp_block_commit() checks for op blockers in the active and
destination (base) images. However all nodes between top_bs and base
are also involved, and they are removed from the chain afterwards.
In addition to that, if top_bs is not the active layer then top_bs's
overlay also needs to be checked because it's involved in the job (its
backing image string needs to be updated to point to 'base').
This patch checks that none of those nodes are blocked.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use block_job_add_bdrv() instead of blocking all operations in
backup_start() and unblocking them in backup_run().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Use block_job_add_bdrv() instead of blocking all operations in
mirror_start_job() and unblocking them in mirror_exit().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a block job is created on a certain BlockDriverState, operations
are blocked there while the job exists. However, some block jobs may
involve additional BDSs, which must be blocked separately when the job
is created and unblocked manually afterwards.
This patch adds block_job_add_bdrv(), that simplifies this process by
keeping a list of BDSs that are involved in the specified block job.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When a BlockDriverState is about to be reopened it can trigger certain
operations that need to write to disk. During this process a different
block job can be woken up. If that block job completes and also needs
to call bdrv_reopen() it can happen that it needs to do it on the same
BlockDriverState that is still in the process of being reopened.
This can have fatal consequences, like in this example:
1) Block job A starts and sleeps after a while.
2) Block job B starts and tries to reopen node1 (a qcow2 file).
3) Reopening node1 means flushing and replacing its qcow2 cache.
4) While the qcow2 cache is being flushed, job A wakes up.
5) Job A completes and reopens node1, replacing its cache.
6) Job B resumes, but the cache that was being flushed no longer
exists.
This patch splits the bdrv_drain_all() call to keep all block jobs
paused during bdrv_reopen_multiple(), so that step 4 can never happen
and the operation is safe.
Note that this scenario can only happen if both bdrv_reopen() calls
are made by block jobs on the same backing chain. Otherwise there's no
chance that the same BlockDriverState appears in both reopen queues.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_drain_all() doesn't allow the caller to do anything after all
pending requests have been completed but before block jobs are
resumed.
This patch splits bdrv_drain_all() into _begin() and _end() for that
purpose. It also adds aio_{disable,enable}_external() calls to disable
external clients in the meantime.
An important restriction of this split is that no new block jobs or
BlockDriverStates can be created between the bdrv_drain_all_begin()
and bdrv_drain_all_end() calls. This is not a concern now because
we'll only be using this in bdrv_reopen_multiple(), but it must be
dealt with if we ever have other uses cases in the future.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduce new object 'BlockdevOptionsSsh' in qapi/block-core.json to
support blockdev-add for SSH network protocol driver. Use only 'struct
InetSocketAddress' since SSH only supports connection over TCP.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
[ kwolf: Removed host_key_check option, we want to expose this later in
a structured way rather than as a string that must be parsed ]
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add InetSocketAddress compatibility to SSH driver.
Add a new option "server" to the SSH block driver which then accepts
a InetSocketAddress.
"host" and "port" are supported as legacy options and are mapped to
their InetSocketAddress representation.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make inet_connect_saddr() in util/qemu-sockets.c public in order to be
able to use it with InetSocketAddress sockets outside of
util/qemu-sockets.c independently.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We have 5 options plus ("server") option which is added in the next
patch that conflict with specifying a SSH filename. We need to iterate
over all the options to check whether its key has an "server." prefix.
This iteration will help us adding the new option "server" easily.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As used by HelenOS, presumably for ultra 2 and 3,
prior to the sun4v platform and the current twinx names.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It's handy to have a mmu idx for physical addresses, so
that mmu disabled and physical access asis can use the
same path as normal accesses.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Several helpers call helper_raise_exception directly, which requires
in turn that their callers have performed save_state. The new function
allows a TCG return address to be passed in so that we can restore
PC + NPC + flags data from that.
This fixes a bug in the usage of helper_check_align, whose callers had
not been calling save_state. It fixes another bug in which the divide
helpers used GETPC at a level other than the direct callee from TCG.
This allows the translator to avoid save_state prior to SAVE, RESTORE,
and FLUSHW instructions.
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Base patches for MTTCG enablement.
# gpg: Signature made Mon 31 Oct 2016 14:01:41 GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-mttcg:
tcg: move locking for tb_invalidate_phys_page_range up
*_run_on_cpu: introduce run_on_cpu_data type
cpus: re-factor out handle_icount_deadline
tcg: cpus rm tcg_exec_all()
tcg: move tcg_exec_all and helpers above thread fn
target-arm/arm-powerctl: wake up sleeping CPUs
tcg: protect translation related stuff with tb_lock.
translate-all: Add assert_(memory|tb)_lock annotations
linux-user/elfload: ensure mmap_lock() held while setting up
tcg: comment on which functions have to be called with tb_lock held
cpu-exec: include cpu_index in CPU_LOG_EXEC messages
translate-all: add DEBUG_LOCKING asserts
translate_all: DEBUG_FLUSH -> DEBUG_TB_FLUSH
cpus: make all_vcpus_paused() return bool
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the linux-user case all things that involve ''l1_map' and PageDesc
tweaks are protected by the memory lock (mmpa_lock). For SoftMMU mode
we previously relied on single threaded behaviour, with MTTCG we now use
the tb_lock().
As a result we need to do a little re-factoring and push the taking of
this lock up the call tree. This requires a slightly different entry for
the SoftMMU and user-mode cases from tb_invalidate_phys_range.
This also means user-mode breakpoint insertion needs to take two locks
but it hadn't taken any previously so this is an improvement.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-20-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This changes the *_run_on_cpu APIs (and helpers) to pass data in a
run_on_cpu_data type instead of a plain void *. This is because we
sometimes want to pass a target address (target_ulong) and this fails on
32 bit hosts emulating 64 bit guests.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161027151030.20863-24-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration bits from the COLO project
# gpg: Signature made Sun 30 Oct 2016 10:39:55 GMT
# gpg: using RSA key 0xEB0B4DFC657EF670
# gpg: Good signature from "Amit Shah <amit@amitshah.net>"
# gpg: aka "Amit Shah <amit@kernel.org>"
# gpg: aka "Amit Shah <amitshah@gmx.net>"
# Primary key fingerprint: 48CA 3722 5FE7 F4A8 B337 2735 1E9A 3B5F 8540 83B6
# Subkey fingerprint: CC63 D332 AB8F 4617 4529 6534 EB0B 4DFC 657E F670
* remotes/amit-migration/tags/migration-for-2.8:
MAINTAINERS: Add maintainer for COLO framework related files
configure: Support enable/disable COLO feature
docs: Add documentation for COLO feature
COLO: Implement failover work for secondary VM
COLO: Implement the process of failover for primary VM
COLO: Introduce state to record failover process
COLO: Add 'x-colo-lost-heartbeat' command to trigger failover
COLO: Synchronize PVM's state to SVM periodically
COLO: Add checkpoint-delay parameter for migrate-set-parameters
COLO: Load VMState into QIOChannelBuffer before restore it
COLO: Send PVM state to secondary side when do checkpoint
COLO: Add a new RunState RUN_STATE_COLO
COLO: Introduce checkpointing protocol
COLO: Establish a new communicating path for COLO
migration: Switch to COLO process after finishing loadvm
migration: Enter into COLO mode after migration if COLO is enabled
COLO: migrate COLO related info to secondary node
migration: Introduce capability 'x-colo' to migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm queue:
* Fix reset GPIO handling for spitz, tosa boards
* virt: add 'pmu' property for configuring whether to expose the
vPMU to the guest
* char: cadence: correct reset value for baud rate registers
* versatilepb: do not run if user asks for more than 256MB RAM
* pxa2xx: Set value default values for CCCR and CKEN on PXA255
* arm: cubieboard: Add support for initrd
* i.MX: Fix GPIO ISR register write
# gpg: Signature made Fri 28 Oct 2016 15:56:56 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20161028:
hw/arm/tosa: Fix reset handling
hw/arm/spitz: Fix reset handling
arm: virt: add PMU property to mach-virt machine type
arm: Add an option to turn on/off vPMU support
char: cadence: correct reset value for baud rate registers
versatilepb: do not run if user asks for more than 256MB RAM
hw/arm/pxa2xx: Set value default values for CCCR and CKEN on PXA255
arm: cubieboard: Add support for initrd
i.MX: Fix GPIO ISR register write
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Fri 28 Oct 2016 15:47:39 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/for-upstream:
aio: convert from RFifoLock to QemuRecMutex
qemu-thread: introduce QemuRecMutex
iothread: release AioContext around aio_poll
block: only call aio_poll on the current thread's AioContext
qemu-img: call aio_context_acquire/release around block job
qemu-io: acquire AioContext
block: prepare bdrv_reopen_multiple to release AioContext
replication: pass BlockDriverState to reopen_backing_file
iothread: detach all block devices before stopping them
aio: introduce qemu_get_current_aio_context
sheepdog: use BDRV_POLL_WHILE
nfs: use BDRV_POLL_WHILE
nfs: move nfs_set_events out of the while loops
block: introduce BDRV_POLL_WHILE
qed: Implement .bdrv_drain
block: change drain to look only at one child at a time
block: add BDS field to count in-flight requests
mirror: use bdrv_drained_begin/bdrv_drained_end
blockjob: introduce .drain callback for jobs
replication: interrupt failover if the main device is closed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a pure mechanical change in preparation for up-coming
re-factoring. Instead of a forward declaration for tcg_exec_all it and
the associated helper functions are moved in front of the call from
qemu_tcg_cpu_thread_fn.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-12-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds calls to the assert_(memory|tb)_lock for all public APIs which
are documented as needing them held for linux-user mode. The asserts are
NOPs for system-mode although these will be converted when MTTCG is
enabled.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-9-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Future patches will enforce the holding of mmap_lock() when we are
manipulating internal memory structures. Technically it doesn't matter
in the case of elfload as we haven't started executing yet. However it
is easier to grab the lock when required than special case the
translate-all API.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-8-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
softmmu requires more functions to be thread-safe, because translation
blocks can be invalidated from e.g. notdirty callbacks. Probably the
same holds for user-mode emulation, it's just that no one has ever
tried to produce a coherent locking there.
This patch will guide the introduction of more tb_lock and tb_unlock
calls for system emulation.
Note that after this patch some (most) of the mentioned functions are
still called outside tb_lock/tb_unlock. The next one will rectify this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-7-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds asserts to check the locking on the various translation
engines structures. There are two sets of structures that are protected
by locks.
The first the l1map and PageDesc structures used to track which
translation blocks are associated with which physical addresses. In
user-mode this is covered by the mmap_lock.
The second case are TB context related structures which are protected by
tb_lock which is also user-mode only.
Currently the asserts do nothing in SoftMMU mode but this will change
for MTTCG.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-4-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make the debug define consistent with the others. The flush operation is
all about invalidating TranslationBlocks on flush events.
Also fix up the commenting on the other DEBUG for the benefit of
checkpatch.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-Id: <20161027151030.20863-3-alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The instructions PCI STORE, PCI LOAD and PCI STORE BLOCK
use calls to memory_region_dispatch_write() and
memory_region_dispatch_read() but do not test the return value.
Furthermore, the instruction PCI STORE BLOCK sets up a PGM_ADDRESSING
exception when the operand 3 is not within the designated PCI address
space instead of a PGM_OPERAND exception.
Let's setup a PGM_OPERAND exception in all of these failure cases.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cornelia.huck@de.ibm.com>
The new cryptodev backend named cryptodev-builtin,
which realized by QEMU cipher APIs. These APIs can
be backed by either nettle or gcrypt.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce the virtio_crypto.h which follows
virtio-crypto specification.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch adds session operation and crypto operation
stuff in the cryptodev backend, including function
pointers and corresponding structures.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
cryptodev backend interface is used to realize the active work for
virtual crypto device.
This patch only add the framework, doesn't include specific operations.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Of the three possible parameter combinations for
virtio_queue_set_host_notifier_fd_handler:
- assign=true/set_handler=true is only called from
virtio_device_start_ioeventfd
- assign=false/set_handler=false is called from
set_host_notifier_internal but it only does something when
reached from virtio_device_stop_ioeventfd_impl; otherwise
there is no EventNotifier set on qemu_get_aio_context().
- assign=true/set_handler=false is called from
set_host_notifier_internal, but it is not doing anything:
with the new start_ioeventfd and stop_ioeventfd methods,
there is never an EventNotifier set on qemu_get_aio_context()
at this point. This is enforced by the assertion in
virtio_bus_set_host_notifier.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ioeventfd_disabled was the only reason for the default
implementation of virtio_device_start_ioeventfd not to use
virtio_bus_set_host_notifier. This is now fixed, and the sole entry
point to set up ioeventfd can be virtio_bus_set_host_notifier.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that there is not anymore a switch from the generic ioeventfd handler
to the dataplane handler, virtio_bus_set_host_notifier(assign=true) is
always called with !bus->ioeventfd_started, hence virtio_bus_stop_ioeventfd
does nothing in this case. Move the invocation to vhost.c, which is the
only place that needs it.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make virtio_device_start_ioeventfd_impl use the same logic as
dataplane to set up the host notifier. This removes the need
for the set_handler argument in set_host_notifier_internal.
This is a first step towards using virtio_bus_set_host_notifier
as the sole entry point to set up ioeventfds. At least now
the functions have the same interface, but they still differ
in that virtio_bus_set_host_notifier sets ioeventfd_disabled.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Override start_ioeventfd and stop_ioeventfd to start/stop the
whole dataplane logic. This has some positive side effects:
- no need anymore for virtio_add_queue_aio (i.e. a revert of
commit 1c627137c1)
- no need anymore to switch from generic ioeventfd handlers to
dataplane
It detects some errors better:
$ qemu-system-x86_64 -object iothread,id=io \
-device virtio-scsi-pci,ioeventfd=off,iothread=io
qemu-system-x86_64: -device virtio-scsi-pci,ioeventfd=off,iothread=io:
ioeventfd is required for iothread
while previously it would have started just fine.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Override start_ioeventfd and stop_ioeventfd to start/stop the
whole dataplane logic. This has some positive side effects:
- no need anymore for virtio_add_queue_aio (i.e. a revert of
commit 0ff841f6d1)
- no need anymore to switch from generic ioeventfd handlers to
dataplane
It detects some errors better:
$ qemu-system-x86_64 -object iothread,id=io \
-drive id=null,file=null-aio://,if=none,format=raw \
-device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null
qemu-system-x86_64: -device virtio-blk-pci,ioeventfd=off,iothread=io,drive=null:
ioeventfd is required for iothread
while previously it would have started just fine.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This will be used to forbid iothread configuration when the
proxy does not allow using ioeventfd. To simplify the implementation,
change the direction of the ioeventfd_disabled callback too.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Allow customization of the start and stop of ioeventfd. This will
allow direct start of dataplane without passing through the default
ioeventfd handlers, which in turn allows using the dataplane logic
instead of virtio_add_queue_aio. It will also enable some code
simplification, because the sole entry point to ioeventfd setup
will be virtio_bus_set_host_notifier.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This simplifies the code and removes the ioeventfd_started
and ioeventfd_set_started callback. The only difference is
in how virtio-ccw handles an error---it doesn't disable
ioeventfd forever anymore. It was the only backend to do
so, and if desired this behavior should be implemented in
virtio-bus.c.
Instead of ioeventfd_started, the ioeventfd_assign callback now
determines whether the virtio bus supports host notifiers.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Avoid "tricking" virtio-blk-dataplane into thinking that ioeventfd will be
available when it is not. This bug has always been there, but it will break
TCG+ioeventfd=on once the dataplane code will be always used when ioeventfd=on.
Reviewed-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Replace the load/save with a vmsd.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Provide a vmsd pointer for VirtIO devices to use instead of the
load/save methods.
We'll eventually kill off the load/save methods.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add myself as co-maintainer of COLO framework, so that
I can get CC'ed on future patches and bugs for this feature.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
configure --enable-colo/--disable-colo to switch COLO
support on/off.
COLO feature doesn't depend on any other external libraries,
So here it is reasonable to enable COLO by default, to
avoid re-compile QEMU if users want to use this capability.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
For primary side, if COLO gets failover request from users.
To be exact, gets 'x_colo_lost_heartbeat' command.
COLO thread will exit the loop while the failover BH does the
cleanup work and resumes VM.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
When handling failover, COLO processes differently according to
the different stage of failover process, here we introduce a global
atomic variable to record the status of failover.
We add four failover status to indicate the different stage of failover process.
You should use the helpers to get and set the value.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We leave users to choose whatever heartbeat solution they want,
if the heartbeat is lost, or other errors they detect, they can use
experimental command 'x_colo_lost_heartbeat' to tell COLO to do failover,
COLO will do operations accordingly.
For example, if the command is sent to the Primary side,
the Primary side will exit COLO mode, does cleanup work,
and then, PVM will take over the service work. If sent to the Secondary side,
the Secondary side will run failover work, then takes over PVM's service work.
Cc: Luiz Capitulino <lcapitulino@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We should not destroy the state of SVM (Secondary VM) until we receive
the complete data of PVM's state, in case the primary fails in the process
of sending the state, so we cache the VM's state in secondary side before
load it into SVM.
Besides, we should call qemu_system_reset() before load VM state,
which can ensure the data is intact.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
VM checkpointing is to synchronize the state of PVM to SVM, just
like migration does, we re-use save helpers to achieve migrating
PVM's state to Secondary side.
COLO need to cache the data of VM's state in the secondary side before
synchronize it to SVM. COLO need the size of the data to determine
how much data should be read in the secondary side.
So here, we can get the size of the data by saving it into I/O channel
before send it to the secondary side.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We need communications protocol of user-defined to control
the checkpointing process.
The new checkpointing request is started by Primary VM,
and the interactive process like below:
Checkpoint synchronizing points:
Primary Secondary
initial work
'checkpoint-ready' <-------------------- @
'checkpoint-request' @ -------------------->
Suspend (Only in hybrid mode)
'checkpoint-reply' <-------------------- @
Suspend&Save state
'vmstate-send' @ -------------------->
Send state Receive state
'vmstate-received' <-------------------- @
Release packets Load state
'vmstate-load' <-------------------- @
Resume Resume (Only in hybrid mode)
Start Comparing (Only in hybrid mode)
NOTE:
1) '@' who sends the message
2) Every sync-point is synchronized by two sides with only
one handshake(single direction) for low-latency.
If more strict synchronization is required, a opposite direction
sync-point should be added.
3) Since sync-points are single direction, the remote side may
go forward a lot when this side just receives the sync-point.
4) For now, we only support 'periodic' checkpoint, for which
the Secondary VM is not running, later we will support 'hybrid' mode.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Markus Armbruster <armbru@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Switch from normal migration loadvm process into COLO checkpoint process if
COLO mode is enabled.
We add three new members to struct MigrationIncomingState,
'have_colo_incoming_thread' and 'colo_incoming_thread' record the COLO
related thread for secondary VM, 'migration_incoming_co' records the
original migration incoming coroutine.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Add a new migration state: MIGRATION_STATUS_COLO. Migration source side
enters this state after the first live migration successfully finished
if COLO is enabled by command 'migrate_set_capability x-colo on'.
We reuse migration thread, so the process of checkpointing will be handled
in migration thread.
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
We can determine whether or not VM in destination should go into COLO mode
by referring to the info that was migrated.
We skip this section if COLO is not enabled (i.e.
migrate_set_capability colo off), so that, It doesn't break compatibility
with migration no matter whether users configure the --enable-colo/disable-colo
on the source/destination side or not;
Signed-off-by: zhanghailiang <zhang.zhanghailiang@huawei.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Signed-off-by: Amit Shah <amit@amitshah.net>
Fixes the following errors:
* ERROR: line over 90 characters
* ERROR: code indent should never use tabs
* ERROR: space prohibited after that open square bracket '['
* ERROR: do not initialise statics to 0 or NULL
* ERROR: "(foo*)" should be "(foo *)"
Signed-off-by: Emil Condrea <emilcondrea@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Quan Xu <xuquan8@huawei.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
# gpg: Signature made Fri 28 Oct 2016 09:44:23 BST
# gpg: using RSA key 0xF30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier/tags/m68k-part2-pull-request:
MAINTAINERS: update M68K entry
target-m68k: immediate ops manage word and byte operands
target-m68k: cmp manages word and bytes operands
target-m68k: add/sub manage word and byte operands
target-m68k: add addressing modes to neg
target-m68k: introduce byte and word cc_ops
target-m68k: some bit ops cleanup
target-m68k: suba/adda can manage word operand
target-m68k: and can manage word and byte operands
target-m68k: or can manage word and byte operands
target-m68k: eor can manage word and byte operands
target-m68k: add addressing modes to not
target-m68k: Inline addx, subx, negx
target-m68k: add dbcc
target-m68k: add addressing modes to scc
target-m68k: add exg ops
target-m68k: add linkl
target-m68k: add bkpt instruction
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2016-10-28
This pull request supersedes and extends the one from 2016-10-26
(which had a build bug).
Highlights:
* SLOF (pseries guest firmware) update
* Enable a number of extra testcases on ppc / pseries
* Added the 'powernv' machine type
- Almost enough to be minimally usable
- But still missing necessary interrupt controller updates
* Cleanup and consolidation of NVRAM handling on several platforms
with related firmware
* Substantial cleanup to device tree construction
* Some more POWER9 instruction emulation
* Cleanup to handling of pseries option vectors and CAS reboot
handling (host/guest feature negotiation mechanism)
* Significant cleanups to handling of PCI devices in test cases
* New hotplug event infrastructure
* Memory hot unplug support for pseries
* Several bug fixes
The NVRAM cleanup affects some Sun sparc platforms as well as ppc
ones, but have been tested by the sparc maintainer (Mark Cave-Ayland).
The test additions also include substantial general changes to the
test framework that aren't strictly ppc related. They don't seem to
break tests on other platforms, they're for the benefit of enabling
tests on ppc and there isn't a specific maintainer for them, so
they're included in this tree.
# gpg: Signature made Fri 28 Oct 2016 02:37:19 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.8-20161028: (73 commits)
ppc: allow certain HV interrupts to be delivered to guests
spapr: Memory hot-unplug support
spapr: use count+index for memory hotplug
spapr: Add DRC count indexed hotplug identifier type
spapr: add hotplug interrupt machine options
spapr_events: add support for dedicated hotplug event source
spapr: update spapr hotplug documentation
target-ppc: Add xvcmpnesp, xvcmpnedp instructions
target-ppc: add xscmp[eq,gt,ge,ne]dp instructions
tests: Add pseries machine to the prom-env-test, too
spapr_nvram: Pre-initialize the NVRAM to support the -prom-env parameter
libqos: Change PCI accessors to take opaque BAR handle
tests: Don't assume structure of PCI IO base in ahci-test
tests: Use qpci_mem{read,write} in ivshmem-test
libqos: Add 64-bit PCI IO accessors
tests: Clean up IO handling in ide-test
libqos: Implement mmio accessors in terms of mem{read,write}
libqos: Add streaming accessors for PCI MMIO
tests: Adjust tco-test to use qpci_legacy_iomap()
libqos: Better handling of PCI legacy IO
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
scripts/tracetool generates a C preprocessor macro from the name of the
build directory. Any characters which are possible in a directory name
but not allowed in a macro name must be substituted, otherwise builds
will fail.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Some files contain multiple #includes of the same header file.
Removed most of those unnecessary duplicate entries using
scripts/clean-includes.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Anand J <anand.indukala@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Enhance the clean-includes script to optionally check for duplicate #include
entries.
Script might output false positive entries as well. Such entries should
not be removed. So if it finds any duplicate entries script will
terminate with an exit status 1. Then each and every file should be
checked manually and corrected if necessary.
In order to enable the check use --check-dup-head option with
scripts/clean-includes.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Anand J <anand.indukala@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The NSIS based installer currently does not install qemu-ga.
It installs the executables and other files for the QEMU system emulation.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Coverity points out that the comparison "fid <= ZPCI_MAX_FID"
in s390_pci_generate_fid() is always true (because fid
is 32 bits and ZPCI_MAX_FID is 0xffffffff). This isn't a
bug because the real loop termination condition is
expressed later via an "if (...) break;" inside the loop,
but it is a bit odd. Rephrase the loop to avoid the
unnecessary duplicate-but-never-true conditional.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Cornelia Huck <cornelia.huck@de.ibm.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
All the callers of migrate_fd_error() pass a non-NULL
error parameter, and if any did pass NULL then we would
segfault in error_copy(), so remove the unnecessary
NULL check earlier in the function.
(Spotted by Coverity.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Avoid undefined behaviour of echo(1) with backslashes in arguments
The behaviour is implementation-defined, different /bin/sh's behave
differently.
Signed-off-by: Daniel Shahaf <danielsh@apache.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The *_exitfn functions cannot fail and should not be
returning int.
This also removes the passthru_exitfn since this callback
does nothing as of now.
This was suggested as a Bite-sized task for code cleanup.
Signed-off-by: Akanksha Srivastava <akanksha.dlf@gmail.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Information about "qemu-trivial" ML can be found in the wiki:
http://wiki.qemu.org/Contribute/TrivialPatches
But the first place where a developer looks is the file MAINTAINERS.
This also allows the get_maintainer.pl script to display
the qemu-trivial ML address when the mail subject contains "trivial".
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Since the lm32 is a 32 bit architecture, just return a 32 bit value which
is then converted to a 64 bit value.
Spotted by coverity, CID 1005506.
Signed-off-by: Michael Walle <michael@walle.cc>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Drop the rX, rY and rZ stuff and use dc->r{0,1,2} directly. This should
also fix the false positive in coverity CID 1005720.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Don't truncate the multiplication and do a 64 bit one instead because
because the result is stored in a 64 bit variable.
Spotted by coverity, CID 1167561.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The lm32 target already has a disassembler which logs the assembly
instructions with "-d in_asm". Therefore, turn of the LOG_DIS() macro to
prevent logging the assembly instructions twice. Also turn the macro in a
one which is always compiled to catch any errors while the macro is turned
off.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The order of most opcodes with immediates was wrong (according to the
reference manual) in the (debug) logging. Additionally, one operand for the
andhi instruction was completly wrong. Fix these.
Signed-off-by: Michael Walle <michael@walle.cc>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Using the CPU reset handler for resets triggered by writing into
gpio pins other than GPIO01 is not appropriate and does not work,
since the reset triggered by writing into GPIO01 is configurable.
Use a separate reset handler for tosa to reset the entire system
and not just the CPU.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1477597646-24111-2-git-send-email-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Using the CPU reset handler for resets triggered by writing into
gpio pins other than GPIO01 is not appropriate and does not work,
since the reset triggered by writing into GPIO01 is configurable.
Use a separate reset handler for spitz to reset the entire system
and not just the CPU.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1477597646-24111-1-git-send-email-linux@roeck-us.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
CPU vPMU is now turned ON by default, but this feature wasn't introduced
until virt-2.7 machine type. To solve this problem, this patch adds a
PMU option in machine state, which is used to control CPU's vPMU status.
This PMU option is not exposed to command line and is turned off in
virt-2.6 machine type.
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Wei Huang <wei@redhat.com>
Message-id: 1477463301-17175-3-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds a pmu=[on/off] option to enable/disable vPMU support
in guest vCPU. It allows virt tools, such as libvirt, to determine the
exsitence of vPMU and configure it. Note this option is only available
for cortex-a57/cortex-53/ host CPUs, but unavailable on ARMv7 and other
processors. Also even though "pmu=" option is available for TCG mode,
setting it doesn't turn PMU on.
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Message-id: 1477463301-17175-2-git-send-email-wei@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Cadence UART device emulator stores 'baud rate generator'
and 'baud rate divider' values, used in computing speed, in two
registers. The device specification defines their range and
their reset value. Use their correct value when resetting the
device in cadence_uart_reset.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1477378140-2670-1-git-send-email-ppandit@redhat.com
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The versatilepb physical address space layout only has
a 256MB region for RAM before the devices. Without a guard
on the amount of RAM requested by the user we would happily
create a RAM area that overlapped with the devices, resulting
in very confusing behaviour (typically a guest crash).
Report the problem to the user if they try to request more
RAM than the board can handle (as we do already for some
other board models).
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 20161025093711.17407-1-jcd@tribudubois.net
[PMM: tidied up commit message, comments. Use error_report()
rather than fprintf(stderr, ...).]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The code used default values for PXA270 to configure CCCR. For PXA255,
the resulting register value is invalid (unsupported) and resulted
in a division by zero in the Linux kernel. Use default values from
datasheet instead.
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1477361273-18888-1-git-send-email-linux@roeck-us.net
[PMM: fixed tabs-vs-spaces nit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Writing the ISR register is supposed to clear interrupt status bits,
not to set them.
This patch makes '-M sabrelite' work without devicetree changes (Linux
kernel versions 3.18 to 4.7 with imx_v6_v7_defconfig and up to v4.8 with
multi_v7_defconfig; mainline has different problems).
Signed-off-by: Guenter Roeck <linux@roeck-us.net>
Message-id: 1477361005-18646-1-git-send-email-linux@roeck-us.net
Acked-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge qio 2016/10/27 v1
# gpg: Signature made Thu 27 Oct 2016 13:54:03 BST
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qio-2016-10-27-1:
main: set names for main loop sources created
vnc: set name for all I/O channels created
migration: set name for all I/O channels created
char: set name for all I/O channels created
nbd: set name for all I/O channels created
io: add ability to set a name for IO channels
io: Add a QIOChannelSocket cleanup test
io: set LISTEN flag explicitly for listen sockets
io: Introduce a qio_channel_set_feature() helper
io: Use qio_channel_has_feature() where applicable
io: Fix double shift usages on QIOChannel features
Conflicts:
qemu-char.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the first step towards having fine-grained critical sections in
dataplane threads, which will resolve lock ordering problems between
address_space_* functions (which need the BQL when doing MMIO, even
after we complete RCU-based dispatch) and the AioContext.
Because AioContext does not use contention callbacks anymore, the
unit test has to be changed.
Previously applied as a0710f7995 and
then reverted.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-19-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
aio_poll is not thread safe; for example bdrv_drain can hang if
the last in-flight I/O operation is completed in the I/O thread after
the main thread has checked bs->in_flight.
The bug remains latent as long as all of it is called within
aio_context_acquire/aio_context_release, but this will change soon.
To fix this, if bdrv_drain is called from outside the I/O thread,
signal the main AioContext through a dummy bottom half. The event
loop then only runs in the I/O thread.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-18-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
We want the BDS event loop to run exclusively in the iothread that
owns the BDS's AioContext. This macro will provide the synchronization
between the two event loops; for now it just wraps the common idiom
of a while loop around aio_poll.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-8-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The "need_check_timer" is used to clear the "NEED_CHECK" flag in the
image header after a grace period once metadata update has finished. To
comply with the bdrv_drain semantics, we should make sure it remains
deleted once .bdrv_drain is called.
The change to qed_need_check_timer_cb is needed because bdrv_qed_drain
is called after s->bs has been drained, and should not operate on it;
instead it should operate on the BdrvChild-ren exclusively. Doing so
is easy because QED does not have a bdrv_co_flush_to_os callback, hence
all that is needed to flush it is to ensure writes have reached the disk.
Based on commit df9a681dc9 (which however included some unrelated
hunks, possibly due to a merge failure or an overlooked squash).
The patch was reverted because at the time bdrv_qed_drain could call
qed_plug_allocating_write_reqs while an allocating write was queued.
This however is not possible anymore after the previous patch;
.bdrv_drain is only called after all writes have completed at the
QED level, and its purpose is to trigger metadata writes in bs->file.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1477565348-5458-7-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
bdrv_requests_pending is checking children to also wait until internal
requests (such as metadata writes) have completed. However, checking
children is in general overkill. Children requests can be of two kinds:
- requests caused by an operation on bs, e.g. a bdrv_aio_write to bs
causing a write to bs->file->bs. In this case, the parent's in_flight
count will always be incremented by at least one for every request in
the child.
- asynchronous metadata writes or flushes. Such writes can be started
even if bs's in_flight count is zero, but not after the .bdrv_drain
callback has been invoked.
This patch therefore changes bdrv_drain to finish I/O in the parent
(after which the parent's in_flight will be locked to zero), call
bdrv_drain (after which the parent will not generate I/O on the child
anymore), and then wait for internal I/O in the children to complete.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-6-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Unlike tracked_requests, this field also counts throttled requests,
and remains non-zero if an AIO operation needs a BH to be "really"
completed.
With this change, it is no longer necessary to have a dummy
BdrvTrackedRequest for requests that are never serialising, and
it is no longer necessary to poll the AioContext once after
bdrv_requests_pending(bs) returns false.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-5-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Ensure that there are no changes between the last check to
bdrv_get_dirty_count and the switch to the target.
There is already a bdrv_drained_end call, we only need to ensure
that bdrv_drained_begin is not called twice.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-4-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
This is required to decouple block jobs from running in an
AioContext. With multiqueue block devices, a BlockDriverState
does not really belong to a single AioContext.
The solution is to first wait until all I/O operations are
complete; then loop in the main thread for the block job to
complete entirely.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-3-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Without this change, there is a race condition in tests/test-replication.
Depending on how fast the failover job (active commit) runs, there is a
chance of two bad things happening:
1) replication_done can be called after the secondary has been closed
and hence when the BDRVReplicationState is not valid anymore.
2) two copies of the active disk are present during the
/replication/secondary/stop test (that test runs immediately after
/replication/secondary/start, which tests failover). This causes the
corruption detector to fire.
Reviewed-by: Wen Congyang <wency@cn.fujitsu.com>
Reviewed-by: Changlong Xie <xiecl.fnst@cn.fujitsu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <1477565348-5458-2-git-send-email-pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
GTK generates key events for the delete key with key->string[0] = 0x7f
... but this does not work right with the readline_handle_byte()
function in util/readline.c, since this treats the keycode 127 as
backspace. So let's add a special case for the GTK delete key to make
this key behave right in the monitor interface of the GTK ui.
Buglink: https://bugs.launchpad.net/qemu/+bug/1619438
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1477570647-7100-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
We do not want to catch the BrlAPI input/ouput immediately, but only
when the guest has started discussing withour virtual device.
This notably fixes input before the guest driver has started.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
ppc hypervisors have delivered system reset and machine check exception
interrupts to guests in some situations (e.g., see FWNMI feature of LoPAPR,
or NMI injection in QEMU).
These exceptions are architected to set the HV bit in hardware, however
when injected into a guest, the HV bit should be cleared. Current code
masks off the HV bit before setting the new MSR, however this happens after
the interrupt delivery model has calculated delivery mode for the exception.
This can result in the guest's MSR LE bit being lost.
Account for this in the exception handler and don't set HV bit for guest
delivery.
Also add another sanity check to ensure similar bugs get caught.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add support to hot remove pc-dimm memory devices.
Since we're introducing a machine-level unplug_request hook, we also
had handling for CPU unplug there as well to ensure CPU unplug
continues to work as it did before.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* add hooks to CAS/cmdline enablement of hotplug ACR support
* add hook for CPU unplug
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 0a417869:
spapr: Move memory hotplug to RTAS_LOG_V6_HP_ID_DRC_COUNT type
dropped per-DRC/per-LMB hotplugs event in favor of a bulk add via a
single LMB count value. This was to avoid overrunning the guest EPOW
event queue with hotplug events. This works fine, but relies on the
guest exhaustively scanning for pluggable LMBs to satisfy the
requested count by issuing rtas-get-sensor(DR_ENTITY_SENSE, ...) calls
until all the LMBs associated with the DIMM are identified.
With newer support for dedicated hotplug event source, this queue
exhaustion is no longer as much of an issue due to implementation
details on the guest side, but we still try to avoid excessive hotplug
events by now supporting both a count and a starting index to avoid
unecessary work. This patch makes use of that approach when the
capability is available.
Cc: bharata@linux.vnet.ibm.com
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add support for DRC count indexed hotplug ID type which is primarily
needed for memory hot unplug. This type allows for specifying the
number of DRs that should be plugged/unplugged starting from a given
DRC index.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
* updated rtas_event_log_v6_hp to reflect count/index field ordering
used in PAPR hotplug ACR
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This adds machine options of the form:
-machine pseries,modern-hotplug-events=true
-machine pseries,modern-hotplug-events=false
If false, QEMU will force the use of "legacy" style hotplug events,
which are surfaced through EPOW events instead of a dedicated
hot plug event source, and lack certain features necessary, mainly,
for memory unplug support.
If true, QEMU will enable support for "modern" dedicated hot plug
event source. Note that we will still default to "legacy" style unless
the guest advertises support for the "modern" hotplug events via
ibm,client-architecture-support hcall during early boot.
For pseries-2.7 and earlier we default to false, for newer machine
types we default to true.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Hotplug events were previously delivered using an EPOW interrupt
and were queued by linux guests into a circular buffer. For traditional
EPOW events like shutdown/resets, this isn't an issue, but for hotplug
events there are cases where this buffer can be exhausted, resulting
in the loss of hotplug events, resets, etc.
Newer-style hotplug event are delivered using a dedicated event source.
We enable this in supported guests by adding standard an additional
event source in the guest device-tree via /event-sources, and, if
the guest advertises support for the newer-style hotplug events,
using the corresponding interrupt to signal the available of
hotplug/unplug events.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This updates the existing documentation to reflect recent updates to
the hotplug event structure, which are in draft form but slated
for inclusion in PAPR/LoPAPR.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that we also support the "-prom-env" parameter for the pseries
machine, we can enable this test for this machine, too. Since booting
with TCG is rather slow with the pseries machine, we also enable
the "-nodefaults" parameter for this test now, so that SLOF does not
have to check that much devices during boot and thus runs a little
bit faster.
Signed-off-by: Thomas Huth <thuth@redhat.com>
[dwg: Don't add -nodefaults to the command line, it causes extra warnings
for the sparc testcases]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In case we do not load the NVRAM contents from a file and the user
specified the "-prom-env" parameter, use the new CHRP NVRAM helper
functions to pre-initialize the NVRAM partitions, so that the SLOF
firmware now can pick up the environment variables from the -prom-env
parameter, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The usual use model for the libqos PCI functions is to map a specific PCI
BAR using qpci_iomap() then pass the returned token into IO accessor
functions. This, and the fact that iomap() returns a (void *) which
actually contains a PCI space address, kind of suggests that the return
value from iomap is supposed to be an opaque token.
..except that the callers expect to be able to add offsets to it. Which
also assumes the compiler will support pointer arithmetic on a (void *),
and treat it as working with byte offsets.
To clarify this situation change iomap() and the IO accessors to take
a definitely opaque BAR handle (enforced with a wrapper struct) along with
an offset within the BAR. This changes both the functions and all the
callers.
There were a number of places that checked if iomap() returned non-NULL,
and or initialized it to NULL before hand. Since iomap() already assert()s
if it fails to map the BAR, these tests were mostly pointless and are
removed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
In a couple of places ahci-test makes assumptions about how the tokens
returned from qpci_iomap() are formatted in ways it probably shouldn't.
First in verify_state() it uses a non-NULL token to indicate that the AHCI
device has been enabled (part of enabling is to iomap()). This changes it
to use an explicit 'enabled' flag instead.
Second, it uses the fact that the token contains a PCI address, stored when
the BAR is mapped during initialization to check that the BAR has the same
value after a migration. This changes it to explicitly read the BAR
register before and after the migration and compare.
Together, these changes will make the test more robust against changes to
the internals of the libqos PCI layer.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
ivshmem implements a block of shared memory in a PCI BAR. Currently our
test case accesses this using qtest_mem{read,write}. However, deducing
the correct addresses for these requires making assumptions about the
internel format returned by qpci_iomap(), along with some ugly casts.
This patch changes the test to use the new qpci_mem{read,write} interfaces
which is neater.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Currently the libqos PCI layer includes accessor helpers for 8, 16 and 32
bit reads and writes. It's likely that we'll want 64-bit accesses in the
future (plenty of modern peripherals will have 64-bit reigsters). This
adds them.
For PIO (not MMIO) accesses on the PC backend, this is implemented as two
32-bit ins or outs. That's not ideal but AFAICT x86 doesn't have 64-bit
versions of in and out.
This patch also converts the single current user of 64-bit accesses -
virtio-pci.c to use the new mechanism, rather than a sequence of 8 byte
reads.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
ide-test uses many explicit inb() / outb() operations for its IO, which
means it's not portable to non-x86 platforms. This cleans it up to use
the libqos PCI accessors instead.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
In the libqos PCI code we now have accessors both for registers (byte
significance preserving) and for streaming data (byte address order
preserving). These exist in both the interface for qtest drivers and in
the machine specific backends.
However, the register-style accessors aren't actually necessary in the
backend. They can be implemented in terms of the byte address order
preserving accessors by the libqos wrappers. This works because PCI is
always little endian.
This does assume that the back end byte address order preserving accessors
will perform the equivalent of a single bus transaction for short lengths.
This is the case, and in fact they currently end up using the same
cpu_physical_memory_rw() implementation within the qtest accelerator.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Currently PCI memory (aka MMIO) space is accessed via a set of readb/writeb
style accessors. This is what we want for accessing discrete registers of
a certain size. However, there are a few cases where we instead need a
"bag of bytes" style streaming interface to PCI MMIO space. This can be
either for streaming data style registers or when there's actual memory
rather than registers in PCI space, for example frame buffers or ivshmem.
This patch adds backend callbacks, and libqos wrappers for this type of
byte address order preserving accesses.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Avoid tco-test making assumptions about the internal format of the address
tokens passed to PCI IO accessors, by using the new qpci_legacy_iomap()
function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
The usual model for PCI IO with libqos is to use qpci_iomap() to map a
specific BAR for a PCI device, then perform IOs within that BAR using
qpci_io_{read,write}*().
However, certain devices also have legacy PCI IO. In this case, instead of
(or as well as) being accessed via PCI BARs, the device can be accessed
via certain well-known, fixed addresses in PCI IO space.
Two existing tests use legacy PCI IO, and take different flawed approaches
to it:
* tco-test manually constructs a tco_io_base value instead of calling
qpci_iomap(), which assumes internal knowledge of the structure of
the value it shouldn't have
* ide-test uses direct in*() and out*() calls instead of using
qpci_io_*() accessors, meaning it's not portable to non-x86 machine
types.
This patch implements a new qpci_iomap_legacy() interface which gets a
handle in the same format as qpci_iomap() but refers to a region in
the legacy PIO space. For a device which has the same registers
available both in a BAR and in legacy space (quite common), this
allows the same test code to test both options with just a different
iomap() at the beginning.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
The PCI backends in libqos each supply an iomap() and iounmap() function
which is used to set up a specified PCI BAR. But PCI BAR allocation takes
place entirely within PCI space, so doesn't really need per-backend
versions. For example, Linux includes generic BAR allocation code used on
platforms where that isn't done by firmware.
This patch merges the BAR allocation from the two existing backends into a
single simplified copy. The back ends just need to set up some parameters
describing the window of PCI IO and PCI memory addresses which are
available for allocation. Like both the existing versions the new one uses
a simple bump allocator.
Note that (again like the existing versions) this doesn't really handle
64-bit memory BARs properly. It is actually used for such a BAR by the
ivshmem test, and apparently the 32-bit MMIO BAR logic is close enough to
work, as long as the BAR isn't too big. Fixing that to properly handle
64-bit BAR allocation is a problem for another time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
The PCI IO space (aka PIO, aka legacy IO) and PCI memory space (aka MMIO)
are distinct address spaces by the PCI spec (although parts of one might be
aliased to parts of the other in some cases).
However, qpci_io_read*() and qpci_io_write*() can perform accesses to
either space depending on parameter. That's convenient for test case
drivers, since there are a fair few devices which can be controlled via
either a PIO or MMIO BAR but with an otherwise identical driver.
This is implemented by having addresses below 64kiB treated as PIO, and
those above treated as MMIO. This works because low addresses in memory
space are generally reserved for DMA rather than MMIO.
At the moment, this demultiplexing must be handled by each PCI backend
(pc and spapr, so far). There's no real reason for this - the current
encoding is likely to work for all platforms, and even if it doesn't we
can still use a more complex common encoding since the value returned from
iomap are semi-opaque.
This patch moves the demultiplexing into the common part of the libqos PCI
code, with the backends having simpler, separate accessors for PIO and
MMIO space. This also means we have a way of explicitly accessing either
space if it's necessary for some special case.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
The 'addr' parameter to qvirtio_config_read*() doesn't have a consistent
meaning: when using the virtio-pci versions, it's a full PCI space address,
but for virtio-mmio, it's an offset from the device's base mmio address.
This means that the callers need to do different things to calculate the
addresses in the two cases, which rather defeats the purpose of function
pointer backends.
All the current users of these functions are using them to retrieve
variables from the device specific portion of the virtio config space.
So, this patch alters the semantics to always be an offset into that
device specific config area.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
ADB devices must take new handler into account only when they recognize it.
This lets operating systems probe for valid/invalid handles, to know device capabilities.
Add a FIXME in keyboard handler, which should use a different translation
table depending of the selected handler.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
ibm,architecture-vec-5 is supposed to encode all option vector 5 bits
negotiated between platform/guest. Currently we hardcode this property
in the boot-time device tree to advertise a single negotiated
capability, "Form 1" NUMA Affinity, regardless of whether or not CAS
has been invoked or that capability has actually been negotiated.
Improve this by generating ibm,architecture-vec-5 based on the full
set of option vector 5 capabilities negotiated via CAS.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In some cases, ibm,client-architecture-support calls can fail. This
could happen in the current code for situations where the modified
device tree segment exceeds the buffer size provided by the guest
via the call parameters. In these cases, QEMU will reset, allowing
an opportunity to regenerate the device tree from scratch via
boot-time handling. There are potentially other scenarios as well,
not currently reachable in the current code, but possible in theory,
such as cases where device-tree properties or nodes need to be removed.
We currently don't handle either of these properly for option vector
capabilities however. Instead of carrying the negotiated capability
beyond the reset and creating the boot-time device tree accordingly,
we start from scratch, generating the same boot-time device tree as we
did prior to the CAS-generated and the same device tree updates as we
did before. This could (in theory) cause us to get stuck in a reset
loop. This hasn't been observed, but depending on the extensiveness
of CAS-induced device tree updates in the future, could eventually
become an issue.
Address this by pulling capability-related device tree
updates resulting from CAS calls into a common routine,
spapr_dt_cas_updates(), and adding an sPAPROptionVector*
parameter that allows us to test for newly-negotiated capabilities.
We invoke it as follows:
1) When ibm,client-architecture-support gets called, we
call spapr_dt_cas_updates() with the set of capabilities
added since the previous call to ibm,client-architecture-support.
For the initial boot, or a system reset generated by something
other than the CAS call itself, this set will consist of *all*
options supported both the platform and the guest. For calls
to ibm,client-architecture-support immediately after a CAS-induced
reset, we call spapr_dt_cas_updates() with only the set
of capabilities added since the previous call, since the other
capabilities will have already been addressed by the boot-time
device-tree this time around. In the unlikely event that
capabilities are *removed* since the previous CAS, we will
generate a CAS-induced reset. In the unlikely event that we
cannot fit the device-tree updates into the buffer provided
by the guest, well generate a CAS-induced reset.
2) When a CAS update results in the need to reset the machine and
include the updates in the boot-time device tree, we call the
spapr_dt_cas_updates() using the full set of negotiated
capabilities as part of the reset path. At initial boot, or after
a reset generated by something other than the CAS call itself,
this set will be empty, resulting in what should be the same
boot-time device-tree as we generated prior to this patch. For
CAS-induced reset, this routine will be called with the full set of
capabilities negotiated by the platform/guest in the previous
CAS call, which should result in CAS updates from previous call
being accounted for in the initial boot-time device tree.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Changed an int -> bool conversion to be more explicit]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently we access individual bytes of an option vector via
ldub_phys() to test for the presence of a particular capability
within that byte. Currently this is only done for the "dynamic
reconfiguration memory" capability bit. If that bit is present,
we pass a boolean value to spapr_h_cas_compose_response()
to generate a modified device tree segment with the additional
properties required to enable this functionality.
As more capability bits are added, will would need to modify the
code to add additional option vector accesses and extend the
param list for spapr_h_cas_compose_response() to include similar
boolean values for these parameters.
Avoid this by switching to spapr_ovec_* helpers so we can do all
the parsing in one shot and then test for these additional bits
within spapr_h_cas_compose_response() directly.
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PAPR guests advertise their capabilities to the platform by passing
an ibm,architecture-vec structure via an
ibm,client-architecture-support hcall as described by LoPAPR v11,
B.6.2.3. during early boot.
Using this information, the platform enables the capabilities it
supports, then encodes a subset of those enabled capabilities (the
5th option vector of the ibm,architecture-vec structure passed to
ibm,client-architecture-support) into the guest device tree via
"/chosen/ibm,architecture-vec-5".
The logical format of these these option vectors is a bit-vector,
where individual bits are addressed/documented based on the byte-wise
offset from the beginning of the bit-vector, followed by the bit-wise
index starting from the byte-wise offset. Thus the bits of each of
these bytes are stored in reverse order. Additionally, the first
byte of each option vector is encodes the length of the option vector,
so byte offsets begin at 1, and bit offset at 0.
This is not very intuitive for the purposes of mapping these bits to
a particular documented capability, so this patch introduces a set
of abstractions that encapsulate the work of parsing/encoding these
options vectors and testing for individual capabilities.
Cc: Bharata B Rao <bharata@linux.vnet.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Tweaked double-include protection to not trigger a checkpatch
false positive]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For historical reasons construction of the guest device tree in spapr is
divided between spapr_create_fdt_skel() which is called at init time, and
spapr_build_fdt() which runs at reset time. Over time, more and more
things have needed to be moved to reset time.
Previous cleanups mean the only things left in spapr_create_fdt_skel() are
the properties of the root node itself. Finish consolidating these two
parts of device tree construction, by moving this to the start of
spapr_build_fdt(), and removing spapr_create_fdt_skel() entirely.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Construction of the /vdevice node (and its children) is divided between
spapr_create_fdt_skel() (at init time), which creates the base node, and
spapr_populate_vdevice() (at reset time) which creates the nodes for each
individual virtual device.
This consolidates both into a single function called from
spapr_build_fdt().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently the /hypervisor device tree node is constructed in
spapr_create_fdt_skel(). As part of consolidating device tree construction
to reset time, move it to a function called from spapr_build_fdt().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The /event-sources device tree node is built from spapr_create_fdt_skel().
As part of consolidating device tree construction to reset time, this moves
it to spapr_build_fdt().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
For historical reasons construction of the /rtas node in the device
tree (amongst others) is split into several places. In particular
it's split between spapr_create_fdt_skel(), spapr_build_fdt() and
spapr_rtas_device_tree_setup().
In fact, as well as adding the actual RTAS tokens to the device tree,
spapr_rtas_device_tree_setup() just adds the ibm,lrdr-capacity
property, which despite going in the /rtas node, doesn't have a lot to
do with RTAS.
This patch consolidates the code constructing /rtas together into a new
spapr_dt_rtas() function. spapr_rtas_device_tree_setup() is renamed to
spapr_dt_rtas_tokens() and now only adds the token properties.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
For historical reasons, building the /chosen node in the guest device tree
is split across several places and includes both parts which write the DT
sequentially and others which use random access functions.
This patch consolidates construction of the node into one place, using
random access functions throughout.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently the device tree node for the XICS interrupt controller is in
spapr_create_fdt_skel(). As part of consolidating device tree construction
to reset time, this moves it to a function called from spapr_build_fdt().
In addition we move the actual code into hw/intc/xics_spapr.c with the
rest of the PAPR specific interrupt controller code.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
At each system reset, the pseries machine needs to load RTAS (the runtime
portion of the guest firmware) into the VM. This means copying
the actual RTAS code into guest memory, and also updating the device
tree so that the guest OS and boot firmware can locate it.
For historical reasons the copy and update to the device tree were in
different parts of the code. This cleanup brings them both together in
an spapr_load_rtas() function.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
The flattened device tree passed to pseries guests contains a list of
reserved memory areas. Currently we construct this list early in
spapr_create_fdt_skel() as we sequentially write the fdt.
This will be inconvenient for upcoming cleanups, so this patch moves
the reserve map changes to the end of fdt construction. This changes
fdt_add_reservemap_entry() calls - which work when writing the fdt
sequentially to fdt_add_mem_rsv() calls used when altering the fdt in
random access mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Currently spapr_create_fdt_skel() takes a bunch of individual parameters
for various things it will put in the device tree. Some of these can
already be taken directly from sPAPRMachineState. This patch alters it so
that all of them can be taken from there, which will allow this code to
be moved away from its current caller in future.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
spapr_finalize_fdt() both finishes building the device tree for the guest
and loads it into guest memory. For future cleanups, it's going to be
more convenient to do these two things separately. The loading portion is
pretty trivial, so we move it inline into the caller, ppc_spapr_reset().
We also rename spapr_finalize_fdt(), because the current name is going to
become inaccurate.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
As Qemu only supports a single instance of the ISA bus, we use the LPC
controller of chip 0 to create one and plug in a couple of useful
devices, like an UART and RTC. An IPMI BT device, which is also an ISA
device, can be defined on the command line to connect an external BMC.
That is for later.
The PowerNV machine now has a console. Skiboot should load a kernel
and jump into it but execution will stop quite early because we lack a
model for the native XICS controller for the moment :
[ 0.000000] NR_IRQS:512 nr_irqs:512 16
[ 0.000000] XICS: Cannot find a Presentation Controller !
[ 0.000000] ------------[ cut here ]------------
[ 0.000000] WARNING: at arch/powerpc/platforms/powernv/setup.c:81
...
[ 0.000000] NIP [c00000000079d65c] pnv_init_IRQ+0x30/0x44
You can still do a few things under xmon.
Based on previous work from :
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Trivial fix for a change in the serial_hds_isa_init() interface]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The LPC (Low Pin Count) interface on a POWER8 is made accessible to
the system through the ADU (XSCOM interface). This interface is part
of set of units connected together via a local OPB (On-Chip Peripheral
Bus) which act as a bridge between the ADU and the off chip LPC
endpoints, like external flash modules.
The most important units of this OPB are :
- OPB Master: contains the ADU slave logic, a set of internal
registers and the logic to control the OPB.
- LPCHC (LPC HOST Controller): which implements a OPB Slave, a set of
internal registers and the LPC HOST Controller to control the LPC
interface.
Four address spaces are provided to the ADU :
- LPC Bus Firmware Memory
- LPC Bus Memory
- LPC Bus I/O (ISA bus)
- and the registers for the OPB Master and the LPC Host Controller
On POWER8, an intermediate hop is necessary to reach the OPB, through
a unit called the ECCB. OPB commands are simply mangled in ECCB write
commands.
On POWER9, the OPB master address space can be accessed via MMIO. The
logic is same but the code will be simpler as the XSCOM and ECCB hops
are not necessary anymore.
This version of the LPC controller model doesn't yet implement support
for the SerIRQ deserializer present in the Naples version of the chip
though some preliminary work is there.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
- ported on latest PowerNV patchset
- changed the XSCOM interface to fit new model
- QOMified the model
- moved the ISA hunks in another patch
- removed printf logging
- added a couple of UNIMP logging
- rewrote commit log ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that we are using real HW ids for the cores in PowerNV chips, we
can route the XSCOM accesses to them. We just need to attach a
specific XSCOM memory region to each core in the appropriate window
for the core number.
To start with, let's install the DTS (Digital Thermal Sensor) handlers
which should return 38°C for each core.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On a real POWER8 system, the Pervasive Interconnect Bus (PIB) serves
as a backbone to connect different units of the system. The host
firmware connects to the PIB through a bridge unit, the
Alter-Display-Unit (ADU), which gives him access to all the chiplets
on the PCB network (Pervasive Connect Bus), the PIB acting as the root
of this network.
XSCOM (serial communication) is the interface to the sideband bus
provided by the POWER8 pervasive unit to read and write to chiplets
resources. This is needed by the host firmware, OPAL and to a lesser
extent, Linux. This is among others how the PCI Host bridges get
configured at boot or how the LPC bus is accessed.
To represent the ADU of a real system, we introduce a specific
AddressSpace to dispatch XSCOM accesses to the targeted chiplets. The
translation of an XSCOM address into a PCB register address is
slightly different between the P9 and the P8. This is handled before
the dispatch using a 8byte alignment for all.
To customize the device tree, a QOM InterfaceClass, PnvXScomInterface,
is provided with a populate() handler. The chip populates the device
tree by simply looping on its children. Therefore, each model needing
custom nodes should not forget to declare itself as a child at
instantiation time.
Based on previous work done by :
Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
[dwg: Added cpu parameter to xscom_complete()]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is largy inspired by sPAPRCPUCore with some simplification, no
hotplug for instance. A set of PnvCore objects is added to the PnvChip
and the device tree is populated looping on these cores.
Real HW cpu ids are now generated depending on the chip cpu model, the
chip id and a core mask. The id is propagated to the CPU object, using
properties, to set the SPR_PIR (Processor Identification Register)
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The Processor Identification Register (PIR) is a register that holds a
processor identifier which is used for bus transactions (XSCOM) and
for processor differentiation in multiprocessor systems. It also used
in the interrupt vector entries (IVE) to identify the thread serving
the interrupts.
P9 and P8 have some differences in the CPU PIR encoding.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This will be used to build real HW ids for the cores and enforce some
limits on the available cores per chip.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is is an abstraction of a POWER8 chip which is a set of cores
plus other 'units', like the pervasive unit, the interrupt controller,
the memory controller, the on-chip microcontroller, etc. The whole can
be seen as a socket. It depends on a cpu model and its characteristics:
max cores and specific inits are defined in a PnvChipClass.
We start with an near empty PnvChip with only a few cpu constants
which we will grow in the subsequent patches with the controllers
required to run the system.
The Chip CFAM (Common FRU Access Module) ID gives the model of the
chip and its version number. It is generally the first thing firmwares
fetch, available at XSCOM PCB address 0xf000f, to start initialization.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The goal is to emulate a PowerNV system at the level of the skiboot
firmware, which loads the OS and provides some runtime services. Power
Systems have a lower firmware (HostBoot) that does low level system
initialization, like DRAM training. This is beyond the scope of what
qemu will address in a PowerNV guest.
No devices yet, not even an interrupt controller. Just to get started,
some RAM to load the skiboot firmware, the kernel and initrd. The
device tree is fully created in the machine reset op.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
- replaced fprintf by error_report
- used a common definition of _FDT macro
- removed VMStateDescription as migration is not yet supported
- added IBM Copyright statements
- reworked kernel_filename handling
- merged PnvSystem and sPowerNVMachineState
- removed PHANDLE_XICP
- added ppc_create_page_sizes_prop helper
- removed nmi support
- removed kvm support
- updated powernv machine to version 2.8
- removed chips and cpus, They will be provided in another patches
- added a machine reset routine to initialize the device tree (also)
- french has a squelette and english a skeleton.
- improved commit log.
- reworked prototypes parameters
- added a check on the ram size (thanks to Michael Ellerman)
- fixed chip-id cell
- changed MAX_CPUS to 2048
- simplified memory node creation to one node only
- removed machine version
- rewrote the device tree creation with the fdt "rw" routines
- s/sPowerNVMachineState/PnvMachineState/
- etc.]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When configured to compile out of tree, the configure script
copies BIOS blobs to the build directory. However since the PPC64 powernv
machine ROM has .lid extension, it is ignored and "make check" fails
when trying the powernv machine.
This adds *.lid to the list of copied blobs.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is the initial image of skiboot 5.3.7 (commit 762d0082) for
the PowerPC PowerNV (Non-Virtualized) platform. Built from
submodule.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The original QOMification of the spapr VIO devices in 3954d33 "spapr:
convert to QEMU Object Model (v2)" moved some callbacks from the
VIOsPAPRBus structure to the VIOsPAPRDeviceClass. Except, that it
forgot to actually remove them from the VIOsPAPRBus structure (which
still exists, though it doesn't fulfill quite the same function as it
did pre-QOM).
This patch removes those now unused callback fields.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Power ISA specifies ME bit handling for system reset interrupt:
if the interrupt occurred while the thread was in power-saving
mode, set to 1; otherwise not altered
Power ISA 3.0, section 6.5 "Interrupt Definitions", Figure 64.
Signed-off-by: Nicholas Piggin <npiggin@gmail.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The routines :
void icp_set_cppr(ICPState *icp, uint8_t cppr);
void icp_set_mfrr(ICPState *icp, uint8_t mfrr);
void icp_eoi(ICPState *icp, uint32_t xirr);
now use one 'ICPState *icp' argument instead of a 'XICSState *' and a
server arguments. The backlink on XICSState* is used whenever needed.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The link will be used to change the API of the icp_* routines which
are still using an XICSState as an argument.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xics_spapr and xics_kvm nearly define the same 'set_nr_servers'
handler. Only the type of the ICP differs. So let's make a common one
to remove some duplicated code.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The header now only contains inline functions related to the
Sun NVRAM, so the a name like sun_nvram.h seems to be more
appropriate now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Everything that is related to CHRP NVRAM should rather reside in
chrp_nvram.c / chrp_nvram.h instead of openbios_firmware_abi.h.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The system and free space NVRAM partitions (for OpenBIOS) are created
in exactly the same way as the Mac-style CHRP NVRAM partitions, so we
can use the new common helper functions to do this job here, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "system partition" and "free space" partition layouts are
defined by the CHRP and LoPAPR specification, and used by
OpenBIOS and SLOF. We can re-use this code for other machines
that use OpenBIOS and SLOF, too. So let's make this code independent
from the MAC NVRAM environment and put it into two proper helper
functions.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
With the addition of "numa_node" properties for PHBs we began
advertising NUMA affinity in cases where nb_numa_nodes > 1.
Since the default on the guest side is to make no assumptions about
PHB NUMA affinity (defaulting to -1), there is still a valid use-case
for explicitly defining a PHB's NUMA affinity even when there's just
one node. In particular, some workloads make faulty assumptions about
/sys/bus/pci/<devid>/numa_node being >= 0, warranting the use of
this property as a workaround even if there's just 1 PHB or NUMA
node.
Enable this use-case by always advertising the PHB's NUMA affinity
if "numa_node" has been explicitly set.
We could achieve this by relaxing the check to simply be
nb_numa_nodes > 0, but even safer would be to check
numa_info[nodeid].present explicitly, and to fail at start time
for cases where it does not exist.
This has an additional affect of no longer advertising PHB NUMA
affinity unconditionally if nb_numa_nodes > 1 and "numa_node"
property is unset/-1, but since the default value on the guest
side for each PHB is also -1, the behavior should be the same for
that situation. We could still retain the old behavior if desired,
but the decision seems arbitrary, so we take the simpler route.
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Cc: Shivaprasad G. Bhat <shivapbh@in.ibm.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
but disable MSI-X tests on SPAPR as we can't check the result
(the memory region used on PC is not readable on SPAPR).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch replaces calls to qtest_start() and qtest_end() by
calls to qtest_pc_boot() and qtest_shutdown().
This allows to initialize memory allocator and PCI interface
functions. This will ease to enable virtio tests on other
architectures by only adding a specific qtest_XXX_boot() (like
qtest_spapr_boot()).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Move the definition to libqos/virtio.h as it must be used
only with virtio functions.
Add a QVirtioDevice parameter as it will be needed to
know if the virtio device is using virtio 1.0 specification
and thus is always little-endian (to do)
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qtest_spapr_boot()/qtest_pc_boot()/qtest_boot() call qtest_vboot()
and qtest_vboot() calls g_malloc(),
and g_malloc() never fails:
if memory allocation fails, the application is terminated.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Useful to debug interrupt problems.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[clg: - updated for qemu-2.7
- added a test on ->irqs as it is not necessarily allocated
(PHB3_MSI)
- removed static variable g_xics and replace with a loop on all
children to find the xics objects.
- rebased on InterruptStatsProvider interface ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The main changes are:
* virtio-serial
* booting speed imrovement
* better PCI bridge support
The complete changelog is:
> virtio-serial: Fix compile error
> scsi: Remove debug functions from scsi-loader.fs
> scsi: Remove unused read-6 command
> obp-tftp: Remove the ciregs-buffer
> libnet: Simplify the net-load arguments passing
> libnet: Simplify the Forth-to-C wrapper of ping()
> Do not link libnet to net-snk anymore, and remove net-snk from board-qemu
> Add a Forth-to-C wrapper for the ping command, too
> Link libnet code to Paflof and add a wrapper for netboot()
> Remember execution tokens of "write" and "read" for socket operations
> Add virtio-serial device support
> Generalize output banner write routine
> Improve indentation in OF.fs
> scsi: implement READ (16) command
> rtas: Improve rtas-do-config-@ and rtas-do-config-! a little bit
> libnet: Make netapps.h includable from .code files
> libnet: Remove unused prototypes from netapps.h
> libnet: Fix the printout of the ping command
> libnet: Make sure to close sockets when we're done
> scsi: implement read-capacity-16
> pci: Fix secondary and subordinate PCI bus enumeration with board-qemu
> pci-phb: Fix stack underflow in phb-pci-walk-bridge
> paflof: Add a read() function to read keyboard input
> paflof: Add socket(), send() and recv() functions to paflof
> paflof: Provide get_timer() and set_timer() helper functions
> paflof: Add a write_mm_log helper function
> paflof: Copy sbrk code from net-snk
> paflof: Use CFLAGS from make.rules instead of completely redefining them
> Do not include the FCode evaluator by default anymore
> Source code beautification of board-qemu/slof/pci-interrupts.fs
> Allow PCI devices in PCI bridge slots greater than 4
> Fix bad interrupt pin numbering in interrupt-map property of PCI bridges
> Improve SLOF_alloc_mem_aligned()
> instance: Fix set-my-args for empty arguments
> Fix remaining compiler warnings in sloffs.c
> Remove misleading padding fields from ROM header definition
> Improve indentation in calculatecrc.h
> Do not include calculatecrc.h from assembler files
> Remove unused defines in calculatecrc.h
> libnet: Re-initialize global variables at the beginning of tftp()
> Remove dependency on cpu/@0 for booting
> usb: Set XHCI slot speed according to port status
> usb: Build correct route string for USB3 devices behind a hub
> usb: Initialize USB3 devices on a hub and keep track of hub topology
> usb: Increase amount of maximum slot IDs and add a sanity check
> usb: Move XHCI port state arrays from header to .c file
> tools: add copy functionality
> tools: added support to sloffs to read from /dev/slof_flash
> tools: added file append functionality
> tools: use crc checking code from romfs/tools
> tools: added initial version of sloffs
> romfs: factored out crc code, to make it usable from other locations
> tools: remove unused parts from the Makefile
> usb-hid: Fix non-working comma key
> fat-files: Fix access to FAT32 dir/files when cluster > 16-bits
> virtio-net: fix ring handling in receive
> net: Remove remainders of the MTFTP code
> net: Move also files from clients/net-snk/app/netapps/ to lib/libnet/
> net: Move files from clients/net-snk/app/netlib/ to lib/libnet/
> net-snk: Get rid of netlib and netapps prefixes in include statements
> usb-xhci: assign field4 before conditional
> Improve F12 key handling in boot menu
> Fix stack underflow that occurs with duplicated ESC in input
> rtas-nvram: optimize erase
> ipv6: Replace magic number 1500 with ETH_MTU_SIZE (i.e. 1518)
> ipv6: Fix NULL pointer dereference in ip6addr_add()
> ipv6: Fix memory leak in set_ipv6_address() / ip6_create_ll_address()
> ipv6: Clear memory after malloc if necessary
> ipv6: Fix possible NULL-pointer dereference in send_ipv6()
> ping: use gateway address for routing
> ping: add netmask in the ping argument
> xhci: fix missing keys from keyboard
> xhci: add memory barrier after filling the trb
> loaders: Remove netflash command
> boot: Remove legacy Forth words for network loading
> base: Move cnt-bits and bcd-to-bin to board-js2x folder
> base: Move huge-tftp-load variable to obp-tftp package
> base: Remove unused IP address conversion functions
> virtio: White space cleanup in virtio-9p.c
> virtio: Add modern version 1.0 support to 9p driver
> virtio: Set a proper name for virtio-9p device tree nodes
> pci: Fix mistype in "unkown-bridge"
> ipv6: Indent code with tabs, not with spaces
> ipv6: send_ipv6() has to return after doing NDP
> ipv6: Do not use unitialized MAC address array
> ipv6: Add support for sending packets through a router
> Remove unused sms code.
> virtio-net: initialize to populate mac address
> libbootmsg: Do not use '\b' characters when printing checkpoints
> dev-null: The "read" function has to return 0 if nothing has been read
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This makes the FloppyDrive qdev object actually useful: Now that it has
all properties that don't belong to the controller, you can actually
use '-device floppy' and get a working result.
Command line semantics is consistent with CD-ROM drives: By default you
get a single empty floppy drive. You can override it with -drive and
using the same index, but if you use -drive to add a floppy to a
different index, you get both of them. However, as soon as you use any
'-device floppy', even to a different slot, the default drive is
disabled.
Using '-device floppy' without specifying the unit will choose the first
free slot on the controller.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Message-id: 1477386868-21826-4-git-send-email-kwolf@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Since the order of keys in JSON filenames is not necessarily fixed, they
should not be compared to fixed strings. This method takes a Python dict
as a reference, parses a given JSON filename and compares both.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This gives us more freedom about the fd that is passed to qemu, allowing
us to e.g. pass sockets.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
By adding an optional suffix to the files used for communication with a
VM, we can launch multiple VM instances concurrently.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Drop the use of legacy options in favor of the SocketAddress
representation, even for internal use (i.e. for storing the result of
the filename parsing).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add a new option "server" to the NBD block driver which accepts a
SocketAddress.
"path", "host" and "port" are still supported as legacy options and are
mapped to their corresponding SocketAddress representation.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Right now, we have four possible options that conflict with specifying
an NBD filename, and a future patch will add another one ("address").
This future option is a nested QDict that is flattened at this point,
requiring us to test each option whether its key has an "address."
prefix. Therefore, we will then need to iterate through all options
(including the "export" option which was not covered so far).
Adding this iteration logic now will simplify adding the new option
later. A nice side effect is that the user will not receive a long list
of five options which are not supposed to be specified with a filename,
but we can actually print the problematic option.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of inlining this nice macro (i.e. resorting to
qdict_put_obj(..., QOBJECT(...))), use it.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of not emitting the port in nbd_refresh_filename(), just set it
to the default if the user did not specify it. This makes the logic a
bit simpler.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, a port that is passed along with a UNIX socket path is
silently ignored. That is not exactly ideal, it should be an error
instead.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 076003f5 added configuration for NFS with IMGOPTSSYNTAX enabled,
but it didn't use the right variable name: $TEST_DIR_OPTS doesn't exist.
This fixes the mistake.
However, this doesn't make anything work that was broken before: The
only way to get IMGOPTSSYNTAX is with -luks, but the combination of
-luks and -nfs doesn't get qemu-img create commands right (because
qemu-img create doesn't support --image-opts yet), so even after this
fix some more work would be required to make the tests pass.
Reported-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's the simpler interface to use for the raw format driver.
Apart from that, this removes the last user of the AIO emulation
implemented by bdrv_aio_ioctl().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This allows drivers to implement ioctls in a coroutine-based way.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Instead of letting raw-posix use the bdrv_ioctl() abstraction to issue
an ioctl to itself, just call ioctl() directly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch exports a bdrv_co_ioctl() function and uses it to extend this
mode of operation to ioctls.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch extends this mode of operation to discards.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
All read/write functions already have a single coroutine-based function
on the BlockBackend level through which all requests go (no matter what
API style the external caller used) and which passes the requests down
to the block node level.
This patch extends this mode of operation to flushes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
New in this release:
===================
* Initial support for Trusted Platform Module (TPM) version 2.0
* Several USB XHCI timing fixes on real hardware
* Support for "LSI MPT Fusion" scsi controllers on QEMU
* Support for virtio devices mapped above 4GB
* Several bug fixes and code cleanups
git shortlog rel-1.9.3..rel-1.10.0
==================================
Alex Williamson (1):
fw/pci: Add support for mapping Intel IGD via QEMU
Cao jin (1):
Fix comment typo
Cole Robinson (1):
biostables: Support SMBIOS 2.6+ UUID format
Dana Rubin (2):
pvscsi: Fix incorrect arguments order in call to memalign_low
pvscsi: Use high memory for rings
Don Slutz (1):
Support for booting from LSI Logic LSI53C1030, SAS1068, SAS1068e
Gerd Hoffmann (4):
ahci: set transfer mode according to the capabilities of connected drive
virtio: uninline _vp_{read,write}
virtio: pci cfg access
virtio: fix virtio-pci
Haozhong Zhang (1):
fw/msr_feature_control: add support to set MSR_IA32_FEATURE_CONTROL
Igor Mammedov (3):
paravirt: disable legacy bios tables in case of more than 255 CPUs
add helpers to read etc/boot-cpus at resume time
support booting with more than 255 CPUs
Kevin O'Connor (124):
usb: Allow configuration of sigatt time (in etc/usb-time-sigatt)
xhci: Check for device disconnects during USB2 reset polling
sdcard: Only enable error_irq_enable for bits defined in SDHCI v1 spec
sdcard: fix typo causing 32bit write to 16bit block_size field
sdcard: Enable extra debugging on sdcard_waitw() timeout
acpi_extract: Move main code to new function main()
acpi_extract: Make the generated .hex files more human readable
acpi_extract: Don't generate unused (and empty) q35-acpi-dsdt.hex file
acpi: Don't build SSDT files on every build; store them in git
acpi: Remove build check for iasl
tpm: Move standard definitions from tcgbios.h to new file std/tcg.h
util.h: Minor - HaveRunPost is in misc.c not resume.c
tpm: Add "static" declaration to functions not used outside tcgbios.c
tpm: Move code around in tcgbios.c
tpm: Move error recovery from tpm_extend_acpi_log() to only caller
tpm: Open code tpm_ipl() into callers
tpm: Change tpm_add_measurement() to tpm_add_action()
tpm: Move tpm_add_bootdevice() into callers
tpm: Move tpm_start_option_rom_scan() and tpm_calling_int19h() into callers
tpm: pcpes->event is a variable length array
tpm: Don't pass entry_count around in parameters to/from tpm_extend_acpi_log()
tpm: There is no need to pass pcrindex to hash_log_extend_event()
tpm: Perform hashing separately from logging
tpm: There is no need to pass event_length to hash/extend functions
tpm: Avoid scatter-gather copying in build_and_send_cmd()
tpm: Don't implement scatter-gather in transmit()
tpm: Merge tpm_log_event() and tpm_extend_acpi_log()
tpm: Merge tpm_log_extend_event() and tpm_extend(); extend before logging
xhci: Wait for port enable even for USB3 devices
xhci: Improve port status change debugging
xhci: Disable slot on failed set_address command
nmi: Don't try to switch onto extra stack in NMI handler
scsi: Do not call printf() from scsi_is_ready()
block: Report drive->sectors using "%u" instead of "%d"
tpm: Add banner separating the TCG bios interface code from TCG menu code
tpm: Avoid macro expansion of tpm request / response structs
tpm: Simplify hardware probe and detection checks
tpm: Add wrapper function tpmhw_set_timeouts()
tpm: Move TPM hardware functions from tcgbios.c to hw/tpm_drivers.c
tpm: Rework TPM interface shutdown support
tpm: Simplify tcpa probe
tpm: Introduce tpm_get_capability() helper function
tpm: Eliminate response buffer parameter from build_and_send_cmd()
tpm: Don't return a status from external bios measurement functions
tpm: No need to check the return status of measurements
tpm: Don't call tpm_set_failure() from tpm_log_extend_event()
tpm: Don't use 16bit BIOS return codes in build_and_send_cmd()
tpm: Don't use 16bit BIOS return codes in tpm_log_event()
tpm: Don't use 16bit BIOS return codes in tpmhw_* functions
tpm: Don't use 16bit BIOS return codes in TPM menu functions
usb: Remove usbdev->slotid field
coreboot: Check for unaligned cbfs header
resume: Make KVM soft reboot loop detection more flexible
post: Always set HaveRunPost prior to setting any other global variable
kbd: Don't treat scancode and asciicode as separate values
kbd: Refactor capslock and numlock handling
ehci: Only delay UHCI/OHCI port scan until after EHCI setup completes
usb: Eliminate USB controller setup thread
pci: Add helper functions for internal driver BAR handling
ahci: Convert to new PCI BAR helper functions
ata: Convert to new PCI BAR helper functions
esp-scsi: Convert to new PCI BAR helper functions
lsi-scsi: Convert to new PCI BAR helper functions
megasas: Convert to new PCI BAR helper functions
pvscsi: Convert to new PCI BAR helper functions
sdcard: Convert to new PCI BAR helper functions
ehci: Convert to new PCI BAR helper functions
ohci: Convert to new PCI BAR helper functions
uhci: Convert to new PCI BAR helper functions
xhci: Convert to new PCI BAR helper functions
virtio: Convert to new PCI BAR helper functions
pci: Consistently set pci->have_drivers for devices with internal drivers
pci: Implement '%pP' printf handler for 'struct pci_device' pointers
pci: Move code in pci.c that is specific to pciinit.c to pciinit.c
pci: Split low-level pci code from higher-level 'struct pci_device' code
scsi: Always use MAXDESCSIZE when building drive description
block: Move drive setup to new function block_setup()
tpm: Unify tpm_fill_hash()/tpm_log_extend_event() and use in BIOS interface
docs: Note release date of 1.9.1
build: fix .text section address alignment
tpm: Write logs in TPM 2 format
mpt-scsi: Declare 'int i' outside of for loop for older compilers
block: Move send_disk_op() from block.c to disk.c
disk: Avoid stack_hop() path if already on the extra stack
optionroms: Drop support for CONFIG_OPTIONROMS_DEPLOYED
shadow: Batch PCI config writes
virtio: Use threads when scanning for virtio devices
scsi: Launch a thread when scanning for drives in the scsi drivers
docs: Note release date of 1.9.2
usb-xhci: Remove unused const variables
tcgbios: Remove unused const variable
vgabios: Remove special case of dh==0xff in handle_1013()
vgabios: Don't check for special case of page==0xff on external calls
vgabios: Simplify set_cursor_pos()
docs: Note release date of 1.9.3
vgabios: Simplify scroll logic
blockcmd: CMD_SCSI op is only used in 32bit mode
swcursor: Move swcursor code from vgafb.c to new file swcursor.c
swcursor: Concentrate swcursor logic in swcursor.c
vgafb: Move header definitions from vgabios.h to new file vgafb.h
vgainit: Move video param setup to stdvga_build_video_param()
vgautil: Add new header file with misc function and variable definitions
vgautil: Move generic definitions from stdvga.h to vgautil.h
vgautil: Move definitions from cbvga.h and clext.h to vgautil.h
version: Update header files now that version.c is not auto generated
checkstack: Handle conditional checks at start of functions
tpm: Append to TPM2 log the hashes used for PCR extension
ps2: Remove stale check for timeout warning on reset
pic: The default hardware interrupt handlers should not take a parameter
kbd: Implement 101-key keyboard keycode mapping
kbd: Implement extended keycode mappings for keypad-enter and keypad-/
kbd: Suppress keys without mappings
kbd: Merge bda->kbd_flag0 and bda->kbd_flag1
kbd: Extract out shift flag setting into new function
kbd: Move checking for special keys in __process_keys() into switch
kbd: Ignore fake shift keys
usb-hid: Generate Ctrl+Break and Alt+SysReq keys
kbd: Generate interrupt events for SysReq, PrtScr, and Break
post: Map int 0x05 to entry point
kbd: Move extended and release events out of special key detection switch
build: Be sure to also include out/*.d in Makefile
smp: consolidate CPU APIC ID detection and accounting
build: Add -fno-pie to the gcc flags when available
docs: Note v1.10.0 release
Marcel Apfelbaum (2):
fw/pci: do not automatically allocate IO region for PCIe bridges
fw/pci: add Q35 S3 support
Matt DeVillier (1):
sdcard: skip detection of PCI sdhci controllers if etc/sdcard used
Paolo Bonzini (1):
smp: restore MSRs on S3 resume
Piotr Król (1):
docs: fix various typos and inconsistency
Roger Pau Monne (1):
build: fix typo in buildversion.py
Stefan Berger (34):
tpm: Temporarily deactivate the TPM in case of failure
tpm: Refactor function building TPM commands
tpm: Refactor the parameters being passed to tpm_extend_acpi_log
tpm: Refactor hash_log_event BIOS interface function
tpm: Refactor hash_log_extend_event
tpm: fix compiler warning with older gcc versions
tpm: Drop code using the TPM for sha1
tpm: Set timeouts and durations to microsecond values
tpm: Cache all log related pointers in tpm_state
tpm: Refactor pass_through_to_tpm
tpm: Rename remaining interrupt functions
tpm: Remove check for working TPM from TPM interrupt handler
tpm: Check length parameter of the array
tpm: Add a menu for TPM configuration
tpm: Copy digest into HashLogExentEvent response
tpm: Move assert_physical_presence and dependencies
tpm: Add support for harware physical presence
tpm: Rework the assertion of physical presence
tpm: Remove usage of PP_CMD_ENABLE from all but one place
tpm: Do not set TPM in failure mode if menu command fails
tpm: Extend TPM TIS with TPM 2 support.
tpm: Factor out tpm_extend
tpm: Prepare code for TPM 2 functions
tpm: Implement tpm20_startup and tpm20_s3_resume
tpm: Implement tpm20_set_timeouts
tpm: Implement tpm20_prepboot
tpm: Implement tpm20_extend
tpm: Implement tpm20_menu
tpm: Implement TPM 2's tpm_set_failure part
tpm: Filter TPM commands in passthrough API
tpm: Retrieve the PCR Bank configuration
tpm: Restructure tpm20_extend to use buffer and take hash as parameter
tpm: Refactor tpml_digest_values_sha1 structure
tpm: Extend tpm20_extend to support extending to multiple PCR banks
Zheng Bao (1):
splash: Skip the RGB555 mode
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# gpg: Signature made Wed 26 Oct 2016 03:19:06 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
colo-proxy: fix memory leak
net: rtl8139: limit processing of ring descriptors
net: vmxnet: initialise local tx descriptor
e1000e: Don't zero out buffer address in rx descriptor
net: rocker: set limit to DMA buffer size
net: eepro100: fix memory leak in device uninit
tap-bsd: OpenBSD uses tap(4) now
net: pcnet: fix source formatting and indentation
net: pcnet: check rx/tx descriptor ring length
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The main loop creates two generic sources for the AIO
and IO handler systems.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all I/O channels created for VNC are given names
to distinguish their respective roles.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all I/O channels created for migration are given names
to distinguish their respective roles.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all I/O channels created for character devices
are given names to distinguish their respective roles.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Ensure that all I/O channels created for NBD are given names
to distinguish their respective roles.
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The GSource object has ability to have a name, which is useful
when debugging performance problems with the mainloop event
callbacks that take too long. By associating a name with a
QIOChannel object, we can then set the name on any GSource
associated with the channel.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
This patch adds a test to verify that the QIOChannel framework will not
unlink a filesystem unix socket unless the _FEATURE_LISTEN bit is set.
Due to a bug introduced in 74b6ce43, the framework would unlink the
entry if the _FEATURE_SHUTDOWN bit was set, regardless of the presence
of _FEATURE_LISTEN.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The SO_ACCEPTCONN ioctl is not portable across OS, with
some BSD versions and OS-X not supporting it. There is
no viable alternative to this, so instead just set the
feature explicitly when creating a listener socket.
The current users of qio_channel_socket_new_fd() won't
ever be given a listening socket, so there's no problem
with no auto-detecting it in this scenario
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Testing QIOChannel feature support can be done with a helper called
qio_channel_has_feature(). Setting feature support, however, was
done manually with a logical OR. This patch introduces a new helper
called qio_channel_set_feature() and makes use of it where applicable.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Parts of the code have been testing QIOChannel features directly with a
logical AND. This patch makes it all consistent by using the
qio_channel_has_feature() function to test if a feature is present.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When QIOChannels were introduced in 666a3af9, the feature bits were
already defined shifted. However, when using them, the code was shifting
them again. The incorrect use was consistent until 74b6ce43, where
QIO_CHANNEL_FEATURE_LISTEN was defined shifted but tested unshifted.
This patch changes the definition to be unshifted and fixes the
incorrect usage introduced on 74b6ce43.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Emulating LL/SC with cmpxchg is not correct, since it can
suffer from the ABA problem. However, portable parallel
code is written assuming only cmpxchg which means that in
practice this is a viable alternative.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than using helpers for physical accesses, use a mmu index.
The primary cleanup is with store-conditional on physical addresses.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Stop specializing on TARGET_LONG_BITS == 32; unconditionally allocate
a temp and expand with tcg_gen_extu_i32_tl. Split out gen_aa32_addr,
gen_aa32_frob64, gen_aa32_ld_i32 and gen_aa32_st_i32 as separate interfaces.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
With this microbenchmark we can measure the overhead of emulating atomic
instructions with a configurable degree of contention.
The benchmark spawns $n threads, each performing $o atomic ops (additions)
in a loop. Each atomic operation is performed on a different cache line
(assuming lines are 64b long) that is randomly selected from a range [0, $r).
[ Note: each $foo corresponds to a -foo flag ]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1467054136-10430-20-git-send-email-cota@braap.org>
The diff here is uglier than necessary. All this does is to turn
FOO
into:
if (s->prefix & PREFIX_LOCK) {
BAR
} else {
FOO
}
where FOO is the original implementation of an unlocked cmpxchg.
[rth: Adjust unlocked cmpxchg to use movcond instead of branches.
Adjust helpers to use atomic helpers.]
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1467054136-10430-6-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Allow qemu to build on 32-bit hosts without 64-bit atomic ops.
Even if we only allow 32-bit hosts to multi-thread emulate 32-bit
guests, we still need some way to handle the 32-bit guest using a
64-bit atomic operation. Do so by dropping back to single-step.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Force the use of cmpxchg16b on x86_64.
Wikipedia suggests that only very old AMD64 (circa 2004) did not have
this instruction. Further, it's required by Windows 8 so no new cpus
will ever omit it.
If we truely care about these, then we could check this at startup time
and then avoid executing paths that use it.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Add all of cmpxchg, op_fetch, fetch_op, and xchg.
Handle both endian-ness, and sizes up to 8.
Handle expanding non-atomically, when emulating in serial.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
TGT_LE and TGT_BE are not size dependent and do not need to be
redefined. The others are no longer used at all.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We already include exec/address-spaces.h and exec/memory.h in
cputlb.c; the include of qemu/timer.h appears to be a fossil.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
The variable parallel_cpus controls the generation of thread aware
atomic code. We only need to set it once we clone our first thread.
At this point any existing translations need to be thrown away.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
When we cannot emulate an atomic operation within a parallel
context, this exception allows us to stop the world and try
again in a serial context.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Allows Int128 to be used more generally, rather than having to
begin with 64-bit inputs and accumulate.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
While the check against sizeof(void *) is appropriate for
normal usage within qemu, there are places in which we want
wider operaions and have checked for their existance.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Making these functional rather than object macros will
prevent later problems with complex macro expansion.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Intel HDA emulator uses stream of buffers during DMA data
transfers. Each entry has buffer length and buffer pointer
position, which are used to derive bytes to 'copy'. If this
length and buffer pointer were to be same, 'copy' could be
set to zero(0), leading to an infinite loop. Add check to
avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1476949224-6865-1-git-send-email-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
RTL8139 ethernet controller in C+ mode supports multiple
descriptor rings, each with maximum of 64 descriptors. While
processing transmit descriptor ring in 'rtl8139_cplus_transmit',
it does not limit the descriptor count and runs forever. Add
check to avoid it.
Reported-by: Andrew Henderson <hendersa@icculus.org>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In Vmxnet3 device emulator while processing transmit(tx) queue,
when it reaches end of packet, it calls vmxnet3_complete_packet.
In that local 'txcq_descr' object is not initialised, which could
leak host memory bytes a guest.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The e1000e emulation zeroes out any used rx descriptor and then writes a
completely newly constructed value there. By doing this, it doesn't only
update the write-back area of the descriptors (as it's supposed to do),
but it also clears the buffer address, which real hardware doesn't do.
The spec explicitly mentions in chapter 7.1.8 that it is valid for a
driver to reuse a descriptor and only update the status field while
doing so, i.e. reusing the old buffer address:
If software statically allocates buffers, and uses memory read to
check for completed descriptors, it simply has to zero the status
byte in the descriptor to make it ready for reuse by hardware.
This patch fixes the behaviour to leave the buffer address in
descriptors unchanged even after the descriptor has been used.
Signed-off-by: Kevin Wolf <mail@kevin-wolf.de>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Rocker network switch emulator has test registers to help debug
DMA operations. While testing host DMA access, a buffer address
is written to register 'TEST_DMA_ADDR' and its size is written to
register 'TEST_DMA_SIZE'. When performing TEST_DMA_CTRL_INVERT
test, if DMA buffer size was greater than 'INT_MAX', it leads to
an invalid buffer access. Limit the DMA buffer size to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Jiri Pirko <jiri@mellanox.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The exit dispatch of eepro100 network card device doesn't free
the 's->vmstate' field which was allocated in device realize thus
leading a host memory leak. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Fix indentations and source format at few places. Add braces
around 'if' and 'while' statements.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The AMD PC-Net II emulator has set of control and status(CSR)
registers. Of these, CSR76 and CSR78 hold receive and transmit
descriptor ring length respectively. This ring length could range
from 1 to 65535. Setting ring length to zero leads to an infinite
loop in pcnet_rdra_addr() or pcnet_transmit(). Add check to avoid it.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Separate all ccr bits. Continue to batch updates via cc_op.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Fix gen_logic_cc() to really extend the size of the result.
Fix gen_get_ccr(): update cc_op as it is used by the helper.
Factorize flags computing and src/ccr cleanup
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
target-m68k: sr/ccr cleanup
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The CF docs certainly doesnt suggest this is true.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Read a 8, 16 or 32bit immediat constant.
An immediate constant is stored in the instruction opcode and
can be in one or two extension words.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Scaled index is not supported by 68000, 68008, and 68010.
EA = (bd + PC) + Xn.SIZE*SCALE + od
Ignore it:
M68000 FAMILY PROGRAMMER’S REFERENCE MANUAL
2.4 BRIEF EXTENSION WORD FORMAT COMPATIBILITY
"If the MC68000 were to execute an instruction that
encoded a scaling factor, the scaling factor would be
ignored and would not access the desired memory address.
The earlier microprocessors do not recognize the brief
extension word formats implemented by newer processors.
Although they can detect illegal instructions, they do not
decode invalid encodings of the brief extension word formats
as exceptions."
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The qdict_flatten() method will take a dict whose elements are
further nested dicts/lists and flatten them by concatenating
keys.
The qdict_crumple() method aims to do the reverse, taking a flat
qdict, and turning it into a set of nested dicts/lists. It will
apply nesting based on the key name, with a '.' indicating a
new level in the hierarchy. If the keys in the nested structure
are all numeric, it will create a list, otherwise it will create
a dict.
If the keys are a mixture of numeric and non-numeric, or the
numeric keys are not in strictly ascending order, an error will
be reported.
As an example, a flat dict containing
{
'foo.0.bar': 'one',
'foo.0.wizz': '1',
'foo.1.bar': 'two',
'foo.1.wizz': '2'
}
will get turned into a dict with one element 'foo' whose
value is a list. The list elements will each in turn be
dicts.
{
'foo': [
{ 'bar': 'one', 'wizz': '1' },
{ 'bar': 'two', 'wizz': '2' }
],
}
If the key is intended to contain a literal '.', then it must
be escaped as '..'. ie a flat dict
{
'foo..bar': 'wizz',
'bar.foo..bar': 'eek',
'bar.hello': 'world'
}
Will end up as
{
'foo.bar': 'wizz',
'bar': {
'foo.bar': 'eek',
'hello': 'world'
}
}
The intent of this function is that it allows a set of QemuOpts
to be turned into a nested data structure that mirrors the nesting
used when the same object is defined over QMP.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-3-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Parameter recursive dropped along with its tests; whitespace style
touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The input_visitor_test_add() method was accepting an instance
of 'TestInputVisitorData' and passing it as the 'user_data'
parameter to test functions. The main 'TestInputVisitorData'
instance that was actually used, was meanwhile being allocated
automatically by the test framework fixture setup.
The 'user_data' parameter is going to be needed for tests
added in later patches, so getting rid of the current mistaken
usage now allows this.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-7-git-send-email-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The QmpOutputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one wants a QObject. Rename it
to better reflect its functionality as a generic QAPI
to QObject converter.
The commit before previous renamed the files, this one renames C
identifiers.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-6-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Split into file rename and identifier rename]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The QmpInputVisitor has no direct dependency on QMP. It is
valid to use it anywhere that one has a QObject. Rename it
to better reflect its functionality as a generic QObject
to QAPI converter.
The previous commit renamed the files, this one renames C identifiers.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <1475246744-29302-5-git-send-email-berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Straightforwardly rebased, split into file and identifier rename]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
The QMP visitors have no direct dependency on QMP. It is
valid to use them anywhere that one has a QObject. Rename them
to better reflect their functionality as a generic QObject
to QAPI converter.
This is the first of three parts: rename the files. The next two
parts will rename C identifiers. The split is necessary to make git
rename detection work.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Split into file and identifier rename, two comments touched up]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
x86 and CPU queue, 2016-10-24
x2APIC support to APIC code, cpu_exec_init() refactor on all
architectures, and other x86 changes.
# gpg: Signature made Mon 24 Oct 2016 20:51:14 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
exec: call cpu_exec_exit() from a CPU unrealize common function
exec: move cpu_exec_init() calls to realize functions
exec: split cpu_exec_init()
pc: q35: Bump max_cpus to 288
pc: Require IRQ remapping and EIM if there could be x2APIC CPUs
pc: Add 'etc/boot-cpus' fw_cfg file for machine with more than 255 CPUs
Increase MAX_CPUMASK_BITS from 255 to 288
pc: Clarify FW_CFG_MAX_CPUS usage comment
pc: kvm_apic: Pass APIC ID depending on xAPIC/x2APIC mode
pc: apic_common: Reset APIC ID to initial ID when switching into x2APIC mode
pc: apic_common: Restore APIC ID to initial ID on reset
pc: apic_common: Extend APIC ID property to 32bit
pc: Leave max apic_id_limit only in legacy cpu hotplug code
acpi: cphp: Force switch to modern cpu hotplug if APIC ID > 254
pc: acpi: x2APIC support for SRAT table
pc: acpi: x2APIC support for MADT table and _MAT method
Conflicts:
target-arm/cpu.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
As cpu_exec_exit() mirrors the cpu_exec_realizefn(),
rename it as cpu_exec_unrealizefn().
Create and register a cpu_common_unrealizefn() function for
the CPU device class and call cpu_exec_unrealizefn() from
this function.
Remove cpu_exec_exit() from cpu_common_finalize()
(which mirrors init, not realize), and as x86_cpu_unrealizefn()
and ppc_cpu_unrealizefn() overwrite the device class unrealize function,
add a call to a parent_unrealize pointer.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Modify all CPUs to call it from XXX_cpu_realizefn() function.
Remove all the cannot_destroy_with_object_finalize_yet as
unsafe references have been moved to cpu_exec_realizefn().
(tested with QOM command provided by commit 4c315c27)
for arm:
Setting of cpu->mp_affinity is moved from arm_cpu_initfn()
to arm_cpu_realizefn() as setting of cpu_index is now done
in cpu_exec_realizefn(). To avoid to overwrite an user defined
value, we set it to an invalid value by default, and update
it in realize function only if the value is still invalid.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Put in cpu_exec_initfn() what initializes the CPU,
and leave in cpu_exec_init() what adds it to the environment.
As cpu_exec_initfn() is called by all XX_cpu_initfn(), call it
directly in cpu_common_initfn().
cpu_exec_init() is now a realize function, it will be renamed
to cpu_exec_realizefn() and moved to the XX_cpu_realizefn()
function in a following patch.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It would prevent starting guest with incorrect configs
where interrupts couldn't be delivered to CPUs with
APIC IDs > 255.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently firmware uses 1 byte at 0x5F offset in RTC CMOS
to get number of CPUs present at boot. However 1 byte is
not enough to handle more than 255 CPUs. So add a new
fw_cfg file that would allow QEMU to tell it.
For compat reasons add file only for machine types that
support more than 255 CPUs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
so that it would be possible to increase maxcpus limit
for x86 target. Keep spapr/virt_arm at limit they used
to have 255.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
SDM: x2APIC State Transitions:
State Changes From xAPIC Mode to x2APIC Mode
"
Any APIC ID value written to the memory-mapped
local APIC ID register is not preserved
"
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
APIC ID should be restored to initial APIC ID
state after Reset and Power-On.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
ACPI ID is 32 bit wide on CPUs with x2APIC support.
Extend 'id' property to support it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
That's enough to make old code that depends on it
to prevent QEMU starting with more than 255 CPUs.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Switch to modern cpu hotplug at machine startup time if
a cpu present at boot has apic-id in range unsupported
by legacy cpu hotplug interface (i.e. > 254), to avoid
killing QEMU from legacy cpu hotplug code with error:
"acpi: invalid cpu id: #apic-id#"
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Block layer patches
# gpg: Signature made Mon 24 Oct 2016 17:02:47 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (23 commits)
block/replication: Clarify 'top-id' parameter usage
block: More operations for meta dirty bitmap
tests: Add test code for hbitmap serialization
block: BdrvDirtyBitmap serialization interface
hbitmap: serialization
block: Assert that bdrv_release_dirty_bitmap succeeded
block: Add two dirty bitmap getters
block: Support meta dirty bitmap
tests: Add test code for meta bitmap
HBitmap: Introduce "meta" bitmap to track bit changes
block: Hide HBitmap in block dirty bitmap interface
quorum: do not allocate multiple iovecs for FIFO strategy
quorum: change child_iter to children_read
iotests: Do not rely on unavailable domains in 162
iotests: Remove raciness from 162
qemu-nbd: Add --fork option
qemu-iotests: Test I/O in a single drive from a throttling group
throttle: Correct access to wrong BlockBackendPublic structures
qapi: fix memory leak in bdrv_image_info_specific_dump
block: improve error handling in raw_open
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches for master
# gpg: Signature made Mon Oct 24 17:56:44 2016 CEST
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2016-10-24:
block/replication: Clarify 'top-id' parameter usage
block: More operations for meta dirty bitmap
tests: Add test code for hbitmap serialization
block: BdrvDirtyBitmap serialization interface
hbitmap: serialization
block: Assert that bdrv_release_dirty_bitmap succeeded
block: Add two dirty bitmap getters
block: Support meta dirty bitmap
tests: Add test code for meta bitmap
HBitmap: Introduce "meta" bitmap to track bit changes
block: Hide HBitmap in block dirty bitmap interface
quorum: do not allocate multiple iovecs for FIFO strategy
quorum: change child_iter to children_read
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Callers can create an iterator of meta bitmap with
bdrv_dirty_meta_iter_new(), then use the bdrv_dirty_iter_* operations on
it. Meta iterators are also counted by bitmap->active_iterators.
Also add a couple of functions to retrieve granularity and count.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-11-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Functions to serialize / deserialize(restore) HBitmap. HBitmap should be
saved to linear sequence of bits independently of endianness and bitmap
array element (unsigned long) size. Therefore Little Endian is chosen.
These functions are appropriate for dirty bitmap migration, restoring
the bitmap in several steps is available. To save performance, every
step writes only the last level of the bitmap. All other levels are
restored by hbitmap_deserialize_finish() as a last step of restoring.
So, HBitmap is inconsistent while restoring.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
[Fix left shift operand to 1UL; add "finish" parameter. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-8-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
HBitmap is an implementation detail of block dirty bitmap that should be hidden
from users. Introduce a BdrvDirtyBitmapIter to encapsulate the underlying
HBitmapIter.
A small difference in the interface is, before, an HBitmapIter is initialized
in place, now the new BdrvDirtyBitmapIter must be dynamically allocated because
the structure definition is in block/dirty-bitmap.c.
Two current users are converted too.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: John Snow <jsnow@redhat.com>
Message-id: 1476395910-8697-2-git-send-email-jsnow@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
In FIFO mode there are no parallel reads, hence there is no need to
allocate separate buffers and clone the iovecs.
The two cases of quorum_aio_cb are now even more different, and
most of quorum_aio_finalize is only needed in one of them, so split
them in separate functions.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1475685327-22767-3-git-send-email-pbonzini@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
There are some (mostly ISP-specific) name servers who will redirect
non-existing domains to special hosts. In this case, we will get a
different error message when trying to connect to such a host, which
breaks test 162.
162 needed this specific error message so it can confirm that qemu was
indeed trying to connect to the user-specified port. However, we can
also confirm this by setting up a local NBD server on exactly that port;
so we can fix the issue by doing just that.
Reported-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Using the --fork option, one can make qemu-nbd fork the worker process.
The original process will exit on error of the worker or once the worker
enters the main loop.
Suggested-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
iotest 093 contains a test that creates a throttling group with
several drives and performs I/O in all of them. This patch adds a new
test that creates a similar setup but only performs I/O in one of the
drives at the same time.
This is useful to test that the round robin algorithm is behaving
properly in these scenarios, and is specifically written using the
regression introduced in 27ccdd5259 as an example.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In 27ccdd5259 the throttling fields were
moved from BlockDriverState to BlockBackend. However in a few cases
the code started using throttling fields from the active BlockBackend
instead of the round-robin token, making the algorithm behave
incorrectly.
This can cause starvation if there's a throttling group with several
drives but only one of them has I/O.
Cc: qemu-stable@nongnu.org
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The 'obj' result of the visitor was not properly freed, like done in
other places doing a similar job.
Signed-off-by: Pino Toscano <ptoscano@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Make raw_open for POSIX more consistent in handling errors by setting
the error object also when qemu_open fails. The error object was set
generally set in case of errors, but I guess this case was overlooked.
Do the same for win32.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Sascha Silbe <silbe@linux.vnet.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.vnet.ibm.com> (POSIX only)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that QAPI supports boxed types, we can have unions at the top level
of a command, so let's put our real options directly there for
blockdev-add instead of having a single "options" dict that contains the
real arguments.
blockdev-add is still experimental and we already made substantial
changes to the API recently, so we're free to make changes like this
one, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Handling this is similar to what is done to the L2 entry in the case of
compressed clusters.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the backing file cannot be opened when doing qemu-img rebase, the
variable 'ret' was not assigned a non-zero value, and the qemu-img
process terminated with exit code zero. Fix this.
Signed-off-by: Xu Tian <xutian@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some SMBus operations restart the transfer to convert from
write to read mode without an intervening i2c_end_transfer().
The second call cannot fail, so the return code is unchecked,
but this causes Coverity to complain. So add some asserts
and documentation about this.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Version 2.0 of the semihosting specification introduces new trap
instructions for AArch32: HLT 0xF000 for A32 and HLT 0x3C for T32.
Implement these (in the same way we implement the existing HLT
semihosting trap for A64).
The old traps via SVC and BKPT are unaffected.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1476792973-18508-1-git-send-email-peter.maydell@linaro.org
Change 2293c27fad (i2c: implement broadcast write) added broadcast
capability to the I2C bus, but it broke SMBus read transactions.
An SMBus read transaction does two i2c_start_transaction() calls
without an intervening i2c_end_transfer() call. This will
result in i2c_start_transfer() adding the same device to the
current_devs list twice, and then the ->event() for the same
device gets called twice in the second call to i2c_start_transfer(),
resulting in the smbus code getting confused.
Note that this happens even with pure I2C devices when simulating
SMBus over I2C.
This fix only scans the bus if the current set of devices is empty.
This means that the current set of devices stays fixed until
i2c_end_transfer() is called, which is really what you want.
This also deletes the empty check from the top of i2c_end_transfer().
It's unnecessary, and it prevents the broadcast variable from being
set to false at the end of the transaction if no devices were on
the bus.
Cc: KONRAD Frederic <fred.konrad@greensocs.com>
Cc: Alistair Francis <alistair.francis@xilinx.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Kwon <hyun.kwon@xilinx.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: KONRAD Frederic <fred.konrad@greensocs.com>
Tested-by: KONRAD Frederic <fred.konrad@greensocs.com>
Message-id: 1470153614-6657-1-git-send-email-minyard@acm.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ARM A9MP processor has a peripheral timer with an auto-increment
register, which holds an increment step value. A user could set
this value to zero. When auto-increment control bit is enabled,
it leads to an infinite loop in 'a9_gtimer_update' while
updating comparator value. Remove this loop incrementing the
comparator value.
Reported-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 1476733226-11635-1-git-send-email-ppandit@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Current ARM MPTimer implementation uses QEMUTimer for the actual timer,
this implementation isn't complete and mostly tries to duplicate of what
generic ptimer is already doing fine.
Conversion to ptimer brings the following benefits and fixes:
- Simple timer pausing implementation
- Fixes counter value preservation after stopping the timer
- Properly handles prescaler != 0 / counter = 0 / load = 0 cases
- Code simplification and reduction
Bump VMSD to version 3, since VMState is changed and is not compatible
with the previous implementation.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Reviewed-by: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Message-id: 37f378c33bb5a28d5cd71167a6bd5bff5e59cbc3.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For most of the timers counter starts to decrement after first period
expires. Due to rounding down performed by the ptimer_get_count, it returns
counter - 1 for the running timer, so that for the ptimer user it looks
like counter gets decremented immediately after running the timer. Add "no
counter round down" policy that provides correct behaviour for those timers.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: ef39622d0ebfdc32a0877e59ffdf6910dc3db688.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, periodic counter wraps around immediately once counter reaches
"0", this is wrong behaviour for some of the timers, resulting in one period
being lost. Add new ptimer policy that provides correct behaviour for such
timers, so that counter stays with "0" for a one period before wrapping
around.
Signed-off-by: Dmitry Osipenko <digetx@gmail.com>
Message-id: f22a670cf1f4be298b31640cb5f4be1df0f20ab6.1475421224.git.digetx@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since the virt board model will never create a CPU which is
pre-ARMv7, we know that our minimum page size is 4K and can
set minimum_page_bits accordingly, for improved performance.
Note that this is a migration compatibility break, so
we introduce it only for the virt-2.8 machine and onward;
virt-2.7 continues using the old 1K pages.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Rather than defining TARGET_PAGE_BITS to always be 10,
switch to using a value picked at runtime. This allows us
to use 4K pages for modern ARM CPUs (and in particular all
64-bit CPUs) without having to drop support for the old
ARMv5 CPUs which had 1K pages.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Add a subsection to vmstate_configuration which is present
only if the guest is using a target page size which is
different from the default. This allows us to helpfully
diagnose attempts to migrate between machines which
are using different target page sizes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Support target CPUs having a page size which isn't knownn
at compile time. To use this, the CPU implementation should:
* define TARGET_PAGE_BITS_VARY
* not define TARGET_PAGE_BITS
* define TARGET_PAGE_BITS_MIN to the smallest value it
might possibly want for TARGET_PAGE_BITS
* call set_preferred_target_page_bits() in its realize
function to indicate the actual preferred target page
size for the CPU (and report any error from it)
In CONFIG_USER_ONLY, the CPU implementation should continue
to define TARGET_PAGE_BITS appropriately for the guest
OS page size.
Machines which want to take advantage of having the page
size something larger than TARGET_PAGE_BITS_MIN must
set the MachineClass minimum_page_bits field to a value
which they guarantee will be no greater than the preferred
page size for any CPU they create.
Note that changing the target page size by setting
minimum_page_bits is a migration compatibility break
for that machine.
For debugging purposes, attempts to use TARGET_PAGE_SIZE
before it has been finally confirmed will assert.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Remove L1 page mapping table properties computing
statically using macros which is dependent on
TARGET_PAGE_BITS. Drop macros V_L1_SIZE, V_L1_SHIFT,
V_L1_BITS macros and replace with variables which are
computed at early stage of VM boot.
Removing dependency can help to make TARGET_PAGE_BITS
dynamic.
Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Message-id: 1465808915-4887-4-git-send-email-vijayak@caviumnetworks.com
[PMM:
assert(v_l1_shift % V_L2_BITS == 0)
cache v_l2_levels
initialize from page_init() rather than vl.c
minor code style fixes
put v_l1_size into a local where used as a loop limit]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Allocate sub_section dynamically. Remove dependency
on TARGET_PAGE_SIZE to make run-time page size detection
for arm platforms.
Signed-off-by: Vijaya Kumar K <vijayak@cavium.com>
Message-id: 1465808915-4887-3-git-send-email-vijayak@caviumnetworks.com
[PMM: use flexible array member rather than separate malloc
so we don't need an extra pointer deref when using it]
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit d2f39ad "exec.c: Ensure right alignment also for file backed ram"
added an additional alignment requirement on the size of backend file
besides the previous page size. On x86, the alignment is changed from
4KB in QEMU 2.6 to 2MB in QEMU 2.7.
This change breaks certain usages in QEMU 2.7 on x86, e.g.
-object memory-backend-file,id=mem1,mem-path=/tmp/,size=$SZ
-device pc-dimm,id=dimm1,memdev=mem1
where $SZ is multiple of 4KB but not 2MB (e.g. 1023M). QEMU 2.7
reports the following error message and aborts:
qemu-system-x86_64: -device pc-dimm,memdev=mem1,id=nv1: backend memory size must be multiple of 0x200000
The same regression may also happen in other platforms as indicated by
Igor Mammedov. This change is however necessary for s390 according to
the commit message of d2f39ad, so we workaround the regression by taking
the change only on s390.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Reported-by: "Xu, Anthony" <anthony.xu@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
No need to count the users of a CharDriverState, it can rely on the fact
of whether there is a CharBackend associated or if there is enough space
in the muxer.
Simplify and fold chr_mux_new_fe() in qemu_chr_fe_init() since there is
a single user now. Also switch from fprintf to raising error instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022100951.19562-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since the hanlders are associated with a CharBackend, rather than the
CharDriverState, it is more appropriate to store in CharBackend. This
avoids the handler copy dance in qemu_chr_fe_set_handlers() then
mux_chr_update_read_handler(), by storing the CharBackend pointer
directly.
Also a mux CharDriver should go through mux->backends[focused], since
chr->be will stay NULL. Before that, it was possible to call
chr->handler by mistake with surprising results, for ex through
qemu_chr_be_can_write(), which would result in calling the last set
handler front end, not the one with focus.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-22-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Now that all front end use qemu_chr_fe_init(), we can move chardev
claiming in init(), and add a function deinit() to release the chardev
and cleanup handlers.
The qemu_chr_fe_claim_no_fail() for property are gone, since the
property will raise an error instead. In other cases, where there is
already an error path, an error is raised instead. Finally, other cases
are handled by &error_abort in qemu_chr_fe_init().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-19-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Store the property in a CharBackend instead of CharDriverState*. This
also replace systematically chr by chr.chr to access the
CharDriverState*. The following patches will replace it with calls to
qemu_chr_fe CharBackend functions.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-12-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This new structure is meant to keep the details associated with a char
driver usage. On initialization, it gets a tag from the mux backend.
It can change its handlers thanks to qemu_chr_fe_set_handlers().
This structure is introduced so that all frontend will be moved to hold
and use a CharBackend. This will allow to better track char usage and
allocation, and help prevent some memory leaks or corruption.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-10-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Make qemu_chr_add_handlers_full() aware of mux handling. This allows
introduction of a tag associated with the fe handlers and a
qemu_chr_set_handlers() function to set the handler for a particular
tag. That will allow to get rid of qemu_chr_add_handlers*() in later
changes, in favor of qemu_chr_fe_set_handler().
To this end, chr_update_read_handler callback is enhanced with a tag
argument, and mux_chr_update_read_handler() is splitted in new
functions: mux_chr_new_handler_tag(), mux_chr_set_handlers(),
mux_set_focus().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-9-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ASAN complains about buffer overflow when running:
aarch64-softmmu/qemu-system-aarch64 -machine xilinx-zynq-a9
==476==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x602000035e38 at pc 0x000000f75253 bp 0x7ffc597e0ec0 sp 0x7ffc597e0eb0
READ of size 8 at 0x602000035e38 thread T0
#0 0xf75252 in xilinx_spips_realize hw/ssi/xilinx_spips.c:623
#1 0xb9ef6c in device_set_realized hw/core/qdev.c:918
#2 0x129ae01 in property_set_bool qom/object.c:1854
#3 0x1296e70 in object_property_set qom/object.c:1088
#4 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27
#5 0x1297168 in object_property_set_bool qom/object.c:1157
#6 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358
#7 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125
#8 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238
#9 0x998eac in main /home/elmarco/src/qemu/vl.c:4534
#10 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730)
#11 0x41d0a8 in _start (/home/elmarco/src/qemu/aarch64-softmmu/qemu-system-aarch64+0x41d0a8)
0x602000035e38 is located 0 bytes to the right of 8-byte region [0x602000035e30,0x602000035e38)
allocated by thread T0 here:
#0 0x7f970b014e60 in malloc (/lib64/libasan.so.3+0xc6e60)
#1 0x7f96f15b0e18 in g_malloc (/lib64/libglib-2.0.so.0+0x4ee18)
#2 0xb9ef6c in device_set_realized hw/core/qdev.c:918
#3 0x129ae01 in property_set_bool qom/object.c:1854
#4 0x1296e70 in object_property_set qom/object.c:1088
#5 0x129dd1b in object_property_set_qobject qom/qom-qobject.c:27
#6 0x1297168 in object_property_set_bool qom/object.c:1157
#7 0xb9aeac in qdev_init_nofail hw/core/qdev.c:358
#8 0x78a5bf in zynq_init_spi_flashes /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:125
#9 0x78af60 in zynq_init /home/elmarco/src/qemu/hw/arm/xilinx_zynq.c:238
#10 0x998eac in main /home/elmarco/src/qemu/vl.c:4534
#11 0x7f96ed692730 in __libc_start_main (/lib64/libc.so.6+0x20730)
s->spi is allocated with the size of num_busses which may be 1 (by
default). Change to use a loop up to s->num_busses also for the
call to ssi_auto_connect_slaves().
Reported-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CharDriverState.init() callback was introduced in commit
ceecf1d158. It is only called from text_console_do_init(), but it is no
longer set since commit a61ae7f88 (init assignment has been removed by
accident).
It seems correct to use an event callback instead and print the console
text on CHR_EVENT_OPENED. That way we can remove the single user of
CharDriverState init().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20161022095318.17775-6-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
16550A UART device uses an oscillator to generate frequencies
(baud base), which decide communication speed. This speed could
be changed by dividing it by a divider. If the divider is
greater than the baud base, speed is set to zero, leading to a
divide by zero error. Add check to avoid it.
Reported-by: Huawei PSIRT <psirt@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <1476251888-20238-1-git-send-email-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Avoid walking the FlatView of all address spaces. Most of the
address spaces will have no log_sync callback on their listeners.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This speeds up MEMORY_LISTENER_CALL noticeably. Right now,
with many PCI devices you have N regions added to M AddressSpaces
(M = # PCI devices with bus-master enabled) and each call looks
up the whole listener list, with at least M listeners in it.
Because most of the regions in N are BARs, which are also roughly
proportional to M, the whole thing is O(M^3). This changes it
to O(M^2), which is the best we can do without rewriting the
whole thing.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This comes from free from unifying tcg_reg_alloc_mov and
tcg_reg_alloc_movi's handling of TEMP_VAL_CONST. It triggers
often on moves to cc_dst, such as the following translation
of "sub $0x3c,%esp":
before: after:
subl $0x3c,%ebp subl $0x3c,%ebp
movl %ebp,0x10(%r14) movl %ebp,0x10(%r14)
movl $0x3c,%ebx movl $0x3c,0x2c(%r14)
movl %ebx,0x2c(%r14)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1473945360-13663-1-git-send-email-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This was found with test-i386. The issue is that instructions
such as
addr32 lea (%eax), %rax
did not perform a 32-bit extension, because the LEA translation
skipped the gen_lea_v_seg step. That step does not just add
segments, it also takes care of extending from address size to
pointer size.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
test_start/stop are used only as flags to loop on. Barriers are unnecessary,
since no dependent data is transferred among threads apart from the flags
themselves.
This commit relaxes the three accesses to test_start/stop that were
not yet relaxed.
Signed-off-by: Emilio G. Cota <cota@braap.org>
This introduces load-acquire and store-release operations in QEMU.
For now, just use them as an implementation detail of atomic_mb_read
and atomic_mb_set.
Since docs/atomics.txt documents that atomic_mb_read only synchronizes
with an atomic_mb_set of the same variable, we can use the new implementation
everywhere instead of seq-cst loads and stores.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Thanks to the acquire semantics of qemu_event_reset and qemu_event_wait,
some memory barriers can be removed.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Do not use the somewhat mysterious atomic_mb_read/atomic_mb_set,
instead make sure that the operations on QemuEvent are annotated
with the desired acquire and release semantics.
In particular, qemu_event_set wakes up the waiting thread, so it must
be a release from the POV of the waker (compare with qemu_mutex_unlock).
And it actually needs a full barrier, because that's the only thing that
provides something like a "load-release".
Use smp_mb_acquire until we have atomic_load_acquire and
atomic_store_release in atomic.h.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The output string QEMU with "--version" is very long, it does
not fit into a normal line of a terminal window anymore. By
putting the copyright information on a separate line instead,
the output looks much nicer.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1475661284-30153-1-git-send-email-thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
iSER is a new transport layer supported in Libiscsi,
iSER provides a zero-copy RDMA capable interface that can
improve performance.
In order to use the new iSER transport one need to have RDMA supported HW
and to choose iser as the protocol name in Libiscsi URI.
For now iSER memory buffers are pre-allocated and pre-registered,
hence in order to work with iSER from QEMU, one need to enable
MEMLOCK attribute in the VM to be large enough for all iSER buffers and RDMA
resources.
Signed-off-by: Roy Shterman <roysh@mellanox.com>
Message-Id: <1476000896-18632-3-git-send-email-roysh@mellanox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A new API to deploy zero-copy command submission. The new API takes I/O
vectors list and number of I/O vectors to submit as input parameters
when initiating the command. New API must be used if working with
iSER transport option.
Signed-off-by: Roy Shterman <roysh@mellanox.com>
Message-Id: <1476000896-18632-2-git-send-email-roysh@mellanox.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Implement SUSE specific unplug protocol for emulated PCI devices
in PVonHVM guests. Its a simple 'outl(1, (ioaddr + 4));'.
This protocol was implemented and used since Xen 3.0.4.
It is used in all SUSE/SLES/openSUSE releases up to SLES11SP3 and
openSUSE 12.3.
In addition old (pre-2011) VMDP versions are handled as well.
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Using 'vdev=sd[a-o]' will create an emulated LSI controller, which can
be used by the emulated BIOS to boot from disk. If the HVM domU has also
PV driver the disk may appear twice in the guest. To avoid this an
unplug of the emulated hardware is needed, similar to what is done for
IDE and NIC drivers already.
Since the SCSI controller provides only disks the entire controller can
be unplugged at once.
Impact of the change for classic and pvops based guest kernels:
vdev=sda:disk0
before: pvops: disk0=pv xvda + emulated sda
classic: disk0=pv sda + emulated sdq
after: pvops: disk0=pv xvda
classic: disk0=pv sda
vdev=hda:disk0, vdev=sda:disk1
before: pvops: disk0=pv xvda
disk1=emulated sda
classic: disk0=pv hda
disk1=pv sda + emulated sdq
after: pvops: disk0=pv xvda
disk1=not accessible by blkfront, index hda==index sda
classic: disk0=pv hda
disk1=pv sda
vdev=hda:disk0, vdev=sda:disk1, vdev=sdb:disk2
before: pvops: disk0=pv xvda
disk1=emulated sda
disk2=pv xvdb + emulated sdb
classic: disk0=pv hda
disk1=pv sda + emulated sdq
disk2=pv sdb + emulated sdr
after: pvops: disk0=pv xvda
disk1=not accessible by blkfront, index hda==index sda
disk2=pv xvdb
classic: disk0=pv hda
disk1=pv sda
disk2=pv sda
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
PAGE_SIZE is undefined on ARM64. Use XC_PAGE_SIZE instead, which is
always 4096 even when page granularity is 64K.
For this to actually work with 64K pages, more changes are required.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Reviewed-by: Juergen Gross <jgross@suse.com>
Release-acked-by: Wei Liu <wei.liu2@citrix.com>
Linux-user changes, mostly bugfixes and adding support for some
new syscalls and some obscure syscalls as well. Includes some
missed patches from earlier rounds, and dropping unicore32 target.
v2: fix the syslog patch and test build with clang-3.8
v3: drop ustat patch
# gpg: Signature made Fri 21 Oct 2016 13:38:06 BST
# gpg: using RSA key 0xB44890DEDE3C9BC0
# gpg: Good signature from "Riku Voipio <riku.voipio@iki.fi>"
# gpg: aka "Riku Voipio <riku.voipio@linaro.org>"
# Primary key fingerprint: FF82 03C8 C391 98AE 0581 41EF B448 90DE DE3C 9BC0
* remotes/riku/tags/pull-linux-user-20160921: (21 commits)
linux-user: disable unicore32 linux-user build
linux-user: added support for pwritev() system call.
linux-user: added support for preadv() system call.
linux-user: Fix fadvise64() syscall support for Mips32
linux-user: Redirect termbits.h for Mips64 to termbits.h for Mips32
linux-user: Update ioctls definitions for Mips32
linux-user: Update mips_syscall_args[] array in main.c
linux-user: Add support for syncfs() syscall
linux-user: Add support for clock_adjtime() syscall
linux-user: Fix definition of target_sigevent for 32-bit guests
linux-user: use libc wrapper instead of direct mremap syscall
linux-user: Don't use alloca() for epoll_wait's epoll event array
linux-user: add RTA_PRIORITY in netlink
linux-user: add kcmp() syscall
linux-user: sparc64: Use correct target SHMLBA in shmat()
linux-user: Remove a duplicate item from strace.list
linux-user: Fix syslog() syscall support
linux-user: Fix socketcall() syscall support
linux-user: Fix msgrcv() and msgsnd() syscalls support
linux-user: Fix mq_open() syscall support
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In order to cleanup linux-user, we need support for most relatively
modern syscalls. unicore32 lacks support for syscalls like
epoll_pwait, preventing cleaning up the CONFIG_EPOLL mess.
This patch can be reverted when unicore32 starts either supporting
the syscalls as defined in mainline kernel, or the oldabi interface
gains support for syscalls supported since at kernel 2.6.19 / glibc 2.6
Cc: MPRC <zhangheng@mprc.pku.edu.cn>
Cc: Xuetao Guan <gxt@mprc.pku.edu.cn>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This system call performs the same task as the writev() system call,
with the exception of having the fourth argument, offset, which
specifes the file offset at which the input operation is to be performed.
Because of this, the pwritev() implementation is based on the writev()
implementation in linux-user mode.
But, since pwritev() is implemented in the kernel as a 5-argument syscall,
5 arguments are needed to be handled as input and passed to the host
syscall.
The pos_l and pos_h argument of the safe_pwritev() are of type unsigned
long, which can be of different sizes on different platforms. The input
arguments are converted to the appropriate host size when passed to
safe_pwritev().
Signed-off-by: Dejan Jovicevic <dejan.jovicevic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This system call performs the same task as the readv() system call,
with the exception of having the fourth argument, offset, which
specifes the file offset at which the input operation is to be performed.
Because of this, the preadv() implementation is based on the readv()
implementation in linux-user mode.
But, since preadv() is implemented in the kernel as a 5-argument syscall,
5 arguments are needed to be handled as input and passed to the host
syscall.
The pos_l and pos_h argument of the safe_preadv() are of type unsigned
long, which can be of different sizes on different platforms. The input
arguments are converted to the appropriate host size when passed to
safe_preadv().
Signed-off-by: Dejan Jovicevic <dejan.jovicevic@rt-rk.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
By looking at the file arch/mips/kernel/scall32-o32.S in Linux
kernel, it can be deduced that, for Mips32 platform, syscall
corresponding to number _NR_fadvise64 as defined in kernel file
arch/mips/include/uapi/asm/unistd.h translates to kernel function
sys_fadvise64_64, and that argument layout for this system call is
as follows:
0 32 0 32
+----------------+----------------+
(arg1) | fd | __pad | (arg2)
+----------------+----------------+
(arg3) | buffer | (arg4)
+----------------+----------------+
(arg5) | len | (arg6)
+----------------+----------------+
(arg7) | advise | not used | (arg8)
+----------------+----------------+
The same argument layout can be deduced from glibc code, and
relevant commit messages in linux kernel and glibc.
The fix is to change TARGET_NR_fadvise64 to TARGET_NR_fadvise64_64
in Mips32 syscall numbers table. Array mips_syscall_args[] in
linux-user/main.c also already have "fadvise64_64" (and not
"fadvise64") in corresponding place for the syscall number in
question, so no change for linux-user/main.c.
This patch also fixes the failure LTP test posix_fadvise03, if
executed on Qemu-emulated Mips32 platform (user mode).
Signed-off-by: Aleksandar Rikalo <aleksandar.rikalo@imgtec.com>
Signed-off-by: Miroslav Tisma <miroslav.tisma@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
linux-user/mips64/termbits.h and linux-user/mips/termbits.h
originate from the same files in Linux kernel. There is no plan
to split original headers in Linux kernel into Mips32 and Mips64
versions any time soon. Therefore, it is better not to have
separate Mips32 and Mips64 variants in Qemu.
This patch makes these two files effectively the same, allowing the
mainenance by changing only a single file. (This is already done in
the same fashion for some other headers in same directories.)
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Array mips_syscall_args[] determines number of arguments for each
syscall on Mips32. It wasn't updated with newer syscalls. Also,
preadv and pwritev have 5 arguments, not 6.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch implements Qemu user mode syncfs() syscall support. Syscall
syncfs() syncs the filesystem containing file determined by the open
file descriptor passed as the argument to syncfs().
The implementation consists of a straightforward invocation of host's
syncfs(). Configure and strace support is included as well.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch implements Qemu user mode clock_adjtime() syscall support.
The implementation is based on invocation of host's clock_adjtime().
Signed-off-by: Aleksandar Rikalo <aleksandar.rikalo@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The sigevent structure includes a union with some fields which
are pointers. For the QEMU target_sigevent structure we must
represent these as abi_ulongs, not host function pointers.
This error was causing the compiler to believe it should 8-align
the _sigev_un union on a 64-bit host, which meant that the
code in target_to_host_sigevent() was looking at the wrong
offset to find the _tid field, and timer_create() would
spuriously fail with EINVAL.
This fixes the final loose end noted in LP:1042388.
While we're editing the structure, switch the 'int32_t' fields
to 'abi_int'; this will only matter for guests with non-standard
integer alignment like m68k.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This commit essentially reverts commit
3af72a4d98, which has replaced
five-argument calls to mremap() by direct mremap syscalls for
compatibility with glibc older than version 2.4.
The direct syscall was buggy for 64bit targets on 32bit hosts
because of the default integer type promotions. Since glibc-2.4
is now a decade old, we can remove this workaround.
Signed-off-by: Felix Janda <felix.janda@posteo.de>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The epoll event array which epoll_wait() allocates has a size
determined by the guest which could potentially be quite large.
Use g_try_new() rather than alloca() so that we can fail more
cleanly if the guest hands us an oversize value. (ENOMEM is
not a documented return value for epoll_wait() but in practice
some kernel configurations can return it -- see for instance
sys_oabi_epoll_wait() on ARM.)
This rearrangement includes fixing a bug where we were
incorrectly passing a negative length to unlock_user() in
the error-exit codepath.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Used by fedora21 on ppc64 in the network initialization
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
There is a duplicate item in strace.list. It is benign, but it
shouldn't be there, since it may lead to confusion and even bugs
in the future. It is the only duplicate in strace.list. This
patch removes it.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
There are currently several problems related to syslog() support.
For example, if the second argument "bufp" of target syslog() syscall
is NULL, the current implementation always returns error code EFAULT.
However, NULL is a perfectly valid value for the second argument for
many use cases of this syscall. This is, for example, visible from
this excerpt of man page for syslog(2):
> EINVAL Bad arguments (e.g., bad type; or for type 2, 3, or 4, buf is
> NULL, or len is less than zero; or for type 8, the level is
> outside the range 1 to 8).
Moreover, the argument "bufp" is ignored for all cases of values of the
first argument, except 2, 3 and 4. This means that for such cases
(the first argument is not 2, 3 or 4), there is no need to pass "buf"
between host and target, and it can be set to NULL while calling host's
syslog(), without loss of emulation accuracy.
Note also that if "bufp" is NULL and the first argument is 2, 3 or 4, the
correct returned error code is EINVAL, not EFAULT.
All these details are reflected in this patch.
"#ifdef TARGET_NR_syslog" is also proprerly inserted when needed.
Support for Qemu's "-strace" switch for syslog() syscall is included too.
LTP tests syslog11 and syslog12 pass with this patch (while fail without
it), on any platform.
Changes to original patch by Riku Voipio:
fixed error paths in TARGET_SYSLOG_ACTION_READ_ALL to match
http://lxr.free-electrons.com/source/kernel/printk/printk.c?v=4.7#L1335
Should fix also the build error in:
https://lists.gnu.org/archive/html/qemu-devel/2016-10/msg03721.html
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Since not all Linux host platforms support socketcall() (most notably
Intel), do_socketcall() function in Qemu's syscalls.c is implemented to
mirror the corespondant implementation of socketcall() in Linux kernel,
and to utilise individual socket operations that are supported on all
Linux platforms. (see kernel source file net/socket.c, definition of
socketcall).
However, error codes produced by Qemu implementation are wrong for the
cases of invalid values of the first argument. Also, naming of constants
is not consistent with kernel one, and not consistant with Qemu convention
of prefixing such constants with "TARGET_". This patch in that light
brings do_socketcall() closer to its kernel counterpart, and in that way
fixes the errors and yields more consisrtent Qemu code.
There were also three missing cases (among 20) for strace support for
socketcall(). The array that contains pointers for appropriate printing
functions is updated with 3 elements, however pointers to functions are
left NULL, and its implementation is left for future.
Also, this patch fixes failure of LTP test socketcall02, if executed on some
Qemu emulated sywstems (uer mode).
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If syscalls msgrcv() and msgsnd() fail, they return E2BIG, EACCES,
EAGAIN, EFAULT, EIDRM, EINTR, EINVAL, ENOMEM, or ENOMSG.
By examining negative scenarios of these syscalls for Mips, it was
established that ENOMSG does not have the same value accross all
platforms, but it is nevertheless not included for conversion in
the correspondant conversion table defined in linux-user/syscall.c.
This is certainly a bug, since it leads to the incorrect emulation
of msgrcv() and msgsnd() for scenarios involving ENOMSG.
This patch fixes this by extending the conversion table to include
ENOMSG.
Also, LTP test msgrcv04 will be fixed for some platforms.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Conversion of file creation flags (O_CREAT, ...) from target to host
was missing.
Also, this patch implements better error handling.
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This patch implements Qemu user mode adjtimex() syscall support.
Syscall adjtimex() reads and optionally sets parameters for a clock
adjustment algorithm used in network synchonization or similar scenarios.
Its declaration is:
int adjtimex(struct timex *buf);
The correspondent source code in the Linux kernel is at kernel/time.c,
line 206.
The Qemu implementation is based on invocation of host's adjtimex(), and
its key part is in the "TARGET_NR_adjtimex" case segment of the the main
switch statement of the function do_syscall(), in linux-user/syscalls.c. All
necessary conversions of the data structures from target to host and from
host to target are covered. Two new functions, target_to_host_timex() and
host_to_target_timex(), are provided for the purpose of such conversions.
For that purpose, the support for related structure "timex" had tp be added
to the file linux-user/syscall_defs.h, based on its definition in Linux
kernel. Also, the relevant support for "-strace" Qemu option is included
in files linux-user/strace.c and linux-user/strace.list.
This patch also fixes failures of LTP tests adjtimex01 and adjtimex02, if
executed in Qemu user mode.
Signed-off-by: Aleksandar Rikalo <aleksandar.rikalo@imgtec.com>
Signed-off-by: Aleksandar Markovic <aleksandar.markovic@imgtec.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The gcrypt threads implementation must be set before calling
any other gcrypt APIs, especially gcry_check_version(),
since that triggers initialization of the random pool. After
that is initialized, changes to the threads impl won't be
honoured by the random pool code. This means that gcrypt
will think thread locking is needed and so try to acquire
the random pool mutex, but this is NULL as no threads impl
was set originally. This results in a crash in the random
pool code.
For the same reasons, we must set the gcrypt threads impl
before calling gnutls_init, since that will also trigger
gcry_check_version
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The test-io-channel-tls test was missing a call to qcrypto_init
and test-crypto-hash was initializing it multiple times,
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
CC tests/test-crypto-tlscredsx509.o
CC tests/crypto-tls-x509-helpers.o
CC tests/pkix_asn1_tab.o
tests/pkix_asn1_tab.c:7:22: warning: libtasn1.h: No such file or directory
tests/pkix_asn1_tab.c:9: error: expected ‘=’, ‘,’, ‘;’, ‘asm’ or ‘__attribute__’ before ‘pkix_asn1_tab’
make: *** [tests/pkix_asn1_tab.o] Error 1
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Introduce CTR mode support for the cipher APIs.
CTR mode uses a counter rather than a traditional IV.
The counter has additional properties, including a nonce
and initial counter block. We reuse the ctx->iv as
the counter for conveniences.
Both libgcrypt and nettle are support CTR mode, the
cipher-builtin doesn't support yet.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
It can't guarantee all cipher modes are supported
if one cipher algorithm is supported by a backend.
Let's extend qcrypto_cipher_supports() to take both
the algorithm and mode as parameters.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
VFIO updates 2016-10-17
- Convert to realize & improve error reporting (Eric Auger)
- RTL quirk bug fix (Thorsten Kohfeldt)
- Skip duplicate pre/post reset (Cao jin)
# gpg: Signature made Mon 17 Oct 2016 20:42:44 BST
# gpg: using RSA key 0x239B9B6E3BB08B22
# gpg: Good signature from "Alex Williamson <alex.williamson@redhat.com>"
# gpg: aka "Alex Williamson <alex@shazbot.org>"
# gpg: aka "Alex Williamson <alwillia@redhat.com>"
# gpg: aka "Alex Williamson <alex.l.williamson@gmail.com>"
# Primary key fingerprint: 42F6 C04E 540B D1A9 9E7B 8A90 239B 9B6E 3BB0 8B22
* remotes/awilliam/tags/vfio-updates-20161017.0:
vfio: fix duplicate function call
vfio/pci: Fix vfio_rtl8168_quirk_data_read address offset
vfio/pci: Handle host oversight
vfio/pci: Remove vfio_populate_device returned value
vfio/pci: Remove vfio_msix_early_setup returned value
vfio/pci: Conversion to realize
vfio/platform: Pass an error object to vfio_base_device_init
vfio/platform: fix a wrong returned value in vfio_populate_device
vfio/platform: Pass an error object to vfio_populate_device
vfio: Pass an error object to vfio_get_device
vfio: Pass an error object to vfio_get_group
vfio: Pass an Error object to vfio_connect_container
vfio/pci: Pass an error object to vfio_pci_igd_opregion_init
vfio/pci: Pass an error object to vfio_add_capabilities
vfio/pci: Pass an error object to vfio_intx_enable
vfio/pci: Pass an error object to vfio_msix_early_setup
vfio/pci: Pass an error object to vfio_populate_device
vfio/pci: Pass an error object to vfio_populate_vga
vfio/pci: Use local error object in vfio_initfn
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
x86 queue, 2016-10-17
# gpg: Signature made Mon 17 Oct 2016 18:51:07 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request: (21 commits)
target-i386: Don't use cpu->migratable when filtering features
target-i386: Return runnability information on query-cpu-definitions
target-i386: x86_cpu_load_features() function
target-i386: Unset cannot_destroy_with_object_finalize_yet
target-i386/kvm: cache the return value of kvm_enable_x2apic()
intel_iommu: reject broken EIM
intel_iommu: add OnOffAuto intr_eim as "eim" property
intel_iommu: redo configuraton check in realize
intel_iommu: pass whole remapped addresses to apic
apic: add send_msi() to APICCommonClass
apic: add global apic_get_class()
target-i386: Move warning code outside x86_cpu_filter_features()
qmp: Add runnability information to query-cpu-definitions
target-i386: xsave: Add FP and SSE bits to x86_ext_save_areas
target-i386: Register properties for feature aliases manually
target-i386: Remove underscores from feat_names arrays
target-i386: Make plus_features/minus_features QOM-based
target-i386: Register aliases for feature names with underscores
target-i386: Disable VME by default with TCG
target-i386: List CPU models using subclass list
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm:
* target-arm: kvm: use AddressSpace-specific listener
* aspeed: add SMC controllers
* hw/arm/boot: allow using a command line specified dtb without a kernel
* hw/dma/pl080: Fix bad bit mask
* hw/intc/arm_gic_kvm: Fix build on aarch64 with some compilers
* hw/arm/virt: fix ACPI tables for ITS
* tests: add a m25p80 test
* tests: cleanup ptimer-test
* pxa2xx: Auto-assign name for i2c bus in i2c_init_bus
* target-arm: handle tagged addresses in A64 code
* target-arm: Fix masking of PC lower bits when doing exception returns
* target-arm: Implement dummy MDCCINT_EL1
* target-arm: Add trace events for the generic timers
* hw/intc/arm_gicv3: Fix ICC register tracepoints
* hw/char/pl011: Add trace events
# gpg: Signature made Mon 17 Oct 2016 19:39:42 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20161017: (25 commits)
hw/char/pl011: Add trace events
hw/intc/arm_gicv3: Fix ICC register tracepoints
target-arm: Add trace events for the generic timers
target-arm: Implement dummy MDCCINT_EL1
Fix masking of PC lower bits when doing exception returns
target-arm: Comments added to identify cases in a switch
target-arm: Code changes to implement overwrite of tag field on PC load
target-arm: Infrastucture changes to enable handling of tagged address loading into PC
pxa2xx: Auto-assign name for i2c bus in i2c_init_bus.
tests: cleanup ptimer-test
tests: add a m25p80 test
hw/arm/virt: no ITS on older machine types
hw/arm/virt-acpi-build: fix MADT generation
hw/intc/arm_gic_kvm: Fix build on aarch64
hw/dma/pl080: Fix bad bit mask (PL080_CONF_M1 | PL080_CONF_M1)
hw/arm/boot: allow using a command line specified dtb without a kernel
aspeed: add support for the SMC segment registers
aspeed: create mapping regions for the maximum number of slaves
aspeed: add support for the AST2500 SoC SMC controllers
aspeed: extend the number of host SPI controllers
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix some problems with the tracepoints for ICC register reads
and writes:
* tracepoints for ICC_BPR<n>, ICC_AP<n>R<x>, ICC_IGRPEN<n>,
ICC_EIOR<n> were not printing the <n> that indicated whether
the access was to the group 0 or 1 register
* the ICC_IGREPEN1_EL3 read function was not actually calling
the associated tracepoint
* the ICC_BPR<n> write function was incorrectly calling the
tracepoint for ICC_PMR writes
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Message-id: 1476294876-12340-4-git-send-email-peter.maydell@linaro.org
In commit 9b6a3ea7a6 store_reg() was changed to mask
both bits 0 and 1 of the new PC value when in ARM mode.
Unfortunately this broke the exception return code paths
when doing a return from ARM mode to Thumb mode: in some
of these we write a new CPSR including new Thumb mode
bit via gen_helper_cpsr_write_eret(), and then use store_reg()
to write the new PC. In this case if the new CPSR specified
Thumb mode then masking bit 1 of the PC is incorrect
(these code paths correspond to the v8 ARM ARM pseudocode
function AArch32.ExceptionReturn(), which always aligns the
new PC appropriately for the new instruction set state).
Instead of using store_reg() in exception-return code paths,
call a new store_pc_exc_ret() which stores the raw new PC
value to env->regs[15], and then mask it appropriately in
the subsequent helper_cpsr_write_eret() where the new
env->thumb state is available.
This fixes a bug introduced by 9b6a3ea7a6 which caused
crashes/hangs or otherwise bad behaviour for Linux when
userspace was using Thumb.
Reported-by: Jerome Forissier <jerome.forissier@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1476113163-24578-1-git-send-email-peter.maydell@linaro.org
For BR, BLR and RET instructions, if tagged addresses are enabled, the
tag field in the address must be cleared out prior to loading the
address into the PC. Depending on the current EL, it will be set to
either all 0's or all 1's.
Signed-off-by: Thomas Hanson <thomas.hanson@linaro.org>
Message-id: 1476301853-15774-3-git-send-email-thomas.hanson@linaro.org
[PMM: remove unnecessary gen_a64_set_pc_reg() wrapper,
rename gen_a64_set_pc_var() to gen_a64_set_pc(), fix stray
misindentation]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When capturing the current CPU state for the TB, extract the TBI0 and TBI1
values from the correct TCR for the current EL and then add them to the TB
flags field.
Then, at the start of code generation for the block, copy the TBI fields
into the DisasContext structure.
Signed-off-by: Thomas Hanson <thomas.hanson@linaro.org>
Message-id: 1476301853-15774-2-git-send-email-thomas.hanson@linaro.org
[PMM: drop useless 'extern' keyword on function prototypes;
provide CONFIG_USER_ONLY trivial versions of arm_regime_tbi[01]()]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If a name is provided, the same name is assigned to both the I2C
controllers. Leaving it NULL, causes names to be automatically
assigned with an ID suffix, giving unique names to each
controller. This helps us to uniquely identify each controller in the
device tree, for example when adding an I2C device.
Signed-off-by: Vijay Kumar B. <vijaykumar@zilogic.com>
Reviewed-by: Deepak S. <deepak@zilogic.com>
Message-id: 1476351885-8905-1-git-send-email-vijaykumar@zilogic.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
1) ptimer-test is not a qtest---it runs the ptimer.c code directly in the
ptimer-test process
2) ptimer-test has its own stubs file, so there is no need to add more
stubs to stubs/vmstate.c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Dmitry Osipenko <digetx@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This test uses the palmetto platform and the Aspeed SPI controller to
test the m25p80 flash module device model. The flash model is defined
by the platform (n25q256a) and it would be nice to find way to control
it, using a property probably.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1475787271-28794-1-git-send-email-clg@kaod.org
Brainstormed-with: Greg Kurz <groug@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove unused debugging code to fix native building on aarch64. Without
this change, the following -Werr output inhibits make from completing.
qemu/hw/intc/arm_gic_kvm.c:38:18: error: debug_gic_kvm defined but not used [-Werror=unused-const-variable=]
static const int debug_gic_kvm = 0;
^~~~~~~~~~~~~
cc1: all warnings being treated as errors
qemu/rules.mak:60: recipe for target 'hw/intc/arm_gic_kvm.o' failed
make[1]: *** [hw/intc/arm_gic_kvm.o] Error 1
Makefile:205: recipe for target 'subdir-aarch64-softmmu' failed
Signed-off-by: Christopher Covington <cov@codeaurora.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20161011163202.19720-1-cov@codeaurora.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When kernel and device tree are specified in the QEMU commandline, then
this device tree may be modified e.g. to add virtio_mmio devices.
With a bootloader e.g. on a flash device these extra devices are not
available.
With this change, the device tree can be specified at the QEMU commandline.
The modified device tree made available to the bootloader with the same
mechanism already supported by device trees fully generated by QEMU.
Signed-off-by: Michael Olbrich <m.olbrich@pengutronix.de>
Message-id: 1473520054-402-1-git-send-email-m.olbrich@pengutronix.de
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SMC controller on the Aspeed SoC has a set of registers to
configure the mapping of each flash module in the SoC address
space. Writing to these registers triggers a remap of the memory
region and the spec requires a certain number of checks before doing
so.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-7-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SMC controller on the Aspeed SoC has a set of registers to
configure the mapping of each flash module in the SoC address
space. These mapping windows are configurable even though no SPI slave
is attached to the controller.
Also rewrite a bit the comments in the code on this topic.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-6-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SMC controllers on the Aspeed AST2500 SoC are very similar to the
ones found on the AST2400. The differences are on the number of
supported flash modules and their default mappings in the SoC address
space.
The Aspeed AST2500 has one SPI controller for the BMC firmware and two
for the host firmware. All controllers have now the same set of
registers compatible with the AST2400 FMC controller and the legacy
'SMC' controller is fully gone.
We keep the FMC object to act as the BMC SPI controller and add a new
SPI controller for the host. We also have to introduce new type names
to handle the differences in the flash modules memory mappping.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-5-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will ease the definition of the new controllers for the AST2500
SoC and also ease the support of the segment registers, which provide
a way to reconfigure the mapping window of each slave.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-3-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Aspeed SoC has three different types of SMC (Static Memory
Controller) controllers: the SMC (legacy), the FMC (the new one) and
the SPI for the host PNOR. The FMC and the SPI models are now
converging on the AST2500 SoC and the SMC, which was still available
on the AST2400 SoC, was removed.
The Aspeed SoC does not provide support for the legacy SMC
controller. So, let's rename the 'smc' object to 'fmc' to clarify its
nature.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 1474977462-28032-2-git-send-email-clg@kaod.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When explicitly enabling unmigratable flags using "-cpu host"
(e.g. "-cpu host,+invtsc"), the requested feature won't be
enabled because cpu->migratable is true by default.
This is inconsistent with all other CPU models, which don't have
the "migratable" option, making "+invtsc" work without the need
for extra options.
This happens because x86_cpu_filter_features() uses
cpu->migratable as an argument for
x86_cpu_get_supported_feature_word(). This is not useful
because:
2) on "-cpu host" it only makes QEMU disable features that were
explicitly enabled in the command-line;
1) on all the other CPU models, cpu->migratable is already false.
The fix is to just use 'false' as an argument to
x86_cpu_get_supported_feature_word() in
x86_cpu_filter_features().
Note that:
* This won't change anything for people using using
"-cpu host" or "-cpu host,migratable=<on|off>" (with no extra
features) because the x86_cpu_get_supported_feature_word() call
on the cpu->host_features check uses cpu->migratable as
argument.
* This won't change anything for any CPU model except "host"
because they all have cpu->migratable == false (and only "host"
has the "migratable" property that allows it to be changed).
* This will only change things for people using "-cpu host,+<feature>",
where <feature> is a non-migratable feature. The only existing
named non-migratable feature is "invtsc".
In other words, this change will only affect people using
"-cpu host,+invtsc" (that will now get what they asked for: the
invtsc flag will be enabled). All other use cases are unaffected.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To do the conversion, the file_backend_class_init() was moved
after the getter/setter functions. The old
file_backend_instance_init() function was removed because it is
not needed anymore.
The NULL errp arguments on the property registration calls were
changed to &error_abort.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The NULL errp arguments on the property registration calls were
changed to &error_abort.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When doing the conversion, the NULL errp arguments on the
property registration calls were changed to &error_abort.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
machine_set_property() replaces '_' by '-' in the property name.
Except it fails to replace an initial '_'. Screwed up in commit
b0ddb8b. Reproducer: "-M pc,__foo_bar=true" produces "Property
'._-foo-bar' not found".
Error messages using a mangled name rather than the name the user
actually wrote is user-hostile, but that's a different topic.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When probing for CPU model information, we need to reuse the code
that initializes CPUID fields, but not the remaining side-effects
of x86_cpu_realizefn(). Move that code to a separate function
that can be reused later.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
TYPE_X86_CPU now call cpu_exec_init() on realize, so we don't
need to set cannot_destroy_with_object_finalize_yet anymore.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Assume that KVM would have returned the same on subsequent runs.
Abstract the memoizaiton pattern into macros and call it memorize as
adding the r makes it less obscure.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Cluster x2APIC cannot work without KVM's x2apic API when the maximal
APIC ID is greater than 8 and only KVM's LAPIC can support x2APIC, so we
forbid other APICs and also the old KVM case with less than 9, to
simplify the code.
There is no point in enabling EIM in forbidden APICs, so we keep it
enabled only for the KVM APIC; unconditionally, because making the
option depend on KVM version would be a maintanance burden.
Old QEMUs would enable eim whenever intremap was on, which would trick
guests into thinking that they can enable cluster x2APIC even if any
interrupt destination would get clamped to 8 bits.
Depending on your configuration, QEMU could notice that the destination
LAPIC is not present and report it with a very non-obvious:
KVM: injection failed, MSI lost (Operation not permitted)
Or the guest could say something about unexpected interrupts, because
clamping leads to aliasing so interrupts were being delivered to
incorrect VCPUs.
KVM_X2APIC_API is the feature that allows us to enable EIM for KVM.
QEMU 2.7 allowed EIM whenever interrupt remapping was enabled. In order
to keep backward compatibility, we again allow guests to misbehave in
non-obvious ways, and make it the default for old machine types.
A user can enable the buggy mode it with "x-buggy-eim=on".
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The default (auto) emulates the current behavior.
A user can now control EIM like
-device intel-iommu,intremap=on,eim=off
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
* there no point in configuring the device if realization is going to
fail, so move the check to the beginning,
* create a separate function for the check,
* use error_setg() instead error_report().
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The MMIO interface to APIC only allowed 8 bit addresses, which is not
enough for 32 bit addresses from EIM remapping.
Intel stored upper 24 bits in the high MSI address, so use the same
technique. The technique is also used in KVM MSI interface.
Other APICs are unlikely to handle those upper bits.
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The MMIO based interface to APIC doesn't work well with MSIs that have
upper address bits set (remapped x2APIC MSIs). A specialized interface
is a quick and dirty way to avoid the shortcoming.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Every configuration has only up to one APIC class and we'll be extending
the class with a function that can be called without an instanced
object, so a direct access to the class is convenient.
This patch will break compilation if some code uses apic_get_class()
with CONFIG_USER_ONLY.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Radim Krčmář <rkrcmar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
x86_cpu_filter_features() will be reused by code that shouldn't
print any warning. Move the warning code to a new
x86_cpu_report_filtered_features() function, and call it from
x86_cpu_realizefn().
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of treating the FP and SSE bits as special cases, add
them to the x86_ext_save_areas array. This will simplify the code
that calculates the supported xsave components and the size of
the xsave area.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of keeping the aliases inside the feature name arrays and
require parsing the strings, just register alias properties
manually. This simplifies the code for property registration and
lookup.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of translating the feature name entries when adding
property names, store the actual property names in the feature
name array.
For reference, here is the full list of functions that use
FeatureWordInfo::feat_names:
* x86_cpu_get_migratable_flags(): not affected, as it just
check for non-NULL values.
* report_unavailable_features(): informative only. It will
start printing feature names with hyphens.
* x86_cpu_list(): informative only. It will start printing
feature names with hyphens
* x86_cpu_register_feature_bit_props(): not affected, as it
was already calling feat2prop(). Now we can remove the
feat2prop() calls safely.
So, the only user-visible effect of this patch are the new names
being used in help and error messages for users.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of using custom feature name lookup code for
plus_features/minus_features, save the property names used in
"[+-]feature" and use object_property_set_bool() to set them.
We don't need a feat2prop() call because we now have alias
properties for the old names containing underscores.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Registering the actual names containing underscores as aliases
will allow management software to be aware that the old
compatibility names are suported, and will make feat2prop() calls
unnecessary when using feature names.
Also, this will help us avoid making the code support underscores
on feature names that never had them in the first place. e.g.
"+tsc_deadline" was never supported and doesn't need to be
translated to "+tsc-deadline".
In other word: this will require less magic translation of
strings, and simple 1:1 match between the config options and
actual QOM properties.
Note that the underscores are still present in the
FeatureWordInfo::feat_names arrays, because
add_flagname_to_bitmaps() needs them to be kept. The next patches
will remove add_flagname_to_bitmaps() and will allow us to
finally remove the aliases from feat_names.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
VME is already disabled automatically when using TCG. So, instead
of pretending it is there when reporting CPU model data on
query-cpu-* QMP commands (making every CPU model to be reported
as not runnable), we can disable it by default on all CPU models
when using TCG.
Do that by adding a tcg_default_props array that will work like
kvm_default_props.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Instead of using the builtin_x86_defs array, use the QOM subclass
list to list CPU models on "-cpu ?" and "query-cpu-definitions".
Signed-off-by: Andreas Färber <afaerber@suse.de>
[ehabkost: copied code from a patch by Andreas:
"target-i386: QOM'ify CPU", from March 2012]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When vfio device is reset(encounter FLR, or bus reset), if need to do
bus reset(vfio_pci_hot_reset_one is called), vfio_pci_pre_reset &
vfio_pci_post_reset will be called twice.
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Introductory comment for rtl8168 VFIO MSI-X quirk states:
At BAR2 offset 0x70 there is a dword data register,
offset 0x74 is a dword address register.
vfio: vfio_bar_read(0000:05:00.0:BAR2+0x70, 4) = 0xfee00398 // read data
Thus, correct offset for data read is 0x70,
but function vfio_rtl8168_quirk_data_read() wrongfully uses offset 0x74.
Signed-off-by: Thorsten Kohfeldt <thorsten.kohfeldt@gmx.de>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In case the end-user calls qemu with -vfio-pci option without passing
either sysfsdev or host property value, the device is interpreted as
0000:00:00.0. Let's create a specific error message to guide the end-user.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The returned value (either -errno or -1) is not used anymore by the caller,
vfio_realize, since the error now is stored in the error object. So let's
remove it.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The returned value is not used anymore by the caller, vfio_realize,
since the error now is stored in the error object. So let's remove it.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch converts VFIO PCI to realize function.
Also original initfn errors now are propagated using QEMU
error objects. All errors are formatted with the same pattern:
"vfio: %s: the error description"
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This patch propagates errors encountered during vfio_base_device_init
up to the realize function.
In case the host value is not set or badly formed we now report an
error.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In case the vfio_init_intp fails we currently do not return an
error value. This patch fixes the bug. The returned value is not
explicit but in practice the error object is the one used to
report the error to the end-user and the actual returned error
value is not used.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Propagate the vfio_populate_device errors up to vfio_base_device_init.
The error object also is passed to vfio_init_intp. At the moment we
only report the error. Subsequent patches will propagate the error
up to the realize function.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
In vfio platform vfio_base_device_init we currently just report the
error. Subsequent patches will propagate the error up to the realize
function.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
For the time being let's just simply report the error in
vfio platform's vfio_base_device_init(). A subsequent patch will
duly propagate the error up to vfio_platform_realize.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The error is currently simply reported in vfio_get_group. Don't
bother too much with the prefix which will be handled at upper level,
later on.
Also return an error value in case container->error is not 0 and
the container is teared down.
On vfio_spapr_remove_window failure, we also report an error whereas
it was silent before.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
In vfio_probe_igd_bar4_quirk, simply report the error.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
The error is cascaded downto vfio_add_std_cap and then vfio_msi(x)_setup,
vfio_setup_pcie_cap.
vfio_add_ext_cap does not return anything else than 0 so let's transform
it into a void function.
Also use pci_add_capability2 which takes an error object.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
The error object is propagated down to vfio_intx_enable_kvm().
The three other callers, vfio_intx_enable_kvm(), vfio_msi_disable_common()
and vfio_pci_post_reset() do not propagate the error and simply call
error_reportf_err() with the ERR_PREFIX formatting.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
The returned value will be removed later on.
We now format an error in case of reading failure for
- the MSIX flags
- the MSIX table,
- the MSIX PBA.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for migration to VFIO-PCI realize.
The returned value will be removed later on.
The case where error recovery cannot be enabled is not converted into
an error object but directly reported through error_report, as before.
Populating an error instead would cause the future realize function to
fail, which is not wanted.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Pass an error object to prepare for the same operation in
vfio_populate_device. Eventually this contributes to the migration
to VFIO-PCI realize.
We now report an error on vfio_get_region_info failure.
vfio_probe_igd_bar4_quirk is not involved in the migration to realize
and simply calls error_reportf_err.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
To prepare for migration to realize, let's use a local error
object in vfio_initfn. Also let's use the same error prefix for all
error messages.
On top of the 1-1 conversion, we start using a common error prefix for
all error messages. We also introduce a similar warning prefix which will
be used later on.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This pull request contains:
- a patch to add a vdc->reset() handler to virtio-9p
- a bunch of patches to fix various memory leaks (thanks to Li Qiang)
- some code cleanups for 9pfs
# gpg: Signature made Mon 17 Oct 2016 16:01:46 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@fr.ibm.com>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "Gregory Kurz (Cimai Technology) <gkurz@cimai.com>"
# gpg: aka "Gregory Kurz (Meiosys Technology) <gkurz@meiosys.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: fix memory leak in v9fs_write
9pfs: fix memory leak in v9fs_link
9pfs: fix memory leak in v9fs_xattrcreate
9pfs: fix information leak in xattr read
virtio-9p: add reset handler
9pfs: only free completed request if not flushed
9pfs: drop useless check in pdu_free()
9pfs: use coroutine_fn annotation in hw/9pfs/9p.[ch]
9pfs: use coroutine_fn annotation in hw/9pfs/co*.[ch]
9pfs: fsdev: drop useless extern annotation for functions
9pfs: fix potential host memory leak in v9fs_read
9pfs: allocate space for guest originated empty strings
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If an error occurs when marshalling the transfer length to the guest, the
v9fs_write() function doesn't free an IO vector, thus leading to a memory
leak. This patch fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
The v9fs_link() function keeps a reference on the source fid object. This
causes a memory leak since the reference never goes down to 0. This patch
fixes the issue.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, rephrased the changelog]
Signed-off-by: Greg Kurz <groug@kaod.org>
The 'fs.xattr.value' field in V9fsFidState object doesn't consider the
situation that this field has been allocated previously. Every time, it
will be allocated directly. This leads to a host memory leak issue if
the client sends another Txattrcreate message with the same fid number
before the fid from the previous time got clunked.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, updated the changelog to indicate how the leak can occur]
Signed-off-by: Greg Kurz <groug@kaod.org>
9pfs uses g_malloc() to allocate the xattr memory space, if the guest
reads this memory before writing to it, this will leak host heap memory
to the guest. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Virtio devices should implement the VirtIODevice->reset() function to
perform necessary cleanup actions and to bring the device to a quiescent
state.
In the case of the virtio-9p device, this means:
- emptying the list of active PDUs (i.e. draining all in-flight I/O)
- freeing all fids (i.e. close open file descriptors and free memory)
That's what this patch does.
The reset handler first waits for all active PDUs to complete. Since
completion happens in the QEMU global aio context, we just have to
loop around aio_poll() until the active list is empty.
The freeing part involves some actions to be performed on the backend,
like closing file descriptors or flushing extended attributes to the
underlying filesystem. The virtfs_reset() function already does the
job: it calls free_fid() for all open fids not involved in an ongoing
I/O operation. We are sure this is the case since we have drained
the PDU active list.
The current code implements all backend accesses with coroutines, but we
want to stay synchronous on the reset path. We can either change the
current code to be able to run when not in coroutine context, or create
a coroutine context and wait for virtfs_reset() to complete. This patch
goes for the latter because it results in simpler code.
Note that we also need to create a dummy PDU because it is also an API
to pass the FsContext pointer to all backend callbacks.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
If a PDU has a flush request pending, the current code calls pdu_free()
twice:
1) pdu_complete()->pdu_free() with pdu->cancelled set, which does nothing
2) v9fs_flush()->pdu_free() with pdu->cancelled cleared, which moves the
PDU back to the free list.
This works but it complexifies the logic of pdu_free().
With this patch, pdu_complete() only calls pdu_free() if no flush request
is pending, i.e. qemu_co_queue_next() returns false.
Since pdu_free() is now supposed to be called with pdu->cancelled cleared,
the check in pdu_free() is dropped and replaced by an assertion.
Signed-off-by: Greg Kurz <groug@kaod.org>
All these functions either call the v9fs_co_* functions which have the
coroutine_fn annotation, or pdu_complete() which calls qemu_co_queue_next().
Let's mark them to make it obvious they execute in coroutine context.
Signed-off-by: Greg Kurz <groug@kaod.org>
All these functions use the v9fs_co_run_in_worker() macro, and thus always
call qemu_coroutine_self() and qemu_coroutine_yield().
Let's mark them to make it obvious they execute in coroutine context.
Signed-off-by: Greg Kurz <groug@kaod.org>
In 9pfs read dispatch function, it doesn't free two QEMUIOVector
object thus causing potential memory leak. This patch avoid this.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
If a guest sends an empty string paramater to any 9P operation, the current
code unmarshals it into a V9fsString equal to { .size = 0, .data = NULL }.
This is unfortunate because it can cause NULL pointer dereference to happen
at various locations in the 9pfs code. And we don't want to check str->data
everywhere we pass it to strcmp() or any other function which expects a
dereferenceable pointer.
This patch enforces the allocation of genuine C empty strings instead, so
callers don't have to bother.
Out of all v9fs_iov_vunmarshal() users, only v9fs_xattrwalk() checks if
the returned string is empty. It now uses v9fs_string_size() since
name.data cannot be NULL anymore.
Signed-off-by: Li Qiang <liqiang6-s@360.cn>
[groug, rewritten title and changelog,
fix empty string check in v9fs_xattrwalk()]
Signed-off-by: Greg Kurz <groug@kaod.org>
ppc patch queue 2016-10-17
Highlights:
* Significant rework of how PCI IO windows are placed for the
pseries machine type
* A number of extra tests added for ppc
* Other tests clean up / fixed
* Some cleanups to the XICS interrupt controller in preparation
for the 'powernv' machine type
A number of the test changes aren't strictly in ppc related code, but
are included via my tree because they're primarily focused on
improving test coverage for ppc.
# gpg: Signature made Mon 17 Oct 2016 03:42:41 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.8-20161017:
spapr: Improved placement of PCI host bridges in guest memory map
spapr_pci: Add a 64-bit MMIO window
spapr: Adjust placement of PCI host bridge to allow > 1TiB RAM
spapr_pci: Delegate placement of PCI host bridges to machine type
libqos: Limit spapr-pci to 32-bit MMIO for now
libqos: Correct error in PCI hole sizing for spapr
libqos: Isolate knowledge of spapr memory map to qpci_init_spapr()
ppc/xics: Split ICS into ics-base and ics class
ppc/xics: Make the ICSState a list
spapr: fix inheritance chain for default machine options
target-ppc: implement vexts[bh]2w and vexts[bhw]2d
tests/boot-sector: Increase time-out to 90 seconds
tests/boot-sector: Use mkstemp() to create a unique file name
tests/boot-sector: Use minimum length for the Forth boot script
qtest: ask endianness of the target in qtest_init()
tests: minor cleanups in usb-hcd-uhci-test
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Mon 17 Oct 2016 03:08:28 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/for-upstream:
tests/docker/Makefile.include: add a generic docker-run target
tests/docker: make test-mingw honour TARGET_LIST
tests/docker: test-build script
tests/docker: add travis dockerfile
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
migration/next for 20161014
# gpg: Signature made Fri 14 Oct 2016 16:24:13 BST
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20161014:
docs/xbzrle: correction
migrate: move max-bandwidth and downtime-limit to migrate_set_parameter
migration: Fix seg with missing port
migration/postcopy: Explicitly disallow huge pages
RAMBlocks: Store page size
Postcopy vs xbzrle: Don't send xbzrle pages once in postcopy [for 2.8]
migrate: Fix bounds check for migration parameters in migration.c
migrate: Use boxed qapi for migrate-set-parameters
migrate: Share common MigrationParameters struct
migrate: Fix cpu-throttle-increment regression in HMP
migration/rdma: Don't flag an error when we've been told about one
migration: Make failed migration load set file error
migration/rdma: Pass qemu_file errors across link
migration: Report values for comparisons
migration: report an error giving the failed field
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This re-factors the docker makefile to include a docker-run target which
can be controlled entirely from environment variables specified on the
make command line. This allows us to run against any given docker image
we may have in our repository, for example:
make docker-run TEST="test-quick" IMAGE="debian:arm64" \
EXECUTABLE=./aarch64-linux-user/qemu-aarch64
The existing docker-foo@bar targets still work but the inline
verification has been dropped because we already don't hit that due to
other pattern rules in rules.mak.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161011161625.9070-5-alex.bennee@linaro.org>
Message-Id: <20161011161625.9070-6-alex.bennee@linaro.org>
[Squash in the verification removal patch. - Fam]
Signed-off-by: Fam Zheng <famz@redhat.com>
This target grabs the latest Travis containers from their repository at
quay.io and then installs QEMU's build dependencies. With this it is
possible to run on broadly the same setup as they have on travis-ci.org.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20161011161625.9070-2-alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Currently, the MMIO space for accessing PCI on pseries guests begins at
1 TiB in guest address space. Each PCI host bridge (PHB) has a 64 GiB
chunk of address space in which it places its outbound PIO and 32-bit and
64-bit MMIO windows.
This scheme as several problems:
- It limits guest RAM to 1 TiB (though we have a limited fix for this
now)
- It limits the total MMIO window to 64 GiB. This is not always enough
for some of the large nVidia GPGPU cards
- Putting all the windows into a single 64 GiB area means that naturally
aligning things within there will waste more address space.
In addition there was a miscalculation in some of the defaults, which meant
that the MMIO windows for each PHB actually slightly overran the 64 GiB
region for that PHB. We got away without nasty consequences because
the overrun fit within an unused area at the beginning of the next PHB's
region, but it's not pretty.
This patch implements a new scheme which addresses those problems, and is
also closer to what bare metal hardware and pHyp guests generally use.
Because some guest versions (including most current distro kernels) can't
access PCI MMIO above 64 TiB, we put all the PCI windows between 32 TiB and
64 TiB. This is broken into 1 TiB chunks. The first 1 TiB contains the
PIO (64 kiB) and 32-bit MMIO (2 GiB) windows for all of the PHBs. Each
subsequent TiB chunk contains a naturally aligned 64-bit MMIO window for
one PHB each.
This reduces the number of allowed PHBs (without full manual configuration
of all the windows) from 256 to 31, but this should still be plenty in
practice.
We also change some of the default window sizes for manually configured
PHBs to saner values.
Finally we adjust some tests and libqos so that it correctly uses the new
default locations. Ideally it would parse the device tree given to the
guest, but that's a more complex problem for another time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
On real hardware, and under pHyp, the PCI host bridges on Power machines
typically advertise two outbound MMIO windows from the guest's physical
memory space to PCI memory space:
- A 32-bit window which maps onto 2GiB..4GiB in the PCI address space
- A 64-bit window which maps onto a large region somewhere high in PCI
address space (traditionally this used an identity mapping from guest
physical address to PCI address, but that's not always the case)
The qemu implementation in spapr-pci-host-bridge, however, only supports a
single outbound MMIO window, however. At least some Linux versions expect
the two windows however, so we arranged this window to map onto the PCI
memory space from 2 GiB..~64 GiB, then advertised it as two contiguous
windows, the "32-bit" window from 2G..4G and the "64-bit" window from
4G..~64G.
This approach means, however, that the 64G window is not naturally aligned.
In turn this limits the size of the largest BAR we can map (which does have
to be naturally aligned) to roughly half of the total window. With some
large nVidia GPGPU cards which have huge memory BARs, this is starting to
be a problem.
This patch adds true support for separate 32-bit and 64-bit outbound MMIO
windows to the spapr-pci-host-bridge implementation, each of which can
be independently configured. The 32-bit window always maps to 2G.. in PCI
space, but the PCI address of the 64-bit window can be configured (it
defaults to the same as the guest physical address).
So as not to break possible existing configurations, as long as a 64-bit
window is not specified, a large single window can be specified. This
will appear the same way to the guest as the old approach, although it's
now implemented by two contiguous memory regions rather than a single one.
For now, this only adds the possibility of 64-bit windows. The default
configuration still uses the legacy mode.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Currently the default PCI host bridge for the 'pseries' machine type is
constructed with its IO windows in the 1TiB..(1TiB + 64GiB) range in
guest memory space. This means that if > 1TiB of guest RAM is specified,
the RAM will collide with the PCI IO windows, causing serious problems.
Problems won't be obvious until guest RAM goes a bit beyond 1TiB, because
there's a little unused space at the bottom of the area reserved for PCI,
but essentially this means that > 1TiB of RAM has never worked with the
pseries machine type.
This patch fixes this by altering the placement of PHBs on large-RAM VMs.
Instead of always placing the first PHB at 1TiB, it is placed at the next
1 TiB boundary after the maximum RAM address.
Technically, this changes behaviour in a migration-breaking way for
existing machines with > 1TiB maximum memory, but since having > 1 TiB
memory was broken anyway, this seems like a reasonable trade-off.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The 'spapr-pci-host-bridge' represents the virtual PCI host bridge (PHB)
for a PAPR guest. Unlike on x86, it's routine on Power (both bare metal
and PAPR guests) to have numerous independent PHBs, each controlling a
separate PCI domain.
There are two ways of configuring the spapr-pci-host-bridge device: first
it can be done fully manually, specifying the locations and sizes of all
the IO windows. This gives the most control, but is very awkward with 6
mandatory parameters. Alternatively just an "index" can be specified
which essentially selects from an array of predefined PHB locations.
The PHB at index 0 is automatically created as the default PHB.
The current set of default locations causes some problems for guests with
large RAM (> 1 TiB) or PCI devices with very large BARs (e.g. big nVidia
GPGPU cards via VFIO). Obviously, for migration we can only change the
locations on a new machine type, however.
This is awkward, because the placement is currently decided within the
spapr-pci-host-bridge code, so it breaks abstraction to look inside the
machine type version.
So, this patch delegates the "default mode" PHB placement from the
spapr-pci-host-bridge device back to the machine type via a public method
in sPAPRMachineClass. It's still a bit ugly, but it's about the best we
can do.
For now, this just changes where the calculation is done. It doesn't
change the actual location of the host bridges, or any other behaviour.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Currently the functions in pci-spapr.c (like pci-pc.c on which it's based)
don't distinguish between 32-bit and 64-bit PCI MMIO. At the moment, the
qemu side implementation is a bit weird and has a single MMIO window
straddling 32-bit and 64-bit regions, but we're likely to change that in
future.
In any case, pci-pc.c - and therefore the testcases using PCI - only handle
32-bit MMIOs for now. For spapr despite whatever changes might happen with
the MMIO windows, the 32-bit window is likely to remain at 2..4 GiB in PCI
space.
So, explicitly limit pci-spapr.c to 32-bit MMIOs for now, we can add 64-bit
MMIO support back in when and if we need it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
In pci-spapr.c (as in pci-pc.c from which it was derived), the
pci_hole_start/pci_hole_size and pci_iohole_start/pci_iohole_size pairs[1]
essentially define the region of PCI (not CPU) addresses in which MMIO
or PIO BARs respectively will be allocated.
The size value is relative to the start value. But in pci-spapr.c it is
set to the entire size of the window supported by the (emulated) hardware,
but the start values are *not* at the beginning of the emulated windows.
That means if you tried to map enough PCI BARs, we'd messily overrun the
IO windows, instead of failing in iomap as we should.
This patch corrects this by calculating the hole sizes from the location
of the window in PCI space and the hole start.
[1] Those are bad names, but that's a problem for another time.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The libqos code for accessing PCI on the spapr machine type uses IOBASE()
and MMIOBASE() macros to determine the address in the CPU memory map of
the windows to PCI address space.
This is a detail of the implementation of PCI in the machine type, it's not
specified by the PAPR standard. Real guests would get the addresses of the
PCI windows from the device tree.
Finding the device tree in libqos would be awkward, but we can at least
localize this knowledge of the implementation to the init function, saving
it in the QPCIBusSPAPR structure for use by the accessors.
That leaves only one place to fix if we alter the location of the PCI
windows, as we're planning to do.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
The existing implementation remains same and ics-base is introduced. The
type name "ics" is retained, and all the related functions renamed as
ics_simple_*
This will allow different implementations for the source controllers
such as the MSI support of PHB3 on Power8 which uses in-memory state
tables for example.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[ clg: added ICS_BASE_GET_CLASS and related fixes, based on :
http://patchwork.ozlabs.org/patch/646010/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of an array of fixed sized blocks, use a list, as we will need
to have sources with variable number of interrupts. SPAPR only uses
a single entry. Native will create more. If performance becomes an
issue we can add some hashed lookup but for now this will do fine.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
[ move the initialization of list to xics_common_initfn,
restore xirr_owner after migration and move restoring to
icp_post_load]
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
[ clg: removed the icp_post_load() changes from nikunj patchset v3:
http://patchwork.ozlabs.org/patch/646008/ ]
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Rather than machine instances having backward-compatible option
defaults that need to be repeatedly re-enabled for every new machine
type we introduce, we set the defaults appropriate for newer machine
types, then add code to explicitly disable instance options as needed
to maintain compatibility with older machine types.
Currently pseries-2.5 does not inherit from pseries-2.6 in this
fashion, which is okay at the moment since we do not have any
instance compatibility options for pseries-2.6+ currently.
We will make use of this in future patches though, so fix it here.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[dwg: Extended to make 2.7 inherit from 2.8 as well]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Vector Extend Sign Instructions:
vextsb2w: Vector Extend Sign Byte To Word
vextsh2w: Vector Extend Sign Halfword To Word
vextsb2d: Vector Extend Sign Byte To Doubleword
vextsh2d: Vector Extend Sign Halfword To Doubleword
vextsw2d: Vector Extend Sign Word To Doubleword
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the PXE tester runs rather slow on ppc64 with tcg, there
is a chance that we hit the 60 seconds timeout on machines that
have a heavy CPU load. So let's increase the timeout to ease
the situation.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pxe-test is run for three different targets now (x86_64, i386
and ppc64), and the bios-tables-test is run for two targets (x86_64
and i386). But each of the tests is using an invariant name for the
disk image with the boot sector code - so if the tests are running in
parallel, there is a race condition that they destroy the disk image
of a parallel test program. Let's use mkstemp() to create unique
temporary files here instead - and since mkstemp() is returning an
integer file descriptor instead of a FILE pointer, we also switch
the fwrite() and fclose() to write() and close() instead.
Reported-by: Sascha Silbe <x-qemu@se-silbe.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The pxe-test is quite slow on ppc64 with tcg. We can speed it up
a little bit by decreasing the size of the file that has to be
loaded via TFTP.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The target endianness is not deduced anymore from
the architecture name but asked directly to the guest,
using a new qtest command: "endianness". As it can't
change (this is the value of TARGET_WORDS_BIGENDIAN),
we store it to not have to ask every time we want to
know if we have to byte-swap a value.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
CC: Greg Kurz <groug@kaod.org>
CC: David Gibson <david@gibson.dropbear.id.au>
CC: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Two minor cleanups:
- exit gracefully in case on unsupported target,
- put machine command line in a constant to avoid
to duplicate it.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Mark the old commands 'migrate_set_speed' and 'migrate_set_downtime' as
deprecated.
Move max-bandwidth and downtime-limit into migrate-set-parameters for
setting maximum migration speed and expected downtime limit parameters
respectively.
Change downtime units to milliseconds (only for new-command) and set
its upper bound limit to 2000 seconds.
Update the query part in both hmp and qmp qemu control interfaces.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The command :
migrate tcp:localhost:
currently segs; fix it so it now says:
error parsing address 'localhost:'
and the same for -incoming.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
At the moment postcopy will fail as soon as qemu tries to register
userfault on the RAMBlock pages that are backed by hugepages.
However, the kernel is going to get userfault support for hugepage
at some point, and we've not got the rest of the QEMU code to support
it yet, so fail neatly with an error like:
Postcopy doesn't support hugetlbfs yet (/objects/mem1)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
xbzrle relies on reading pages that have already been sent
to the destination and then applying the modifications; we can't
do that in postcopy because the destination may well have
modified the page already or the page has been discarded.
I already didn't allow reception of xbzrle pages, but I
forgot to add the test to stop them being sent.
Enabling both xbzrle and postcopy can make some sense;
if you think that your migration might finish if you
have xbzrle, then when it doesn't complete you flick
over to postcopy and stop xbzrle'ing.
This corresponds to RH bug:
https://bugzilla.redhat.com/show_bug.cgi?id=1368422
Symptom is:
Unknown combination of migration flags: 0x60 (postcopy mode)
(either 0x60 or 0x40)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch fixes the out-of-bounds check of migration parameters in
qmp_migrate_set_parameters() for cpu-throttle-initial and
cpu-throttle-increment by adding a return statement for both as they
were broken since their introduction in 2.5 via commit 1626fee.
Due to the missing return statements, parameters were getting set to
out-of-bounds values despite the error.
Signed-off-by: Ashijeet Acharya <ashijeetacharya@gmail.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Amit Shah <amit.shah@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It is rather verbose, and slightly error-prone, to repeat
the same set of parameters for input (migrate-set-parameters)
as for output (query-migrate-parameters), where the only
difference is whether the members are optional. We can just
document that the optional members will always be present
on output, and then share a common struct between both
commands. The next patch can then reduce the amount of
code needed on input.
Also, we made a mistake in qemu 2.7 of returning an empty
string during 'query-migrate-parameters' when there is no
TLS, rather than omitting TLS details entirely. Technically,
this change risks breaking any 2.7 client that is hard-coded
to expect the parameter's existence; on the other hand, clients
that are portable to 2.6 already must be prepared for those
members to not be present.
And this gets rid of yet one more place where the QMP output
visitor is silently converting a NULL string into "" (which
is a hack I ultimately want to kill off).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If the other side tells us there's been an error and we fail
the migration, we don't need to signal that failure to the other
side because it already knew.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If an error occurs in a section load, set the file error flag
so that the transport can get notified to do a cleanup.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If we fail for some reason (e.g. a mismatched RAMBlock)
and it's set the qemu_file error flag, pass that error back to the
peer so it can clean up rather than waiting for some higher level
progress.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael R. Hines <michael@hinespot.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Report the values when a comparison fails; together with
the previous patch that prints the device and field names
this should give a good idea of why loading the migration failed.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When a field fails to load (typically due to a limit
check, or a call to a get/put) report the device and field
to give an indication of the cause.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
That commit mis-used mux char: the frontend are multiplexed, not the
backend. Fix the regression preventing "c-a c" to switch the focus. The
following patches will fix the crash (when leaving or removing frontend)
by tracking frontends with handler tags.
This reverts commit 949055a254.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
KVM-PR currently does not support transactional memory, and the
implementation in TCG is just a fake. We should not announce TM
support in the ibm,pa-features property when running on such a
system, so disable it by default and only enable it if the KVM
implementation supports it (i.e. recent versions of KVM-HV).
These changes are based on some earlier work from Anton Blanchard
(thanks!).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit bac3bf287a)
The current code uses pa_features_206 for POWERPC_MMU_2_06, and
for everything else, it uses pa_features_207. This is bad in some
cases because there is also a "degraded" MMU version of ISA 2.06,
called POWERPC_MMU_2_06a, which should of course use the flags for
2.06 instead. And there is also the possibility that the user runs
the pseries machine with a POWER5+ or even 970 processor. In that
case we certainly do not want to set the flags for 2.07, and rather
simply skip the setting of the pa-features property instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 4cbec30d76)
The function spapr_populate_cpu_dt() has become quite big
already, and since we likely have to extend the pa-features
property for every new processor generation, it is nicer
if we put the related code into a separate function.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
(cherry picked from commit 230bf719d3)
@@ -150,13 +150,16 @@ The trace backends are chosen at configure time:
For a list of supported trace backends, try ./configure --help or see below.
If multiple backends are enabled, the trace is sent to them all.
If no backends are explicitly selected, configure will default to the
"log" backend.
The following subsections describe the supported trace backends.
=== Nop ===
The "nop" backend generates empty trace event functions so that the compiler
can optimize out trace events completely. This is the default and imposes no
performance penalty.
can optimize out trace events completely. This imposes no performance
penalty.
Note that regardless of the selected trace backend, events with the "disable"
property will be generated with the "nop" backend.
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.