Clarify that:
- for protocols the brdv_file_open function is used instead
of bdrv_open;
- when protocol_name is set, a driver should expect
to be given only a filename and no other options.
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
The blkreplay driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.
Attempts to invoke this driver using protocol syntax
(i.e. blkreplay:<filename:options:...>) will now fail gracefully:
$ qemu-img info blkreplay:foo
qemu-img: Could not open 'blkreplay:foo': Unknown protocol 'blkreplay'
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The throttle driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.
Attempts to invoke this driver using protocol syntax
(i.e. throttle:<filename:options:...>) will now fail gracefully:
$ qemu-img info throttle:foo
qemu-img: Could not open 'throttle:foo': Unknown protocol 'throttle'
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The quorum driver is not a protocol so it should implement bdrv_open
instead of bdrv_file_open and not provide a protocol_name.
Attempts to invoke this driver using protocol syntax
(i.e. quorum:<filename:options:...>) will now fail gracefully:
$ qemu-img info quorum:foo
qemu-img: Could not open 'quorum:foo': Unknown protocol 'quorum'
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The protocol_name field is used when selecting a driver via protocol
syntax (i.e. <protocol_name>:<filename:options:...>). Drivers that are
only selected explicitly (e.g. driver=replication,mode=primary,...)
should not have a protocol_name.
This patch removes the protocol_name field from the brdv_replication
structure so that attempts to invoke this driver using protocol syntax
will fail gracefully:
$ qemu-img info replication:foo
qemu-img: Could not open 'replication:': Unknown protocol 'replication'
Buglink: https://bugs.launchpad.net/qemu/+bug/1726733
Signed-off-by: Fabiano Rosas <farosas@linux.vnet.ibm.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
gtk,spice: add dmabuf support.
sdl,vnc,gtk: bugfixes.
ui/qapi: add device ID and head parameters to screendump.
build: try improve handling of clang warnings.
# gpg: Signature made Mon 12 Mar 2018 09:13:28 GMT
# gpg: using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20180312-pull-request:
qapi: Add device ID and head parameters to screendump
spice: add cursor_dmabuf support
spice: add scanout_dmabuf support
spice: drop dprint() debug logging
vnc: deal with surface NULL pointers
ui/gtk-egl: add cursor_dmabuf support
ui/gtk-egl: add scanout_dmabuf support
ui/gtk: use GtkGlArea on wayland only
ui/opengl: Makefile cleanup
ui/gtk: group gtk.mo declarations in Makefile
ui/gtk: make GtkGlArea usage a runtime option
sdl: workaround bug in sdl 2.0.8 headers
make: switch language file build to be gtk module aware
build: try improve handling of clang warnings
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm queue:
* i.MX: Add i.MX7 SOC implementation and i.MX7 Sabre board
* Report the correct core count in A53 L2CTLR on the ZynqMP board
* linux-user: preliminary SVE support work (signal handling)
* hw/arm/boot: fix memory leak in case of error loading ELF file
* hw/arm/boot: avoid reading off end of buffer if passed very
small image file
* hw/arm: Use more CONFIG switches for the object files
* target/arm: Add "-cpu max" support
* hw/arm/virt: Support -machine gic-version=max
* hw/sd: improve debug tracing
* hw/sd: sdcard: Add the Tuning Command (CMD 19)
* MAINTAINERS: add Philippe as odd-fixes maintainer for SD
# gpg: Signature made Fri 09 Mar 2018 17:24:23 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180309: (25 commits)
MAINTAINERS: Add entries for SD (SDHCI, SDBus, SDCard)
sdhci: Fix a typo in comment
sdcard: Add the Tuning Command (CMD19)
sdcard: Display which protocol is used when tracing (SD or SPI)
sdcard: Display command name when tracing CMD/ACMD
sdcard: Do not trace CMD55, except when we already expect an ACMD
hw/arm/virt: Support -machine gic-version=max
hw/arm/virt: Add "max" to the list of CPU types "virt" supports
target/arm: Make 'any' CPU just an alias for 'max'
target/arm: Add "-cpu max" support
target/arm: Move definition of 'host' cpu type into cpu.c
target/arm: Query host CPU features on-demand at instance init
arm: avoid heap-buffer-overflow in load_aarch64_image
arm: fix load ELF error leak
hw/arm: Use more CONFIG switches for the object files
aarch64-linux-user: Add support for SVE signal frame records
aarch64-linux-user: Add support for EXTRA signal frame records
aarch64-linux-user: Remove struct target_aux_context
aarch64-linux-user: Split out helpers for guest signal handling
linux-user: Implement aarch64 PR_SVE_SET/GET_VL
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Typically the scanline length and the line offset are identical. But
in case they are not our calculation for region_end is incorrect. Using
line_offset is fine for all scanlines, except the last one where we have
to use the actual scanline length.
Fixes: CVE-2018-7550
Reported-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Prasad J Pandit <pjp@fedoraproject.org>
Tested-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-id: 20180309143704.13420-1-kraxel@redhat.com
Add audio/ to common-obj-m variable.
Also run both audio and ui variables through unnest-vars.
This avoids sdl.mo (exists in both audio/ and ui/) name clashes.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306074053.22856-4-kraxel@redhat.com
Make audio_driver_lookup() try load the module in case it doesn't find
the driver in the registry. Also load all modules for -audio-help, so
the help output includes the help text for modular audio drivers.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180306074053.22856-3-kraxel@redhat.com
Add registry for audio drivers, using the existing audio_driver struct.
Make all drivers register themself. The old list of audio_driver struct
pointers is now a list of audio driver names, specifying the priority
(aka probe order) in case no driver is explicitly asked for.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180306074053.22856-2-kraxel@redhat.com
Block layer patches
# gpg: Signature made Fri 09 Mar 2018 15:09:20 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (56 commits)
qemu-iotests: fix 203 migration completion race
iotests: Tweak 030 in order to trigger a race condition with parallel jobs
iotests: Skip test for ENOMEM error
iotests: Mark all tests executable
iotests: Test creating overlay when guest running
qemu-iotests: Test ssh image creation over QMP
qemu-iotests: Test qcow2 over file image creation with QMP
block: Fail bdrv_truncate() with negative size
file-posix: Fix no-op bdrv_truncate() with falloc preallocation
ssh: Support .bdrv_co_create
ssh: Pass BlockdevOptionsSsh to connect_to_ssh()
ssh: QAPIfy host-key-check option
ssh: Use QAPI BlockdevOptionsSsh object
sheepdog: Support .bdrv_co_create
sheepdog: QAPIfy "redundancy" create option
nfs: Support .bdrv_co_create
nfs: Use QAPI options in nfs_client_open()
rbd: Use qemu_rbd_connect() in qemu_rbd_do_create()
rbd: Assign s->snap/image_name in qemu_rbd_open()
rbd: Support .bdrv_co_create
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU's screendump command can only take dumps from the primary display.
When using multiple VGA cards, there is no way to get a dump from a
secondary card or other display heads yet. So let's add a 'device' and
a 'head' parameter to the HMP and QMP commands to be able to specify
alternative devices and heads with the screendump command, too.
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1520267868-31778-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add support for cursor dmabufs. qemu has to render the cursor for
that, so in case a cursor is present qemu allocates a new dmabuf, blits
the scanout, blends in the pointer and passes on the new dmabuf to
spice-server. Without cursor qemu continues to simply pass on the
scanout dmabuf as-is.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180308090618.30147-4-kraxel@redhat.com
Secondary displays in multihead setups are allowed to have a NULL
DisplaySurface. Typically user interfaces handle this by hiding the
window which shows the display in question.
This isn't an option for vnc though because it simply hasn't a concept
of windows or outputs. So handle the situation by showing a placeholder
DisplaySurface instead. Also check in console_select whenever a surface
is preset in the first place before requesting an update.
This fixes a segfault which can be triggered by switching to an unused
display (via vtrl-alt-<nr>) in a multihead setup, for example using
-device virtio-vga,max_outputs=2.
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-id: 20180308161803.6152-1-kraxel@redhat.com
For dma-buf support we need a egl context. The gtk x11 backend uses glx
contexts though. We can't use the GtkGlArea widget on x11 because of
that, so use our own gtk-egl code instead. wayland continues to use
the GtkGlArea widget.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-5-kraxel@redhat.com
Compile in both gtk-egl and gtk-gl-area, then allow to choose at runtime
instead of compile time which opengl variant we want use.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180306090951.22932-2-kraxel@redhat.com
This patch disables the pragma diagnostic -Wunused-but-set-variable for
clang in util/coroutine-ucontext.c.
This in turn allows us to remove it from the configure check, so the
CONFIG_PRAGMA_DIAGNOSTIC_AVAILABLE will succeed for clang.
With that in place clang builds (linux) will use -Werror by default,
which breaks the build due to warning about unaligned struct members.
Just turning off this warning isn't a good idea as it indicates
portability problems. So make it a warning again, using
-Wno-error=address-of-packed-member. That way it doesn't break the
build but still shows up in the logs.
Now clang builds qemu without errors. Well, almost. There are some
left in the rdma code. Leaving that to the rdma people. All others can
use --disable-rdma to workarounds this.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-Id: <20180309135945.20436-1-kraxel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
this actually limits (as the original commit mesage suggests) the
number of I/O buffers that can be allocated and not the number
of parallel (inflight) I/O requests.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1520507908-16743-4-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
this patch makes the bulk phase of a block migration to take
place before we start transferring ram. As the bulk block migration
can take a long time its pointless to transfer ram during that phase.
Signed-off-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <1520507908-16743-2-git-send-email-pl@kamp.de>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Spotted thanks to ASAN:
QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/migration-test -p /x86_64/migration/bad_dest
==30302==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7f60efba1a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7f60eef3cf75 in g_malloc0 ../glib/gmem.c:124
#2 0x55ca9094702c in error_copy /home/elmarco/src/qemu/util/error.c:203
#3 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
#4 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
#5 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
#6 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
#7 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
#8 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
#9 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
#10 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
#11 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
#12 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
#13 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
#14 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
#15 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
#16 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
#17 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
#18 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)
Indirect leak of 45 byte(s) in 1 object(s) allocated from:
#0 0x7f60efba1850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7f60eef3cf0c in g_malloc ../glib/gmem.c:94
#2 0x7f60eef3d1cf in g_malloc_n ../glib/gmem.c:331
#3 0x7f60eef596eb in g_strdup ../glib/gstrfuncs.c:363
#4 0x55ca90947085 in error_copy /home/elmarco/src/qemu/util/error.c:204
#5 0x55ca9037a30f in migrate_set_error /home/elmarco/src/qemu/migration/migration.c:1139
#6 0x55ca9037a462 in migrate_fd_error /home/elmarco/src/qemu/migration/migration.c:1150
#7 0x55ca9038162b in migrate_fd_connect /home/elmarco/src/qemu/migration/migration.c:2411
#8 0x55ca90386e41 in migration_channel_connect /home/elmarco/src/qemu/migration/channel.c:81
#9 0x55ca9038335e in socket_outgoing_migration /home/elmarco/src/qemu/migration/socket.c:85
#10 0x55ca9083dd3a in qio_task_complete /home/elmarco/src/qemu/io/task.c:142
#11 0x55ca9083d6cc in gio_task_thread_result /home/elmarco/src/qemu/io/task.c:88
#12 0x7f60eef37317 in g_idle_dispatch ../glib/gmain.c:5552
#13 0x7f60eef3490b in g_main_dispatch ../glib/gmain.c:3182
#14 0x7f60eef357ac in g_main_context_dispatch ../glib/gmain.c:3847
#15 0x55ca90927231 in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
#16 0x55ca90927420 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
#17 0x55ca909275fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
#18 0x55ca8fc1c2a4 in main_loop /home/elmarco/src/qemu/vl.c:1942
#19 0x55ca8fc2eb3a in main /home/elmarco/src/qemu/vl.c:4724
#20 0x7f60e4082009 in __libc_start_main (/lib64/libc.so.6+0x21009)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180306170959.3921-1-marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
From the "Physical Layer Simplified Specification Version 3.01":
A known data block ("Tuning block") can be used to tune sampling
point for tuning required hosts. [...]
This procedure gives the system optimal timing for each specific
host and card combination and compensates for static delays in
the timing budget including process, voltage and different PCB
loads and skews. [...]
Data block, carried by DAT[3:0], contains a pattern for tuning
sampling position to receive data on the CMD and DAT[3:0] line.
[based on a patch from Alistair Francis <alistair.francis@xilinx.com>
from qemu/xilinx tag xilinx-v2015.2]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180309153654.13518-5-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add support for passing 'max' to -machine gic-version. By analogy
with the -cpu max option, this picks the "best available" GIC version
whether you're using KVM or TCG, so it behaves like 'host' when
using KVM, and gives you GICv3 when using TCG.
Also like '-cpu host', using -machine gic-version=max' means there
is no guarantee of migration compatibility between QEMU versions;
in future 'max' might mean '4'.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-7-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Now we have a working '-cpu max', the linux-user-only
'any' CPU is pretty much the same thing, so implement it
that way.
For the moment we don't add any of the extra feature bits
to the system-emulation "max", because we don't set the
ID register bits we would need to to advertise those
features as present.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180308130626.12393-5-peter.maydell@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Add support for "-cpu max" for ARM guests. This CPU type behaves
like "-cpu host" when KVM is enabled, and like a system CPU with
the maximum possible feature set otherwise. (Note that this means
it won't be migratable across versions, as we will likely add
features to it in future.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-4-peter.maydell@linaro.org
Move the definition of the 'host' cpu type into cpu.c, where all the
other CPU types are defined. We can do this now we've decoupled it
from the KVM-specific host feature probing. This means we now create
the type unconditionally (assuming we were built with KVM support at
all), but if you try to use it without -enable-kvm this will end
up in the "host cpu probe failed and KVM not enabled" path in
arm_cpu_realizefn(), for an appropriate error message.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-3-peter.maydell@linaro.org
Currently we query the host CPU features in the class init function
for the TYPE_ARM_HOST_CPU class, so that we can later copy them
from the class object into the instance object in the object
instance init function. This is awkward for implementing "-cpu max",
which should work like "-cpu host" for KVM but like "cpu with all
implemented features" for TCG.
Move the place where we store the information about the host CPU from
a class object to static variables in kvm.c, and then in the instance
init function call a new kvm_arm_set_cpu_features_from_host()
function which will query the host kernel if necessary and then
fill in the CPU instance fields.
This allows us to drop the special class struct and class init
function for TYPE_ARM_HOST_CPU entirely.
We can't delay the probe until realize, because the ARM
instance_post_init hook needs to look at the feature bits we
set, so we need to do it in the initfn. This is safe because
the probing doesn't affect the actual VM state (it creates a
separate scratch VM to do its testing), but the probe might fail.
Because we can't report errors in retrieving the host features
in the initfn, we check this belatedly in the realize function
(the intervening code will be able to cope with the relevant
fields in the CPU structure being zero).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20180308130626.12393-2-peter.maydell@linaro.org
Spotted by ASAN:
QTEST_QEMU_BINARY=aarch64-softmmu/qemu-system-aarch64 tests/boot-serial-test
Direct leak of 48 byte(s) in 1 object(s) allocated from:
#0 0x7ff8a9b0ca38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7ff8a8ea7f75 in g_malloc0 ../glib/gmem.c:124
#2 0x55fef3d99129 in error_setv /home/elmarco/src/qemu/util/error.c:59
#3 0x55fef3d99738 in error_setg_internal /home/elmarco/src/qemu/util/error.c:95
#4 0x55fef323acb2 in load_elf_hdr /home/elmarco/src/qemu/hw/core/loader.c:393
#5 0x55fef2d15776 in arm_load_elf /home/elmarco/src/qemu/hw/arm/boot.c:830
#6 0x55fef2d16d39 in arm_load_kernel_notify /home/elmarco/src/qemu/hw/arm/boot.c:1022
#7 0x55fef3dc634d in notifier_list_notify /home/elmarco/src/qemu/util/notify.c:40
#8 0x55fef2fc3182 in qemu_run_machine_init_done_notifiers /home/elmarco/src/qemu/vl.c:2716
#9 0x55fef2fcbbd1 in main /home/elmarco/src/qemu/vl.c:4679
#10 0x7ff89dfed009 in __libc_start_main (/lib64/libc.so.6+0x21009)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A lot of ARM object files are linked into the executable unconditionally,
even though we have corresponding CONFIG switches like CONFIG_PXA2XX or
CONFIG_OMAP. We should make sure to use these switches in the Makefile so
that the users can disable certain unwanted boards and devices more easily.
While we're at it, also add some new switches for the boards that do not
have a CONFIG option yet.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1520266949-29817-1-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This changes the qemu signal frame layout to be more like the kernel's,
in that the various records are dynamically allocated rather than fixed
in place by a structure.
For now, all of the allocation is out of uc.tuc_mcontext.__reserved,
so the allocation is actually trivial. That will change with SVE support.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180303143823.27055-4-richard.henderson@linaro.org
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add code needed to get a functional PCI subsytem when using in
conjunction with upstream Linux guest (4.13+). Tested to work against
"e1000e" (network adapter, using MSI interrupts) as well as
"usb-ehci" (USB controller, using legacy PCI interrupts).
Based on "i.MX6 Applications Processor Reference Manual" (Document
Number: IMX6DQRM Rev. 4) as well as corresponding dirver in Linux
kernel (circa 4.13 - 4.16 found in drivers/pci/dwc/*)
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches
# gpg: Signature made Fri Mar 9 15:40:32 2018 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2018-03-09:
qemu-iotests: fix 203 migration completion race
iotests: Tweak 030 in order to trigger a race condition with parallel jobs
iotests: Skip test for ENOMEM error
iotests: Mark all tests executable
iotests: Test creating overlay when guest running
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Using a local m68k floatx80_tentox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-9-laurent@vivier.eu>
Using a local m68k floatx80_twotox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-8-laurent@vivier.eu>
Using a local m68k floatx80_etox()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-7-laurent@vivier.eu>
Using a local m68k floatx80_log2()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-6-laurent@vivier.eu>
Using a local m68k floatx80_log10()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-5-laurent@vivier.eu>
Using a local m68k floatx80_logn()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-4-laurent@vivier.eu>
Using a local m68k floatx80_lognp1()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-3-laurent@vivier.eu>
There is a race between the test's 'query-migrate' QMP command after the
QMP 'STOP' event and completing the migration:
The test case invokes 'query-migrate' upon receiving 'STOP'. At this
point the migration thread may still be in the process of completing.
Therefore 'query-migrate' can return 'status': 'active' for a brief
window of time instead of 'status': 'completed'. This results in
qemu-iotests 203 hanging.
Solve the race by enabling the 'events' migration capability, which
causes QEMU to emit migration-specific QMP events that do not suffer
from this race condition. Wait for the QMP 'MIGRATION' event with
'status': 'completed'.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180305155926.25858-1-stefanha@redhat.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
This patch tweaks TestParallelOps in iotest 030 so it allocates data
in smaller regions (256KB/512KB instead of 512KB/1MB) and the
block-stream job in test_stream_commit() only needs to copy data that
is at the very end of the image.
This way when the block-stream job is awakened it will finish right
away without any chance of being stopped by block_job_sleep_ns(). This
triggers the bug that was fixed by 3d5d319e12 and
1a63a90750 and is therefore a more useful test
case for parallel block jobs.
After this patch the aforementiond bug can also be reproduced with the
test_stream_parallel() test case.
Since with this change the stream job in test_stream_commit() finishes
early, this patch introduces a similar test case where both jobs are
slowed down so they can actually run in parallel.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Cc: John Snow <jsnow@redhat.com>
Message-id: 20180306130121.30243-1-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The AFL image is to exercise the code validating image size, which
doesn't work on 32 bit or when out of memory (there is a large
allocation before the interesting point). So check that and skip the
test, instead of faking the result.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20180301011413.11531-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Most callers have their own checks, but something like this should also
be checked centrally. As it happens, x-blockdev-create can pass negative
image sizes to format drivers (because there is no QAPI type that would
reject negative numbers) and triggers the check added by this patch.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
If bdrv_truncate() is called, but the requested size is the same as
before, don't call posix_fallocate(), which returns -EINVAL for length
zero and would therefore make bdrv_truncate() fail.
The problem can be triggered by creating a zero-sized raw image with
'falloc' preallocation mode.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This adds the .bdrv_co_create driver callback to ssh, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Move the parsing of the QDict options up to the callers, in preparation
for the .bdrv_co_create implementation that directly gets a QAPI type.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This makes the host-key-check option available in blockdev-add.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Create a BlockdevOptionsSsh object in connect_to_ssh() and take the
options from there. 'host_key_check' is still processed separately
because it's not in the schema yet.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This adds the .bdrv_co_create driver callback to sheepdog, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The "redundancy" option for Sheepdog image creation is currently a
string that can encode one or two integers depending on its format,
which at the same time implicitly selects a mode.
This patch turns it into a QAPI union and converts the string into such
a QAPI object before interpreting the values.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This adds the .bdrv_co_create driver callback to nfs, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Using the QAPI visitor to turn all options into QAPI BlockdevOptionsNfs
simplifies the code a lot. It will also be useful for implementing the
QAPI based .bdrv_co_create callback.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This is almost exactly the same code. The differences are that
qemu_rbd_connect() supports BlockdevOptionsRbd.server and that the cache
mode is set explicitly.
Supporting 'server' is a welcome new feature for image creation.
Caching is disabled by default, so leave it that way.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Now that the options are already available in qemu_rbd_open() and not
only parsed in qemu_rbd_connect(), we can assign s->snap and
s->image_name there instead of passing the fields by reference to
qemu_rbd_connect().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This adds the .bdrv_co_create driver callback to rbd, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
With the conversion to a QAPI options object, the function is now
prepared to be used in a .bdrv_co_create implementation.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of the QemuOpts in qemu_rbd_connect(), we want to use QAPI
objects. As a preparation, fetch those options directly from the QDict
that .bdrv_open() supports in the rbd driver and that are not in the
schema.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The code to establish an RBD connection is duplicated between open and
create. In order to be able to share the code, factor out the code from
qemu_rbd_open() as a first step.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This adds the .bdrv_co_create driver callback to gluster, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This adds the .bdrv_co_create driver callback to file-win32, which
enables image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This adds the .bdrv_co_create driver callback to file, which enables
image creation over QMP.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This adds a synchronous x-blockdev-create QMP command that can create
qcow2 images on a given node name.
We don't want to block while creating an image, so this is not the final
interface in all aspects, but BlockdevCreateOptionsQcow2 and
.bdrv_co_create() are what they actually might look like in the end. In
any case, this should be good enough to test whether we interpret
BlockdevCreateOptions as we should.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We'll use a separate source file for image creation, and we need to
check there whether the requested driver is whitelisted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Instead of manually creating the BlockdevCreateOptions object, use a
visitor to parse the given options into the QAPI object.
This involves translation from the old command line syntax to the syntax
mandated by the QAPI schema. Option names are still checked against
qcow2_create_opts, so only the old option names are allowed on the
command line, even if they are translated in qcow2_create().
In contrast, new option values are optionally recognised besides the old
values: 'compat' accepts 'v2'/'v3' as an alias for '0.10'/'1.1', and
'encrypt.format' accepts 'qcow' as an alias for 'aes' now.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
A few block drivers will need to rename .bdrv_create options for their
QAPIfication, so let's have a helper function for that.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This allows, given a QemuOpts for a QemuOptsList that was merged from
multiple QemuOptsList, to only consider those options that exist in one
specific list. Block drivers need this to separate format-layer create
options from protocol-level options.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Once qcow2_co_create() can be called directly on an already existing
node, we must provide the 'full' and 'falloc' preallocation modes
outside of creating the image on the protocol layer. Fortunately, we
have preallocated truncate now which can provide this functionality.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of passing the encryption format name and the QemuOpts down, use
the QCryptoBlockCreateOptions contained in BlockdevCreateOptions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Instead of passing a separate BlockDriverState* into qcow2_co_create(),
make use of the BlockdevRef that is included in BlockdevCreateOptions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
All of the simple options are now passed to qcow2_co_create() in a
BlockdevCreateOptions object. Still missing: node-name and the
encryption options.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Currently, qcow2_create() only parses the QemuOpts and then calls
qcow2_co_create() for the actual image creation, which includes both the
creation of the actual file on the file system and writing a valid empty
qcow2 image into that file.
The plan is that qcow2_co_create() becomes the function that implements
the functionality for a future 'blockdev-create' QMP command, which only
creates the qcow2 layer on an already opened file node.
This is a first step towards that goal: Let's move out anything that
deals with the protocol layer from qcow2_co_create() into
qcow2_create(). This means that qcow2_co_create() doesn't need a file
name any more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
The functions originally known as qcow2_create() and qcow2_create2()
are now called qcow2_co_create_opts() and qcow2_co_create(), which
matches the names of the BlockDriver callbacks that they will implement
at the end of this patch series.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
This creates a BlockdevCreateOptions union type that will contain all of
the options for image creation. We'll start out with an empty struct
type BlockdevCreateNotSupported for all drivers.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
'qemu-img check' cannot detect if a snapshot's L1 table is corrupted.
This patch checks the table's offset and size and reports corruption
if the values are not valid.
This patch doesn't add code to fix that corruption yet, only to detect
and report it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function deletes a snapshot from disk, removing its entry from
the snapshot table, freeing its L1 table and decreasing the refcounts
of all clusters.
The L1 table offset and size are however not validated. If we use
invalid values in this function we'll probably corrupt the image even
more, so we should return an error instead.
We now have a function to take care of this, so let's use it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function copies a snapshot's L1 table into the active one without
validating it first.
We now have a function to take care of this, so let's use it.
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The inactive-l2 overlap check iterates uses the L1 tables from all
snapshots, but it does not validate them first.
We now have a function to take care of this, so let's use it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function iterates over all snapshots of a qcow2 file in order to
expand all zero clusters, but it does not validate the snapshots' L1
tables first.
We now have a function to take care of this, so let's use it.
We can also take the opportunity to replace the sector-based
bdrv_read() with bdrv_pread().
Cc: Eric Blake <eblake@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function checks that the size of a snapshot's L1 table is not too
large, but it doesn't validate the offset.
We now have a function to take care of this, so let's use it.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function checks that the offset and size of a table are valid.
While the offset checks are fine, the size check is too generic, since
it only verifies that the total size in bytes fits in a 64-bit
integer. In practice all tables used in qcow2 have much smaller size
limits, so the size needs to be checked again for each table using its
actual limit.
This patch generalizes this function by allowing the caller to specify
the maximum size for that table. In addition to that it allows passing
an Error variable.
The function is also renamed and made public since we're going to use
it in other parts of the code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
update_header_sync itself does not need to flush the caches to disk.
The only paths that allocate clusters are:
- bitmap_list_store with in_place=false, called by update_ext_header_and_dir
- store_bitmap_data, called by store_bitmap
- store_bitmap, called by qcow2_store_persistent_dirty_bitmaps and
followed by update_ext_header_and_dir
So in the end the central place where we need to flush the caches
is update_ext_header_and_dir.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If the bdrv_reopen_prepare helper isn't provided, the qemu-img commit
command fails to re-open the base layer after committing changes into
it. Provide a no-op implementation for the LUKS driver, since there
is not any custom work that needs doing to re-open it.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This functions is needed by upcoming m68k softfloat functions.
Source code copied for WinUAE (tag 3500)
(The WinUAE file has been copied from QEMU and has
the QEMU licensing notice)
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180305203910.10391-2-laurent@vivier.eu>
QEMU RISC-V Emulation Support (RV64GC, RV32GC)
This release renames the SiFive machines to sifive_e and sifive_u
to represent the SiFive Everywhere and SiFive Unleashed platforms.
SiFive has configurable soft-core IP, so it is intended that these
machines will be extended to enable a variety of SiFive IP blocks.
The CPU definition infrastructure has been improved and there are
now vendor CPU modules including the SiFiVe E31, E51, U34 and U54
cores. The emulation accuracy for the E series has been improved
by disabling the MMU for the E series. S mode has been disabled on
cores that only support M mode and U mode. The two Spike machines
that support two privileged ISA versions have been coalesced into
one file. This series has Signed-off-by from the core contributors.
*** Known Issues ***
* Disassembler has some checkpatch warnings for the sake of code brevity
* scripts/qemu-binfmt-conf.sh has checkpatch warnings due to line length
* PMP (Physical Memory Protection) is as-of-yet unused and needs testing
*** Changelog ***
v8.2
* Rebase
v8.1
* Fix missed case of renaming spike_v1.9 to spike_v1.9.1
v8
* Added linux-user/riscv/target_elf.h during rebase
* Make resetvec configurable and clear mpp and mie on reset
* Use SiFive E31, E51, U34 and U54 cores in SiFive machines
* Define SiFive E31, E51, U34 and U54 cores
* Refactor CPU core definition in preparation for vendor cores
* Prevent S or U mode unless S or U extensions are present
* SiFive E Series cores have no MMU
* SiFive E Series cores have U mode
* Make privileged ISA v1.10 implicit in CPU types
* Remove DRAM_BASE and EXT_IO_BASE as they vary by machine
* Correctly handle mtvec and stvec alignment with respect to RVC
* Print more machine mode state in riscv_cpu_dump_state
* Make riscv_isa_string use compact extension order method
* Fix bug introduced in v6 RISCV_CPU_TYPE_NAME macro change
* Parameterize spike v1.9.1 config string
* Coalesce spike_v1.9.1 and spike_v1.10 machines
* Rename sifive_e300 to sifive_e, and sifive_u500 to sifive_u
v7
* Make spike_v1.10 the default machine
* Rename spike_v1.9 to spike_v1.9.1 to match privileged spec version
* Remove empty target/riscv/trace-events file
* Monitor ROM 32-bit reset code needs to be target endian
* Add TARGET_TIOCGPTPEER to linux-user/riscv/termbits.h
* Add -initrd support to the virt board
* Fix naming in spike machine interface header
* Update copyright notice on RISC-V Spike machines
* Update copyright notice on RISC-V HTIF Console device
* Change CPU Core and translator to GPLv2+
* Change RISC-V Disassembler to GPLv2+
* Change SiFive Test Finisher to GPLv2+
* Change SiFive CLINT to GPLv2+
* Change SiFive PRCI to GPLv2+
* Change SiFive PLIC to GPLv2+
* Change RISC-V spike machines to GPLv2+
* Change RISC-V virt machine to GPLv2+
* Change SiFive E300 machine to GPLv2+
* Change SiFive U500 machine to GPLv2+
* Change RISC-V Hart Array to GPLv2+
* Change RISC-V HTIF device to GPLv2+
* Change SiFiveUART device to GPLv2+
v6
* Drop IEEE 754-201x minimumNumber/maximumNumber for fmin/fmax
* Remove some unnecessary commented debug statements
* Change RISCV_CPU_TYPE_NAME to use riscv-cpu suffix
* Define all CPU variants for linux-user
* qemu_log calls require trailing \n
* Replace PLIC printfs with qemu_log
* Tear out unused HTIF code and eliminate shouting debug messages
* Fix illegal instruction when sfence.vma is passed (rs2) arguments
* Make updates to PTE accessed and dirty bits atomic
* Only require atomic PTE updates on MTTCG enabled guests
* Page fault if accessed or dirty bits can't be updated
* Fix get_physical_address PTE reads and writes on riscv32
* Remove erroneous comments from the PLIC
* Default enable MTTCG
* Make WFI less conservative
* Unify local interrupt handling
* Expunge HTIF interrupts
* Always access mstatus.mip under a lock
* Don't implement rdtime/rdtimeh in system mode (bbl emulates them)
* Implement insreth/cycleh for rv32 and always enable user-mode counters
* Add GDB stub support for reading and writing CSRs
* Rename ENABLE_CHARDEV #ifdef from HTIF code
* Replace bad HTIF ELF code with load_elf symbol callback
* Convert chained if else fault handlers to switch statements
* Use RISCV exception codes for linux-user page faults
v5
* Implement NaN-boxing for flw, set high order bits to 1
* Use float_muladd_negate_* flags to floatXX_muladd
* Use IEEE 754-201x minimumNumber/maximumNumber for fmin/fmax
* Fix TARGET_NR_syscalls
* Update linux-user/riscv/syscall_nr.h
* Fix FENCE.I, needs to terminate translation block
* Adjust unusual convention for interruptno >= 0
v4
* Add @riscv: since 2.12 to CpuInfoArch
* Remove misleading little-endian comment from load_kernel
* Rename cpu-model property to cpu-type
* Drop some unnecessary inline function attributes
* Don't allow GDB to set value of x0 register
* Remove unnecessary empty property lists
* Add Test Finisher device to implement poweroff in virt machine
* Implement priv ISA v1.10 trap and sret/mret xPIE/xIE behavior
* Store fflags data in fp_status
* Purge runtime users of helper_raise_exception
* Fix validate_csr
* Tidy gen_jalr
* Tidy immediate shifts
* Add gen_exception_inst_addr_mis
* Add gen_exception_debug
* Add gen_exception_illegal
* Tidy helper_fclass_*
* Split rounding mode setting to a new function
* Enforce MSTATUS_FS via TB flags
* Implement acquire/release barrier semantics
* Use atomic operations as required
* Fix FENCE and FENCE_I
* Remove commented code from spike machines
* PAGE_WRITE permissions can be set on loads if page is already dirty
* The result of format conversion on an NaN must be a quiet NaN
* Add missing process_queued_cpu_work to riscv linux-user
* Remove float(32|64)_classify from cpu.h
* Removed nonsensical unions aliasing the same type
* Use uintN_t instead of uintN_fast_t in fpu_helper.c
* Use macros for FPU exception values in softfloat_flags_to_riscv
* Move code to set round mode into set_fp_round_mode function
* Convert set_fp_exceptions from a macro to an inline function
* Convert round mode helper into an inline function
* Make fpu_helper ieee_rm array static const
* Include cpu_mmu_index in cpu_get_tb_cpu_state flags
* Eliminate MPRV influence on mmu_index
* Remove unrecoverable do_unassigned_access function
* Only update PTE accessed and dirty bits if necessary
* Remove unnecessary tlb_flush in set_mode as mode is in mmu_idx
* Remove buggy support for misa writes. misa writes are optional
and are not implemented in any known hardware
* Always set PTE read or execute permissions during page walk
* Reorder helper function declarations to match order in helper.c
* Remove redundant variable declaration in get_physical_address
* Remove duplicated code from get_physical_address
* Use mmu_idx instead of mem_idx in riscv_cpu_get_phys_page_debug
v3
* Fix indentation in PMP and HTIF debug macros
* Fix disassembler checkpatch open brace '{' on next line errors
* Fix trailing statements on next line in decode_inst_decompress
* NOTE: the other checkpatch issues have been reviewed previously
v2
* Remove redundant NULL terminators from disassembler register arrays
* Change disassembler register name arrays to const
* Refine disassembler internal function names
* Update dates in disassembler copyright message
* Remove #ifdef CONFIG_USER_ONLY version of cpu_has_work
* Use ULL suffix on 64-bit constants
* Move riscv_cpu_mmu_index from cpu.h to helper.c
* Move riscv_cpu_hw_interrupts_pending from cpu.h to helper.c
* Remove redundant TARGET_HAS_ICE from cpu.h
* Use qemu_irq instead of void* for irq definition in cpu.h
* Remove duplicate typedef from struct CPURISCVState
* Remove redundant g_strdup from cpu_register
* Remove redundant tlb_flush from riscv_cpu_reset
* Remove redundant mode calculation from get_physical_address
* Remove redundant debug mode printf and dcsr comment
* Remove redundant clearing of MSB for bare physical addresses
* Use g_assert_not_reached for invalid mode in get_physical_address
* Use g_assert_not_reached for unreachable checks in get_physical_address
* Use g_assert_not_reached for unreachable type in raise_mmu_exception
* Return exception instead of aborting for misaligned fetches
* Move exception defines from cpu.h to cpu_bits.h
* Remove redundant breakpoint control definitions from cpu_bits.h
* Implement riscv_cpu_unassigned_access exception handling
* Log and raise exceptions for unimplemented CSRs
* Match Spike HTIF exit behavior - don’t print TEST-PASSED
* Make frm,fflags,fcsr writes trap when mstatus.FS is clear
* Use g_assert_not_reached for unreachable invalid mode
* Make hret,uret,dret generate illegal instructions
* Move riscv_cpu_dump_state and int/fpr regnames to cpu.c
* Lift interrupt flag and mask into constants in cpu_bits.h
* Change trap debugging to use qemu_log_mask LOG_TRACE
* Change CSR debugging to use qemu_log_mask LOG_TRACE
* Change PMP debugging to use qemu_log_mask LOG_TRACE
* Remove commented code from pmp.c
* Change CpuInfoRISCV qapi schema docs to Since 2.12
* Change RV feature macro to use target_ulong cast
* Remove riscv_feature and instead use misa extension flags
* Make riscv_flush_icache_syscall a no-op
* Undo checkpatch whitespace fixes in unrelated linux-user code
* Remove redudant constants and tidy up cpu_bits.h
* Make helper_fence_i a no-op
* Move include "exec/cpu-all" to end of cpu.h
* Rename set_privilege to riscv_set_mode
* Move redundant forward declaration for cpu_riscv_translate_address
* Remove TCGV_UNUSED from riscv_translate_init
* Add comment to pmp.c stating the code is untested and currently unused
* Use ctz to simplify decoding of PMP NAPOT address ranges
* Change pmp_is_in_range to use than equal for end addresses
* Fix off by one error in pmp_update_rule
* Rearrange PMP_DEBUG so that formatting is compile-time checked
* Rearrange trap debugging so that formatting is compile-time checked
* Rearrange PLIC debugging so that formatting is compile-time checked
* Use qemu_log/qemu_log_mask for HTIF logging and debugging
* Move exception and interrupt names into cpu.c
* Add Palmer Dabbelt as a RISC-V Maintainer
* Rebase against current qemu master branch
v1
* initial version based on forward port from riscv-qemu repository
*** Background ***
"RISC-V is an open, free ISA enabling a new era of processor innovation
through open standard collaboration. Born in academia and research,
RISC-V ISA delivers a new level of free, extensible software and
hardware freedom on architecture, paving the way for the next 50 years
of computing design and innovation."
The QEMU RISC-V port has been developed and maintained out-of-tree for
several years by Sagar Karandikar and Bastian Koppelmann. The RISC-V
Privileged specification has evolved substantially over this period but
has recently been solidifying. The RISC-V Base ISA has been frozon for
some time and the Privileged ISA, GCC toolchain and Linux ABI are now
quite stable. I have recently joined Sagar and Bastian as a RISC-V QEMU
Maintainer and hope to support upstreaming the port.
There are multiple vendors taping out, preparing to ship, or shipping
silicon that implements the RISC-V Privileged ISA Version 1.10. There
are also several RISC-V Soft-IP cores implementing Privileged ISA
Version 1.10 that run on FPGA such as SiFive's Freedom U500 Platform
and the U54‑MC RISC-V Core IP, among many more implementations from a
variety of vendors. See https://riscv.org/ for more details.
RISC-V support was upstreamed in binutils 2.28 and GCC 7.1 in the first
half of 2016. RISC-V support is now available in LLVM top-of-tree and
the RISC-V Linux port was accepted into Linux 4.15-rc1 late last year
and is available in the Linux 4.15 release. GLIBC 2.27 added support
for the RISC-V ISA running on Linux (requires at least binutils-2.30,
gcc-7.3.0, and linux-4.15). We believe it is timely to submit the
RISC-V QEMU port for upstream review with the goal of incorporating
RISC-V support into the upcoming QEMU 2.12 release.
The RISC-V QEMU port is still under active development, mostly with
respect to device emulation, the addition of Hypervisor support as
specified in the RISC-V Draft Privileged ISA Version 1.11, and Vector
support once the first draft is finalized later this year. We believe
now is the appropriate time for RISC-V QEMU development to be carried
out in the main QEMU repository as the code will benefit from more
rigorous review. The RISC-V QEMU port currently supports all the ISA
extensions that have been finalized and frozen in the Base ISA.
Blog post about recent additions to RISC-V QEMU: https://goo.gl/fJ4zgk
The RISC-V QEMU wiki: https://github.com/riscv/riscv-qemu/wiki
Instructions for building a busybox+dropbear root image, BBL (Berkeley
Boot Loader) and linux kernel image for use with the RISC-V QEMU
'virt' machine: https://github.com/michaeljclark/busybear-linux
*** Overview ***
The RISC-V QEMU port implements the following specifications:
* RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.2
* RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.9.1
* RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.10
The RISC-V QEMU port supports the following instruction set extensions:
* RV32GC with Supervisor-mode and User-mode (RV32IMAFDCSU)
* RV64GC with Supervisor-mode and User-mode (RV64IMAFDCSU)
The RISC-V QEMU port adds the following targets to QEMU:
* riscv32-softmmu
* riscv64-softmmu
* riscv32-linux-user
* riscv64-linux-user
The RISC-V QEMU port supports the following hardware:
* HTIF Console (Host Target Interface)
* SiFive CLINT (Core Local Interruptor) for Timer interrupts and IPIs
* SiFive PLIC (Platform Level Interrupt Controller)
* SiFive Test (Test Finisher) for exiting simulation
* SiFive UART, PRCI, AON, PWM, QSPI support is partially implemented
* VirtIO MMIO (GPEX PCI support will be added in a future patch)
* Generic 16550A UART emulation using 'hw/char/serial.c'
* MTTCG and SMP support (PLIC and CLINT) on the 'virt' machine
The RISC-V QEMU full system emulator supports 5 machines:
* 'spike_v1.9.1', CLINT, PLIC, HTIF console, config-string, Priv v1.9.1
* 'spike_v1.10', CLINT, PLIC, HTIF console, device-tree, Priv v1.10
* 'sifive_e', CLINT, PLIC, SiFive UART, HiFive1 compat, Priv v1.10
* 'sifive_u', CLINT, PLIC, SiFive UART, device-tree, Priv v1.10
* 'virt', CLINT, PLIC, 16550A UART, VirtIO, device-tree, Priv v1.10
This is a list of RISC-V QEMU Port Contributors:
* Alex Suykov
* Andreas Schwab
* Antony Pavlov
* Bastian Koppelmann
* Bruce Hoult
* Chih-Min Chao
* Daire McNamara
* Darius Rad
* David Abdurachmanov
* Hesham Almatary
* Ivan Griffin
* Jim Wilson
* Kito Cheng
* Michael Clark
* Palmer Dabbelt
* Richard Henderson
* Sagar Karandikar
* Shea Levy
* Stefan O'Rear
Notes:
* contributor email addresses available off-list on request.
* checkpatch has been run on all 23 patches.
* checkpatch exceptions are noted in patches that have errors.
* passes "make check" on full build for all targets
* tested riscv-linux-4.6.2 on 'spike_v1.9.1' machine
* tested riscv-linux-4.15 on 'spike_v1.10' and 'virt' machines
* tested SiFive HiFive1 binaries in 'sifive_e' machine
* tested RV64 on 32-bit i386
This patch series includes the following patches:
# gpg: Signature made Thu 08 Mar 2018 19:40:20 GMT
# gpg: using DSA key 6BF1D7B357EF3E4F
# gpg: Good signature from "Michael Clark <michaeljclark@mac.com>"
# gpg: aka "Michael Clark <mjc@sifive.com>"
# gpg: aka "Michael Clark <michael@metaparadigm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 7C99 930E B17C D8BA 073D 5EFA 6BF1 D7B3 57EF 3E4F
* remotes/riscv/tags/riscv-qemu-upstream-v8.2: (23 commits)
RISC-V Build Infrastructure
SiFive Freedom U Series RISC-V Machine
SiFive Freedom E Series RISC-V Machine
SiFive RISC-V PRCI Block
SiFive RISC-V UART Device
RISC-V VirtIO Machine
SiFive RISC-V Test Finisher
RISC-V Spike Machines
SiFive RISC-V PLIC Block
SiFive RISC-V CLINT Block
RISC-V HART Array
RISC-V HTIF Console
Add symbol table callback interface to load_elf
RISC-V Linux User Emulation
RISC-V Physical Memory Protection
RISC-V TCG Code Generation
RISC-V GDB Stub
RISC-V FPU Support
RISC-V CPU Helpers
RISC-V Disassembler
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fixes and cleanups for the 2.12 softfreeze.
# gpg: Signature made Thu 08 Mar 2018 17:53:14 GMT
# gpg: using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20180308:
s390x/virtio: Convert virtio-ccw from *_exit to *_unrealize
pc-bios/s390-ccw: Move string arrays from bootmap header to .c file
s390x/sclp: clean up sclp masks
s390x/sclp: proper support of larger send and receive masks
vfio-ccw: license text should indicate GPL v2 or later
s390x/sclpconsole: Remove dead code - remove exit handlers
numa: we don't implement NUMA for s390x
hw/s390x: Add the possibility to specify the netboot image on the command line
target/s390x: Remove leading underscores from #defines
s390/ipl: only print boot menu error if -boot menu=on was specified
hw/s390x/ipl: Bail out if the network bootloader can not be found
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 00d09fdbba ("vl: pause vcpus before
stopping iothreads") and commit dce8921b2b
("iothread: Stop threads before main() quits") tried to work around the
fact that emulation was still active during termination by stopping
iothreads. They suffer from race conditions:
1. virtio_scsi_handle_cmd_vq() racing with iothread_stop_all() hits the
virtio_scsi_ctx_check() assertion failure because the BDS AioContext
has been modified by iothread_stop_all().
2. Guest vq kick racing with main loop termination leaves a readable
ioeventfd that is handled by the next aio_poll() when external
clients are enabled again, resulting in unwanted emulation activity.
This patch obsoletes those commits by fully disabling emulation activity
when vcpus are stopped.
Use the new vm_shutdown() function instead of pause_all_vcpus() so that
vm change state handlers are invoked too. Virtio devices will now stop
their ioeventfds, preventing further emulation activity after vm_stop().
Note that vm_stop(RUN_STATE_SHUTDOWN) cannot be used because it emits a
QMP STOP event that may affect existing clients.
It is no longer necessary to call replay_disable_events() directly since
vm_shutdown() does so already.
Drop iothread_stop_all() since it is no longer used.
Cc: Fam Zheng <famz@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-5-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock. By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!
Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread. This way no races with the vq handler are
possible.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If the main loop thread invokes .ioeventfd_stop() just as the vq handler
function begins in the IOThread then the handler may lose the race for
the AioContext lock. By the time the vq handler is able to acquire the
AioContext lock the ioeventfd has already been removed and the handler
isn't supposed to run anymore!
Use the new aio_wait_bh_oneshot() function to perform ioeventfd removal
from within the IOThread. This way no races with the vq handler are
possible.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20180307144205.20619-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Commit 5b2ffbe4d9 ("virtio-blk: dataplane:
notify guest as a batch") deferred guest notification to a BH in order
batch notifications, with purpose of avoiding flooding the guest with
interruptions.
This optimization came with a cost. The average latency perceived in the
guest is increased by a few microseconds, but also when multiple IO
operations finish at the same time, the guest won't be notified until
all completions from each operation has been run. On the contrary,
virtio-scsi issues the notification at the end of each completion.
On the other hand, nowadays we have the EVENT_IDX feature that allows a
better coordination between QEMU and the Guest OS to avoid sending
unnecessary interruptions.
With this change, virtio-blk/dataplane only batches notifications if the
EVENT_IDX feature is not present.
Some numbers obtained with fio (ioengine=sync, iodepth=1, direct=1):
- Test specs:
* fio-3.4 (ioengine=sync, iodepth=1, direct=1)
* qemu master
* virtio-blk with a dedicated iothread (default poll-max-ns)
* backend: null_blk nr_devices=1 irqmode=2 completion_nsec=280000
* 8 vCPUs pinned to isolated physical cores
* Emulator and iothread also pinned to separate isolated cores
* variance between runs < 1%
- Not patched
* numjobs=1: lat_avg=327.32 irqs=29998
* numjobs=4: lat_avg=337.89 irqs=29073
* numjobs=8: lat_avg=342.98 irqs=28643
- Patched:
* numjobs=1: lat_avg=323.92 irqs=30262
* numjobs=4: lat_avg=332.65 irqs=29520
* numjobs=8: lat_avg=335.54 irqs=29323
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20180307114459.26636-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Starting qemu with the following arguments causes qemu to segfault:
... -device lsi,id=lsi0 -drive file=iscsi:<...>,format=raw,if=none,node-name=
iscsi1 -device scsi-block,bus=lsi0.0,id=<...>,drive=iscsi1
This patch fixes blk_aio_ioctl() so it does not pass stack addresses to
blk_aio_ioctl_entry() which may be invoked after blk_aio_ioctl() returns. More
details about the bug follow.
blk_aio_ioctl() invokes blk_aio_prwv() with blk_aio_ioctl_entry as the
coroutine parameter. blk_aio_prwv() ultimately calls aio_co_enter().
When blk_aio_ioctl() is executed from within a coroutine context (e.g.
iscsi_bh_cb()), aio_co_enter() adds the coroutine (blk_aio_ioctl_entry) to
the current coroutine's wakeup queue. blk_aio_ioctl() then returns.
When blk_aio_ioctl_entry() executes later, it accesses an invalid pointer:
....
BlkRwCo *rwco = &acb->rwco;
rwco->ret = blk_co_ioctl(rwco->blk, rwco->offset,
rwco->qiov->iov[0].iov_base); <--- qiov is
invalid here
...
In the case when blk_aio_ioctl() is called from a non-coroutine context,
blk_aio_ioctl_entry() executes immediately. But if bdrv_co_ioctl() calls
qemu_coroutine_yield(), blk_aio_ioctl() will return. When the coroutine
execution is complete, control returns to blk_aio_ioctl_entry() after the call
to blk_co_ioctl(). There is no invalid reference after this point, but the
function is still holding on to invalid pointers.
The fix is to change blk_aio_prwv() to accept a void pointer for the IO buffer
rather than a QEMUIOVector. blk_aio_prwv() passes this through in BlkRwCo and the
coroutine function casts it to QEMUIOVector or uses the void pointer directly.
Signed-off-by: Deepa Srinivasan <deepa.srinivasan@oracle.com>
Signed-off-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bootmap.h can currently only be included once - otherwise the linker
complains about multiple definitions of the "magic" strings. It's a
bad style to define string arrays in header files, so let's better
move these to the bootmap.c file instead where they are used.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520317081-5341-1-git-send-email-thuth@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Until 67915de9f0 ("s390x/event-facility: variable-length event masks")
we only supported sclp event masks with a size of exactly 4 bytes, even
though the architecture allows the guests to set up sclp event masks
from 1 to 1021 bytes in length.
After that patch, the behaviour was almost compliant, but some issues
were still remaining, in particular regarding the handling of selective
reads and migration.
When setting the sclp event mask, a mask size is also specified. Until
now we only considered the size in order to decide which bits to save
in the internal state. On the other hand, when a guest performs a
selective read, it sends a mask, but it does not specify a size; the
implied size is the size of the last mask that has been set.
Specifying bits in the mask of selective read that are not available in
the internal mask should return an error, and bits past the end of the
mask should obviously be ignored. This can only be achieved by keeping
track of the lenght of the mask.
The mask length is thus now part of the internal state that needs to be
migrated.
This patch fixes the handling of selective reads, whose size will now
match the length of the event mask, as per architecture.
While the default behaviour is to be compliant with the architecture,
when using older machine models the old broken behaviour is selected
(allowing only masks of size exactly 4), in order to be able to migrate
toward older versions.
Fixes: 67915de9f0 ("s390x/event-facility: variable-length event masks")
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Message-Id: <1519407778-23095-2-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Right now it is possible to crash QEMU for s390x by providing e.g.
-numa node,nodeid=0,cpus=0-1
Problem is, that numa.c uses mc->cpu_index_to_instance_props as an
indicator whether NUMA is supported by a machine type. We don't
implement NUMA for s390x ("topology") yet. However we need
mc->cpu_index_to_instance_props for query-cpus.
So let's fix this case by also checking for mc->get_default_cpu_node_id,
which will be needed by machine_set_cpu_numa_node().
qemu-system-s390x: -numa node,nodeid=0,cpus=0-1: NUMA is not supported by
this machine-type
While at it, make s390_cpu_index_to_props() look like on other
architectures.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180227110255.20999-1-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The file name of the netboot binary is currently hard-coded to
"s390-netboot.img", without a possibility for the user to select
an alternative firmware image here. That's unfortunate, especially
since the basics are already there: The filename is a property of
the s390-ipl device. So we just have to add a check whether the user
already provided the property and only set the default if the string
is still empty. Now it is possible to select a different firmware
image with "-global s390-ipl.netboot_fw=/path/to/s390-netboot.img".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519731154-3127-1-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We should not use leading underscores followed by a capital letter
in #defines since such identifiers are reserved by the C standard.
For ASCE_ORIGIN, REGION_ENTRY_ORIGIN and SEGMENT_ENTRY_ORIGIN I also
added parentheses around the value to silence an error message from
checkpatch.pl.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1520227018-4061-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It is possible that certain QEMU configurations may not
create an IPLB (such as when -kernel is provided). In
this case, a misleading error message will be printed
stating that the "boot menu is not supported for this
device type".
To amend this, only print this message iff boot menu=on
was provided on the commandline. Otherwise, return silently.
While we're at it, remove trailing periods from error
messages.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Message-Id: <1519760121-24594-1-git-send-email-walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If QEMU fails to load 's390-netboot.img', the guest firmware currently
loops forever and just floods the console with "Network boot device
detected" messages. The code in ipl.c apparently already tried to stop
the VM with vm_stop() in this case, but this is in vain since the run
state is later reset due to a call to vm_start() from vl.c again.
To avoid the ugly firmware loop, let's simply exit QEMU directly instead
since it just does not make sense to continue if the required firmware
image can not be loaded. While we're at it, also add the file name of
the netboot binary to the error message, so that the user has a better
hint about what is missing.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519725913-24852-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Something has happened to the old emdebian setup which means it no
longer builds. Let's disable the shippable builds which are always
failing.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Merge tpm 2018/03/07
# gpg: Signature made Wed 07 Mar 2018 12:42:13 GMT
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-03-07-1:
tpm: convert tpm_tis.c to use trace-events
tpm: convert tpm_emulator.c to use trace-events
tpm: convert tpm_util.c to use trace-events
tpm: convert tpm_passthrough.c to use trace-events
tpm: convert tpm_crb.c to use trace-events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since the commit af7a06bac7:
`casa [..](10), .., ..` (and probably others alternate space instructions)
triggers a data access exception when the MMU is disabled.
When we enter get_asi(...) dc->mem_idx is set to MMU_PHYS_IDX when the MMU
is disabled. Just keep mem_idx unchanged in this case so we passthrough the
MMU when it is disabled.
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
The global hack for creating SCSI devices has recently been removed,
but this apparently broke SCSI devices on some boards that were not
ready for this change yet. For the sun4m machines you now get:
$ sparc-softmmu/qemu-system-sparc -boot d -cdrom x.iso
qemu-system-sparc: -cdrom x.iso: machine type does not support if=scsi,bus=0,unit=2
Fix it by calling scsi_bus_legacy_handle_cmdline() after creating the
corresponding SCSI controller.
Reported-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Fixes: 1454509726
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Change all fprintf(stderr...) calls in hw/i386/multiboot.c to call
error_report() instead, including the mb_debug macro. Remove the "\n"
from strings passed to all modified calls, since error_report() appends
one.
Signed-off-by: Jack Schwartz <jack.schwartz@oracle.com>
Reviewed-by: Daniel Kiper <daniel.kiper@oracle.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This adds RISC-V into the build system enabling the following targets:
- riscv32-softmmu
- riscv64-softmmu
- riscv32-linux-user
- riscv64-linux-user
This adds defaults configs for RISC-V, enables the build for the RISC-V
CPU core, hardware, and Linux User Emulation. The 'qemu-binfmt-conf.sh'
script is updated to add the RISC-V ELF magic.
Expected checkpatch errors for consistency reasons:
ERROR: line over 90 characters
FILE: scripts/qemu-binfmt-conf.sh
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
This provides a RISC-V Board compatible with the the SiFive Freedom U SDK.
The following machine is implemented:
- 'sifive_u'; CLINT, PLIC, UART, device-tree
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
This provides a RISC-V Board compatible with the the SiFive Freedom E SDK.
The following machine is implemented:
- 'sifive_e'; CLINT, PLIC, UART, AON, GPIO, QSPI, PWM
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
Simple model of the PRCI (Power, Reset, Clock, Interrupt) to emulate
register reads made by the SDK BSP.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
QEMU model of the UART on the SiFive E300 and U500 series SOCs.
BBL supports the SiFive UART for early console access via the SBI
(Supervisor Binary Interface) and the linux kernel SBI console.
The SiFive UART implements the pre qom legacy interface consistent
with the 16550a UART in 'hw/char/serial.c'.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
RISC-V machine with device-tree, 16550a UART and VirtIO MMIO.
The following machine is implemented:
- 'virt'; CLINT, PLIC, 16550A UART, VirtIO MMIO, device-tree
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
RISC-V machines compatble with Spike aka riscv-isa-sim, the RISC-V
Instruction Set Simulator. The following machines are implemented:
- 'spike_v1.9.1'; HTIF console, config-string, Privileged ISA Version 1.9.1
- 'spike_v1.10'; HTIF console, device-tree, Privileged ISA Version 1.10
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
The PLIC (Platform Level Interrupt Controller) device provides a
parameterizable interrupt controller based on SiFive's PLIC specification.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
The CLINT (Core Local Interruptor) device provides real-time clock, timer
and interprocessor interrupts based on SiFive's CLINT specification.
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Palmer Dabbelt <palmer@sifive.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
HTIF (Host Target Interface) provides console emulation for QEMU. HTIF
allows identical copies of BBL (Berkeley Boot Loader) and linux to run
on both Spike and QEMU. BBL provides HTIF console access via the
SBI (Supervisor Binary Interface) and the linux kernel SBI console.
The HTIT chardev implements the pre qom legacy interface consistent
with the 16550a UART in 'hw/char/serial.c'.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Stefan O'Rear <sorear2@gmail.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
The RISC-V HTIF (Host Target Interface) console device requires access
to the symbol table to locate the 'tohost' and 'fromhost' symbols.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
Implements the physical memory protection extension as specified in
Privileged ISA Version 1.10.
PMP (Physical Memory Protection) is as-of-yet unused and needs testing.
The SiFive verification team have PMP test cases that will be run.
Nothing currently depends on PMP support. It would be preferable to keep
the code in-tree for folk that are interested in RISC-V PMP support.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Daire McNamara <daire.mcnamara@emdalo.com>
Signed-off-by: Ivan Griffin <ivan.griffin@emdalo.com>
Signed-off-by: Michael Clark <mjc@sifive.com>
TCG code generation for the RV32IMAFDC and RV64IMAFDC. The QEMU
RISC-V code generator has complete coverage for the Base ISA v2.2,
Privileged ISA v1.9.1 and Privileged ISA v1.10:
- RISC-V Instruction Set Manual Volume I: User-Level ISA Version 2.2
- RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.9.1
- RISC-V Instruction Set Manual Volume II: Privileged ISA Version 1.10
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Bastian Koppelmann <kbastian@mail.uni-paderborn.de>
Signed-off-by: Sagar Karandikar <sagark@eecs.berkeley.edu>
Signed-off-by: Michael Clark <mjc@sifive.com>
The RISC-V disassembler has no dependencies outside of the 'disas'
directory so it can be applied independently. The majority of the
disassembler is machine-generated from instruction set metadata:
- https://github.com/michaeljclark/riscv-meta
Expected checkpatch errors for consistency and brevity reasons:
ERROR: line over 90 characters
ERROR: trailing statements should be on next line
ERROR: space prohibited between function name and open parenthesis '('
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Clark <mjc@sifive.com>
Use types that are defined by QEMU in trace events caused build failures
for the UST trace backend:
In file included from trace-ust-all.c:13:0:
trace-ust-all.h:11844:206: error: unknown type name ‘hwaddr’
It only knows about C built-in types, and any types that are pulled in
from includs of qemu-common.h and lttng/tracepoint.h. This does not
include the 'hwaddr' type, so replace it with a uint64_t which is what
exec/hwaddr.h defines 'hwaddr' as. This fixes the build failure
introduced by
commit 9eb8040c2d
Author: Peter Maydell <peter.maydell@linaro.org>
Date: Fri Mar 2 10:45:39 2018 +0000
hw/misc/tz-ppc: Model TrustZone peripheral protection controller
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180306134317.836-1-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
basename(3) and dirname(3) modify their argument and may return
pointers to statically allocated memory which may be overwritten by
subsequent calls.
g_path_get_basename and g_path_get_dirname have no such issues, and
therefore more preferable.
Signed-off-by: Julia Suvorova <jusual@mail.ru>
Message-Id: <1519888086-4207-1-git-send-email-jusual@mail.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There are two issues with the documentation of the --balloon parameter:
First, "--balloon none" is simply doing nothing. Even if a machine had a
balloon device by default, this option is not disabling anything, it is
simply ignored. Thus let's simply drop this option from the documentation
to avoid to confuse the users (but keep the code in vl.c for backward
compatibility).
Second, the documentation claims that "--balloon virtio" is the default
mode, but this is not true anymore since commit 382f074371.
Since that commit, the option also has no real use case anymore, since
you can simply use "--device virtio-balloon" nowadays instead. Thus to
simplify our complex parameter zoo a little bit, let's deprecate the
the parameter now and tell the user to use "--device virtio-balloon"
instead.
Fixes: 382f074371
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519796303-13257-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The use of WHvGetExitContextSize will break ABI compatibility if the platform
changes the context size while a qemu compiled executable does not recompile.
To avoid this we now use sizeof and let the platform determine which version
of the struction was passed for ABI compatibility.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1519665216-1078-8-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Justin Terry (VM) via Qemu-devel <qemu-devel@nongnu.org>
Fixes an issue where if the tpr is assigned to the array but not a different
value from what is already expected on the vp the code will skip incrementing
the reg_count. In this case its possible that we set an invalid memory section
of the next call for DeliverabilityNotifications that was not expected.
The fix is to use a local variable to store the temporary tpr and only update
the array if the local tpr value is different than the vp context.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1519665216-1078-7-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Justin Terry (VM) via Qemu-devel <qemu-devel@nongnu.org>
This reverts commit 906548689e.
Even with -Og, the debug experience is noticeably worse
because gdb shows a lot more "<optimised out>" variables and
function arguments.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adding check for `while` and `for` statements, which condition has more than
one line.
The former checkpatch.pl can check `if` statement, which condition has more
than one line, whether block misses brace round, like this:
'''
if (cond1 ||
cond2)
statement;
'''
But it doesn't do the same check for `for` and `while` statements.
Using `(?:...)` instead of `(...)` in regex pattern catch.
Because `(?:...)` is faster and avoids unwanted side-effect.
Suggested-by: Stefan Hajnoczi <stefanha@redhat.com>
Suggested-by: Eric Blake <eblake@redhat.com>
Suggested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Su Hang <suhang16@mails.ucas.ac.cn>
Message-Id: <1520319890-19761-1-git-send-email-suhang16@mails.ucas.ac.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_rw is calling address_space_to_flatview but it can
be called outside the RCU lock. To fix it, transform flatview_rw
into address_space_rw, since flatview_rw is otherwise unused.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_map is calling address_space_to_flatview but it can
be called outside the RCU lock. The function itself is calling
rcu_read_lock/rcu_read_unlock, just in the wrong place, so the
fix is easy.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_access_valid is calling address_space_to_flatview but it can
be called outside the RCU lock. To fix it, push the rcu_read_lock/unlock
pair up from flatview_access_valid to address_space_access_valid.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_read is calling address_space_to_flatview but it can
be called outside the RCU lock. To fix it, push the rcu_read_lock/unlock
pair up from flatview_read_full to address_space_read's constant size
fast path and address_space_read_full.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
address_space_write is calling address_space_to_flatview but it can
be called outside the RCU lock. To fix it, push the rcu_read_lock/unlock
pair up from flatview_write to address_space_write.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These accessors are called from inlined functions, and the call sequence
is much more expensive than just inlining the access. Move the
struct declaration to memory-internal.h so that exec.c and memory.c
can both use an inline function.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The MemoryListener is registered on address_space_memory, there is
not much to assert. This currently works because the callback
is invoked only once when the listener is registered, but section->fv
is the _new_ FlatView, not the old one on later calls and that
would break.
This confines address_space_to_flatview to exec.c and memory.c.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fix the following ASAN reports:
==20125==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 24 byte(s) in 1 object(s) allocated from:
#0 0x7f0faea03a38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7f0fae450f75 in g_malloc0 ../glib/gmem.c:124
#2 0x562fffd526fc in machine_start /home/elmarco/src/qemu/tests/sdhci-test.c:180
Indirect leak of 152 byte(s) in 1 object(s) allocated from:
#0 0x7f0faea03850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7f0fae450f0c in g_malloc ../glib/gmem.c:94
#2 0x562fffd5d21d in qpci_init_pc /home/elmarco/src/qemu/tests/libqos/pci-pc.c:122
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-7-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes the following ASAN report:
Direct leak of 128 byte(s) in 8 object(s) allocated from:
#0 0x7fefce311850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7fefcdd5ef0c in g_malloc ../glib/gmem.c:94
#2 0x559b976faff0 in create_ahci_io_test /home/elmarco/src/qemu/tests/ahci-test.c:1810
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-6-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since 218bb57dd7, the -fsanitize=address
check fails with:
config-temp/qemu-conf.c:3:20: error: integer overflow in expression [-Werror=overflow]
return INT32_MIN / -1;
Interestingly, UBSAN check doesn't produce a compile time warning.
Use a test that doesn't have compile time warnings, and make it
specific to UBSAN check.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180215212552.26997-2-marcandre.lureau@redhat.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There is already 'device-list-properties' which does most of the job,
however it does not handle everything returned by qom-list-types such
as machines as they inherit directly from TYPE_OBJECT and not TYPE_DEVICE.
It does not handle abstract classes either.
This adds a new qom-list-properties command which prints properties
of a specific class and its instance. It is pretty much a simplified copy
of the device-list-properties handler.
Since it creates an object instance, device properties should appear
in the output as they are copied to QOM properties at the instance_init
hook.
This adds a object_class_property_iter_init() helper to allow class
properties enumeration uses it in the new QMP command to allow properties
listing for abstract classes.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20180301130939.15875-3-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
ObjectPropertyInfo is more generic and only missing @description.
This adds a description to ObjectPropertyInfo and removes
DevicePropertyInfo so the resulting ObjectPropertyInfo can be used
elsewhere.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20180301130939.15875-2-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These options have been marked in a comment in qemu-options.hx as
deprecated in 2009 already (see commit 1ed2fc1fa3), but we
never informed the users about these deprecations. Let's catch up
on that omission now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519138892-12836-1-git-send-email-thuth@redhat.com>
[Fix messages. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Automatic creation of SCSI controllers for "-drive if=scsi" for x86
machines was quite a bad idea (see description of commit f778a82f0c
for details). This is marked as deprecated since QEMU v2.9.0, and as
far as I know, nobody complained that this is still urgently required
anymore. Time to remove this now.
Suggested-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519123357-13225-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's been marked as deprecated since a very long time already, and
the parameter is not doing anything useful anymore except for printing
a warning, so it's now time to finally get rid of this option.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1519071820-4062-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches
# gpg: Signature made Mon 05 Mar 2018 17:45:51 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (38 commits)
block: Fix NULL dereference on empty drive error
qcow2: Replace align_offset() with ROUND_UP()
block/ssh: Add basic .bdrv_truncate()
block/ssh: Make ssh_grow_file() blocking
block/ssh: Pull ssh_grow_file() from ssh_create()
qemu-img: Make resize error message more general
qcow2: make qcow2_co_create2() a coroutine_fn
block: rename .bdrv_create() to .bdrv_co_create_opts()
Revert "IDE: Do not flush empty CDROM drives"
block: test blk_aio_flush() with blk->root == NULL
block: add BlockBackend->in_flight counter
block: extract AIO_WAIT_WHILE() from BlockDriverState
aio: rename aio_context_in_iothread() to in_aio_context_home_thread()
docs: document how to use the l2-cache-entry-size parameter
specs/qcow2: Fix documentation of the compressed cluster descriptor
iotest 033: add misaligned write-zeroes test via truncate
block: fix write with zero flag set and iovector provided
block: Drop unused .bdrv_co_get_block_status()
vvfat: Switch to .bdrv_co_block_status()
vpc: Switch to .bdrv_co_block_status()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# Conflicts:
# include/block/block.h
ppc patch queue 2018-03-06
This pull request supersedes ppc-for-2.12-20180302 which had compile
problems with some gcc versions. It also contains a few additional
patches.
Highlights are:
* New Sam460ex machine type
* Yet more fixes related to vcpu id allocation for spapr
* Numerous macio cleanupsr
* Some enhancements to the Spectre/Meltdown fixes for pseries,
allowing use of a better mitigation for indirect branch based
exploits
* New pseries machine types with Spectre/Meltdown mitigations
enabled (stop gap until libvirt and management understands the
machine options)
* A handful of other fixes
# gpg: Signature made Tue 06 Mar 2018 04:01:00 GMT
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180306: (30 commits)
PowerPC: Add TS bits into msr_mask
adb: add trace-events for monitoring keyboard/mouse during bus enumeration
PPC: e500: Fix duplicate kernel load and device tree overlap
hw/ppc/spapr,e500: Use new property "stdout-path" for boot console
ppc/spapr-caps: Define the pseries-2.12-sxxm machine type
ppc/spapr-caps: Convert cap-ibs to custom spapr-cap
ppc/spapr-caps: Convert cap-sbbc to custom spapr-cap
ppc/spapr-caps: Convert cap-cfpc to custom spapr-cap
ppc/spapr-caps: Add support for custom spapr_capabilities
target/ppc: Check mask when setting cap_ppc_safe_indirect_branch
macio: remove macio_init() function
macio: move setting of CUDA timebase frequency to macio_common_realize()
mac_newworld: use object link to pass OpenPIC object to macio
openpic: move OpenPIC state and related definitions to openpic.h
openpic: move KVM-specific declarations into separate openpic_kvm.h file
mac_oldworld: use object link to pass heathrow PIC object to macio
macio: move macio related structures and defines into separate macio.h file
heathrow: change heathrow_pic_init() to return the heathrow device
heathrow: convert to trace-events
heathrow: QOMify heathrow PIC
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A new parameter "context" is added to qio_channel_tls_handshake() is to
allow the TLS to be run on a non-default context. Still, no functional
change.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
We have worked on qio_task_run_in_thread() already. Further, let
all the qio channel APIs use that context.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
qio_task_run_in_thread() allows main thread to run blocking operations
in the background. However it has an assumption on that it's always
working with the default context. This patch tries to allow the threaded
QIO task framework to run with non-default gcontext.
Currently no functional change so far, so the QIOTasks are still always
running on main context.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Originally we were storing the GSources tag IDs. That'll be not enough
if we are going to support non-default gcontext for QIO code. Switch to
GSources without changing anything real. Now we still always pass in
NULL, which means the default gcontext.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Firstly, introduce an internal qio_channel_add_watch_full(), which
enhances qio_channel_add_watch() that context can be specified.
Then add a new API wrapper qio_channel_add_watch_source() to return a
GSource pointer rather than a tag ID.
Note that the _source() call will keep a reference of GSource so that
callers need to unref them explicitly when finished using the GSource.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
It is strange that it was called gio_task_thread_result. Rename it to
follow the naming rule of the file.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
During migration, after MSR bits is synced, cpu_post_load() will use
msr_mask to determine which PPC MSR bits will be applied into the target
side. Hardware Transaction Memory(HTM) has been supported since Power8,
but TS0/TS1 bit was not in msr_mask yet. That will prevent target KVM
from loading TM checkpointed values.
This patch adds TS bits into msr_mask for Power8, so that transactional
application can be migrated across qemu.
Signed-off-by: Simon Guo <wei.guo.simon@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch fixes an incorrect behavior when the -kernel argument has been
specified without -bios. In this case the kernel was loaded twice. At address
32M as a raw image and afterwards by load_elf/load_uimage at the
corresponding load address. In this case the region for the device tree and
the raw kernel image may overlap.
The patch fixes the behavior by loading the kernel image once with
load_elf/load_uimage and skips loading the raw image.
When here do not use bios_name/size for the kernel and use a more generic
name called payload_name/size.
New in v3: dtb must be stored between kernel and initrd because Linux can
handle the dtb only within the first 64MB. Add a comment to
clarify the behavior.
Signed-off-by: David Engraf <david.engraf@sysgo.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Linux kernel commit 2a9d832cc9aae21ea827520fef635b6c49a06c6d
(of: Add bindings for chosen node, stdout-path) deprecated chosen property
"linux,stdout-path" and "stdout".
Introduce the new property "stdout-path" and continue supporting the older
property to remain compatible with existing/older firmware. This older property
can be deprecated after 5 years.
Signed-off-by: Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The sxxm (speculative execution exploit mitigation) machine type is a
variant of the 2.12 machine type with workarounds for speculative
execution vulnerabilities enabled by default.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Convert cap-ibs (indirect branch speculation) to a custom spapr-cap
type.
All tristate caps have now been converted to custom spapr-caps, so
remove the remaining support for them.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Don't explicitly list "?"/help option, trust convention]
[dwg: Fold tristate removal into here, to not break bisect]
[dwg: Fix minor style problems]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
There are currently 2 implemented types of spapr-caps, boolean and
tristate. However there may be a need for caps which don't fit either of
these options. Add a custom capability type for which a list of custom
valid strings can be specified and implement the get/set functions for
these. Also add a field for help text to describe the available options.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
[dwg: Change "help" option to "?" matching qemu conventions]
[dwg: Add ATTRIBUTE_UNUSED to avoid breaking bisect]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Check the character and character_mask field when setting
cap_ppc_safe_indirect_branch based on the hypervisor response
to KVM_PPC_GET_CPU_CHAR. Previously the mask field wasn't checked
which was incorrect.
Fixes: 8acc2ae5 (target/ppc/kvm: Add cap_ppc_safe_[cache/bounds_check/indirect_branch])
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Move the remaining comment into macio.c for reference, then remove the
macio_init() function and instantiate the macio devices for both Old World
and New World machines via qdev_init_nofail() directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also switch macio_newworld_realize() over to use it rather than using the pic_mem
memory region directly.
Now that both Old World and New World macio devices no longer make use of the
pic_mem memory region directly, we can remove it.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is needed before the next patch because the target-dependent kvm stub
uses the existing kvm_openpic_connect_vcpu() declaration, making it impossible
to move the device-specific declarations into the same file without breaking
ppc-linux-user compilation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This enables the device to be made available during the setup of the Old World
machine. In order to pass back the previous set of IRQs we temporarily introduce
a new pic_irqs parameter until it can be removed.
An additional benefit of this change is that it is also possible to remove the
pic_mem pointer used for macio by accessing the memory region via sysbus.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that the ESCC device is instantiated directly via qdev, move it to within
the macio device and wire up the IRQs and memory regions using the sysbus API.
This enables to remove the now-obsolete escc_mem parameter to the macio_init()
function.
(Note this patch also contains small touch-ups to the formatting in
macio_escc_legacy_setup() and ppc_heathrow_init() in order to keep checkpatch
happy)
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current recommendation is to embed subdevices directly within their container
device, so do this for the DBDMA device.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
VSMT must be set in order to compute VCPU ids. This means that the
following functions must not be called before spapr_set_vsmt_mode()
was called:
- spapr_vcpu_id()
- spapr_is_thread0_in_vcore()
- xics_max_server_number()
We had a recent regression where the latter would be called before VSMT
was set, and broke migration of some old machine types. This patch
adds assert() in the above functions to avoid problems in the future.
Also, since VSMT is really a CPU related thing, spapr_set_vsmt_mode() is
now called from spapr_init_cpus(), just before the first VSMT user.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some older machine types create more ICPs than needed. We hence
need to register up to xics_max_server_number() dummy ICPs to
accomodate the migration of these machine types.
Recent VSMT rework changed xics_max_server_number() to return
DIV_ROUND_UP(max_cpus * spapr->vsmt, smp_threads)
instead of
DIV_ROUND_UP(max_cpus * kvmppc_smt_threads(), smp_threads);
The change is okay but it requires spapr->vsmt to be set, which
isn't the case with the current code. This causes the formula to
return zero and we don't create dummy ICPs. This breaks migration
of older guests as reported here:
https://bugzilla.redhat.com/show_bug.cgi?id=1549087
The dummy ICP workaround doesn't really have a dependency on XICS
itself. But it does depend on proper VCPU id numbering and it must
be applied before creating vCPUs (ie, creating real ICPs). So this
patch moves the workaround to spapr_init_cpus(), which already
assumes VSMT to be set.
Fixes: 72194664c8 ("spapr: use spapr->vsmt to compute VCPU ids")
Reported-by: Lukas Doktor <ldoktor@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add emulation of aCube Sam460ex board based on AMCC 460EX embedded SoC.
This is not a complete implementation yet with a lot of components
still missing but enough for the U-Boot firmware to start and to boot
a Linux kernel or AROS.
Signed-off-by: François Revol <revol@free.fr>
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is the PCIX controller found in newer 440 core SoCs e.g. the
AMMC 460EX. The device tree refers to this as plb-pcix compared to
the plb-pci controller in older 440 SoCs.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
[dwg: Remove hwaddr from trace-events, that doesn't work with some
trace backends]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 5d0fb1508e "spapr: consolidate the VCPU id numbering logic
in a single place" introduced a helper to detect thread0 of a virtual
core based on its VCPU id. This is used to create CPU core nodes in
the DT, but it is broken in TCG.
$ qemu-system-ppc64 -nographic -accel tcg -machine dumpdtb=dtb.bin \
-smp cores=16,maxcpus=16,threads=1
$ dtc -f -O dts dtb.bin | grep POWER8
PowerPC,POWER8@0 {
PowerPC,POWER8@8 {
instead of the expected 16 cores that we get with KVM:
$ dtc -f -O dts dtb.bin | grep POWER8
PowerPC,POWER8@0 {
PowerPC,POWER8@8 {
PowerPC,POWER8@10 {
PowerPC,POWER8@18 {
PowerPC,POWER8@20 {
PowerPC,POWER8@28 {
PowerPC,POWER8@30 {
PowerPC,POWER8@38 {
PowerPC,POWER8@40 {
PowerPC,POWER8@48 {
PowerPC,POWER8@50 {
PowerPC,POWER8@58 {
PowerPC,POWER8@60 {
PowerPC,POWER8@68 {
PowerPC,POWER8@70 {
PowerPC,POWER8@78 {
This happens because spapr_get_vcpu_id() maps VCPU ids to
cs->cpu_index in TCG mode. This confuses the code in
spapr_is_thread0_in_vcore(), since it assumes thread0 VCPU
ids to have a spapr->vsmt spacing.
spapr_get_vcpu_id(cpu) % spapr->vsmt == 0
Actually, there's no real reason to expose cs->cpu_index instead
of the VCPU id, since we also generate it with TCG. Also we already
set it explicitly in spapr_set_vcpu_id(), so there's no real reason
either to call kvm_arch_vcpu_id() with KVM.
This patch unifies spapr_get_vcpu_id() to always return the computed
VCPU id both in TCG and KVM. This is one step forward towards KVM<->TCG
migration.
Fixes: 5d0fb1508e
Reported-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
blk_error_action() sends a BLOCK_IO_ERROR QMP event which includes the
node name of its root node. If the BlockBackend represents an empty
drive, there is no root node, so we should not try to access its node
name. Make the field optional in the event and include it only when
the BlockBackend isn't empty.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Pull request
Mostly patches that are only indirectly related to the block layer, but I've
reviewed them and there is no maintainer.
# gpg: Signature made Mon 05 Mar 2018 09:39:50 GMT
# gpg: using RSA key 9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
README: Document 'git-publish' workflow
Add a git-publish configuration file
tests/libqos: Check for valid dev pointer when looking for PCI devices
util/uri.c: wrap single statement blocks with braces {}
util/uri.c: remove brackets that wrap `return` statement's content.
util/uri.c: Coding style check, Only whitespace involved
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ui: build curses, gtk and sdl as modules.
# gpg: Signature made Mon 05 Mar 2018 08:48:24 GMT
# gpg: using RSA key 4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20180305-pull-request:
ui/sdl: build as module
audio: rename CONFIG_* to CONFIG_AUDIO_*
ui/curses: build as module
ui/gtk: build as module
configure: opengl doesn't depend on x11
configure: add X11 vars to config-host.mak
console: add ui module loading support
console: add and use qemu_display_find_default
egl-headless: switch over to new display registry
curses: switch over to new display registry
cocoa: switch over to new display registry
sdl: switch over to new display registry
console: add qemu display registry, add gtk
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Mon 05 Mar 2018 03:06:59 GMT
# gpg: using RSA key EF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
tap: setting error appropriately when calling net_init_tap_one()
hw/net: Remove unnecessary header includes
net: Add a new convenience option "--nic" to configure default/on-board NICs
net: Remove the deprecated 'host_net_add' and 'host_net_remove' HMP commands
net: Remove the deprecated way of dumping network packets
net: Make net_client_init() static
net: Only show vhost-user in the help text if CONFIG_POSIX is defined
net: List available netdevs with "-netdev help"
net: Move error reporting from net_init_client/netdev to the calling site
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qapi patches for 2018-03-01
- Markus Armbruster: Modularize generated QAPI code
# gpg: Signature made Fri 02 Mar 2018 19:50:16 GMT
# gpg: using RSA key A7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-qapi-2018-03-01-v4: (30 commits)
qapi: Don't create useless directory qapi-generated
Fix up dangling references to qmp-commands.* in comment and doc
qapi: Move qapi-schema.json to qapi/, rename generated files
docs: Correct outdated information on QAPI
docs/devel/writing-qmp-commands: Update for modular QAPI
qapi: Empty out qapi-schema.json
Include less of the generated modular QAPI headers
qapi: Generate separate .h, .c for each module
watchdog: Consolidate QAPI into single file
qapi/common: Fix guardname() for funny filenames
qapi/types qapi/visit: Generate built-in stuff into separate files
qapi: Make code-generating visitors use QAPIGen more
qapi: Rename generated qmp-marshal.c to qmp-commands.c
qapi: Record 'include' directives in intermediate representation
qapi: Generate in source order
qapi: Record 'include' directives in parse tree
qapi: Concentrate QAPISchemaParser.exprs updates in .__init__()
qapi: Lift error reporting from QAPISchema.__init__() to callers
qapi/common: Eliminate QAPISchema.exprs
qapi: Improve include file name reporting in error messages
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
git-publish [1] is a convenient tool to send patches and has been
popular among QEMU developers. Recently it has been made available in
Fedora/Debian official repo.
One nice feature of the tool is a per-project configuration with
profiles, especially in which the cccmd option is a handy method to
create the Cc list.
[1]: https://github.com/stefanha/git-publish
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20180226030326.20219-2-famz@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
For this patch, using curly braces to wrap `if` `while` `else` statements,
which only hold single statement. For example:
'''
if (cond)
statement;
'''
to
'''
if (cond) {
statement;
}
'''
And using tricks that compare the disassemblies before and after
code changes, to make sure code logic isn't changed:
'''
git checkout master
make util/uri.o
strip util/uri.o
objdump -Drx util/uri.o > /tmp/uri-master.txt
git checkout cleanupbranch
make util/uri.o
strip util/uri.o
objdump -Drx util/uri.o > /tmp/uri-cleanup.txt
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This avoids a name clash for CONFIG_SDL, which is used by both sdl video
support and sdl audio support. It also more clear that this is a audio
driver configuration.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180301100547.18962-13-kraxel@redhat.com
Also drop curses libs from libs_softmmu. Add CURSES_{CFLAGS,LIBS}
variables so we can use them for linking the curses module.
Also make target/unicore32/helper.o depend on curses which uses curses
directly for some reason ...
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180301100547.18962-12-kraxel@redhat.com
If a requested user interface is not available, try loading it as
module, simliar to block layer modules. Needed to keep things working
when followup patches start to build user interfaces as modules.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180301100547.18962-8-kraxel@redhat.com
Add a registry for user interfaces. Add qemu_display_init and
qemu_display_early_init helper functions for display initialization.
Hook up gtk ui as first user.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180301100547.18962-2-kraxel@redhat.com
If netdev_add tap,id=net0,...,vhost=on failed in net_init_tap_one(),
the followed up device_add virtio-net-pci,netdev=net0 will fail
too, prints:
TUNSETOFFLOAD ioctl() failed: Bad file descriptor TUNSETOFFLOAD
ioctl() failed: Bad file descriptor
The reason is that the fd of tap is closed when error occured after
calling net_init_tap_one().
The fd should be closed when calling net_init_tap_one failed:
- if tap_set_sndbuf() failed
- if tap_set_sndbuf() succeeded but vhost failed to open or
initialize with vhostforce flag on
- with wrong vhost command line parameter
The fd should not be closed just because vhost failed to open or
initialize but without vhostforce flag. So the followed up
device_add can fall back to userspace virtio successfully.
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Suggested-by: Jason Wang <jasowang@redhat.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Headers like "hw/loader.h" and "qemu/sockets.h" are not needed in
the hw/net/*.c files. And Some other headers are included via other
headers already, so we can drop them, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The legacy "-net" option can be quite confusing for the users since most
people do not expect to get a "vlan" hub between their emulated guest
hardware and the host backend. But so far, we are also not able to get
rid of "-net" completely, since it is the only way to configure on-board
NICs that can not be instantiated via "-device" yet. It's also a little
bit shorter to type "-net nic -net tap" instead of "-device xyz,netdev=n1
-netdev tap,id=n1".
So what we need is a new convenience option that is shorter to type than
the full -device + -netdev stuff, and which can be used to configure the
on-board NICs that can not be handled via -device yet. Thus this patch now
provides such a new option "--nic": It adds an entry in the nd_table to
configure a on-board / default NIC, creates a host backend and connects
the two directly, without a confusing "vlan" hub inbetween.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
They are deprecated since QEMU v2.10, and so far nobody complained that
these commands are still necessary for any reason - and since you can use
'netdev_add' and 'netdev_remove' instead, there also should not be any
real reason. Since they are also standing in the way for the upcoming
'vlan' clean-up, it's now time to remove them.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
"-net dump" has been marked as deprecated since QEMU v2.10, since it
only works with the deprecated 'vlan' parameter (or hubs). Network
dumping should be done with "-object filter-dump" nowadays instead.
Since nobody complained so far about the deprecation message, let's
finally get rid of "-net dump" now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The function is only used within net.c, so there's no need that
this is a global function.
While we're at it, also remove the unused prototype compute_mcast_idx()
(the function has been removed in commit d9caeb09b1).
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
According to net/Makefile.objs we only link in the vhost-user code
if CONFIG_POSIX has been set. So the help screen should also only
show this information if CONFIG_POSIX has been defined.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Other options like "-chardev" or "-device" feature a nice help text
with the available devices when being called with "help" or "?".
Since it is quite useful, especially if you want to see which network
backends have been compiled into the QEMU binary, let's provide such
a help text for "-netdev", too.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It looks strange that net_init_client() and net_init_netdev() both
take an "Error **errp" parameter, but then do the error reporting
with "error_report_err(local_err)" on their own. Let's move the
error reporting to the calling site instead to simplify this code
a little bit.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since f3218a8 ("softfloat: add floatx80 constants")
floatx80_infinity is defined but never used.
This patch updates floatx80 functions to use
this definition.
This allows to define a different default Infinity
value on m68k: the m68k FPU defines infinity with
all bits set to zero in the mantissa.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180224201802.911-4-laurent@vivier.eu>
Using a local m68k floatx80_mod()
[copied from previous:
Written by Andreas Grabher for Previous, NeXT Computer Emulator.]
The quotient byte of the FPSR is updated with
the result of the operation.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180224201802.911-3-laurent@vivier.eu>
Move fpu/softfloat-macros.h to include/fpu/
Export floatx80 functions to be used by target floatx80
specific implementations.
Exports:
propagateFloatx80NaN(), extractFloatx80Frac(),
extractFloatx80Exp(), extractFloatx80Sign(),
normalizeFloatx80Subnormal(), packFloatx80(),
roundAndPackFloatx80(), normalizeRoundAndPackFloatx80()
Also exports packFloat32() that will be used to implement
m68k fsinh, fcos, fsin, ftan operations.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180224201802.911-2-laurent@vivier.eu>
Move qapi-schema.json to qapi/, so it's next to its modules, and all
files get generated to qapi/, not just the ones generated for modules.
Consistently name the generated files qapi-MODULE.EXT:
qmp-commands.[ch] become qapi-commands.[ch], qapi-event.[ch] become
qapi-events.[ch], and qmp-introspect.[ch] become qapi-introspect.[ch].
This gets rid of the temporary hacks in scripts/qapi/commands.py,
scripts/qapi/events.py, and scripts/qapi/common.py.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-28-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: Fix trailing dot in tpm.c, undo temporary hack for OSX toolchain]
Signed-off-by: Eric Blake <eblake@redhat.com>
The previous commit improved compile time by including less of the
generated QAPI headers. This is impossible for stuff defined directly
in qapi-schema.json, because that ends up in headers that that pull in
everything.
Move everything but include directives from qapi-schema.json to new
sub-module qapi/misc.json, then include just the "misc" shard where
possible.
It's possible everywhere, except:
* monitor.c needs qmp-command.h to get qmp_init_marshal()
* monitor.c, ui/vnc.c and the generated qapi-event-FOO.c need
qapi-event.h to get enum QAPIEvent
Perhaps we'll get rid of those some other day.
Adding a type to qapi/migration.json now recompiles some 120 instead
of 2300 out of 5100 objects.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-25-armbru@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
In my "build everything" tree, a change to the types in
qapi-schema.json triggers a recompile of about 4800 out of 5100
objects.
The previous commit split up qmp-commands.h, qmp-event.h, qmp-visit.h,
qapi-types.h. Each of these headers still includes all its shards.
Reduce compile time by including just the shards we actually need.
To illustrate the benefits: adding a type to qapi/migration.json now
recompiles some 2300 instead of 4800 objects. The next commit will
improve it further.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-24-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[eblake: rebase to master]
Signed-off-by: Eric Blake <eblake@redhat.com>
Our qapi-schema.json is composed of modules connected by include
directives, but the generated code is monolithic all the same: one
qapi-types.h with all the types, one qapi-visit.h with all the
visitors, and so forth. These monolithic headers get included all
over the place. In my "build everything" tree, adding a QAPI type
recompiles about 4800 out of 5100 objects.
We wouldn't write such monolithic headers by hand. It stands to
reason that we shouldn't generate them, either.
Split up generated qapi-types.h to mirror the schema's modular
structure: one header per module. Name the main module's header
qapi-types.h, and sub-module D/B.json's header D/qapi-types-B.h.
Mirror the schema's includes in the headers, so that qapi-types.h gets
you everything exactly as before. If you need less, you can include
one or more of the sub-module headers. To be exploited shortly.
Split up qapi-types.c, qapi-visit.h, qapi-visit.c, qmp-commands.h,
qmp-commands.c, qapi-event.h, qapi-event.c the same way.
qmp-introspect.h, qmp-introspect.c and qapi.texi remain monolithic.
The split of qmp-commands.c duplicates static helper function
qmp_marshal_output_str() in qapi-commands-char.c and
qapi-commands-misc.c. This happens when commands returning the same
type occur in multiple modules. Not worth avoiding.
Since I'm going to rename qapi-event.[ch] to qapi-events.[ch], and
qmp-commands.[ch] to qapi-commands.[ch], name the shards that way
already, to reduce churn. This requires temporary hacks in
commands.py and events.py. Similarly, c_name() must temporarily
be taught to munge '/' in common.py. They'll go away with the rename.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-23-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: declare a dummy variable in each .c file, to shut up OSX
toolchain warnings about empty .o files, including hacking c_name()]
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit f0df84c6 added watchdog-set-action in the main qapi-schema.json,
but it belongs better in qapi/run-state.json alongside the definition
of WatchdogAction. The command was written prior to commit 0e201d34
creating the latter file, even though it was merged after.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180226225744.26356-1-eblake@redhat.com>
Linking code from multiple separate QAPI schemata into the same
program is possible, but involves some weirdness around built-in
types:
* We generate code for built-in types into .c only with option
--builtins. The user is responsible for generating code for exactly
one QAPI schema per program with --builtins.
* We generate code for built-in types into .h regardless of
--builtins, but guarded by #ifndef QAPI_VISIT_BUILTIN. Because all
copies of this code are exactly the same, including any combination
of these headers works.
Replace this contraption by something more conventional: generate code
for built-in types into their very own files: qapi-builtin-types.c,
qapi-builtin-visit.c, qapi-builtin-types.h, qapi-builtin-visit.h, but
only with --builtins. Obey --output-dir, but ignore --prefix for
them.
Make qapi-types.h include qapi-builtin-types.h. With multiple
schemata you now have multiple qapi-types.[ch], but only one
qapi-builtin-types.[ch]. Same for qapi-visit.[ch] and
qapi-builtin-visit.[ch].
Bonus: if all you need is built-in stuff, you can include a much
smaller header. To be exploited shortly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-21-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: fix octal constant for python 3]
Signed-off-by: Eric Blake <eblake@redhat.com>
The use of QAPIGen is rather shallow so far: most of the output
accumulation is not converted. Take the next step: convert output
accumulation in the code-generating visitor classes. Helper functions
outside these classes are not converted.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180211093607.27351-20-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: rebase to earlier guardstart cleanup]
Signed-off-by: Eric Blake <eblake@redhat.com>
The include directive permits modular QAPI schemata, but the generated
code is monolithic all the same. To permit generating modular code,
the front end needs to pass more information on inclusions to the back
ends. The commit before last added the necessary information to the
parse tree. This commit adds it to the intermediate representation
and its QAPISchemaVisitor. A later commit will use this to to
generate modular code.
New entity QAPISchemaInclude represents inclusions. Call new visitor
method visit_include() for it, so visitors can see the sub-modules a
module includes.
Note that unlike other entities, QAPISchemaInclude has no name, and is
therefore not added to entity_dict.
New QAPISchemaEntity attribute @module names the entity's source file.
Call new visitor method visit_module() when it changes during a visit,
so visitors can keep track of the module being visited.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-18-armbru@redhat.com>
[eblake: avoid accidental deletion of self._predefining]
Signed-off-by: Eric Blake <eblake@redhat.com>
The parse tree is a list of expressions. Except include expressions
currently get replaced by the included file's parse tree.
Instead of throwing away the include expression, keep it with the file
name expanded so you don't have to track the including file's
directory to make sense of it.
A future commit will put this include expression to use.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-16-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[eblake: fix check of expr after assignment]
Signed-off-by: Eric Blake <eblake@redhat.com>
Error messages print absolute file names of included files even if the
user gave a relative one on the command line:
$ PYTHONPATH=scripts python -B tests/qapi-schema/test-qapi.py tests/qapi-schema/include-cycle.json
In file included from tests/qapi-schema/include-cycle.json:1:
In file included from /work/armbru/qemu/tests/qapi-schema/include-cycle-b.json:1:
/work/armbru/qemu/tests/qapi-schema/include-cycle-c.json:1: Inclusion loop for include-cycle.json
Improve this to
In file included from tests/qapi-schema/include-cycle.json:1:
In file included from tests/qapi-schema/include-cycle-b.json:1:
tests/qapi-schema/include-cycle-c.json:1: Inclusion loop for include-cycle.json
The error message when an include file can't be opened prints the
include directive's file name, which is relative to the including
file. Change this to print the file name relative to the working
directory. Visible in tests/qapi-schema/include-no-file.err.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-12-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
A massive number of objects depends on QAPI-generated headers. In my
"build everything" tree, it's roughly 4800 out of 5100. This is
particularly annoying when only some of the generated files change,
say for a doc fix.
Improve qapi-gen.py to touch its output files only if they actually
change. Rebuild time for a QAPI doc fix drops from many minutes to a
few seconds. Rebuilds get faster for certain code changes, too. For
instance, adding a simple QMP event now recompiles less than 200
instead of 4800 objects. But adding a QAPI type is as bad as ever;
we've clearly got more work to do.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180211093607.27351-11-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: fix octal constant for python3]
Signed-off-by: Eric Blake <eblake@redhat.com>
Whenever qapi-schema.json changes, we run six programs eleven times to
update eleven files. Similar for qga/qapi-schema.json. This is
silly. Replace the six programs by a single program that spits out
all eleven files.
The programs become modules in new Python package qapi, along with the
helper library. This requires moving them to scripts/qapi/. While
moving them, consistently drop executable mode bits.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-9-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: move change to one-line 'blurb' earlier in series, mention mode
bit change as intentional, update qapi-code-gen.txt to match actual
generated events.c file]
Signed-off-by: Eric Blake <eblake@redhat.com>
The next commit will introduce a common driver program for all
generators. The generators need to be modules for that. qapi2texi.py
already is. Make the other generators follow suit.
The changes are actually trivial. Obvious in the diffs once you view
them with whitespace changes ignored.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-8-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: minor tweak to keep 'blurb' one line]
Signed-off-by: Eric Blake <eblake@redhat.com>
These classes encapsulate accumulating and writing output.
Convert C code generation to QAPIGenC and QAPIGenH. The conversion is
rather shallow: most of the output accumulation is not converted.
Left for later.
The indentation machinery uses a single global variable indent_level,
even though we generally interleave creation of a .c and its .h. It
should become instance variable of QAPIGenC. Also left for later.
Documentation generation isn't converted, and QAPIGenDoc isn't used.
This will change shortly.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-6-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: fix nits spotted by Michael]
Signed-off-by: Eric Blake <eblake@redhat.com>
Each generator carries a copyright notice for the generator itself,
and another one for the files it generates. Only the former have been
updated along the way, the latter have not, and are all out of date.
Fix by copying the generator's copyright notice to the generated files
instead. Note that the fix doesn't copy the "Authors:" part; the
generated files' outdated Authors list goes away without replacement.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-4-armbru@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
[eblake: Flatten each 'blurb' to one line]
Signed-off-by: Eric Blake <eblake@redhat.com>
Every generator has separate boilerplate for .h and .c, and their
differences are boring. All of them repeat the license note.
Reduce the repetition as follows. Move common text like the license
note to common open_output(), next to the existing common text there.
For each generator, replace the two separate descriptions by a single
one.
While there, emit an "automatically generated" note into generated
documentation, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180211093607.27351-3-armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The align_offset() function is equivalent to the ROUND_UP() macro so
there's no need to use the former. The ROUND_UP() name is also a bit
more explicit.
This patch uses ROUND_UP() instead of the slower QEMU_ALIGN_UP()
because align_offset() already requires that the second parameter is a
power of two.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180215131008.5153-1-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
At runtime (that is, during a future ssh_truncate()), the SSH session is
non-blocking. However, ssh_truncate() (or rather, bdrv_truncate() in
general) is not a coroutine, so this resize operation needs to block.
For ssh_create(), that is fine, too; the session is never set to
non-blocking anyway.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180214204915.7980-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The issue:
$ qemu-img resize -f qcow2 foo.qcow2
qemu-img: Expecting one image file name
Try 'qemu-img --help' for more information
So we gave an image file name, but we omitted the length. qemu-img
thinks the last argument is always the size and removes it immediately
from argv (by decrementing argc), and tries to verify that it is a valid
size only at a later point.
So we do not actually know whether that last argument we called "size"
is indeed a size or whether the user instead forgot to specify that size
but did give a file name.
Therefore, the error message should be more general.
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1523458
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20180205162745.23650-1-mreitz@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
BlockDriver->bdrv_create() has been called from coroutine context since
commit 5b7e1542cf ("block: make
bdrv_create adopt coroutine").
Make this explicit by renaming to .bdrv_co_create_opts() and add the
coroutine_fn annotation. This makes it obvious to block driver authors
that they may yield, use CoMutex, or other coroutine_fn APIs.
bdrv_co_create is reserved for the QAPI-based version that Kevin is
working on.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170705102231.20711-2-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch adds test cases for the scenario where blk_aio_flush() is
called on a BlockBackend with no root. Calling drain afterwards should
complete the requests with -ENOMEDIUM.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockBackend currently relies on BlockDriverState->in_flight to track
requests for blk_drain(). There is a corner case where
BlockDriverState->in_flight cannot be used though: blk->root can be NULL
when there is no medium. This results in a segfault when the NULL
pointer is dereferenced.
Introduce a BlockBackend->in_flight counter for aio requests so it works
even when blk->root == NULL.
Based on a patch by Kevin Wolf <kwolf@redhat.com>.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
BlockDriverState has the BDRV_POLL_WHILE() macro to wait on event loop
activity while a condition evaluates to true. This is used to implement
synchronous operations where it acts as a condvar between the IOThread
running the operation and the main loop waiting for the operation. It
can also be called from the thread that owns the AioContext and in that
case it's just a nested event loop.
BlockBackend needs this behavior but doesn't always have a
BlockDriverState it can use. This patch extracts BDRV_POLL_WHILE() into
the AioWait abstraction, which can be used with AioContext and isn't
tied to BlockDriverState anymore.
This feature could be built directly into AioContext but then all users
would kick the event loop even if they signal different conditions.
Imagine an AioContext with many BlockDriverStates, each time a request
completes any waiter would wake up and re-check their condition. It's
nicer to keep a separate AioWait object for each condition instead.
Please see "block/aio-wait.h" for details on the API.
The name AIO_WAIT_WHILE() avoids the confusion between AIO_POLL_WHILE()
and AioContext polling.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The name aio_context_in_iothread() is misleading because it also returns
true when called on the main AioContext from the main loop thread, which
is not an IOThread.
This patch renames it to in_aio_context_home_thread() and expands the
doc comment to make the semantics clearer.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This patch fixes several mistakes in the documentation of the
compressed cluster descriptor:
1) the documentation claims that the cluster descriptor contains the
number of sectors used to store the compressed data, but what it
actually contains is the number of sectors *minus one* or, in other
words, the number of additional sectors after the first one.
2) the width of the fields is incorrectly specified. The number of bits
used by each field is
x = 62 - (cluster_bits - 8) for the offset field
y = (cluster_bits - 8) for the size field
So the offset field's location is [0, x-1], not [0, x] as stated.
3) the size field does not contain the size of the compressed data,
but rather the number of sectors where that data is stored. The
compressed data starts at the exact point specified in the offset
field and ends when there's enough data to produce a cluster of
decompressed data. Both points can be in the middle of a sector,
allowing several compressed clusters to be stored next to one
another, sharing sectors if necessary.
Cc: qemu-stable@nongnu.org
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This new test case only makes sense for qcow2 while iotest 033 is generic;
however it matches the test purpose perfectly and also 033 contains those
do_test() tricks to pass the alignment, which won't look nice being
duplicated in other tests or moved to the common code.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The normal bdrv_co_pwritev() use is either
- BDRV_REQ_ZERO_WRITE clear and iovector provided
- BDRV_REQ_ZERO_WRITE set and iovector == NULL
while
- the flag clear and iovector == NULL is an assertion failure
in bdrv_co_do_zero_pwritev()
- the flag set and iovector provided is in fact allowed
(the flag prevails and zeroes are written)
However the alignment logic does not support the latter case so the padding
areas get overwritten with zeroes.
Currently, general functions like bdrv_rw_co() do provide iovector
regardless of flags. So, keep it supported and use bdrv_co_do_zero_pwritev()
alignment for it which also makes the code a bit more obvious anyway.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Now that all drivers have been updated to provide the
byte-based .bdrv_co_block_status(), we can delete the sector-based
interface.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the vvfat driver accordingly. Note that we
can rely on the block driver having already clamped limits to our
block size, and simplify accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the vpc driver accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the vmdk driver accordingly. Drop the
now-unused vmdk_find_index_in_cluster().
Also, fix a pre-existing bug: if find_extent() fails (unlikely,
since the block layer did a bounds check), then we must return a
failure, rather than 0.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the vdi driver accordingly. Note that the
TODO is already covered (the block layer guarantees bounds of its
requests), and that we can remove the now-unused s->block_sectors.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the sheepdog driver accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the raw driver accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the qed driver accordingly, taking the opportunity
to inline qed_is_allocated_cb() into its lone caller (the callback
used to be important, until we switched qed to coroutines). There is
no intent to optimize based on the want_zero flag for this format.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the qcow2 driver accordingly.
For now, we are ignoring the 'want_zero' hint. However, it should
be relatively straightforward to honor the hint as a way to return
larger *pnum values when we have consecutive clusters with the same
data/zero status but which differ only in having non-consecutive
mappings.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the qcow driver accordingly. There is no
intent to optimize based on the want_zero flag for this format.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the parallels driver accordingly. Note that
the internal function block_status() is still sector-based, because
it is still in use by other sector-based functions; but that's okay
because request_alignment is 512 as a result of those functions.
For now, no optimizations are added based on the mapping hint.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the null driver accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the iscsi driver accordingly. In this case,
it is handy to teach iscsi_co_block_status() to handle a NULL map
and file parameter, even though the block layer passes non-NULL
values, because we also call the function directly. For now, there
are no optimizations done based on the want_zero flag.
We can also make the simplification of asserting that the block
layer passed in aligned values.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert all uses of
the allocmap (no semantic change). Callers that already had bytes
available are simpler, and callers that now scale to bytes will be
easier to switch to byte-based in the future.
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert all uses of
the cluster size in sectors, along with adding assertions that we
are not dividing by zero.
Improve some comment grammar while in the area.
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the gluster driver accordingly.
In want_zero mode, we continue to report fine-grained hole
information (the caller wants as much mapping detail as possible);
but when not in that mode, the caller prefers larger *pnum and
merely cares about what offsets are allocated at this layer, rather
than where the holes live. Since holes still read as zeroes at
this layer (rather than deferring to a backing layer), we can take
the shortcut of skipping find_allocation(), and merely state that
all bytes are allocated.
We can also drop redundant bounds checks that are already
guaranteed by the block layer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the file protocol driver accordingly.
In want_zero mode, we continue to report fine-grained hole
information (the caller wants as much mapping detail as possible);
but when not in that mode, the caller prefers larger *pnum and
merely cares about what offsets are allocated at this layer, rather
than where the holes live. Since holes still read as zeroes at
this layer (rather than deferring to a backing layer), we can take
the shortcut of skipping lseek(), and merely state that all bytes
are allocated.
We can also drop redundant bounds checks that are already
guaranteed by the block layer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Update the generic helpers, and all passthrough clients
(blkdebug, commit, mirror, throttle) accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit bdd6a90 has a bug: drivers should never directly set
BDRV_BLOCK_ALLOCATED, but only io.c should do that (as needed).
Instead, drivers should report BDRV_BLOCK_DATA if it knows that
data comes from this BDS.
But let's look at the bigger picture: semantically, the nvme
driver is similar to the nbd, null, and raw drivers (no backing
file, all data comes from this BDS). But while two of those
other drivers have to supply the callback (null because it can
special-case BDRV_BLOCK_ZERO, raw because it can special-case
a different offset), in this case the block layer defaults are
good enough without the callback at all (similar to nbd).
So, fix the bug by deletion ;)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. Now that the block layer exposes byte-based allocation,
it's time to tackle the drivers. Add a new callback that operates
on as small as byte boundaries. Subsequent patches will then update
individual drivers, then finally remove .bdrv_co_get_block_status().
The new code also passes through the 'want_zero' hint, which will
allow subsequent patches to further optimize callers that only care
about how much of the image is allocated (want_zero is false),
rather than full details about runs of zeroes and which offsets the
allocation actually maps to (want_zero is true). As part of this
effort, fix another part of the documentation: the claim in commit
4c41cb4 that BDRV_BLOCK_ALLOCATED is short for 'DATA || ZERO' is a
lie at the block layer (see commit e88ae2264), even though it is
how the bit is computed from the driver layer. After all, there
are intentionally cases where we return ZERO but not ALLOCATED at
the block layer, when we know that a read sees zero because the
backing file is too short. Note that the driver interface is thus
slightly different than the public interface with regards to which
bits will be set, and what guarantees are provided on input.
We also add an assertion that any driver using the new callback will
make progress (the only time pnum will be 0 is if the block layer
already handled an out-of-bounds request, or if there is an error);
the old driver interface did not provide this guarantee, which
could lead to some inf-loops in drastic corner-case failures.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Define a new board model for the MPS2 with an AN505 FPGA image
containing a Cortex-M33. Since the FPGA images for TrustZone
cores (AN505, and the similar AN519 for Cortex-M23) have a
significantly different layout of devices to the non-TrustZone
images, we use a new source file rather than shoehorning them
into the existing mps2.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-20-peter.maydell@linaro.org
Add remaining easy registers to iotkit-secctl:
* NSCCFG just routes its two bits out to external GPIO lines
* BRGINSTAT/BRGINTCLR/BRGINTEN can be dummies, because QEMU's
bus fabric can never report errors
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20180220180325.29818-18-peter.maydell@linaro.org
The Arm IoT Kit includes a "security controller" which is largely a
collection of registers for controlling the PPCs and other bits of
glue in the system. This commit provides the initial skeleton of the
device, implementing just the ID registers, and a couple of read-only
read-as-zero registers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-16-peter.maydell@linaro.org
In some board or SoC models it is necessary to split a qemu_irq line
so that one input can feed multiple outputs. We currently have
qemu_irq_split() for this, but that has several deficiencies:
* it can only handle splitting a line into two
* it unavoidably leaks memory, so it can't be used
in a device that can be deleted
Implement a qdev device that encapsulates splitting of IRQs, with a
configurable number of outputs. (This is in some ways the inverse of
the TYPE_OR_IRQ device.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-13-peter.maydell@linaro.org
The function qdev_init_gpio_in_named() passes the DeviceState pointer
as the opaque data pointor for the irq handler function. Usually
this is what you want, but in some cases it would be helpful to use
some other data pointer.
Add a new function qdev_init_gpio_in_named_with_opaque() which allows
the caller to specify the data pointer they want.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-12-peter.maydell@linaro.org
The Cortex-M33 allows the system to specify the reset value of the
secure Vector Table Offset Register (VTOR) by asserting config
signals. In particular, guest images for the MPS2 AN505 board rely
on the MPS2's initial VTOR being correct for that board.
Implement a QEMU property so board and SoC code can set the reset
value to the correct value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-7-peter.maydell@linaro.org
Create an "idau" property on the armv7m container object which
we can forward to the CPU object. Annoyingly, we can't use
object_property_add_alias() because the CPU object we want to
forward to doesn't exist until the armv7m container is realized.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-6-peter.maydell@linaro.org
In v8M, the Implementation Defined Attribution Unit (IDAU) is
a small piece of hardware typically implemented in the SoC
which provides board or SoC specific security attribution
information for each address that the CPU performs MPU/SAU
checks on. For QEMU, we model this with a QOM interface which
is implemented by the board or SoC object and connected to
the CPU using a link property.
This commit defines the new interface class, adds the link
property to the CPU object, and makes the SAU checking
code call the IDAU interface if one is present.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-5-peter.maydell@linaro.org
Instead of loading guest images to the system address space, use the
CPU's address space. This is important if we're trying to load the
file to memory or via an alias memory region that is provided by an
SoC object and thus not mapped into the system address space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-4-peter.maydell@linaro.org
Instead of loading kernels, device trees, and the like to
the system address space, use the CPU's address space. This
is important if we're trying to load the file to memory or
via an alias memory region that is provided by an SoC
object and thus not mapped into the system address space.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180220180325.29818-3-peter.maydell@linaro.org
Allow the translate subroutines to return false for invalid insns.
At present we can of course invoke an invalid insn exception from within
the translate subroutine, but in the short term this consolidates code.
In the long term it would allow the decodetree language to support
overlapping patterns for ISA extensions.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227232618.2908-1-richard.henderson@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
1. NBD_REP_ERR_INVALID is not only about length, so, make message more
general
2. hex format is not very good: it's hard to read something like
"option a (set meta context)", so switch to dec.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <1518702707-7077-6-git-send-email-vsementsov@virtuozzo.com>
[eblake: expand scope of patch: ALL uses of nbd_opt_lookup and
nbd_rep_lookup are now decimal]
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit 79ba8c98 (v2.7) changed the setting of request_alignment
to occur only during bdrv_refresh_limits(), rather than at at
bdrv_open() time; but at the time, NBD was unaffected, because
it still used sector-based callbacks, so the block layer
defaulted NBD to use 512 request_alignment.
Later, commit 70c4fb26 (also v2.7) changed NBD to use byte-based
callbacks, without setting request_alignment. This resulted in
NBD using request_alignment of 1, which works great when the
server supports it (as is the case for qemu-nbd), but falls apart
miserably if the server requires alignment (but only if qemu
actually sends a sub-sector request; qemu-io can do it, but
most qemu operations still perform on sectors or larger).
Even later, the NBD protocol was updated to document that clients
should learn the server's minimum alignment during NBD_OPT_GO;
and recommended that clients should assume a minimum size of 512
unless the server understands NBD_OPT_GO and replied with a smaller
size. Commit 081dd1fe (v2.10) attempted to do that, by assigning
request_alignment to whatever was learned from the server; but
it has two flaws: the assignment is done during bdrv_open() so
it gets unconditionally wiped out back to 1 during any later
bdrv_refresh_limits(); and the code is not using a default of 512
when the server did not report a minimum size.
Fix these issues by moving the assignment to request_alignment
to the right function, and by using a sane default when the
server does not advertise a minimum size.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180215032905.27146-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy<vsementsov@virtuozzo.com>
- add query-cpus-fast and deprecate query-cpus, while adding s390 cpu
information
- remove s390x memory hotplug implementation, which is not useable in
this form
- add boot menu support in the s390-ccw bios
- expose s390x guest crash information
- fixes and cleaups
# gpg: Signature made Thu 01 Mar 2018 12:54:47 GMT
# gpg: using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20180301-v2: (27 commits)
s390x/tcg: fix loading 31bit PSWs with the highest bit set
s390x: remove s390_get_memslot_count
s390x/sclp: remove memory hotplug support
s390x/cpumodel: document S390FeatDef.bit not applicable
hmp: change hmp_info_cpus to use query-cpus-fast
qemu-doc: deprecate query-cpus
qmp: add architecture specific cpu data for query-cpus-fast
qmp: add query-cpus-fast
qmp: expose s390-specific CPU info
s390x/tcg: add various alignment checks
s390x/tcg: fix disabling/enabling DAT
s390/stattrib: Make SaveVMHandlers data static
s390x/cpu: expose the guest crash information
pc-bios/s390: Rebuild the s390x firmware images with the boot menu changes
s390-ccw: interactive boot menu for scsi
s390-ccw: use zipl values when no boot menu options are present
s390-ccw: set cp_receive mask only when needed and consume pending service irqs
s390-ccw: read user input for boot index via the SCLP console
s390-ccw: print zipl boot menu
s390-ccw: read stage2 boot loader data to find menu
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Introduce two vhost-user meassges: VHOST_USER_CREATE_CRYPTO_SESSION
and VHOST_USER_CLOSE_CRYPTO_SESSION. At this point, the QEMU side
support crypto operation in cryptodev host-user backend.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Impliment the vhost-crypto's funtions, such as startup,
stop and notification etc. Introduce an enum
QCryptoCryptoDevBackendOptionsType in order to
identify the cryptodev vhost backend is vhost-user
or vhost-kernel-module (If exist).
At this point, the cryptdoev-vhost-user works.
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Signed-off-by: Longpeng(Mike) <longpeng2@huawei.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In commit 0ca1fd2d68 ("vhost: Simplify ring verification checks"),
it checks the virtqueue desc mapping for 3 times.
Fixed: commit 0ca1fd2d68 ("vhost: Simplify ring verification checks")
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In our Armv8a server, we try to configure the vhost scsi but fail
to boot up the guest (-machine virt-2.10). The guest's boot failure
is very early, even earlier than grub.
There are 3 virtqueues (ctrl, event and cmd) for virtio scsi device,
but ovmf and seabios will only set the physical address for the 3rd
one (cmd). Then in vhost_virtqueue_start(), virtio_queue_get_desc_addr
will be 0 for ctrl and event vq when qemu negotiates with ovmf. So
vhost_memory_map fails with ENOMEM.
This patch just fixs it by early quitting the virtqueue start/stop
when virtio_queue_get_desc_addr is 0.
Btw, after guest kernel starts, all the 3 queues will be initialized
and set address correctly.
Already tested on Arm64 and X86_64 qemu.
Signed-off-by: Jia He <jia.he@hxt-semitech.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since used_memslots will be updated to the actual value after
registering memory listener for the first time, move the
memslots limit checking to the right place.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCIe features are available only via the 'q35' machine type for x86 and
the 'virt' machine type for AArch64 architecture.
Mention that explicitly.
Thanks: Daniel Berrangé
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Document statistics added in commits
commit a0d06486b4
Author: Denis V. Lunev <den@openvz.org>
Date: Wed Feb 24 10:50:48 2016 +0300
virtio-balloon: add 'available' counter
and
commit bf1e7140ef
Author: Tomáš Golembiovský <tgolembi@redhat.com>
Date: Tue Dec 5 13:14:46 2017 +0100
virtio-balloon: include statistics of disk/file caches
Signed-off-by: Tomáš Golembiovský <tgolembi@redhat.com>
Reviewed-by: Jonathan Helman <jonathan.helman@oracle.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Let's also put the 31-bit hack in front of the REAL MMU, otherwise right
now we get errors when loading a PSW where the highest bit is set (e.g.
via s390-netboot.img). The highest bit is not masked away, therefore we
inject addressing exceptions into the guest.
The proper fix will later be to do all address wrapping before accessing
the MMU - so we won't get any "wrong" entries in there (which makes
flushing also easier). But that will require more work (wrapping in
load_psw, wrapping when incrementing the PC, wrapping every memory
access).
This fixes the tests/pxe-test test.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180301120826.6847-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Set the appropriate Linux hwcap bits to tell the guest binary if we
have implemented half-precision floating point support.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Much like recpe the ARM ARM has simplified the pseudo code for the
calculation which is done on a fixed point 9 bit integer maths. So
while adding f16 we can also clean this up to be a little less heavy
on the floating point and just return the fractional part and leave
the calle's to do the final packing of the result.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227143852.11175-27-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It looks like the ARM ARM has simplified the pseudo code for the
calculation which is done on a fixed point 9 bit integer maths. So
while adding f16 we can also clean this up to be a little less heavy
on the floating point and just return the fractional part and leave
the calle's to do the final packing of the result.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227143852.11175-23-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This actually covers two different sections of the encoding table:
Advanced SIMD scalar two-register miscellaneous FP16
Advanced SIMD two-register miscellaneous (FP16)
The difference between the two is covered by a combination of Q (bit
30) and S (bit 28). Notably the FRINTx instructions are only
available in the vector form.
This is just the decode skeleton which will be filled out by later
patches.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227143852.11175-17-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
A bunch of the vectorised bitwise operations just operate on larger
chunks at a time. We can do the same for the new half-precision
operations by introducing some TWOHALFOP helpers which work on each
half of a pair of half-precision operations at once.
Hopefully all this hoop jumping will get simpler once we have
generically vectorised helpers here.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227143852.11175-16-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Half-precision flush to zero behaviour is controlled by a separate
FZ16 bit in the FPCR. To handle this we pass a pointer to
fp_status_fp16 when working on half-precision operations. The value of
the presented FPCR is calculated from an amalgam of the two when read.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180227143852.11175-5-alex.bennee@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds the SiI9022 (and implicitly EDID I2C) device to the ARM
Versatile Express machine, and selects the two I2C devices necessary
in the arm-softmmu.mak configuration so everything will build
smoothly.
I am implementing proper handling of the graphics in the Linux
kernel and adding proper emulation of SiI9022 and EDID makes the
driver probe as nicely as before, retrieving the resolutions
supported by the "QEMU monitor" and overall just working nice.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Message-id: 20180227104903.21353-6-linus.walleij@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds support for emulating the Silicon Image SII9022 DVI/HDMI
bridge. It's not very clever right now, it just acknowledges
the switch into DDC I2C mode and back. Combining this with the
existing DDC I2C emulation gives the right behavior on the Versatile
Express emulation passing through the QEMU EDID to the emulated
platform.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Message-id: 20180227104903.21353-5-linus.walleij@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: explictly reset ddc_req/ddc_skip_finish/ddc]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The tx function of the DDC I2C slave emulation was returning 1
on all writes resulting in NACK in the I2C bus. Changing it to
0 makes the DDC I2C work fine with bit-banged I2C such as the
versatile I2C.
I guess it was not affecting whatever I2C controller this was
used with until now, but with the Versatile I2C it surely
does not work.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Message-id: 20180227104903.21353-4-linus.walleij@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge tpm 2018/02/21 v2
# gpg: Signature made Tue 27 Feb 2018 13:50:28 GMT
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-02-21-2:
tests: add test for TPM TIS device
tests: Move common TPM test code into tpm-emu.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
All memory region ROM images have a base address of 0 which causes the overlapping
address check to fail if more than one memory region ROM image is present, or an
existing ROM image is loaded at address 0.
Make sure that we ignore the overlapping address check in
rom_check_and_register_reset() if this is a memory region ROM image. In particular
this fixes the "rom: requested regions overlap" error on startup when trying to
run qemu-system-sparc with a -kernel image since commit 7497638642: "tcx: switch to
load_image_mr() and remove prom_addr hack".
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Boot menu patches by Collin L. Walling
# gpg: Signature made Mon 26 Feb 2018 11:24:21 AM CET
# gpg: using RSA key 2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>" [full]
# gpg: aka "Thomas Huth <thuth@redhat.com>" [undefined]
# gpg: aka "Thomas Huth <huth@tuxfamily.org>" [undefined]
# gpg: aka "Thomas Huth <th.huth@posteo.de>" [unknown]
* tag 'tags/s390-ccw-bios-2018-02-26':
pc-bios/s390: Rebuild the s390x firmware images with the boot menu changes
s390-ccw: interactive boot menu for scsi
s390-ccw: use zipl values when no boot menu options are present
s390-ccw: set cp_receive mask only when needed and consume pending service irqs
s390-ccw: read user input for boot index via the SCLP console
s390-ccw: print zipl boot menu
s390-ccw: read stage2 boot loader data to find menu
s390-ccw: set up interactive boot menu parameters
s390-ccw: parse and set boot menu options
s390-ccw: move auxiliary IPL data to separate location
s390-ccw: update libc
s390-ccw: refactor IPL structs
s390-ccw: refactor eckd_block_num to use CHS
s390-ccw: refactor boot map table code
# gpg: Signature made Sun 25 Feb 2018 17:54:21 GMT
# gpg: using RSA key F30C38BD3F2FBE3C
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>"
# gpg: aka "Laurent Vivier <laurent@vivier.eu>"
# gpg: aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>"
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F 5173 F30C 38BD 3F2F BE3C
* remotes/vivier2/tags/linux-user-for-2.12-pull-request:
linux-user: MIPS set cpu to r6 CPU if binary is R6
linux-user, m68k: select CPU according to ELF header values
linux-user: introduce functions to detect CPU type
linux-user: Move CPU type name selection to a function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Due to a kernel bug we can never increase the size of capability
set 1, so introduce a new capability set in parallel, old userspace
will continue to use the old set, new userspace will start using
the new one when it detects a fixed kernel.
v2: don't use a define from virglrenderer, just probe it.
v3: fix compilation when virglrenderer disabled
v4: fix style warning, just use ?: op instead.
Signed-off-by: Dave Airlie <airlied@redhat.com>
Message-id: 20180223023814.24459-1-airlied@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
From an architecture point of view, nothing can be mapped into the address
space on s390x. All there is is memory. Therefore there is also not really
an interface to communicate such information to the guest. All we can do is
specify the maximum ram address and guests can probe in that range if
memory is available and usable (TPROT).
Also memory hotplug is strange. The guest can decide at some point in
time to add / remove memory in some range. While the hypervisor can deny
to online an increment, all increments have to be predefined and there is
no way of telling the guest about a newly "hotplugged" increment. So if we
specify right now e.g.
-m 2G,slots=2,maxmem=20G
An ordinary fedora guest will happily online (hotplug) all memory,
resulting in a guest consuming 20G. So it really behaves rather like
-m 22G
There is no way to hotplug memory from the outside like on other
architectures. This is of course bad for upper management layers.
As the guest can create/delete memory regions while it is running, of
course migration support is not available and tricky to implement.
With virtualization, it is different. We might want to map something
into guest address space (e.g. fake DAX devices) and not detect it
automatically as memory. So we really want to use the maxmem and slots
parameter just like on all other architectures. Such devices will have
to expose the applicable memory range themselves. To finally be able to
provide memory hotplug to guests, we will need a new paravirtualized
interface to do that (e.g. something into the direction of virtio-mem).
This implies, that maxmem cannot be used for s390x memory hotplug
anymore and has to go. This simplifies the code quite a bit.
As migration support is not working, this change cannot really break
migration as guests without slots and maxmem don't see the SCLP
features. Also, the ram size calculation does not change.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180219174231.10874-1-david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
[CH: tweaked patch description, as discussed on list]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The 'bit' field of the 'S390FeatDef' structure is not applicable to all
its instances. Currently this field is not applicable, and remains
unused, iff the feature is of type S390_FEAT_TYPE_MISC. Having the value 0
specified for multiple such feature definitions was a little confusing,
as it's a perfectly legit bit value, and as the value of the bit
field is usually ought to be unique for each feature of a given
feature type.
Let us introduce a specialized macro for defining features of type
S390_FEAT_TYPE_MISC so, that one does not have to specify neither bit nor
type (as the latter is implied).
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20180221165628.78946-1-pasic@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Changing the implementation of hmp_info_cpus() to call
qmp_query_cpus_fast() instead of qmp_query_cpus. This has the
following consequences:
o No further code change required for qmp_query_cpus deprecation
o HMP profits from the less disruptive cpu information retrieval
o HMP 'info cpus' won't display architecture specific data anymore,
which should be tolerable in the light of the deprecation of
query-cpus.
In order to allow 'info cpus' to be executed completely on the
fast path, monitor_get_cpu_index() has been adapted to not synchronize
the cpu state.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Message-Id: <1518797321-28356-6-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The s390 CPU state can be retrieved without interrupting the
VM execution. Extendend the CpuInfoFast union with architecture
specific data and an implementation for s390.
Return data looks like this:
[
{"thread-id":64301,"props":{"core-id":0},
"arch":"s390","cpu-state":"operating",
"qom-path":"/machine/unattached/device[0]","cpu-index":0},
{"thread-id":64302,"props":{"core-id":1},
"arch":"s390","cpu-state":"operating",
"qom-path":"/machine/unattached/device[1]","cpu-index":1}
]
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-4-git-send-email-mihajlov@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The query-cpus command has an extremely serious side effect:
it always interrupts all running vCPUs so that they can run
ioctl calls. This can cause a huge performance degradation for
some workloads. And most of the information retrieved by the
ioctl calls are not even used by query-cpus.
This commit introduces a replacement for query-cpus called
query-cpus-fast, which has the following features:
o Never interrupt vCPUs threads. query-cpus-fast only returns
vCPU information maintained by QEMU itself, which should be
sufficient for most management software needs
o Drop "halted" field as it can not be retrieved in a fast
way on most architectures
o Drop irrelevant fields such as "current", "pc" and "arch"
o Rename some fields for better clarification & proper naming
standard
Signed-off-by: Luiz Capitulino <lcapitulino@redhat.com>
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Message-Id: <1518797321-28356-3-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Presently s390x is the only architecture not exposing specific
CPU information via QMP query-cpus. Upstream discussion has shown
that it could make sense to report the architecture specific CPU
state, e.g. to detect that a CPU has been stopped.
With this change the output of query-cpus will look like this on
s390:
[
{"arch": "s390", "current": true,
"props": {"core-id": 0}, "cpu-state": "operating", "CPU": 0,
"qom_path": "/machine/unattached/device[0]",
"halted": false, "thread_id": 63115},
{"arch": "s390", "current": false,
"props": {"core-id": 1}, "cpu-state": "stopped", "CPU": 1,
"qom_path": "/machine/unattached/device[1]",
"halted": true, "thread_id": 63116}
]
This change doesn't add the s390-specific data to HMP 'info cpus'.
A follow-on patch will remove all architecture specific information
from there.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <1518797321-28356-2-git-send-email-mihajlov@linux.vnet.ibm.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's add proper alignment checks for a handful of instructions that
require a SPECIFICATION exception in case alignment is violated.
Introduce new wout/in functions. As we are right now only using them for
privileged instructions, we have to add ugly ifdefs to silence
compilers.
Convert STORE CPU ID right away to make use of the wout function.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180215103822.15179-1-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Currently, all memory accesses go via the MMU of the address space
(primary, secondary, ...). This is bad, because we don't flush the TLB
when disabling/enabling DAT. So we could add a tlb flush. However it
is easier to simply select the MMU we already have in place for real
memory access.
All we have to do is point at the right MMU and allow to execute these
pages.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180213161240.19891-1-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[CH: get rid of tabs]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This patch is the s390 implementation of guest crash information,
similar to commit d187e08dc4 ("i386/cpu: add crash-information QOM
property") and the related commits. We will detect several crash
reasons, with the "disabled wait" being the most important one, since
this is used by all s390 guests as a "panic like" notification.
Demonstrate these ways with examples as follows.
1. crash-information QOM property;
Run qemu with -qmp unix:qmp-sock,server, then use utility "qmp-shell"
to execute "qom-get" command, and might get the result like,
(QEMU) (QEMU) qom-get path=/machine/unattached/device[0] \
property=crash-information
{"return": {"core": 0, "reason": "disabled-wait", "psw-mask": 562956395872256, \
"type": "s390", "psw-addr": 1102832}}
2. GUEST_PANICKED event reporting;
Run qemu with a socket option, and telnet or nc to that,
-chardev socket,id=qmp,port=4444,host=localhost,server \
-mon chardev=qmp,mode=control,pretty=on \
Negotiating the mode by { "execute": "qmp_capabilities" }, and the crash
information will be reported on a guest crash event like,
{
"timestamp": {
"seconds": 1518004739,
"microseconds": 552563
},
"event": "GUEST_PANICKED",
"data": {
"action": "pause",
"info": {
"core": 0,
"psw-addr": 1102832,
"reason": "disabled-wait",
"psw-mask": 562956395872256,
"type": "s390"
}
}
}
3. log;
Run qemu with the parameters: -D <logfile> -d guest_errors, to
specify the logfile and log item. The results might be,
Guest crashed on cpu 0: disabled-wait
PSW: 0x0002000180000000 0x000000000010d3f0
Co-authored-by: Jing Liu <liujbjl@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180209122543.25755-1-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[CH: tweaked qapi comment]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This patch implements a dummy ObjectInfo structure so that
it's easy to typecast the incoming data. If the metadata is
valid, write_pending is set. Also, the incoming filename
is utf-16, so, instead of depending on external libraries, just
implement a simple function to get the filename
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-id: 20180223164829.29683-6-bsd@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Allow write operations on behalf of the initiator. The
precursor to write is the sending of the write metadata
that consists of the ObjectInfo dataset. This patch introduces
a flag that is set when the responder is ready to receive
write data based on a previous SendObjectInfo operation by
the initiator (The SendObjectInfo implementation is in a
later patch)
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-id: 20180223164829.29683-5-bsd@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Write of existing objects by the initiator is acheived by
making a temporary buffer with the new changes, deleting the
old file and then writing a new file with the same name.
Also, add a "readonly" property which needs to be set to false
for deletion to work.
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-id: 20180223164829.29683-4-bsd@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Fix a possible null dereference when deleting a folder and
its contents. An ignored event might be received for its contents
after the parent folder is deleted which will return a null object.
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-id: 20180223164829.29683-3-bsd@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The response to a SendObjectInfo consists of the storageid,
parent obejct handle and the handle reserved for the new
incoming object
Signed-off-by: Bandan Das <bsd@redhat.com>
Message-id: 20180223164829.29683-2-bsd@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Provide a new s390-ccw.img binary with the boot menu patches by Collin.
Though there should not be any visible changes for the network booting,
the s390-netboot.img binary has been rebuilt, too, since some of the
changes affected the shared source files.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Interactive boot menu for scsi. This follows a similar procedure
as the interactive menu for eckd dasd. An example follows:
s390x Enumerated Boot Menu.
3 entries detected. Select from index 0 to 2.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Added additional "break;" statement to avoid analyzer warnings]
Signed-off-by: Thomas Huth <thuth@redhat.com>
If no boot menu options are present, then flag the boot menu to
use the zipl options that were set in the zipl configuration file
(and stored on disk by zipl). These options are found at some
offset prior to the start of the zipl boot menu banner. The zipl
timeout value is limited to a 16-bit unsigned integer and stored
as seconds, so we take care to convert it to milliseconds in order
to conform to the rest of the boot menu functionality. This is
limited to CCW devices.
For reference, the zipl configuration file uses the following
fields in the menu section:
prompt=1 enable the boot menu
timeout=X set the timeout to X seconds
To explicitly disregard any boot menu options, then menu=off or
<bootmenu enable='no' ... /> must be specified.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
It is possible while waiting for multiple types of external
interrupts that we might have pending irqs remaining between
irq consumption and irq-type disabling. Those interrupts
could potentially propagate to the guest after IPL completes
and cause unwanted behavior.
As it is today, the SCLP will only recognize write events that
are enabled by the control program's send and receive masks. To
limit the window for, and prevent further irqs from, ASCII
console events (specifically keystrokes), we should only enable
the control program's receive mask when we need it.
While we're at it, remove assignment of the (non control program)
send and receive masks, as those are actually set by the SCLP.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Implements an sclp_read function to capture input from the
console and a wrapper function that handles parsing certain
characters and adding input to a buffer. The input is checked
for any erroneous values and is handled appropriately.
A prompt will persist until input is entered or the timeout
expires (if one was set). Example:
Please choose (default will boot in 10 seconds):
Correct input will boot the respective boot index. If the
user's input is empty, 0, or if the timeout expires, then
the default zipl entry will be chosen. If the input is
within the range of available boot entries, then the
selection will be booted. Any erroneous input will cancel
the timeout and re-prompt the user.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
When the boot menu options are present and the guest's
disk has been configured by the zipl tool, then the user
will be presented with an interactive boot menu with
labeled entries. An example of what the menu might look
like:
zIPL v1.37.1-build-20170714 interactive boot menu.
0. default (linux-4.13.0)
1. linux-4.13.0
2. performance
3. kvm
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Read the stage2 boot loader data block-by-block. We scan the
current block for the string "zIPL" to detect the start of the
boot menu banner. We then load the adjacent blocks (previous
block and next block) to account for the possibility of menu
data spanning multiple blocks.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reads boot menu flag and timeout values from the iplb and
sets the respective fields for the menu.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Set boot menu options for an s390 guest and store them in
the iplb. These options are set via the QEMU command line
option:
-boot menu=on|off[,splash-time=X]
or via the libvirt domain xml:
<os>
<bootmenu enable='yes|no' timeout='X'/>
</os>
Where X represents some positive integer representing
milliseconds.
Any value set for loadparm will override all boot menu options.
If loadparm=PROMPT, then the menu will be enabled without a
timeout.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The s390-ccw firmware needs some information in support of the
boot process which is not available on the native machine.
Examples are the netboot firmware load address and now the
boot menu parameters.
While storing that data in unused fields of the IPL parameter block
works, that approach could create problems if the parameter block
definition should change in the future. Because then a guest could
overwrite these fields using the set IPLB diagnose.
In fact the data in question is of more global nature and not really
tied to an IPL device, so separating it is rather logical.
This commit introduces a new structure to hold firmware relevant
IPL parameters set by QEMU. The data is stored at location 204 (dec)
and can contain up to 7 32-bit words. This area is available to
programming in the z/Architecture Principles of Operation and
can thus safely be used by the firmware until the IPL has completed.
Signed-off-by: Viktor Mihajlovski <mihajlov@linux.vnet.ibm.com>
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
[thuth: fixed "4 + 8 * n" comment]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Moved:
memcmp from bootmap.h to libc.h (renamed from _memcmp)
strlen from sclp.c to libc.h (renamed from _strlen)
Added C standard functions:
isdigit
Added non C-standard function:
uitoa
atoui
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
ECKD DASDs have different IPL structures for CDL and LDL
formats. The current Ipl1 and Ipl2 structs follow the CDL
format, so we prepend "EckdCdl" to them. Boot info for LDL
has been moved to a new struct: EckdLdlIpl1.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Add new cylinder/head/sector struct. Use it to calculate
eckd block numbers instead of a BootMapPointer (which used
eckd chs anyway).
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Some ECKD bootmap code was using structs designed for SCSI.
Even though this works, it confuses readability. Add a new
BootMapTable struct to assist with readability in bootmap
entry code. Also:
- replace ScsiMbr in ECKD code with appropriate structs
- fix read_block messages to reflect BootMapTable
- fixup ipl_scsi to use BootMapTable (referred to as Program Table)
- defined value for maximum table entries
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
M680x0 doesn't support the same set of instructions
as ColdFire, so we can't use "any" CPU type to execute
m68020 instructions.
We select CPU type ("m68040" or "any" for ColdFire)
according to the ELF header. If we can't, we
use by default the value used until now: "any".
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180220173307.25125-4-laurent@vivier.eu>
Instead of a sequence of "#if ... #endif" move the
selection to a function in linux-user/*/target_elf.h
We can't add them in linux-user/*/target_cpu.h
because we will need to include "elf.h" to
use ELF flags with eflags, and including
"elf.h" in "target_cpu.h" introduces some
conflicts in elfload.c
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180220173307.25125-2-laurent@vivier.eu>
* New "raspi3" machine emulating RaspberryPi 3
* Fix bad register definitions for VMIDR and VMPIDR (which caused
assertions for 64-bit guest CPUs with EL2 on big-endian hosts)
* hw/char/stm32f2xx_usart: fix TXE/TC bit handling
* Fix ast2500 protection register emulation
* Lots of SD card emulation cleanups and bugfixes
# gpg: Signature made Thu 22 Feb 2018 15:18:53 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180222: (32 commits)
sdcard: simplify SD_SEND_OP_COND (ACMD41)
sdcard: simplify SEND_IF_COND (CMD8)
sdcard: warn if host uses an incorrect address for APP CMD (CMD55)
sdcard: check the card is in correct state for APP CMD (CMD55)
sdcard: handles more commands in SPI mode
sdcard: use a more descriptive label 'unimplemented_spi_cmd'
sdcard: handle the Security Specification commands
sdcard: handle CMD54 (SDIO)
sdcard: use the registerfields API for the CARD_STATUS register masks
sdcard: use the correct masked OCR in the R3 reply
sdcard: simplify using the ldst API
sdcard: remove commands from unsupported old MMC specification
sdcard: clean the SCR register and add few comments
sdcard: fix the 'maximum data transfer rate' to 25MHz
sdcard: update the CSD CRC register regardless the CSD structure version
sdcard: Don't always set the high capacity bit
sdcard: use the registerfields API to access the OCR register
sdcard: use G_BYTE from cutils
sdcard: define SDMMC_CMD_MAX instead of using the magic '64'
sdcard: add more trace events
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To comply with Spec v1.10 (and 2.00, 3.01):
. TRAN_SPEED
for current SD Memory Cards that field must be always 0_0110_010b (032h) which is
equal to 25MHz - the mandatory maximum operating frequency of SD Memory Card.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180215221325.7611-4-f4bug@amsat.org
[PMM: fixed comment indent]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some register blocks of the ast2500 are protected by protection key
registers which require the right magic value to be written to those
registers to allow those registers to be mutated.
Register manuals indicate that writing the correct magic value to these
registers should cause subsequent reads from those values to return 1,
and writing any other value should cause subsequent reads to return 0.
Previously, qemu implemented these registers incorrectly: the registers
were handled as simple memory, meaning that writing some value x to a
protection key register would result in subsequent reads from that
register returning the same value x. The protection was implemented by
ensuring that the current value of that register equaled the magic
value.
This modifies qemu to have the correct behaviour: attempts to write to a
ast2500 protection register results in a transition to 1 or 0 depending
on whether the written value is the correct magic. The protection logic
is updated to ensure that the value of the register is nonzero.
This bug caused deadlocks with u-boot HEAD: when u-boot is done with a
protectable register block, it attempts to lock it by writing the
bitwise inverse of the correct magic value, and then spinning forever
until the register reads as zero. Since qemu implemented writes to these
registers as ordinary memory writes, writing the inverse of the magic
value resulted in subsequent reads returning that value, leading to
u-boot spinning forever.
Signed-off-by: Hugo Landau <hlandau@devever.net>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: Andrew Jeffery <andrew@aj.id.au>
Message-id: 20180220132627.4163-1-hlandau@devever.net
[PMM: fixed incorrect code indentation]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I/O currently being synchronous, there is no reason to ever clear the
SR_TXE bit. However the SR_TC bit may be cleared by software writing
to the SR register, so set it on each write.
In addition, fix the reset value of the USART status register.
Signed-off-by: Richard Braun <rbraun@sceen.net>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
[PMM: removed XXX tag from comment, since it isn't something
we need to come back and fix in QEMU]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds a "raspi3" machine type, which can now be selected as
the machine to run on by users via the "-M" command line option to QEMU.
The machine type does *not* ignore memory transaction failures so we
likely need to add some dummy devices later when people run something
more complicated than what I'm using for testing.
Signed-off-by: Pekka Enberg <penberg@iki.fi>
[PMM: added #ifdef TARGET_AARCH64 so we don't provide the 64-bit
board in the 32-bit only arm-softmmu build.]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The register definitions for VMIDR and VMPIDR have separate
reginfo structs for the AArch32 and AArch64 registers. However
the 32-bit versions are wrong:
* they use offsetof instead of offsetoflow32 to mark where
the 32-bit value lives in the uint64_t CPU state field
* they don't mark themselves as ARM_CP_ALIAS
In particular this means that if you try to use an Arm guest CPU
which enables EL2 on a big-endian host it will assert at reset:
target/arm/cpu.c:114: cp_reg_check_reset: Assertion `oldvalue == newvalue' failed.
because the reset of the 32-bit register writes to the top
half of the uint64_t.
Correct the errors in the structures.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
---
This is necessary for 'make check' to pass on big endian
systems with the 'raspi3' board enabled, which is the
first board which has an EL2-enabled-by-default CPU.
This is the re-factor of softfloat:
- shared common code path float16/32/64
- well commented and easy to follow code
- added a bunch of float16 support
While some operations are slower the key ones exercised by the
floating point dbt-bench are the same: https://i.imgur.com/oXNJNql.png
# gpg: Signature made Wed 21 Feb 2018 10:44:14 GMT
# gpg: using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-softfloat-refactor-210218-1: (22 commits)
fpu/softfloat: re-factor sqrt
fpu/softfloat: re-factor compare
fpu/softfloat: re-factor minmax
fpu/softfloat: re-factor scalbn
fpu/softfloat: re-factor int/uint to float
fpu/softfloat: re-factor float to int/uint
fpu/softfloat: re-factor round_to_int
fpu/softfloat: re-factor muladd
fpu/softfloat: re-factor div
fpu/softfloat: re-factor mul
fpu/softfloat: re-factor add/sub
fpu/softfloat: define decompose structures
fpu/softfloat: move the extract functions to the top of the file
fpu/softfloat: improve comments on ARM NaN propagation
include/fpu/softfloat: add some float16 constants
include/fpu/softfloat: implement float16_set_sign helper
include/fpu/softfloat: implement float16_chs helper
include/fpu/softfloat: implement float16_abs helper
target/*/cpu.h: remove softfloat.h
fpu/softfloat-types: new header to prevent excessive re-builds
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Pass the modifier state to the keymap lookup function. In case multiple
keysym -> keycode mappings exist look at the modifier state and prefer
the mapping where the modifier state matches.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180222070513.8740-6-kraxel@redhat.com
The cursor dmabuf can be NULL, in case no cursor defined by the guest.
Happens for example when linux guests show the framebuffer console.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180220110433.20353-3-kraxel@redhat.com
After some hotkey was pressed sdl2 doesn't forward the first modifier
keyup event to the guest, resulting in stuck modifier keys.
Fix the logic in handle_keyup(). Also gui_key_modifier_pressed doesn't
need to be a global variable.
Reported-by: Howard Spoelstra <hsp.cat7@gmail.com>
Tested-by: Howard Spoelstra <hsp.cat7@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180220150444.784-1-kraxel@redhat.com
Move the TPM TIS related register and flag #defines into
include/hw/acpi/tpm.h for access by the test case.
Write a test case that covers the TIS functionality.
Add the tests cases to the MAINTAINERS file.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a little bit of a departure from softfloat's original approach
as we skip the estimate step in favour of a straight iteration. There
is a minor optimisation to avoid calculating more bits of precision
than we need however this still brings a performance drop, especially
for float64 operations.
Suggested-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
The compare function was already expanded from a macro. I keep the
macro expansion but move most of the logic into a compare_decomposed.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Let's do the same re-factor treatment for minmax functions. I still
use the MACRO trick to expand but now all the checking code is common.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
These are considerably simpler as the lower order integers can just
use the higher order conversion function. As the decomposed fractional
part is a full 64 bit rounding and inexact handling comes from the
pack functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We share the common int64/uint64_pack_decomposed function across all
the helpers and simply limit the final result depending on the final
size.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
We can now add float16_round_to_int and use the common round_decomposed and
canonicalize functions to have a single implementation for
float16/32/64 round_to_int functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_muladd and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 muladd functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_div and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_mul and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 versions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
We can now add float16_add/sub and use the common decompose and
canonicalize functions to have a single implementation for
float16/32/64 add and sub functions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
These structures pave the way for generic softfloat helper routines
that will operate on fully decomposed numbers.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This defines the same set of common constants for float 16 as defined
for 32 and 64 bit floats. These are often used by target helper
functions. I've also removed constants that are not used by anybody.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
As cpu.h is another typically widely included file which doesn't need
full access to the softfloat API we can remove the includes from here
as well. Where they do need types it's typically for float_status and
the rounding modes so we move that to softfloat-types.h as well.
As a result of not having softfloat in every cpu.h call we now need to
add it to various helpers that do need the full softfloat.h
definitions.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[For PPC parts]
Acked-by: David Gibson <david@gibson.dropbear.id.au>
The main culprit here is bswap.h which pulled in softfloat.h so it
could use the types in its CPU_Float* and ldfl/stfql functions. As
bswap.h is very widely included this added a compile dependency every
time we touch softfloat.h. Move the typedefs for each float type into
their own file so we don't re-build the world every time we tweak the
main softfloat.h header.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
It's not actively built and when enabled things fail to compile. I'm
not sure the type-checking is really helping here. Seeing as we "own"
our softfloat now lets remove the cruft.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Fill the terminal box from right to left to avoid
Gtk-WARNING **: Allocating size to GtkScrollbar 0x55f6d54b0200 without
calling gtk_widget_get_preferred_width/height(). How does the code
know the size to allocate?
Signed-off-by: Jan Kiszka <jan.kiszka@siemens.com>
Message-id: 902aaef8-d20e-0530-dea2-cdfe3db33ff3@web.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Switch over all leftover users to qapi DisplayType.
Then delete the unused display_type variable.
Add 'default' DisplayType, which isn't an actual display type but
a placeholder for "user didn't specify a display". It will be replaced
by the DisplayType actually used, which in turn depends on the
DisplayTypes availabel in the particular build.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180202111022.19269-13-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Add QAPI DisplayType enum, DisplayOptions union and DisplayGTK struct.
Switch gtk configuration to use the qapi type.
Some bookkeeping (fullscreen for example) is done twice now, this is
temporary until more/all UIs are switched over to qapi configuration.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180202111022.19269-5-kraxel@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Leak found thanks to ASAN:
Direct leak of 8 byte(s) in 1 object(s) allocated from:
#0 0x55995789ac90 in __interceptor_malloc (/home/elmarco/src/qemu/build/x86_64-softmmu/qemu-system-x86_64+0x1510c90)
#1 0x7f0a91190f0c in g_malloc /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:94
#2 0x5599580a281c in v9fs_path_copy /home/elmarco/src/qemu/hw/9pfs/9p.c:196:17
#3 0x559958f9ec5d in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:116:9
#4 0x7f0a8766ebbf (/lib64/libc.so.6+0x50bbf)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
lhs/rhs doesn't tell much about how argument are handled, dst/src is
and const arguments is clearer in my mind. Use g_memdup() while at it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
SystemTap's dtrace(1) produces the following warning when it encounters
"char const" instead of "const char":
Warning: /usr/bin/dtrace:trace-dtrace-root.dtrace:66: syntax error near:
probe flatview_destroy_rcu
Warning: Proceeding as if --no-pyparsing was given.
This is a limitation in current SystemTap releases. I have sent a patch
upstream to accept "char const" since it is valid C:
https://sourceware.org/ml/systemtap/2018-q1/msg00017.html
In QEMU we still wish to avoid warnings in the current SystemTap
release. It's simple enough to replace "char const" with "const char".
I'm not changing the documentation or implementing checks to prevent
this from occurring again in the future. The next release of SystemTap
will hopefully resolve this issue.
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180201162625.4276-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Using the greedy star matching, arguments like "...%"PRIx64 caused issues
for functions with multiple PRI formats.
The issue was only seen with the ust backend, as it is the only one
using the format regex.
The result for many functions was that the arguments coming after the
greedy star end was left out of the tracepoint, and in some cases some
of the arguments that was traced had the wrong format.
Signed-off-by: Jon Emil Jahren <jonemilj@gmail.com>
Message-id: 20180129041648.30884-2-jonemilj@gmail.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
PVRDMA implementation
# gpg: Signature made Mon 19 Feb 2018 11:08:49 GMT
# gpg: using RSA key 36D4C0F0CF2FE46D
# gpg: Good signature from "Marcel Apfelbaum <marcel@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: B1C6 3A57 F92E 08F2 640F 31F5 36D4 C0F0 CF2F E46D
* remotes/marcel/tags/rdma-pull-request:
MAINTAINERS: add entry for hw/rdma
hw/rdma: Implementation of PVRDMA device
hw/rdma: PVRDMA commands and data-path ops
hw/rdma: Implementation of generic rdma device layers
hw/rdma: Definitions for rdma device and rdma resource manager
hw/rdma: Add wrappers and macros
include/standard-headers: add pvrdma related headers
scripts/update-linux-headers: import pvrdma headers
docs: add pvrdma device documentation.
mem: add share parameter to memory-backend-ram
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
PVRDMA is the QEMU implementation of VMware's paravirtualized RDMA device.
It works with its Linux Kernel driver AS IS, no need for any special
guest modifications.
While it complies with the VMware device, it can also communicate with
bare metal RDMA-enabled machines and does not require an RDMA HCA in the
host, it can work with Soft-RoCE (rxe).
It does not require the whole guest RAM to be pinned allowing memory
over-commit and, even if not implemented yet, migration support will be
possible with some HW assistance.
Implementation is divided into 2 components, rdma general and pvRDMA
specific functions and structures.
The second PVRDMA sub-module - interaction with PCI layer.
- Device configuration and setup (MSIX, BARs etc).
- Setup of DSR (Device Shared Resources)
- Setup of device ring.
- Device management.
Reviewed-by: Dotan Barak <dotanb@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
This layer is composed of two sub-modules, backend and resource manager.
Backend sub-module is responsible for all the interaction with IB layers
such as ibverbs and umad (external libraries).
Resource manager is a collection of functions and structures to manage
RDMA resources such as QPs, CQs and MRs.
Reviewed-by: Dotan Barak <dotanb@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
As all mapping for this device are from driver to device,
declare wrappers on top of pci_dma_*map functions.
In addition, declare macros to be used for debug messages.
Reviewed-by: Dotan Barak <dotanb@mellanox.com>
Reviewed-by: Zhu Yanjun <yanjun.zhu@oracle.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Import the headers used by the pvrdma device.
Part of them are interfaces between the guest driver and the device,
imported under include/standart-headers/drivers/infiniband/... .
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Modify the script to import the headers used by the pvrdma device.
Part of them are interfaces between the guest driver and the device,
import them under include/standart-headers/drivers/infiniband/... .
Remove the unused functions from pvrdma_verbs.h avoiding the
unnecessary import of several infiniband/networking/other headers.
Reviewed-by: Gal Hammer <ghammer@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Currently only file backed memory backend can
be created with a "share" flag in order to allow
sharing guest RAM with other processes in the host.
Add the "share" flag also to RAM Memory Backend
in order to allow remapping parts of the guest RAM
to different host virtual addresses. This is needed
by the RDMA devices in order to remap non-contiguous
QEMU virtual addresses to a contiguous virtual address range.
Moved the "share" flag to the Host Memory base class,
modified phys_mem_alloc to include the new parameter
and a new interface memory_region_init_ram_shared_nomigrate.
There are no functional changes if the new flag is not used.
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Some pointers on how to get a patch into stable.
[contains some suggestions by mdroth and eblake]
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Back when we used to support compiling either with or without
NPTL threading library support, we used a macro THREAD which would
expand either to nothing (no thread support) or to __thread (threads
supported). For a long time now we have required thread support,
so remove the macro and just use __thread directly as other parts
of QEMU do.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180213132246.26844-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The "default" parameter of the "-mon" option is useless since
QEMU v2.4.0, and marked as deprecated since QEMU v2.8.0. That
should have been long enough to let people update their scripts,
so time to remove it now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1513700253-10045-1-git-send-email-thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
ppc patch queue 2018-02-16
Highlights of this batch:
* Conversion to TranslatorOps (Emilio Cota)
* Further bugfixes and cleanups to vcpu id allocation for pseries
(Greg Kurz)
* Another bugfix for HPT resizing (Daniel Henrique-Barboza)
* Macintosh CUDA cleanups (Mark Cave-Ayland)
* Further tweaks to Spectre/Meltdown mitigations (Suraj Singh)
# gpg: Signature made Fri 16 Feb 2018 10:00:02 GMT
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180216:
ppc4xx: Add device models found in PPC440 core SoCs
ppc/spapr-caps: Disallow setting workaround for spapr-cap-ibs
target/ppc: convert to TranslatorOps
target/ppc: convert to DisasContextBase
spapr: consolidate the VCPU id numbering logic in a single place
spapr: rename spapr_vcpu_id() to spapr_get_vcpu_id()
spapr: move VCPU calculation to core machine code
spapr: use spapr->vsmt to compute VCPU ids
ppc/spapr-caps: Change migration macro to take full spapr-cap name
hw/char: remove legacy interface escc_init()
hw/ppc/spapr_hcall: set htab_shift after kvmppc_resize_hpt_commit
cuda: convert to trace-events
ppc: move CUDAState and other CUDA-related definitions into separate cuda.h file
cuda: convert to use the shared mos6522 device
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull 20180214
Note that the 'Add test for migration to bad destination' displays
a 'Connection refused' during running, but still gives the correct exit
code and OK (It's checking that the source doesn't fail when
it can't connect, so that's the right error).
If it's particularly disliked that patch can be skipped individually.
# gpg: Signature made Wed 14 Feb 2018 15:33:04 GMT
# gpg: using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20180214a:
migration: pass MigrationState to migrate_init()
migration: allow send_rq to fail
migration: provide postcopy_fault_thread_notify()
migration: reuse mis->userfault_quit_fd
migration: better error handling with QEMUFile
tests/migration: Add test for migration to bad destination
migration: Fix early failure cleanup
tests/migration: Add source to PC boot block
migration: improve documentation of postcopy-ram
migration/xen: Check return value of qemu_fclose
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The 'vs->as.freq' value is a signed integer, which is read from an
unsigned 32-bit int field on the wire. There is thus a risk of overflow
on 32-bit platforms. Move the frequency limit checking to be done at
time of read before casting to a signed integer.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180205114938.15784-4-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The start_auth_sasl() method declares a 'Error *local_err' variable in
an inner if () {...} scope, which shadows a variable of the same name
declared at the start of the method. This is confusing for reviewers and
may trigger compiler warnings.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180205114938.15784-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
For very large framebuffers, it is theoretically possible for the result
of 'vs->throttle_output_offset * VNC_THROTTLE_OUTPUT_LIMIT_SCALE' to
exceed the size of a 32-bit int. For this to happen in practice, the
video RAM would have to be set to a large enough value, which is not
likely today. None the less we can be paranoid against future growth by
using division instead of multiplication when checking the limits.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180205114938.15784-2-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The documentation on SDL_RenderPresent function states that
"the backbuffer should be considered invalidated after each present",
so copy the entire texture on each redraw.
On the other hand, SDL_UpdateTexture function is described as
"fairly slow function", so restrict it to just the changed pixels.
Also added SDL_RenderClear call, as suggested in the documentation
page on SDL_RenderPresent.
Signed-off-by: Anatoly Trosinenko <anatoly.trosinenko@gmail.com>
Message-id: 20180205133228.25082-1-anatoly.trosinenko@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
On one of our client's node, due to trying to read from closed ioc,
a segmentation fault occured. Corresponding backtrace:
0 object_get_class (obj=obj@entry=0x0)
1 qio_channel_readv_full (ioc=0x0, iov=0x7ffe55277180 ...
2 qio_channel_read (ioc=<optimized out> ...
3 vnc_client_read_buf (vs=vs@entry=0x55625f3c6000, ...
4 vnc_client_read_plain (vs=0x55625f3c6000)
5 vnc_client_read (vs=0x55625f3c6000)
6 vnc_client_io (ioc=<optimized out>, condition=G_IO_IN, ...
7 g_main_dispatch (context=0x556251568a50)
8 g_main_context_dispatch (context=context@entry=0x556251568a50)
9 glib_pollfds_poll ()
10 os_host_main_loop_wait (timeout=<optimized out>)
11 main_loop_wait (nonblocking=nonblocking@entry=0)
12 main_loop () at vl.c:1909
13 main (argc=<optimized out>, argv=<optimized out>, ...
Having analyzed the coredump, I understood that the reason is that
ioc_tag is reset on vnc_disconnect_start and ioc is cleaned
in vnc_disconnect_finish. Between these two events due to some
reasons the ioc_tag was set again and after vnc_disconnect_finish
the handler is running with freed ioc,
which led to the segmentation fault.
The patch checks vs->disconnecting in places where we call
qio_channel_add_watch and resets handler if disconnecting == TRUE
to prevent such an occurrence.
Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com>
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Message-id: 20180207094844.21402-1-klim.kireev@virtuozzo.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# gpg: Signature made Thu 15 Feb 2018 17:50:22 GMT
# gpg: using RSA key BE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/qio-next-pull-request:
allow to build with older sed
io/channel-command: Do not kill the child process after closing the pipe
io: Add /dev/fdset/ support to QIOChannelFile
io: Don't call close multiple times in QIOChannelFile
io: Fix QIOChannelFile when creating and opening read-write
io/channel-websock: handle continuous reads without any data
io: fix QIONetListener memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit 92b540dac9 introduce a counter to handle the timeouts in a
better way. But in case ccnt reaches 512, the current read character is
ignored - and if that character is part of the string that we are looking
for, the test fails to match the string.
Almost all of the tests look for a string within the first 512 bytes of
firmware output, so the problem never triggered there. But the hppa test
that has been added recently looks for a longer string at the very end of
a long output, thus there's a chance that we miss a character there so
that the test fails unexpectedly. Fix it by *not* reading and dropping a
character if the counter reaches 512.
Fixes: 92b540dac9
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1518761564-9899-1-git-send-email-thuth@redhat.com
[PMM: added initializer for nbd to silence false-positive warning
from OpenBSD 6 compiler]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The spapr-cap cap-ibs can only have values broken or fixed as there is
no explicit workaround required. Currently setting the value workaround
for this cap will hit an assert if the guest makes the hcall
h_get_cpu_characteristics.
Report an error when attempting to apply the setting with a more helpful
error message.
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A few changes worth noting:
- Didn't migrate ctx->exception to DISAS_* since the exception field is
in many cases architecturally relevant.
- Moved the cross-page check from the end of translate_insn to tb_start.
- Removed the exit(1) after a TCG temp leak; changed the fprintf there to
qemu_log.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A couple of notes:
- removed ctx->nip in favour of base->pc_next. Yes, it is annoying,
but didn't want to waste its 4 bytes.
- ctx->singlestep_enabled does a lot more than
base.singlestep_enabled; this confused me at first.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Several places in the code need to calculate a VCPU id:
(cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads
(core_id / smp_threads) * spapr->vsmt (1 user)
index * spapr->vsmt (2 users)
or guess that the VCPU id of a given VCPU is the first thread of a virtual
core:
index % spapr->vsmt != 0
Even if the numbering logic isn't that complex, it is rather fragile to
have these assumptions open-coded in several places. FWIW this was
proved with recent issues related to VSMT.
This patch moves the VCPU id formula to a single function to be called
everywhere the code needs to compute one. It also adds an helper to
guess if a VCPU is the first thread of a VCORE.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Rename spapr_is_vcore() to spapr_is_thread0_in_vcore() for clarity]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr_vcpu_id() function is an accessor actually. Let's rename it
for symmetry with the recently added spapr_set_vcpu_id() helper.
The motivation behind this is that a later patch will consolidate
the VCPU id formula in a function and spapr_vcpu_id looks like an
appropriate name.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The VCPU ids are currently computed and assigned to each individual
CPU threads in spapr_cpu_core_realize(). But the numbering logic
of VCPU ids is actually a machine-level concept, and many places
in hw/ppc/spapr.c also have to compute VCPU ids out of CPU indexes.
The current formula used in spapr_cpu_core_realize() is:
vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i
where:
cc->core_id is a multiple of smp_threads
cpu_index = cc->core_id + i
0 <= i < smp_threads
So we have:
cpu_index % smp_threads == i
cc->core_id / smp_threads == cpu_index / smp_threads
hence:
vcpu_id =
(cpu_index / smp_threads) * spapr->vsmt + cpu_index % smp_threads;
This formula was used before VSMT at the time VCPU ids where computed
at the target emulation level. It has the advantage of being useable
to derive a VPCU id out of a CPU index only. It is fitted for all the
places where the machine code has to compute a VCPU id.
This patch introduces an accessor to set the VCPU id in a PowerPCCPU object
using the above formula. It is a first step to consolidate all the VCPU id
logic in a single place.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since the introduction of VSMT in 2.11, the spacing of VCPU ids
between cores is controllable through a machine property instead
of being only dictated by the SMT mode of the host:
cpu->vcpu_id = (cc->core_id * spapr->vsmt / smp_threads) + i
Until recently, the machine code would try to change the SMT mode
of the host to be equal to VSMT or exit. This allowed the rest of
the code to assume that kvmppc_smt_threads() == spapr->vsmt is
always true.
Recent commit "8904e5a75005 spapr: Adjust default VSMT value for
better migration compatibility" relaxed the rule. If the VSMT
mode cannot be set in KVM for some reasons, but the requested
CPU topology is compatible with the current SMT mode, then we
let the guest run with kvmppc_smt_threads() != spapr->vsmt.
This breaks quite a few places in the code, in particular when
calculating DRC indexes.
This is what happens on a POWER host with subcores-per-core=2 (ie,
supports up to SMT4) when passing the following topology:
-smp threads=4,maxcpus=16 \
-device host-spapr-cpu-core,core-id=4,id=core1 \
-device host-spapr-cpu-core,core-id=8,id=core2
qemu-system-ppc64: warning: Failed to set KVM's VSMT mode to 8 (errno -22)
This is expected since KVM is limited to SMT4, but the guest is started
anyway because this topology can run on SMT4 even with a VSMT8 spacing.
But when we look at the DT, things get nastier:
cpus {
...
ibm,drc-indexes = <0x4 0x10000000 0x10000004 0x10000008 0x1000000c>;
This means that we have the following association:
CPU core device | DRC | VCPU id
-----------------+------------+---------
boot core | 0x10000000 | 0
core1 | 0x10000004 | 4
core2 | 0x10000008 | 8
core3 | 0x1000000c | 12
But since the spacing of VCPU ids is 8, the DRC for core1 points to a
VCPU that doesn't exist, the DRC for core2 points to the first VCPU of
core1 and and so on...
...
PowerPC,POWER8@0 {
...
ibm,my-drc-index = <0x10000000>;
...
};
PowerPC,POWER8@8 {
...
ibm,my-drc-index = <0x10000008>;
...
};
PowerPC,POWER8@10 {
...
No ibm,my-drc-index property for this core since 0x10000010 doesn't
exist in ibm,drc-indexes above.
...
};
};
...
interrupt-controller {
...
ibm,interrupt-server-ranges = <0x0 0x10>;
With a spacing of 8, the highest VCPU id for the given topology should be:
16 * 8 / 4 = 32 and not 16
...
linux,phandle = <0x7e7323b8>;
interrupt-controller;
};
And CPU hot-plug/unplug is broken:
(qemu) device_del core1
pseries-hotplug-cpu: Cannot find CPU (drc index 10000004) to remove
(qemu) device_del core2
cpu 4 (hwid 8) Ready to die...
cpu 5 (hwid 9) Ready to die...
cpu 6 (hwid 10) Ready to die...
cpu 7 (hwid 11) Ready to die...
These are the VCPU ids of core1 actually
(qemu) device_add host-spapr-cpu-core,core-id=12,id=core3
(qemu) device_del core3
pseries-hotplug-cpu: Cannot find CPU (drc index 1000000c) to remove
This patches all the code in hw/ppc/spapr.c to assume the VSMT
spacing when manipulating VCPU ids.
Fixes: 8904e5a750
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Change the macro that generates the vmstate migration field and the needed
function for the spapr-caps to take the full spapr-cap name. This has
the benefit of meaning this instance will be picked up when greping
for the spapr-caps and making it more obvious what this macro is doing.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Move necessary stuff in escc.h and update type names.
Remove slavio_serial_ms_kbd_init().
Fix code style problems reported by checkpatch.pl
Update mac_newworld, mac_oldworld and sun4m to use directly the
QDEV interface.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Newer kernels have a htab resize capability when adding or remove
memory. At these situations, the guest kernel might reallocate its
htab to a more suitable size based on the resulting memory.
However, we're not setting the new value back into the machine state
when a KVM guest resizes its htab. At first this doesn't seem harmful,
but when migrating or saving the guest state (via virsh managedsave,
for instance) this mismatch between the htab size of QEMU and the
kernel makes the guest hangs when trying to load its state.
Inside h_resize_hpt_commit, the hypercall that commits the hash page
resize changes, let's set spapr->htab_shift to the new value if we're
sure that kvmppc_resize_hpt_commit were successful.
While we're here, add a "not RADIX" sanity check as it is already done
in the related hypercall h_resize_hpt_prepare.
Fixes: https://github.com/open-power-host-os/qemu/issues/28
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
target-arm queue:
* aspeed: code cleanup to use unimplemented_device
* preparatory work for 'raspi3' RaspberryPi 3 machine model
* more SVE prep work
* v8M: add minor missing registers
* v7M: fix bug where we weren't migrating v7m.other_sp
* v7M: fix bugs in handling of interrupt registers for
external interrupts beyond 32
# gpg: Signature made Thu 15 Feb 2018 18:34:40 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180215-1:
raspi: Raspberry Pi 3 support
bcm2836: Make CPU type configurable
target/arm: Implement v8M MSPLIM and PSPLIM registers
target/arm: Migrate v7m.other_sp
target/arm: Add AIRCR to vmstate struct
hw/intc/armv7m_nvic: Fix byte-to-interrupt number conversions
target/arm: Implement writing to CONTROL_NS for v8M
hw/intc/armv7m_nvic: Implement SCR
hw/intc/armv7m_nvic: Implement cache ID registers
hw/intc/armv7m_nvic: Implement v8M CPPWR register
hw/intc/armv7m_nvic: Implement M profile cache maintenance ops
hw/intc/armv7m_nvic: Fix ICSR PENDNMISET/CLR handling
hw/intc/armv7m_nvic: Don't hardcode M profile ID registers in NVIC
target/arm: Handle SVE registers when using clear_vec_high
target/arm: Enforce access to ZCR_EL at translation
target/arm: Suppress TB end for FPCR/FPSR
target/arm: Enforce FP access to FPCR/FPSR
target/arm: Remove ARM_CP_64BIT from ZCR_EL registers
hw/arm/aspeed: simplify using the 'unimplemented device' for aspeed_soc.io
hw/arm/aspeed: directly map the serial device to the system address space
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds Raspberry Pi 3 support to hw/arm/raspi.c. The
differences to Pi 2 are:
- Firmware address
- Board ID
- Board revision
The CPU is different too, but that's going to be configured as part of
the machine default CPU when we introduce a new machine type.
The patch was written from scratch by me but the logic is similar to
Zoltán Baldaszti's previous work, which I used as a reference (with
permission from the author):
https://github.com/bztsrc/qemu-raspi3
Signed-off-by: Pekka Enberg <penberg@iki.fi>
[PMM: fixed trailing whitespace on one line]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch adds a "cpu-type" property to BCM2836 SoC in preparation for
reusing the code for the Raspberry Pi 3, which has a different processor
model.
Signed-off-by: Pekka Enberg <penberg@iki.fi>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The v8M architecture includes hardware support for enforcing
stack pointer limits. We don't implement this behaviour yet,
but provide the MSPLIM and PSPLIM stack pointer limit registers
as reads-as-written, so that when we do implement the checks
in future this won't break guest migration.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-12-peter.maydell@linaro.org
In commit abc24d86cc we accidentally broke migration of
the stack pointer value for the mode (process, handler) the CPU
is not currently running as. (The commit correctly removed the
no-longer-used v7m.current_sp flag from the VMState but also
deleted the still very much in use v7m.other_sp SP value field.)
Add a subsection to migrate it again. (We don't need to care
about trying to retain compatibility with pre-abc24d86cc0364f
versions of QEMU, because that commit bumped the version_id
and we've since bumped it again a couple of times.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-11-peter.maydell@linaro.org
In many of the NVIC registers relating to interrupts, we
have to convert from a byte offset within a register set
into the number of the first interrupt which is affected.
We were getting this wrong for:
* reads of NVIC_ISPR<n>, NVIC_ISER<n>, NVIC_ICPR<n>, NVIC_ICER<n>,
NVIC_IABR<n> -- in all these cases we were missing the "* 8"
needed to convert from the byte offset to the interrupt number
(since all these registers use one bit per interrupt)
* writes of NVIC_IPR<n> had the opposite problem of a spurious
"* 8" (since these registers use one byte per interrupt)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180209165810.6668-9-peter.maydell@linaro.org
We were previously making the system control register (SCR)
just RAZ/WI. Although we don't implement the functionality
this register controls, we should at least provide the state,
including the banked state for v8M.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-7-peter.maydell@linaro.org
M profile cores have a similar setup for cache ID registers
to A profile:
* Cache Level ID Register (CLIDR) is a fixed value
* Cache Type Register (CTR) is a fixed value
* Cache Size ID Registers (CCSIDR) are a bank of registers;
which one you see is selected by the Cache Size Selection
Register (CSSELR)
The only difference is that they're in the NVIC memory mapped
register space rather than being coprocessor registers.
Implement the M profile view of them.
Since neither Cortex-M3 nor Cortex-M4 implement caches,
we don't need to update their init functions and can leave
the ctr/clidr/ccsidr[] fields in their ARMCPU structs at zero.
Newer cores (like the Cortex-M33) will want to be able to
set these ID registers to non-zero values, though.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-6-peter.maydell@linaro.org
The Coprocessor Power Control Register (CPPWR) is new in v8M.
It allows software to control whether coprocessors are allowed
to power down and lose their state. QEMU doesn't have any
notion of power control, so we choose the IMPDEF option of
making the whole register RAZ/WI (indicating that no coprocessors
can ever power down and lose state).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-5-peter.maydell@linaro.org
For M profile cores, cache maintenance operations are done by
writing to special registers in the system register space.
For QEMU, cache operations are always NOPs, since we don't
implement the cache. Implementing these explicitly avoids
a spurious LOG_GUEST_ERROR when the guest uses them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-4-peter.maydell@linaro.org
The PENDNMISET/CLR bits in the ICSR should be RAZ/WI from
NonSecure state if the AIRCR.BFHFNMINS bit is zero. We had
misimplemented this as making the bits RAZ/WI from both
Secure and NonSecure states. Fix this bug by checking
attrs.secure so that Secure code can pend and unpend NMIs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-3-peter.maydell@linaro.org
Instead of hardcoding the values of M profile ID registers in the
NVIC, use the fields in the CPU struct. This will allow us to
give different M profile CPU types different ID register values.
This commit includes the addition of the missing ID_ISAR5,
which exists as RES0 in both v7M and v8M.
(The values of the ID registers might be wrong for the M4 --
this commit leaves the behaviour there unchanged.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20180209165810.6668-2-peter.maydell@linaro.org
sed's -E option may not be supported by older distros. As there's no
point using sed here at all, use just shell mechanisms to establish the
variable values, starting from the stem instead of the full target.
Signed-off-by: Jan Beulich <jbeulich@suse.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We are currently facing some migration failure on s390x when running
certain avocado-vt tests, e.g. when running the test
type_specific.io-github-autotest-qemu.migrate.with_reboot.exec.gzip_exec.
This test is using 'migrate -d "exec:nc localhost 5200"' for the migration.
The problem is detected at the receiving side, where the migration stream
apparently ends too early. However, the cause for the problem is at the
sending side: After writing the migration stream into the pipe to netcat,
the source QEMU calls qio_channel_command_close() which closes the pipe
and immediately (!) kills the child process afterwards (via the function
qio_channel_command_abort()). So if the sending netcat did not read the
final bytes from the pipe yet, or if it did not manage to send out all
its buffers yet, it is killed before the whole migration stream is passed
to the destination side.
QEMU can not know how much time is required by the child process to send
over all migration data, so we should not kill it, neither directly nor
after a delay. Let's simply wait for the child process to exit gracefully
instead (this was also the behaviour of pclose() that was used in "exec:"
migration before the QIOChannel rework).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Add /dev/fdset/ support to QIOChannelFile by calling qemu_open() instead
of open() and qemu_close() instead of close(). There is a subtle
semantic change since qemu_open() automatically sets O_CLOEXEC, but this
doesn't affect any of the users of the function.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If the file descriptor underlying QIOChannelFile is closed in the
io_close() method, don't close it again in the finalize() method since
the file descriptor number may have been reused in the meantime.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The code wrongly passes the mode to open() only if O_WRONLY is set.
Instead, the mode should be passed when O_CREAT is set (or O_TMPFILE on
Linux). Fix this by always passing the mode since open() will correctly
ignore the mode if it is not needed. Add a testcase which exercises this
bug and also change the existing testcase to check that the mode of the
created file is correct.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
According to the current implementation of websocket protocol in QEMU,
qio_channel_websock_handshake_io tries to read handshake from the
channel to start communication over socket. But this approach
doesn't cover scenario when socket was closed while handshaking.
Therefore, if G_IO_IN is caught and qio_channel_read returns zero,
error has to be set and connection has to be done.
Such behaviour causes 100% CPU load in main QEMU loop, because main loop
poll continues to receive and handle G_IO_IN events from websocket.
Step to reproduce 100% CPU load:
1) start qemu with the simplest configuration
$ qemu -vnc [::1]:1,websocket=7500
2) open any vnc listener (which doesn't follow websocket
protocol)
$ vncviewer :7500
3) kill listener
4) qemu main thread eats 100% CPU
Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The sources array does not escape out of qio_net_listener_wait_client, so
we have to free it.
Reported by Coverity.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Various improvements to the qtest checks:
- Clean-ups by Eric Blake with regards to the global_qtest variable
- Some more test cases for the boot-serial tester
- Re-activation of the m48t59-test
# gpg: Signature made Wed 14 Feb 2018 11:07:44 GMT
# gpg: using RSA key 2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>"
# gpg: aka "Thomas Huth <thuth@redhat.com>"
# gpg: aka "Thomas Huth <huth@tuxfamily.org>"
# gpg: aka "Thomas Huth <th.huth@posteo.de>"
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/huth/tags/pull-request-2018-02-14:
tests/m48t59: Use the m48t59 test on ppc, too
tests/Makefile: Derive check-qtest-ppc64-y from check-qtest-ppc-y
tests/m48t59: Make the test independent of global_qtest
tests/m48t59: Fix and re-enable the test for sparc
tests/boot-serial-test: Add support for the aarch64 virt machine
tests/boot-serial: Add tests for PowerPC Mac machines
tests/boot-serial: Enable the boot-serial test on SPARC machines, too
wdt_ib700-test: Drop dependence on global_qtest
tests/boot-sector: Drop dependence on global_qtest
qmp-test: Drop dependence on global_qtest
libqos: Use explicit QTestState for remaining libqos operations
libqos: Use explicit QTestState for ahci operations
libqos: Use explicit QTestState for i2c operations
libqos: Use explicit QTestState for rtas operations
libqos: Use explicit QTestState for fw_cfg operations
libqos: Track QTestState with QPCIBus
libqtest: Use qemu_strtoul()
tests: Clean up wait for event
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit dce8921b2b ("iothread: Stop threads
before main() quits") introduced iothread_stop_all() to avoid the
following virtio-scsi assertion failure:
assert(blk_get_aio_context(d->conf.blk) == s->ctx);
Back then the assertion failed because when bdrv_close_all() made
d->conf.blk NULL, blk_get_aio_context() returned the global AioContext
instead of s->ctx.
The same assertion can still fail today when vcpus submit new I/O
requests after iothread_stop_all() has moved the BDS to the global
AioContext.
This patch hardens the iothread_stop_all() approach by pausing vcpus
before calling iothread_stop_all().
Note that the assertion failure is a race condition. It is not possible
to reproduce it reliably.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20180201110708.8080-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The ref405ep machine has a memory-mapped m48t59 device, so
we can run the m48t59 test on this machine, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Stop using the functions that require global_qtest here and pass
around the QTestState instead (global_qtest should finally get
removed since this causes problems with tests running in parallel).
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The m48t59 test has been disabled in commit baeddded5f
("sparc: disable qtest in make check"), likely due to some timing issues
in the bcd_check_time tests which might fail if it gets interrupted for
too long. It should be OK to re-enable this test if we make sure that we
do not run it on timing-sensitive machines, thus it should be OK if we only
run it in the g_test_slow() mode.
Additionally, there are two other issues:
First, the test can not run so easily on sparc64 anymore, since commit
f3b18f35a2 ("sun4u: switch m48t59 NVRAM to MMIO access")
moved the m48t59 device to the ebus instead, and for this you first
have to set up the corresponding PCI device (which is currently not
possible from within the m48t59 test). So we can only re-enable this
test on sparc, but not the sparc64 target.
Second, the fuzzing test is executed before the bcd-check-time test
(due to the naming of the tests), without having the base address set
up properly, so the fuzzing test does not really check anything at all.
Fix it by setting up the base address from the main function already
and by moving the qtest_start() to the tests themselves, so that each
test starts with a clean environment (since after the fuzzing, the clock
is unusable for the bcd-check-time test).
Signed-off-by: Thomas Huth <thuth@redhat.com>
This patch adds a small binary kernel to test aarch64 virt machine's
UART.
Signed-off-by: Wei Huang <wei@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[thuth: Fixed contextual conflicts with the hppa and sdhci patches]
Signed-off-by: Thomas Huth <thuth@redhat.com>
OpenBIOS prints out the CPU type on these machine types, so we can use
this string to test whether the CPU detection is working correctly.
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
OpenBIOS prints out the name of the detected CPU here, so looking for
this string is a nice test to verify that the CPU detection is still
working correctly.
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As a general rule, we prefer avoiding implicit global state
because it makes code harder to safely copy and paste without
thinking about the global state. Improve this test to be
explicit about the state.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As a general rule, we prefer avoiding implicit global state
because it makes code harder to safely copy and paste without
thinking about the global state. Adjust the helper code to
use explicit state instead, and update all callers.
Fix some trailing whitespace while touching the file.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
As a general rule, we prefer avoiding implicit global state
because it makes code harder to safely copy and paste without
thinking about the global state. Although qmp-test does not
maintain parallel qtest connections, it was the last test
assigning to global_qtest. It's just as easy to be explicit
about the state; once all tests have been cleaned up, a later
patch can then get rid of global_qtest and a layer of wrappers
in libqtest.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop one more client of global_qtest by teaching all remaining
libqos stragglers to pass in an explicit QTestState. Change the
setting of global_qtest from being implicit in libqos' call to
qtest_start() to instead be explicit in all clients that are
still relying on global_qtest.
Note that qmp_execute() can be greatly simplified in the process,
and that we also get rid of interpolation of a JSON string into a
temporary variable when qtest_qmp() can do it more reliably.
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop one more client of global_qtest by teaching all ahci test
functionality to pass in an explicit QTestState. The state was
already available, so no callers had to be adjusted.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop one more client of global_qtest by teaching all i2c test
functionality to pass in an explicit QTestState, adjusting all
callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop one more client of global_qtest by teaching all rtas test
functionality to pass in an explicit QTestState, adjusting all
callers.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Use nicer indentation in rtas.h]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Drop one more client of global_qtest by teaching all fw_cfg test
functionality (invoked through alloc-pc) to pass in an explicit
QTestState, adjusting all callers. In particular, fw_cfg-test
had to reorder things to create the test state prior to creating
the fw_cfg (and drop a pointless strdup in the meantime), but that
test now no longer depends on global_qtest.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Fixed conflict wrt pc_alloc_init() in vhost-user-test.c]
Signed-off-by: Thomas Huth <thuth@redhat.com>
When initializing a QPCIBus, track which QTestState the bus is
associated with (so that a later patch can then explicitly use
that test state for all communication on the bus, rather than
blindly relying on global_qtest). Update the initialization
functions to take another parameter, and update all callers to
pass in state (for now, most callers get away with passing the
current global_qtest as the current state, although this required
fixing the order of initialization to ensure qtest_start() is
called before qpci_init*() in rtl8139-test, and provided an
opportunity to pass in the allocator in e1000e-test).
Touch up some allocations to use g_new0() rather than g_malloc()
while in the area, and simplify some code (all implementations
of QOSOps provide a .init_allocator() that never fails).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[thuth: Removed hunk from vhost-user-test.c that is not required anymore,
fixed conflict in qtest_vboot() and adjusted qpci_init_pc() in sdhci-test]
Signed-off-by: Thomas Huth <thuth@redhat.com>
We will not allow failures to happen when sending data from destination
to source via the return path. However it is possible that there can be
errors along the way. This patch allows the migrate_send_rp_message()
to return error when it happens, and further extended it to
migrate_send_rp_req_pages().
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180208103132.28452-9-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It was only used for quitting the page fault thread before. Let it be
something more useful - now we can use it to notify a "wake" for the
page fault thread (for any reason), and it only means "quit" if the
fault_thread_quit is set.
Since we changed what it does, renaming it to userfault_event_fd.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180208103132.28452-3-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If the postcopy down due to some reason, we can always see this on dst:
qemu-system-x86_64: RP: Received invalid message 0x0000 length 0x0000
However in most cases that's not the real issue. The problem is that
qemu_get_be16() has no way to show whether the returned data is valid or
not, and we are _always_ assuming it is valid. That's possibly not wise.
The best approach to solve this would be: refactoring QEMUFile interface
to allow the APIs to return error if there is. However it needs quite a
bit of work and testing. For now, let's explicitly check the validity
first before using the data in all places for qemu_get_*().
This patch tries to fix most of the cases I can see. Only if we are with
this, can we make sure we are processing the valid data, and also can we
make sure we can capture the channel down events correctly.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180208103132.28452-2-peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The boot block used in the migration test is currently only
shipped as a hex (with the source in the git commit message of ea0c6d62),
change this to actually include the source.
A script is added to rebuild the header but the expectation is that
the generated hex is shipped as well as the .s, so that
there's no requirement to have just the right assembler etc.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180213100606.5379-1-dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Removed blank line at end of script
QEMUFile uses buffered IO so when writing small amounts (such as the Xen
device state file), the actual write call and any errors that may occur
only happen as part of qemu_fclose(). Therefore, report IO errors when
saving the device state under Xen by checking the return value of
qemu_fclose().
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Message-Id: <20180206163039.23661-1-ross.lagerwall@citrix.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch implements movep instruction. It moves data between a data register
and alternate bytes within the address space starting at the location
specified and incrementing by two.
It was designed for the original 68000 and used in firmwares for
interfacing the 8-bit peripherals through the 16-bit data bus.
Without this patch opcode for this instruction is recognized as some bitop.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Mihail Abakumov <mikhail.abakumov@ispras.ru>
Tested-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180206124431.31433.91946.stgit@pasha-VirtualBox>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
This will keep checkpatch happy when the next patch does code motion.
Fix the include order to match HACKING when adding the needed header.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
We still use hacks like qmp("") to wait for an event, even though we
have qmp_eventwait() since commit 8fe941f, and qmp_eventwait_ref()
since commit 7ffe312. Both commits neglected to convert all the
existing hacks. Make up what they missed.
Bonus: gets rid of empty format strings. A step towards compile-time
format string checking without triggering -Wformat-zero-length.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
[thuth: dropped the hunks from the usb tests - not needed anymore]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Block layer patches
# gpg: Signature made Tue 13 Feb 2018 17:03:11 GMT
# gpg: using RSA key 7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (55 commits)
iotests: Add l2-cache-entry-size to iotest 137
iotests: Test downgrading an image using a small L2 slice size
iotests: Test valid values of l2-cache-entry-size
qcow2: Allow configuring the L2 slice size
qcow2: Rename l2_table in count_cow_clusters()
qcow2: Rename l2_table in count_contiguous_clusters_unallocated()
qcow2: Rename l2_table in count_contiguous_clusters()
qcow2: Rename l2_table in qcow2_alloc_compressed_cluster_offset()
qcow2: Update qcow2_truncate() to support L2 slices
qcow2: Update expand_zero_clusters_in_l1() to support L2 slices
qcow2: Prepare expand_zero_clusters_in_l1() for adding L2 slice support
qcow2: Read refcount before L2 table in expand_zero_clusters_in_l1()
qcow2: Update qcow2_update_snapshot_refcount() to support L2 slices
qcow2: Prepare qcow2_update_snapshot_refcount() for adding L2 slice support
qcow2: Update zero_single_l2() to support L2 slices
qcow2: Update discard_single_l2() to support L2 slices
qcow2: Update handle_alloc() to support L2 slices
qcow2: Update handle_copied() to support L2 slices
qcow2: Update qcow2_alloc_cluster_link_l2() to support L2 slices
qcow2: Update qcow2_get_cluster_offset() to support L2 slices
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virtio,vhost,pci,pc: features, fixes and cleanups
- new stats in virtio balloon
- virtio eventfd rework for boot speedup
- vhost memory rework for boot speedup
- fixes and cleanups all over the place
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 13 Feb 2018 16:29:55 GMT
# gpg: using RSA key 281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (22 commits)
virtio-balloon: include statistics of disk/file caches
acpi-test: update FADT
lpc: drop pcie host dependency
tests: acpi: fix FADT not being compared to reference table
hw/pci-bridge: fix pcie root port's IO hints capability
libvhost-user: Support across-memory-boundary access
libvhost-user: Fix resource leak
virtio-balloon: unref the memory region before continuing
pci: removed the is_express field since a uniform interface was inserted
virtio-blk: enable multiple vectors when using multiple I/O queues
pci/bus: let it has higher migration priority
pci-bridge/i82801b11: clear bridge registers on platform reset
vhost: Move log_dirty check
vhost: Merge and delete unused callbacks
vhost: Clean out old vhost_set_memory and friends
vhost: Regenerate region list from changed sections list
vhost: Merge sections added to temporary list
vhost: Simplify ring verification checks
vhost: Build temporary section list and deref after commit
virtio: improve virtio devices initialization time
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Previous commit ("tests: acpi: fix FADT not being compared to reference table")
started tracking changes to the FADT. Generate the expected FACP files -
apparently these weren't updated since 2013.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It turns out that FADT isn't actually tested for changes
against reference table, since it happens to be the 1st
table in RSDT which is currently ignored.
Fix it by making sure that all tables from RSDT are added
to test list.
NOTE: FADT contains guest allocated pointers to FACS/DSDT,
zero them out so that possible FACS/DSDT address change
won't affect test results.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The gen_pcie_root_port mem-reserve and pref32-reserve properties are
defined as size (so uint64_t), but passed as uint32_t when building
the 'IO hints' vendor specific capability.
Passing 4G (or more) gets truncated and passed as a zero reservation.
Is not a huge issue since the guest firmware will always compare the
hints with the default value and take the maximum.
Fix it by passing the values as uint64_t and failing to init the
gen_pcie_root_port id invalid values are used.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The sg list/indirect descriptor table may be contigious
in GPA but not in HVA address space. But libvhost-user
wasn't aware of that. This would cause out-of-bounds
access. Even a malicious guest could use it to get
information from the vhost-user backend.
Introduce a plen parameter in vu_gpa_to_va() so we can
handle this case, returning the actual mapped length.
Signed-off-by: Yongji Xie <xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Free the mmaped memory when we need to mmap new memory
space on vu_set_mem_table_exec() and vu_set_log_base_exec() to
avoid memory leak.
Also close the corresponding fd after mmap() on
vu_set_log_base_exec() to avoid fd leak.
Signed-off-by: Yongji Xie <xieyongji@baidu.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Block patches for the block queue
# gpg: Signature made Tue Feb 13 17:00:13 2018 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2018-02-13: (40 commits)
iotests: Add l2-cache-entry-size to iotest 137
iotests: Test downgrading an image using a small L2 slice size
iotests: Test valid values of l2-cache-entry-size
qcow2: Allow configuring the L2 slice size
qcow2: Rename l2_table in count_cow_clusters()
qcow2: Rename l2_table in count_contiguous_clusters_unallocated()
qcow2: Rename l2_table in count_contiguous_clusters()
qcow2: Rename l2_table in qcow2_alloc_compressed_cluster_offset()
qcow2: Update qcow2_truncate() to support L2 slices
qcow2: Update expand_zero_clusters_in_l1() to support L2 slices
qcow2: Prepare expand_zero_clusters_in_l1() for adding L2 slice support
qcow2: Read refcount before L2 table in expand_zero_clusters_in_l1()
qcow2: Update qcow2_update_snapshot_refcount() to support L2 slices
qcow2: Prepare qcow2_update_snapshot_refcount() for adding L2 slice support
qcow2: Update zero_single_l2() to support L2 slices
qcow2: Update discard_single_l2() to support L2 slices
qcow2: Update handle_alloc() to support L2 slices
qcow2: Update handle_copied() to support L2 slices
qcow2: Update qcow2_alloc_cluster_link_l2() to support L2 slices
qcow2: Update qcow2_get_cluster_offset() to support L2 slices
...
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that the code is ready to handle L2 slices we can finally add an
option to allow configuring their size.
An L2 slice is the portion of an L2 table that is read by the qcow2
cache. Until now the cache was always reading full L2 tables, and
since the L2 table size is equal to the cluster size this was not very
efficient with large clusters. Here's a more detailed explanation of
why it makes sense to have smaller cache entries in order to load L2
data:
https://lists.gnu.org/archive/html/qemu-block/2017-09/msg00635.html
This patch introduces a new command-line option to the qcow2 driver
named l2-cache-entry-size (cf. l2-cache-size). The cache entry size
has the same restrictions as the cluster size: it must be a power of
two and it has the same range of allowed values, with the additional
requirement that it must not be larger than the cluster size.
The L2 cache entry size (L2 slice size) remains equal to the cluster
size for now by default, so this feature must be explicitly enabled.
Although my tests show that 4KB slices consistently improve
performance and give the best results, let's wait and make more tests
with different cluster sizes before deciding on an optimal default.
Now that the cache entry size is not necessarily equal to the cluster
size we need to reflect that in the MIN_L2_CACHE_SIZE documentation.
That minimum value is a requirement of the COW algorithm: we need to
read two L2 slices (and not two L2 tables) in order to do COW, see
l2_allocate() for the actual code.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: c73e5611ff4a9ec5d20de68a6c289553a13d2354.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The qcow2_truncate() code is mostly independent from whether
we're using L2 slices or full L2 tables, but in full and
falloc preallocation modes new L2 tables are allocated using
qcow2_alloc_cluster_link_l2(). Therefore the code needs to be
modified to ensure that all nb_clusters that are processed in each
call can be allocated with just one L2 slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1fd7d272b5e7b66254a090b74cf2bed1cc334c0e.1517840877.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
expand_zero_clusters_in_l1() expands zero clusters as a necessary step
to downgrade qcow2 images to a version that doesn't support metadata
zero clusters. This function takes an L1 table (which may or may not
be active) and iterates over all its L2 tables looking for zero
clusters.
Since we'll be loading L2 slices instead of full tables we need to add
an extra loop that iterates over all slices of each L2 table, and we
should also use the slice size when allocating the buffer used when
the L1 table is not active.
This function doesn't need any additional changes so apart from that
this patch simply updates the variable name from l2_table to l2_slice.
Finally, and since we have to touch the bdrv_read() / bdrv_write()
calls anyway, this patch takes the opportunity to replace them with
the byte-based bdrv_pread() / bdrv_pwrite().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 43590976f730501688096cff103f2923b72b0f32.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Adding support for L2 slices to expand_zero_clusters_in_l1() needs
(among other things) an extra loop that iterates over all slices of
each L2 table.
Putting all changes in one patch would make it hard to read because
all semantic changes would be mixed with pure indentation changes.
To make things easier this patch simply creates a new block and
changes the indentation of all lines of code inside it. Thus, all
modifications in this patch are cosmetic. There are no semantic
changes and no variables are renamed yet. The next patch will take
care of that.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: c2ae9f31ed5b6e591477ad4654448badd1c89d73.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
At the moment it doesn't really make a difference whether we call
qcow2_get_refcount() before of after reading the L2 table, but if we
want to support L2 slices we'll need to read the refcount first.
This patch simply changes the order of those two operations to prepare
for that. The patch with the actual semantic changes will be easier to
read because of this.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 947a91d934053a2dbfef979aeb9568f57ef57c5d.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_update_snapshot_refcount() increases the refcount of all
clusters of a given snapshot. In order to do that it needs to load all
its L2 tables and iterate over their entries. Since we'll be loading
L2 slices instead of full tables we need to add an extra loop that
iterates over all slices of each L2 table.
This function doesn't need any additional changes so apart from that
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 5f4db199b9637f4833b58487135124d70add8cf0.1517840877.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Adding support for L2 slices to qcow2_update_snapshot_refcount() needs
(among other things) an extra loop that iterates over all slices of
each L2 table.
Putting all changes in one patch would make it hard to read because
all semantic changes would be mixed with pure indentation changes.
To make things easier this patch simply creates a new block and
changes the indentation of all lines of code inside it. Thus, all
modifications in this patch are cosmetic. There are no semantic
changes and no variables are renamed yet. The next patch will take
care of that.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 8ffaa5e55bd51121f80e498f4045b64902a94293.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
zero_single_l2() limits the number of clusters to be zeroed to the
amount that fits inside an L2 table. Since we'll be loading L2 slices
instead of full tables we need to update that limit. The function is
renamed to zero_in_l2_slice() for clarity.
Apart from that, this function doesn't need any additional changes, so
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: ebc16e7e79fa6969d8975ef487d679794de4fbcc.1517840877.git.berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
discard_single_l2() limits the number of clusters to be discarded
to the amount that fits inside an L2 table. Since we'll be loading
L2 slices instead of full tables we need to update that limit. The
function is renamed to discard_in_l2_slice() for clarity.
Apart from that, this function doesn't need any additional changes, so
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1cb44a5b68be5334cb01b97a3db3a3c5a43396e5.1517840877.git.berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
handle_alloc() loads an L2 table and limits the number of checked
clusters to the amount that fits inside that table. Since we'll be
loading L2 slices instead of full tables we need to update that limit.
Apart from that, this function doesn't need any additional changes, so
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: b243299c7136f7014c5af51665431ddbf5e99afd.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
handle_copied() loads an L2 table and limits the number of checked
clusters to the amount that fits inside that table. Since we'll be
loading L2 slices instead of full tables we need to update that limit.
Apart from that, this function doesn't need any additional changes, so
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 541ac001a7d6b86bab2392554bee53c2b312148c.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
There's a loop in this function that iterates over the L2 entries in a
table, so now we need to assert that it remains within the limits of
an L2 slice.
Apart from that, this function doesn't need any additional changes, so
this patch simply updates the variable name from l2_table to l2_slice.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: f9846a1c2efc51938e877e2a25852d9ab14797ff.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_get_cluster_offset() checks how many contiguous bytes are
available at a given offset. The returned number of bytes is limited
by the amount that can be addressed without having to load more than
one L2 table.
Since we'll be loading L2 slices instead of full tables this patch
changes the limit accordingly using the size of the L2 slice for the
calculations instead of the full table size.
One consequence of this is that with small L2 slices operations such
as 'qemu-img map' will need to iterate in more steps because each
qcow2_get_cluster_offset() call will potentially return a smaller
number. However the code is already prepared for that so this doesn't
break semantics.
The l2_table variable is also renamed to l2_slice to reflect this, and
offset_to_l2_index() is replaced with offset_to_l2_slice_index().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 6b602260acb33da56ed6af9611731cb7acd110eb.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This patch updates l2_allocate() to support the qcow2 cache returning
L2 slices instead of full L2 tables.
The old code simply gets an L2 table from the cache and initializes it
with zeroes or with the contents of an existing table. With a cache
that returns slices instead of tables the idea remains the same, but
the code must now iterate over all the slices that are contained in an
L2 table.
Since now we're operating with slices the function can no longer
return the newly-allocated table, so it's up to the caller to retrieve
the appropriate L2 slice after calling l2_allocate() (note that with
this patch the caller is still loading full L2 tables, but we'll deal
with that in a separate patch).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20fc0415bf0e011e29f6487ec86eb06a11f37445.1517840877.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Adding support for L2 slices to l2_allocate() needs (among other
things) an extra loop that iterates over all slices of a new L2 table.
Putting all changes in one patch would make it hard to read because
all semantic changes would be mixed with pure indentation changes.
To make things easier this patch simply creates a new block and
changes the indentation of all lines of code inside it. Thus, all
modifications in this patch are cosmetic. There are no semantic
changes and no variables are renamed yet. The next patch will take
care of that.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: d0d7dca8520db304524f52f49d8157595a707a35.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Each entry in the qcow2 L2 cache stores a full L2 table (which uses a
complete cluster in the qcow2 image). A cluster is usually too large
to be used efficiently as the size for a cache entry, so we want to
decouple both values by allowing smaller cache entries. Therefore the
qcow2 L2 cache will no longer return full L2 tables but slices
instead.
This patch updates l2_load() so it can handle L2 slices correctly.
Apart from the offset of the L2 table (which we already had) we also
need the guest offset in order to calculate which one of the slices
we need.
An L2 slice has currently the same size as an L2 table (one cluster),
so for now this function will load exactly the same data as before.
This patch also removes a stale comment about the return value being
a pointer to the L2 table. This function returns an error code since
55c17e9821.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: b830aa1fc5b6f8e3cb331d006853fe22facca847.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The BDRVQcow2State structure contains an l2_size field, which stores
the number of 64-bit entries in an L2 table.
For efficiency reasons we want to be able to load slices instead of
full L2 tables, so we need to know how many entries an L2 slice can
hold.
An L2 slice is the portion of an L2 table that is loaded by the qcow2
cache. At the moment that cache can only load complete tables,
therefore an L2 slice has the same size as an L2 table (one cluster)
and l2_size == l2_slice_size.
Later we'll allow smaller slices, but until then we have to use this
new l2_slice_size field to make the rest of the code ready for that.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: adb048595f9fb5dfb110c802a8b3c3be3b937f37.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Similar to offset_to_l2_index(), this function returns the index in
the L1 table for a given guest offset. This is only used in a couple
of places and it's not a particularly complex calculation, but it
makes the code a bit more readable.
Although in the qcow2_get_cluster_offset() case the old code was
taking advantage of the l1_bits variable, we're going to get rid of
the other uses of l1_bits in a later patch anyway, so it doesn't make
sense to keep it just for this.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: a5f626fed526b7459a0425fad06d823d18df8522.1517840877.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The table size in the qcow2 cache is currently equal to the cluster
size. This doesn't allow us to use the cache memory efficiently,
particularly with large cluster sizes, so we need to be able to have
smaller cache tables that are independent from the cluster size. This
patch adds a new field to Qcow2Cache that we can use instead of the
cluster size.
The current table size is still being initialized to the cluster size,
so there are no semantic changes yet, but this patch will allow us to
prepare the rest of the code and simplify a few function calls.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 67a1bf9e55f417005c567bead95a018dc34bc687.1517840876.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
To maintain load/store disabled bitmap there is new approach:
- deprecate @autoload flag of block-dirty-bitmap-add, make it ignored
- store enabled bitmaps as "auto" to qcow2
- store disabled bitmaps without "auto" flag to qcow2
- on qcow2 open load "auto" bitmaps as enabled and others
as disabled (except in_use bitmaps)
Also, adjust iotests 165 and 176 appropriately.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-id: 20180202160752.143796-1-vsementsov@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
sd_prealloc() will now preallocate the area [old_size, new_size). As
before, it rounds to buf_size and may thus overshoot and preallocate
areas that were not requested to be preallocated. For image creation,
this is no change in behavior. For truncation, this is in accordance
with the documentation for preallocated truncation.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We want to use this function in sd_truncate() later on, so taking a
filename is not exactly ideal.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
By using qemu_do_cluster_truncate() in qemu_cluster_truncate(), we now
automatically have preallocated truncation.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of expecting the current size to be 0, query it and allocate
only the area [current_size, offset) if preallocation is requested.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pull out the truncation code from the qemu_cluster_create() function so
we can later reuse it in qemu_gluster_truncate().
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
glfs_close() is a classical clean-up operation, as can be seen by the
fact that it is executed even if the truncation before it failed.
Also, moving it to clean-up makes it more clear that if it fails, we do
not want it to overwrite the current ret value if that signifies an
error already.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that memory_region_sync_dirty_bitmap is NULL, we can unify its
loop with memory_global_dirty_log_sync's. The only difference is
that memory_region_sync_dirty_bitmap will no longer call log_sync on
FlatRanges that do have a zero dirty_log_mask, but this is okay because
video memory is always registered with the dirty page logging mechanism.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Simplify the users of memory_region_snapshot_and_clear_dirty, so
that they do not have to call memory_region_sync_dirty_bitmap
explicitly.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This removes the last user of memory_region_test_and_clear_dirty
outside memory.c.
Tested-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The 64-bit ADMA address is not converted to the cpu endianes correctly.
This patch fixes the issue and uses a valid mask for the attribute data.
Signed-off-by: Sai Pavan Boddu <saipava@xilinx.com>
[AF: Re-write commit message]
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <20180208164818.7961-13-f4bug@amsat.org>
qemu-io puts the TTY into non-canonical mode, which means no EOF processing is
done and thus getchar() will never return the EOF constant. Instead we have to
query the TTY attributes to determine the configured EOF character (usually
Ctrl-D / 0x4), and then explicitly check for that value. This fixes the
regression that prevented Ctrl-D from triggering an exit of qemu-io that has
existed since readline was first added in
commit 0cf17e1817
Author: Stefan Hajnoczi <stefanha@redhat.com>
Date: Thu Nov 14 11:54:17 2013 +0100
qemu-io: use readline.c
It also ensures that a newline is printed when exiting, to complete the
line output by the "qemu-io> " prompt.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Only a few select machine types support floppy drives and there is
actually nothing preventing us from using virtio here, so let's do it.
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Even if an op blocker is present for BLOCK_OP_TYPE_MIRROR_SOURCE,
it is checked a bit late and the result is that the target is
created even if drive-mirror subsequently fails. Add an early
check to avoid this.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
g_realloc() aborts the program if it fails to allocate the required
amount of memory. We want to detect that scenario and return an error
instead, so let's use g_try_realloc().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Split options out of the "@table @var" section and create a "@table
@option", then use whitespaces and blank lines consistently.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts commit 76bf133c4 which updated the reference output, and
fixed the reference image, because the code path we want to exercise is
actually the invalid image size.
The descriptor block in the image, which includes the CID to verify, has been
invalid since the reference image was added. Since commit 9877860e7b we report
this error earlier than the "file too large", so 059.out mismatches.
The binary change is generated along the operations of:
$ bunzip2 afl9.vmdk.bz2
$ qemu-img create -f vmdk fix.vmdk 1G
$ dd if=afl9.vmdk of=fix.vmdk bs=512 count=1 conv=notrunc
$ mv fix.vmdk afl9.vmdk
$ bzip2 afl9.vmdk
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Define two functions to update the interrupt state, and call them
on loadvm. This removes the need to migrate the state as part of
vmstate_kvaser_pci.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The core SJA1000 support is independent of following
patches which map SJA1000 chip to PCI boards.
The work is based on Jin Yang GSoC 2013 work funded
by Google and mentored in frame of RTEMS project GSoC
slot donated to QEMU.
Rewritten for QEMU-2.0+ versions and architecture cleanup
by Pavel Pisa (Czech Technical University in Prague).
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Connection to the real host CAN bus network through
SocketCAN network interface is available only for Linux
host system. Mechanism is generic, support for another
CAN API and operating systems can be implemented in future.
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The CanBusState state structure is created for each
emulated CAN channel. Individual clients/emulated
CAN interfaces or host interface connection registers
to the bus by CanBusClientState structure.
The CAN core is prepared to support connection to the
real host CAN bus network. The commit with such support
for Linux SocketCAN follows.
Implementation is as simple as possible. There is no state to be
migrated, and messages prioritization and queuing are not considered
for now. But it is intended to be extended when need arises.
Development repository and more documentation at
https://gitlab.fel.cvut.cz/canbus/qemu-canbus
The work is based on Jin Yang GSoC 2013 work funded
by Google and mentored in frame of RTEMS project GSoC
slot donated to QEMU.
Rewritten for QEMU-2.0+ versions and architecture cleanup
by Pavel Pisa (Czech Technical University in Prague).
Signed-off-by: Pavel Pisa <pisa@cmp.felk.cvut.cz>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since HAX_VM_IOCTL_ALLOC_RAM takes a 32-bit size, it cannot handle
RAM blocks of 4GB or larger, which is why HAXM can only run guests
with less than 4GB of RAM. Solve this problem by utilizing the new
HAXM API, HAX_VM_IOCTL_ADD_RAMBLOCK, which takes a 64-bit size, to
register RAM blocks with the HAXM kernel module. The new API is
first added in HAXM 7.0.0, and its availablility and be confirmed
by the presence of the HAX_CAP_64BIT_RAMBLOCK capability flag.
When the guest RAM size reaches 7GB, QEMU will ask HAXM to set up a
memory mapping that covers a 4GB region, which will fail, because
HAX_VM_IOCTL_SET_RAM also takes a 32-bit size. Work around this
limitation by splitting the large mapping into small ones and
calling HAX_VM_IOCTL_SET_RAM multiple times.
Bug: https://bugs.launchpad.net/qemu/+bug/1735576
Signed-off-by: Yu Ning <yu.ning@intel.com>
Message-Id: <1515752555-12784-1-git-send-email-yu.ning@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The make rules for building QEMU are mostly silent by default. They can
be made verbose by setting the variable V=1. The default state does not
however correspond to a V=0 setting - $(V) must be undefined / empty to
get the default quiet build.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180123164718.12714-3-berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 42a77f1ce4.
The primary intention of this change was to silence messages
like
make[1]: '/home/berrange/src/virt/qemu/capstone/libcapstone.a' is up to date.
which we get when calling make recursively with explicit
targets.
The problem is that this change affected every make target,
not merely the targets that triggered these "is up to date"
messages. As a result any targets that were not invoking
commands via "$(call quiet-command ...)" suddenly become
silent. This is particularly bad for "make install" which
now appears todo nothing.
Rather than go through every make rule and try to identify
places where we now need to explicitly print a message to
show work taking place, just revert the change.
To address the original problem of silencing "is up to date"
messages, we simply add --quiet to the SUBDIR_MAKEVARS
variable, so it only affects us on recursive make calls.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180123164718.12714-2-berrange@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit 7e49f5e8e5.
This commit seems to break parallel 'make -j4 check';
revert it until we identify the problem.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2018-02-12
Here's the accumulatead ppc and pseries related patches for the last
while. Highlights are:
* A number of Macintosh / CUDA cleanups from Mark Cave-Ayland
* An important bug fix (missing "break;") for
H_GET_CPU_CHARACTERISTICS
* Yet another fix for SMT mode handling
* Assorted other cleanups and fixes
# gpg: Signature made Mon 12 Feb 2018 03:39:30 GMT
# gpg: using RSA key 6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180212:
misc: introduce new mos6522 VIA device and enable it for ppc builds
cuda: factor out timebase-derived counter value and load time
cuda: set timer 1 frequency property to CUDA_TIMER_FREQ
cuda: don't call cuda_update() when writing to ACR register
cuda: minor cosmetic tidy-ups to get_next_irq_time()
cuda: rename frequency property to tb_frequency
cuda: introduce CUDAState parameter to get_counter()
spapr: set vsmt to MAX(8, smp_threads)
cuda: don't allow writes to port output pins
cuda: do not use old_mmio accesses
hw/ppc: rename functions in comments
spapr: add missing break in h_get_cpu_characteristics()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The MOS6522 VIA forms the bridge part of several Mac devices, including the
Mac via-cuda and via-pmu devices. Introduce a standard mos6522 device that
can be shared amongst multiple implementations.
This is effectively taking the 6522 parts out of cuda.c and turning them
into a separate device whilst also applying some style tidy-ups and including
a conversion to trace-events.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit b981289c49 "PPC: Cuda: Use cuda timer to expose tbfreq to guest" altered
the timer calculations from those based upon the hardware CUDA clock frequency
to those based upon the CPU timebase frequency.
In fact we can isolate the differences to 2 simple changes: one to the counter
read value and another to the counter load time. Move these changes into
separate functions so the implementation can be swapped later.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Now that we have successfully decoupled the timebase frequency and the hardware
timer frequency, set the timer 1 frequency property to CUDA_TIMER_FREQ and alter
get_next_irq_time() to use it rather than the hard-coded constant.
In addition to this we must now switch the tb_diff calculation over to use the
timebase frequency now that the hardware clock frequency and the timebase
frequency are different.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
[dwg: Correct a conflict due to a bug in an earlier patch]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The wire protocol for reading data to/from the VIA is triggered by changing
inputs on port B rather than changing the timer configuration via the ACR.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This allows us to more easily differentiate between the timebase frequency used
to calibrate the MacOS timers and the actual frequency of the hardware clock as
indicated by CUDA_TIMER_FREQ.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[dwg: Revert some extraneous changes which break compile]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This will be required shortly and also happens to match nicely with the
corresponding signature for set_counter().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We ignore silently the value of smp_threads when we set
the default VSMT value, and if smp_threads is greater than VSMT
kernel is going into trouble later.
Fixes: 8904e5a750
("spapr: Adjust default VSMT value for better migration compatibility")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Applied using the Coccinelle semantic patch scripts/coccinelle/use_osdep.cocci
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Applied using the Coccinelle semantic patch scripts/coccinelle/use_osdep.cocci
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Applied using the Coccinelle semantic patch scripts/coccinelle/use_osdep.cocci
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Applied using the Coccinelle semantic patch scripts/coccinelle/use_osdep.cocci
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
qemu-binfmt-conf.sh is used for the Linux usermode emulation, so
let's add this file to that section in the MAINTAINERS file.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Check for the presence of posix_memalign() in the configure script,
not using "defined(_POSIX_C_SOURCE) && !defined(__sun__)". This
lets qemu use posix_memalign() on NetBSD versions that have it,
instead of falling back to valloc() which is wasteful when the
required alignment is smaller than a page.
Signed-off-by: Andreas Gustafsson <gson@gson.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Now that we have a website that accepts patches on the list, the
main project should make it easier to find information about that
process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Fam Zheng <famz@redhat.com>
Even with --disable-git-update, ./configure tries updating the capstone
submodule instead of marking it "no"; this disables capstone submodule
if git update is disabled.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
As was last done in 379e21c25, we don't want .git files for
submodules here, which we aren't presently doing for capstone and
keycodemapdb.
Rather than delete the offending files before archiving, ask tar
to --exclude=.git
Signed-off-by: Cole Robinson <crobinso@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The spaces between the parameters in the chardev and tpmdev sections
are rather confusing than helpful, and prevent that the lists can be
copy-n-pasted easily for real usage. We also don't use such spaces
in other sections in the documentation, e.g. with the -netdev option,
so let's be consistent and remove the spaces in the chardev and tpmdev
sections, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
<memory.h> is a non-standard obsolete header that was long ago
replaced by <string.h>.
<malloc.h> is a non-standard header; it is not obsolete (we must
use it for malloc_trim, for example), but generally should not
be used in files that just need malloc() and friends, where
<stdlib.h> is the standard header.
And since osdep.h already guarantees string.h and stdlib.h, we
can drop these unusual system header includes as redundant
rather than replacing them.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
The "-machine xxx,help" prints kernel-irqchip possible values as
"OnOffSplit", this adds separators to the printed line.
Also, since only lower case letters are specified in qapi/common.json,
this changes the letter cases too.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Commit bcb5ce08cf ("spapr: Rename machine init functions for clarity")
renamed ppc_spapr_reset to spapr_machine_reset and ppc_spapr_init
to spapr_machine_init. Let's also rename the references in
comments.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Detected by Coverity (CID 1385702). This fixes the recently added hypercall
to let guests properly apply Spectre and Meltdown workarounds.
Fixes: c59704b254 "target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICS"
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We don't need the can_write_zeroes_with_unmap field in
BlockDriverInfo, because it is redundant information with
supported_zero_flags & BDRV_REQ_MAY_UNMAP. Note that
BlockDriverInfo and supported_zero_flags are both per-device
settings, rather than global state about the driver as a
whole, which means one or both of these bits of information
can already be conditional. Let's audit how they were set:
crypto: always setting can_write_ to false is pointless (the
struct starts life zero-initialized), no use of supported_
nbd: just recently fixed to set can_write_ if supported_
includes MAY_UNMAP (thus this commit effectively reverts
bca80059e and solves the problem mentioned there in a more
global way)
file-posix, iscsi, qcow2: can_write_ is conditional, while
supported_ was unconditional; but passing MAY_UNMAP would
fail with ENOTSUP if the condition wasn't met
qed: can_write_ is unconditional, but pwrite_zeroes lacks
support for MAY_UNMAP and supported_ is not set. Perhaps
support can be added later (since it would be similar to
qcow2), but for now claiming false is no real loss
all other drivers: can_write_ is not set, and supported_ is
either unset or a passthrough
Simplify the code by moving the conditional into
supported_zero_flags for all drivers, then dropping the
now-unused BDI field. For callers that relied on
bdrv_can_write_zeroes_with_unmap(), we return the same
per-device settings for drivers that had conditions (no
observable change in behavior there); and can now return
true (instead of false) for drivers that support passthrough
(for example, the commit driver) which gives those drivers
the same fix as nbd just got in bca80059e. For callers that
relied on supported_zero_flags, we now have a few more places
that can avoid a wasted call to pwrite_zeroes() that will
just fail with ENOTSUP.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180126193439.20219-1-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
One patch to mitigate Travis timeouts
# gpg: Signature made Fri 09 Feb 2018 14:13:46 GMT
# gpg: using RSA key FBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-travis-speedup-090218-1:
.travis.yml: add --disable-linux-user for some jobs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Miscellaneous patches for 2018-02-07
# gpg: Signature made Fri 09 Feb 2018 12:52:51 GMT
# gpg: using RSA key 3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-misc-2018-02-07-v4:
Move include qemu/option.h from qemu-common.h to actual users
Drop superfluous includes of qapi/qmp/qjson.h
Drop superfluous includes of qapi/qmp/dispatch.h
Include qapi/qmp/qnull.h exactly where needed
Include qapi/qmp/qnum.h exactly where needed
Include qapi/qmp/qbool.h exactly where needed
Include qapi/qmp/qstring.h exactly where needed
Include qapi/qmp/qdict.h exactly where needed
Include qapi/qmp/qlist.h exactly where needed
Include qapi/qmp/qobject.h exactly where needed
qdict qlist: Make most helper macros functions
Eliminate qapi/qmp/types.h
Typedef the subtypes of QObject in qemu/typedefs.h, too
Include qmp-commands.h exactly where needed
Drop superfluous includes of qapi/qmp/qerror.h
Include qapi/error.h exactly where needed
Drop superfluous includes of qapi-types.h and test-qapi-types.h
Clean up includes
Use #include "..." for our own headers, <...> for others
vnc: use stubs for CONFIG_VNC=n dummy functions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The modules and co-routine builds are only really relevant to softmmu
builds and regularly timeout on Travis. Let's disable linux-user
builds here for more headroom.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
target-arm queue:
* Support M profile derived exceptions on exception entry and exit
* Implement AArch64 v8.2 crypto insns (SHA-512, SHA-3, SM3, SM4)
* Implement working i.MX6 SD controller
* Various devices preparatory to i.MX7 support
* Preparatory patches for SVE emulation
* v8M: Fix bug in implementation of 'TT' insn
* Give useful error if user tries to use userspace GICv3 with KVM
# gpg: Signature made Fri 09 Feb 2018 11:01:23 GMT
# gpg: using RSA key 3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180209: (30 commits)
hw/core/generic-loader: Allow PC to be set on command line
target/arm/translate.c: Fix missing 'break' for TT insns
target/arm/kvm: gic: Prevent creating userspace GICv3 with KVM
target/arm: Add SVE state to TB->FLAGS
target/arm: Add ZCR_ELx
target/arm: Add SVE to migration state
target/arm: Add predicate registers for SVE
target/arm: Expand vector registers for SVE
hw/arm: Move virt's PSCI DT fixup code to arm/boot.c
usb: Add basic code to emulate Chipidea USB IP
i.MX: Add implementation of i.MX7 GPR IP block
i.MX: Add i.MX7 GPT variant
i.MX: Add code to emulate GPCv2 IP block
i.MX: Add code to emulate i.MX7 SNVS IP-block
i.MX: Add code to emulate i.MX2 watchdog IP block
i.MX: Add code to emulate i.MX7 CCM, PMU and ANALOG IP blocks
hw: i.MX: Convert i.MX6 to use TYPE_IMX_USDHC
sdhci: Add i.MX specific subtype of SDHCI
target/arm: enable user-mode SHA-3, SM3, SM4 and SHA-512 instruction support
target/arm: implement SM4 instructions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-common.h includes qemu/option.h, but most places that include the
former don't actually need the latter. Drop the include, and add it
to the places that actually need it.
While there, drop superfluous includes of both headers, and
separate #include from file comment with a blank line.
This cleanup makes the number of objects depending on qemu/option.h
drop from 4545 (out of 4743) to 284 in my "build everything" tree.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-20-armbru@redhat.com>
[Semantic conflict with commit bdd6a90a9e in block/nvme.c resolved]
This cleanup makes the number of objects depending on qapi/qmp/qdict.h
drop from 4550 (out of 4743) to 368 in my "build everything" tree.
For qapi/qmp/qobject.h, the number drops from 4552 to 390.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-13-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/qmp/qlist.h
drop from 4551 (out of 4743) to 16 in my "build everything" tree.
While there, separate #include from file comment with a blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-12-armbru@redhat.com>
The macro expansions of qdict_put_TYPE() and qlist_append_TYPE() need
qbool.h, qnull.h, qnum.h and qstring.h to compile. We include qnull.h
and qnum.h in the headers, but not qbool.h and qstring.h. Works,
because we include those wherever the macros get used.
Open-coding these helpers is of dubious value. Turn them into
functions and drop the includes from the headers.
This cleanup makes the number of objects depending on qapi/qmp/qnum.h
from 4551 (out of 4743) to 46 in my "build everything" tree. For
qapi/qmp/qnull.h, the number drops from 4552 to 21.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-10-armbru@redhat.com>
qapi/qmp/types.h is a convenience header to include a number of
qapi/qmp/ headers. Since we rarely need all of the headers
qapi/qmp/types.h includes, we bypass it most of the time. Most of the
places that use it don't need all the headers, either.
Include the necessary headers directly, and drop qapi/qmp/types.h.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-9-armbru@redhat.com>
This cleanup makes the number of objects depending on qapi/error.h
drop from 1910 (out of 4743) to 1612 in my "build everything" tree.
While there, separate #include from file comment with a blank line,
and drop a useless comment on why qemu/osdep.h is included first.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-5-armbru@redhat.com>
[Semantic conflict with commit 34e304e975 resolved, OSX breakage fixed]
s390x updates:
- rework interrupt handling for tcg, smp is now considered non-experimental
- some general improvements in the flic
- improvements in the pci code, and wiring it up in tcg
- add PTFF subfunctions for multiple-epoch to the cpu model
- maintainership updates
- various other fixes and improvements
# gpg: Signature made Fri 09 Feb 2018 09:04:34 GMT
# gpg: using RSA key DECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20180209: (29 commits)
MAINTAINERS: add David as additional tcg/s390 maintainer
MAINTAINERS: reorganize s390-ccw bios maintainership
MAINTAINERS: add myself as overall s390x maintainer
s390x/pci: use the right pal and pba in reg_ioat()
s390x/pci: fixup global refresh
s390x/pci: fixup the code walking IOMMU tables
s390x/cpumodel: model PTFF subfunctions for Multiple-epoch facility
s390x/cpumodel: allow zpci features in qemu model
s390x/tcg: wire up pci instructions
s390x/sclp: fix event mask handling
s390x/flic: cache the common flic class in a central function
s390x/kvm: cache the kvm flic in a central function
s390x/tcg: cache the qemu flic in a central function
configure: s390x supports mttcg now
s390x/tcg: remove SMP warning
s390x/tcg: STSI overhaul
s390x: fix size + content of STSI blocks
s390x/flic: optimize CPU wakeup for TCG
s390x/flic: implement qemu_s390_clear_io_flic()
s390x/tcg: implement TEST PENDING INTERRUPTION
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The documentation for the generic loader claims that you can
set the PC for a CPU with an option of the form
-device loader,cpu-num=0,addr=0x10000004
However if you try this QEMU complains:
cpu_num must be specified when setting a program counter
This is because we were testing against 0 rather than CPU_NONE.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180205150426.20542-1-peter.maydell@linaro.org
The code where we added the TT instruction was accidentally
missing a 'break', which meant that after generating the code
to execute the TT we would fall through to 'goto illegal_op'
and generate code to take an UNDEF insn.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180206103941.13985-1-peter.maydell@linaro.org
KVM doesn't support emulating a GICv3 in userspace, only GICv2. We
currently attempt this anyway, and as a result a KVM guest doesn't
receive interrupts and the user is left wondering why. Report an error
to the user if this particular combination is requested.
Signed-off-by: Christoffer Dall <christoffer.dall@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180201205307.30343-1-christoffer.dall@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The memory writes done to push registers on the stack
on exception entry in M profile CPUs are supposed to
go via MPU permissions checks, which may cause us to
take a derived exception instead of the original one of
the MPU lookup fails. We were implementing these as
always-succeeds direct writes to physical memory.
Rewrite v7m_push_stack() to do the necessary checks.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-5-git-send-email-peter.maydell@linaro.org
In the v8M architecture, if the process of taking an exception
results in a further exception this is called a derived exception
(for example, an MPU exception when writing the exception frame to
memory). If the derived exception happens while pushing the initial
stack frame, we must ignore any subsequent possible exception
pushing the callee-saves registers.
In preparation for making the stack writes check for exceptions,
add a return value from v7m_push_stack() and a new parameter to
v7m_exception_taken(), so that the former can tell the latter that
it needs to ignore failures to write to the stack. We also plumb
the argument through to v7m_push_callee_stack(), which is where
the code to ignore the failures will be.
(Note that the v8M ARM pseudocode structures this slightly differently:
derived exceptions cause the attempt to process the original
exception to be abandoned; then at the top level it calls
DerivedLateArrival to prioritize the derived exception and call
TakeException from there. We choose to let the NVIC do the prioritization
and continue forward with a call to TakeException which will then
take either the original or the derived exception. The effect is
the same, but this structure works better for QEMU because we don't
have a convenient top level place to do the abandon-and-retry logic.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-4-git-send-email-peter.maydell@linaro.org
Currently armv7m_nvic_acknowledge_irq() does three things:
* make the current highest priority pending interrupt active
* return a bool indicating whether that interrupt is targeting
Secure or NonSecure state
* implicitly tell the caller which is the highest priority
pending interrupt by setting env->v7m.exception
We need to split these jobs, because v7m_exception_taken()
needs to know whether the pending interrupt targets Secure so
it can choose to stack callee-saves registers or not, but it
must not make the interrupt active until after it has done
that stacking, in case the stacking causes a derived exception.
Similarly, it needs to know the number of the pending interrupt
so it can read the correct vector table entry before the
interrupt is made active, because vector table reads might
also cause a derived exception.
Create a new armv7m_nvic_get_pending_irq_info() function which simply
returns information about the highest priority pending interrupt, and
use it to rearrange the v7m_exception_taken() code so we don't
acknowledge the exception until we've done all the things which could
possibly cause a derived exception.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1517324542-6607-3-git-send-email-peter.maydell@linaro.org
In order to support derived exceptions (exceptions generated in
the course of trying to take an exception), we need to be able
to handle prioritizing whether to take the original exception
or the derived exception.
We do this by introducing a new function
armv7m_nvic_set_pending_derived() which the exception-taking code in
helper.c will call when a derived exception occurs. Derived
exceptions are dealt with mostly like normal pending exceptions, so
we share the implementation with the armv7m_nvic_set_pending()
function.
Note that the way we structure this is significantly different
from the v8M Arm ARM pseudocode: that does all the prioritization
logic in the DerivedLateArrival() function, whereas we choose to
let the existing "identify highest priority exception" logic
do the prioritization for us. The effect is the same, though.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1517324542-6607-2-git-send-email-peter.maydell@linaro.org
Split it out from the s390-ccw-virtio machine, add Thomas as a
maintainer in addition to Christian.
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When registering ioat, pba should be comprised of leftmost 52 bits and
rightmost 12 binary zeros, and pal should be comprised of leftmost 52
bits and right most 12 binary ones. The lower 12 bits of words 5 and 7
of the FIB are ignored by the facility. Let's fixup this.
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-4-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The VFIO common code doesn't provide the possibility to modify a
previous mapping entry in another way than unmapping and mapping again
with new properties.
To avoid -EEXIST DMA mapping error, we introduce a GHashTable to store
S390IOTLBEntry instances in order to cache the mapped entries. When
intercepting rpcit instruction, ignore the identical mapped entries to
avoid doing map operations multiple times and do unmap and re-map
operations for the case of updating the valid entries.
Acked-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-3-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Current s390x PCI IOMMU code is lack of flags' checking, including:
1) protection bit
2) table length
3) table offset
4) intermediate tables' invalid bit
5) format control bit
This patch introduces a new struct named S390IOTLBEntry, and makes up
these missed checkings. At the same time, inform the guest with the
corresponding error number when the check fails. Finally, in order to
get the error number, we export s390_guest_io_table_walk().
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <20180205072258.5968-2-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
For now, the kernel does not properly indicate configured CPU subfunctions
to the guest, but simply uses the host values (as support in KVM is still
missing). That's why we missed to model the PTFF subfunctions that come
with Multiple-epoch facility.
Let's properly add these, along with a new feature group.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180205102935.14736-1-david@redhat.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
AEN and AIS can be provided unconditionally, ZPCI should be turned on
manually.
With -cpu qemu,zpci=on, the guest kernel can now successfully detect
virtio-pci devices under tcg.
Also fixup the order of the MSA_EXT_{3,4} flags while at it.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
On s390x, pci support is implemented via a set of instructions
(no mmio). Unfortunately, none of them are documented in the
PoP; the code is based upon the existing implementation for KVM
and the Linux zpci driver.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Current STSI implementation is a mess, so let's rewrite it.
Problems fixed by this patch:
1) The order of exceptions/when recognized is wrong.
2) We have to store to virtual address space, not absolute.
3) Alignment check of the block is missing.
3) The SMP information is not indicated.
While at it:
a) Make the code look nicer
- get rid of nesting levels
- use struct initialization instead of initializing to zero
- rename a misspelled field and rename function code defines
- use a union and have only one write statement
- use cpu_to_beX()
b) Indicate the VM name/extended name + UUID just like KVM does
c) Indicate that all LPAR CPUs we fake are dedicated
d) Add a comment why we fake being a KVM guest
e) Give our guest as default the name "TCGguest"
f) Fake the same CPU information we have in our Guest for all layers
While at it, get rid of "potential_page_fault()" by forwarding the
retaddr properly.
The result is best verified by looking at "/proc/sysinfo" in the guest
when specifying on the qemu command line
-uuid "74738ff5-5367-5958-9aee-98fffdcd1876" \
-name "extra long guest name"
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
All blocks are 4k in size, which is only true for two of them right now.
Also some reserved fields were wrong, fix it and convert all reserved
fields to u8.
This also fixes the LPAR part output in /proc/sysinfo under TCG. (for
now, everything was indicated as 0)
While at it, introduce typedefs for these structs and use them in TCG/KVM
code.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-13-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Move floating interrupt handling into the flic. Floating interrupts
will now be considered by all CPUs, not just CPU #0. While at it, convert
I/O interrupts to use a list and make sure we properly consider I/O
sub-classes in s390_cpu_has_io_int().
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is a preparation for floating interrupt support and only applies to
MTTCG, single threaded TCG works just fine. If a floating interrupt wakes
up a VCPU and the CPU thinks it can run (clearing cs->halted), at
the point where the interrupt would be delivered, already another VCPU
might have picked up the interrupt, resulting in a wakeup without an
interrupt (executing wrong code).
It is wrong to let the VCPU continue to execute (the WAIT PSW). Instead,
we have to put the VCPU back to sleep.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let the flic device handle it internally. This will allow us to later
on store floating interrupts in the flic for the TCG case.
This now also simplifies kvm.c. All that's left is the fallback
interface for floating interrupts, which is now triggered directly via
the flic in case anything goes wrong.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We currently only support CRW machine checks. This is a preparation for
real floating interrupt support.
Get rid of the queue and handle it via the bit INTERRUPT_MCHK. We don't
rename it for now, as it will be soon gone (when moving crw machine checks
into the flic).
Please note that this is the same way also KVM handles it: only one
instance of a machine check can be pending at a time. So no need for a
queue.
While at it, make sure we try to deliver only if env->cregs[14]
actually indicates that CRWs are accepted.
Drop two unused defines on the way (we already have PSW_MASK_...).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We have to consider all deliverable interrupts.
We now have to take care of the special scenario, where we first
inject an interrupt with a WAIT PSW, followed by a !WAIT PSW. (very
unlikely but possible)
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180129125623.21729-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
In alpine docker image the qemu-system-s390x build is broken and
it throws this error:
qemu-system-s390x: Initialization of device s390-ipl failed: could not
load bootloader 's390-ccw.img'
The grep command of busybox uses regex. This fails on binary data
(e.g. stops on every \0), so it does not identify the string
BiGeNdIaN in the test case big/little. Therefore, it assumes
that the architecture is little endian.
This fix solves the grep problem by printing the content of
TMPO with strings
Signed-off-by: Alice Frosi <alice@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[some changes to patch description, add -a option to strings]
Message-Id: <20180130133828.77336-2-borntraeger@de.ibm.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Clean up includes so that osdep.h is included first and headers
which it implies are not included manually.
This commit was created with scripts/clean-includes, with the change
to target/s390x/gen-features.c manually reverted, and blank lines
around deletions collapsed.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-3-armbru@redhat.com>
System headers should be included with <...>, our own headers with
"...". Offenders tracked down with an ugly, brittle and probably
buggy Perl script. Previous iteration was commit a9c94277f0.
Delete inclusions of "string.h" and "strings.h" instead of fixing them
to <string.h> and <strings.h>, because we always include these via
osdep.h.
Put the cleaned up system header includes first.
While there, separate #include from file comment with exactly one
blank line.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20180201111846.21846-2-armbru@redhat.com>
according to Eduardo Habkost's commit fd3b02c889 all PCIEs now implement
INTERFACE_PCIE_DEVICE so we don't need is_express field anymore.
Devices that implements only INTERFACE_PCIE_DEVICE (is_express == 1)
or
devices that implements only INTERFACE_CONVENTIONAL_PCI_DEVICE (is_express == 0)
where not affected by the change.
The only devices that were affected are those that are hybrid and also
had (is_express == 1) - therefor only:
- hw/vfio/pci.c
- hw/usb/hcd-xhci.c
- hw/xen/xen_pt.c
For those 3 I made sure that QEMU_PCI_CAP_EXPRESS is on in instance_init()
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently virtio-pci driver hardcoded 2 vectors for virtio-blk device,
for multiple I/O queues scenario, all the I/O queues will share one
interrupt vector, while here, enable multiple vectors according to
the number of I/O queues.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In the past, we prioritized IOMMU migration so that we have such a
priority order:
IOMMU > PCI Devices
When migrating a guest with both vIOMMU and a pcie-root-port, we'll
always migrate vIOMMU first, since pci buses will be seen to have the
same priority of general PCI devices.
That's problematic.
The thing is that PCI bus number information is stored in the root port,
and that is needed by vIOMMU during post_load(), e.g., to figure out
context entry for a device. If we don't have correct bus numbers for
devices, we won't be able to recover device state of the DMAR memory
regions, and things will be messed up.
So let's boost the PCIe root ports to be even with higher priority:
PCIe Root Port > IOMMU > PCI Devices
A smoke test shows that this patch fixes bug 1538953.
Also, apply this rule to all the PCI bus/bridge devices: ioh3420,
xio3130_downstream, xio3130_upstream, pcie_pci_bridge, pci-pci bridge,
i82801b11.
I noted that we set pcie_pci_bridge_dev_vmstate twice. Clean that up
together.
CC: Alex Williamson <alex.williamson@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Laurent Vivier <lvivier@redhat.com>
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1538953
Reported-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The "i82801b11-bridge" device model is a descendant of "base-pci-bridge"
(TYPE_PCI_BRIDGE). However, unlike other similar devices, such as
- pci-bridge,
- pcie-pci-bridge,
- PCIE Root Port,
- xio3130 switch upstream and downstream ports,
- dec-21154-p2p-bridge,
- pbm-bridge,
- xilinx-pcie-root,
"i82801b11-bridge" does not clear the bridge specific registers at
platform reset.
This is a problem because devices on "i82801b11-bridge" continue to
respond to config space cycles after platform reset, when addressed with
the bus number that was previously programmed into the secondary bus
number register of "i82801b11-bridge". This error breaks OVMF's search for
extra (PXB) root buses, for example.
The device class reset method for "i82801b11-bridge" is currently NULL;
set it directly to pci_bridge_reset(), like the last three bridge models
in the above listing do.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: qemu-stable@nongnu.org
Ref: https://bugzilla.redhat.com/show_bug.cgi?id=1541839
Signed-off-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Move the log_dirty check into vhost_section.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Now that the olf vhost_set_memory code is gone, the _nop and _add
callbacks are identical and can be merged. The _del callback is
no longer needed.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Remove the old update mechanism, vhost_set_memory, and the functions
and flags it used.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Compare the sections list that's just been generated, and if it's
different from the old one regenerate the region list.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
As sections are reported by the listener to the _nop and _add
methods, add them to the temporary section list but now merge them
with the previous section if the new one abuts and the backend allows.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_verify_ring_mappings() were used to verify that
rings are still accessible and related memory hasn't
been moved after flatview is updated.
It was doing checks by mapping ring's GPA+len and
checking that HVA hadn't changed with new memory map.
To avoid maybe expensive mapping call, we were
identifying address range that changed and were doing
mapping only if ring was in changed range.
However it's not neccessary to perform ring's GPA
mapping as we already have its current HVA and all
we need is to verify that ring's GPA translates to
the same HVA in updated flatview.
This will allow the following patches to simplify the range
comparison that was previously needed to avoid expensive
verify_ring_mapping calls.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
with modifications by:
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Igor spotted that there's a race, where a region that's unref'd
in a _del callback might be free'd before the set_mem_table call in
the _commit callback, and thus the vhost might end up using free memory.
Fix this by building a complete temporary sections list, ref'ing every
section (during add and nop) and then unref'ing the whole list right
at the end of commit.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The loading time of a VM is quite significant when its virtio
devices use a large amount of virt-queues (e.g. a virtio-serial
device with max_ports=511). Most of the time is spend in the
creation of all the required event notifiers (ioeventfd and memory
regions).
This patch pack all the changes to the memory regions in a
single memory transaction.
Reported-by: Sitong Liu
Reported-by: Xiaoling Gao
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
The virtio_bus_set_host_notifier function no longer calls
event_notifier_cleanup when a event notifier is removed.
The commit updates the code to match the new behavior and calls
virtio_bus_cleanup_host_notifier after the notifier was de-assign
and no longer in use.
This change is a preparation to allow executing the
virtio_bus_set_host_notifier function in a memory region
transaction.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 0750b06021.
Follow up patches are reworking the memory listeners, the new mechanism
will add its own set of traces.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The x86 vector instruction set is extremely irregular. With newer
editions, Intel has filled in some of the blanks. However, we don't
get many 64-bit operations until SSE4.2, introduced in 2009.
The subsequent edition was for AVX1, introduced in 2011, which added
three-operand addressing, and adjusts how all instructions should be
encoded.
Given the relatively narrow 2 year window between possible to support
and desirable to support, and to vastly simplify code maintainence,
I am only planning to support AVX1 and later cpus.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Trivial move and constant propagation. Some identity and constant
function folding, but nothing that requires knowledge of the size
of the vector element.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Use dup to convert a non-constant scalar to a third vector.
Add addition, multiplication, and logical operations with an immediate.
Add addition, subtraction, multiplication, and logical operations with
a non-constant scalar. Allow for the front-end to build operations in
which the scalar operand comes first.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
No vector ops as yet. SSE only has direct support for 8- and 16-bit
saturation; handling 32- and 64-bit saturation is much more expensive.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Opcodes are added for scalar and vector shifts, but considering the
varied semantics of these do not expose them to the front ends. Do
go ahead and provide them in case they are needed for backend expansion.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Some functions use intN_t arguments, some use uintN_t, some just
used "unsigned". To aid putting function pointers in tables, we
need consistency.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To make our efforts on QEMU testing easier to consume by contributors,
let's add a document. For example, Patchew reports build errors on
patches that should be relatively easy to reproduce with a few steps, and
it is much nicer if there is such a documentation that it can refer to.
This focuses on how to run existing tests and how to write new test
cases, without going into the frameworks themselves.
The VM based testing section is moved from tests/vm/README which now
is a single line pointing to the new doc.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180201022046.9425-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
This is a new protocol driver that exclusively opens a host NVMe
controller through VFIO. It achieves better latency than linux-aio by
completely bypassing host kernel vfs/block layer.
$rw-$bs-$iodepth linux-aio nvme://
----------------------------------------
randread-4k-1 10.5k 21.6k
randread-512k-1 745 1591
randwrite-4k-1 30.7k 37.0k
randwrite-512k-1 1945 1980
(unit: IOPS)
The driver also integrates with the polling mechanism of iothread.
This patch is co-authored by Paolo and me.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20180116060901.17413-4-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Now that CoQueues can use a QemuMutex for thread-safety, there is no
need for curl to roll its own coroutine queue. Coroutines can be
placed directly on the queue instead of using a list of CURLAIOCBs.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-6-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
qemu_co_queue_next does not need to release and re-acquire the mutex,
because the queued coroutine does not run immediately. However, this
does not hold for qemu_co_enter_next. Now that qemu_co_queue_wait
can synchronize (via QemuLockable) with code that is not running in
coroutine context, it's important that code using qemu_co_enter_next
can easily use a standardized locking idiom.
First of all, qemu_co_enter_next must use aio_co_wake to restart the
coroutine. Second, the function gains a second argument, a QemuLockable*,
and the comments of qemu_co_queue_next and qemu_co_queue_restart_all
are adjusted to clarify the difference.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-5-pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
There are cases in which a queued coroutine must be restarted from
non-coroutine context (with qemu_co_enter_next). In this cases,
qemu_co_enter_next also needs to be thread-safe, but it cannot use
a CoMutex and so cannot qemu_co_queue_wait. Use QemuLockable so
that the CoQueue can interchangeably use CoMutex or QemuMutex.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-4-pbonzini@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
QemuLockable is a polymorphic lock type that takes an object and
knows which function to use for locking and unlocking. The
implementation could use C11 _Generic, but since the support is
not very widespread I am instead using __builtin_choose_expr and
__builtin_types_compatible_p, which are already used by
include/qemu/atomic.h.
QemuLockable can be used to implement lock guards, or to pass around
a lock in such a way that a function can release it and re-acquire it.
The next patch will do this for CoQueue.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20180203153935.8056-3-pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Using "fedora:latest" makes behavior different depending on when you
actually pulled the image from the docker repository. In my case,
the supposedly "latest" image was a Fedora 25 download from 8 months
ago, and the new "test-debug" test was failing.
Use "27" to improve reproducibility and make it clear when the image
is obsolete.
Cc: Fam Zheng <famz@redhat.com>
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <1515755504-21341-1-git-send-email-pbonzini@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Implements the Windows Hypervisor Platform accelerator (WHPX) target. Which
acts as a hypervisor accelerator for QEMU on the Windows platform. This enables
QEMU much greater speed over the emulated x86_64 path's that are taken on
Windows today.
1. Adds support for vPartition management.
2. Adds support for vCPU management.
3. Adds support for MMIO/PortIO.
4. Registers the WHPX ACCEL_CLASS.
Signed-off-by: Justin Terry (VM) <juterry@microsoft.com>
Message-Id: <1516655269-1785-4-git-send-email-juterry@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since we have separate handler on POLLHUP, which drops data
after closing the connection we need to fix this test, because
it sends data and instantly close the socket creating race condition.
In some cases on other end of socket client closes it faster than
reads data. To prevent it I suggest to close socket after recieving.
Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com>
Message-Id: <20180201134831.17709-1-klim.kireev@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This will exercise the memfd memory backend and should generally be
better for testing than memory-backend-file (thanks to anonymous files
and sealing).
If memfd is available, it is preferred.
However, in order to check that file & memfd backends both work
correctly, the read-guest-mem test is checked explicitly for each.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-8-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add a new memory backend, similar to hostmem-file, except that it
doesn't need to create files. It also enforces memory sealing.
This backend is mainly useful for sharing the memory with other
processes.
Note that Linux supports transparent huge-pages of shmem/memfd memory
since 4.8. It is relatively easier to set up THP than a dedicate
hugepage mount point by using "madvise" in
/sys/kernel/mm/transparent_hugepage/shmem_enabled.
Since 4.14, memfd allows to set hugetlb requirement explicitly.
Pending for merge in 4.16 is memfd sealing support for hugetlb backed
memory.
Usage:
-object memory-backend-memfd,id=mem1,size=1G
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-5-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Linux commit 749df87bd7bee5a79cef073f5d032ddb2b211de8 (v4.14-rc1)
added a new flag MFD_HUGETLB to memfd_create() that specify the file
to be created resides in the hugetlbfs filesystem. This is the
generic hugetlbfs filesystem not associated with any specific mount
point.
hugetlbfs does not support sealing operations in v4.14, therefore
specifying MFD_ALLOW_SEALING with MFD_HUGETLB will result in EINVAL.
However, I added sealing support in "[PATCH v3 0/9] memfd: add sealing
to hugetlb-backed memory" series, queued in -mm tree for v4.16.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180201132757.23063-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If no one joins the thread, its associated memory is leaked.
Reported-by: CheneyLin <linzc@zju.edu.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The effects of ivshmem_enable_irqfd() was not undone on device reset.
This manifested as:
ivshmem_add_kvm_msi_virq: Assertion `!s->msi_vectors[vector].pdev' failed.
when irqfd was enabled before reset and then enabled again after reset, making
ivshmem_enable_irqfd() run for the second time.
To reproduce, run:
ivshmem-server
and QEMU with:
-device ivshmem-doorbell,chardev=iv
-chardev socket,path=/tmp/ivshmem_socket,id=iv
then install the Windows driver, at the time of writing available at:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem
and crash-reboot the guest by inducing a BSOD.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Message-Id: <20171211072110.9058-5-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Adds a rollback path to ivshmem_enable_irqfd() and fixes
ivshmem_disable_irqfd() to bail if irqfd has not been enabled.
To reproduce, run:
ivshmem-server -n 0
and QEMU with:
-device ivshmem-doorbell,chardev=iv
-chardev socket,path=/tmp/ivshmem_socket,id=iv
then load, unload, and load again the Windows driver, at the time of writing
available at:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem
The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-4-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As of commit 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:
ivshmem: msix_set_vector_notifiers failed
msix_unset_vector_notifiers: Assertion `dev->msix_vector_use_notifier && dev->msix_vector_release_notifier' failed.
if MSI-X is repeatedly enabled and disabled on the ivshmem device, for example
by loading and unloading the Windows ivshmem driver. This is because
msix_unset_vector_notifiers() doesn't call any of the release notifier callbacks
since MSI-X is already disabled at that point (msix_enabled() returning false
is how this transition is detected in the first place). Thus ivshmem_vector_mask()
doesn't run and when MSI-X is subsequently enabled again ivshmem_vector_unmask()
fails.
This is fixed by keeping track of unmasked vectors and making sure that
ivshmem_vector_mask() always runs on MSI-X disable.
Fixes: 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-3-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As of commit 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications"),
QEMU crashes with:
kvm_irqchip_commit_routes: Assertion `ret == 0' failed.
if the ivshmem device is configured with more vectors than what the server
supports. This is caused by the ivshmem_vector_unmask() being called on
vectors that have not been initialized by ivshmem_add_kvm_msi_virq().
This commit fixes it by adding a simple check to the mask and unmask
callbacks.
Note that the opposite mismatch, if the server supplies more vectors than
what the device is configured for, is already handled and leads to output
like:
Too many eventfd received, device has 1 vectors
To reproduce the assert, run:
ivshmem-server -n 0
and QEMU with:
-device ivshmem-doorbell,chardev=iv
-chardev socket,path=/tmp/ivshmem_socket,id=iv
then load the Windows driver, at the time of writing available at:
https://github.com/virtio-win/kvm-guest-drivers-windows/tree/master/ivshmem
The issue is believed to have been masked by other guest drivers, notably
Linux ones, not enabling MSI-X on the device.
Fixes: 660c97eef6 ("ivshmem: use kvm irqfd for msi notifications")
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171211072110.9058-2-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The following behavior was observed for QEMU configured by libvirt
to use guest agent as usual for the guests without virtio-serial
driver (Windows or the guest remaining in BIOS stage).
In QEMU on first connect to listen character device socket
the listen socket is removed from poll just after the accept().
virtio_serial_guest_ready() returns 0 and the descriptor
of the connected Unix socket is removed from poll and it will
not be present in poll() until the guest will initialize the driver
and change the state of the serial to "guest connected".
In libvirt connect() to guest agent is performed on restart and
is run under VM state lock. Connect() is blocking and can
wait forever.
In this case libvirt can not perform ANY operation on that VM.
The bug can be easily reproduced this way:
Terminal 1:
qemu-system-x86_64 -m 512 -device pci-serial,chardev=serial1 -chardev socket,id=serial1,path=/tmp/console.sock,server,nowait
(virtio-serial and isa-serial also fit)
Terminal 2:
minicom -D unix\#/tmp/console.sock
(type something and press enter)
C-a x (to exit)
Do 3 times:
minicom -D unix\#/tmp/console.sock
C-a x
It needs 4 connections, because the first one is accepted by QEMU, then two are queued by
the kernel, and the 4th blocks.
The problem is that QEMU doesn't add a read watcher after succesful read
until the guest device wants to acquire recieved data, so
I propose to install a separate pullhup watcher regardless of
whether the device waits for data or not.
Signed-off-by: Klim Kireev <klim.kireev@virtuozzo.com>
Message-Id: <20180125135129.9305-1-klim.kireev@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When unregister memory listeners, we should call, e.g.,
region_del() (and possibly other undo operations) on every existing
memory region sections there, otherwise we may leak resources that are
held during the region_add(). This patch undo the stuff for the
listeners, which emulates the case when the address space is set from
current to an empty state.
I found this problem when debugging a refcount leak issue that leads to
a device unplug event lost (please see the "Bug:" line below). In that
case, the leakage of resource is the PCI BAR memory region refcount.
And since memory regions are not keeping their own refcount but onto
their owners, so the vfio-pci device's (who is the owner of the PCI BAR
memory regions) refcount is leaked, and event missing.
We had encountered similar issues before and fixed in other
way (ee4c112846, "vhost: Release memory references on cleanup"). This
patch can be seen as a more high-level fix of similar problems that are
caused by the resource leaks from memory listeners. So now we can remove
the explicit unref of memory regions since that'll be done altogether
during unregistering of listeners now.
Bug: https://bugzilla.redhat.com/show_bug.cgi?id=1531393
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-5-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After next patch, listener unregister will need the container to be
alive. Let's move this unregister phase to be before unset container,
since that operation will free the backend container in kernel,
otherwise we'll get these after next patch:
qemu-system-x86_64: VFIO_UNMAP_DMA: -22
qemu-system-x86_64: vfio_dma_unmap(0x559bf53a4590, 0x0, 0xa0000) = -22 (Invalid argument)
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-4-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: Alex Williamson <alex.williamson@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's a preparation for follow-up patch to call region_del() in
memory_listener_unregister(), otherwise all device addr attached with
kvm_devices_head will be reset before calling kvm_arm_set_device_addr.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180122060244.29368-3-peterx@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It helps ASAN to detect more leaks on coroutine stacks, and to get rid
of some extra warnings.
Before:
tests/test-coroutine -p
/basic/lifecycle
/basic/lifecycle: ==20781==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
==20781==WARNING: ASan is ignoring requested __asan_handle_no_return:
stack top: 0x7ffcb184d000; bottom 0x7ff6c4cfd000; size: 0x0005ecb50000
(25446121472)
False positive error reports may follow
For details see https://github.com/google/sanitizers/issues/189
OK
After:
tests/test-coroutine -p /basic/lifecycle
/basic/lifecycle: ==21110==WARNING: ASan doesn't fully support
makecontext/swapcontext functions and may produce false positives in
some cases!
OK
A similar work would need to be done for sigaltstack & windows fibers
to have similar coverage. Since ucontext is preferred, I didn't bother
checking the other coroutine implementations for now.
Update travis to fix the build with ASAN annotations.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-4-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Migration pull 2018-02-06
This is based off Juan's last pull with a few extras, but
also removing:
Add migration xbzrle test
Add migration precopy test
As well as my normal test boxes, I also gave it a test
on a 32 bit ARM box and it seems happy (a Calxeda highbank)
and a big-endian power box.
Dave
# gpg: Signature made Tue 06 Feb 2018 15:33:31 GMT
# gpg: using RSA key 0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20180206a:
migration: incoming postcopy advise sanity checks
migration: Don't leak IO channels
migration: Recover block devices if failure in device state
tests: Adjust sleeps for migration test
tests: Create migrate-start-postcopy command
tests: Add deprecated commands migration test
tests: Use consistent names for migration
tests: Consolidate accelerators declaration
tests: Remove deprecated migration tests commands
migration: Drop current address parameter from save_zero_page()
migration: use s->threshold_size inside migration_update_counters
migration/savevm.c: set MAX_VM_CMD_PACKAGED_SIZE to 1ul << 32
migration: Route errors down through migration_channel_connect
migration: Allow migrate_fd_connect to take an Error *
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Python queue, 2018-02-05
# gpg: Signature made Mon 05 Feb 2018 23:07:57 GMT
# gpg: using RSA key 2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/python-next-pull-request: (21 commits)
docker: change Fedora images to run with python3
travis: improve python version test coverage
ui: update keycodemapdb to get py3 fixes
input: add missing JIS keys to virtio input
qemu.py: don't launch again before shutdown()
qemu.py: cleanup redundant calls in launch()
qemu.py: use poll() instead of 'returncode'
qemu.py: always cleanup on shutdown()
qemu.py: refactor launch()
qemu.py: better control of created files
qemu.py: remove unused import
configure: allow use of python 3
scripts: ensure signrom treats data as bytes
qapi: force a UTF-8 locale for running Python
qapi: ensure stable sort ordering when checking QAPI entities
qapi: remove '-q' arg to diff when comparing QAPI output
qapi: Adapt to moved location of 'maketrans' function in py3
qapi: adapt to moved location of StringIO module in py3
qapi: Use OrderedDict from standard library if available
qapi: use items()/values() intead of iteritems()/itervalues()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
These quirks are necessary for GeForce, but not for Quadro/GRID/Tesla
assignment. Leaving them enabled is fully functional and provides the
most compatibility, but due to the unique NVIDIA MSI ACK behavior[1],
it also introduces latency in re-triggering the MSI interrupt. This
overhead is typically negligible, but has been shown to adversely
affect some (very) high interrupt rate applications. This adds the
vfio-pci device option "x-no-geforce-quirks=" which can be set to
"on" to disable this additional overhead.
A follow-on optimization for GeForce might be to make use of an
ioeventfd to allow KVM to trigger an irqfd in the kernel vfio-pci
driver, avoiding the bounce through userspace to handle this device
write.
[1] Background: the NVIDIA driver has been observed to issue a write
to the MMIO mirror of PCI config space in BAR0 in order to allow the
MSI interrupt for the device to retrigger. Older reports indicated a
write of 0xff to the (read-only) MSI capability ID register, while
more recently a write of 0x0 is observed at config space offset 0x704,
non-architected, extended config space of the device (BAR0 offset
0x88704). Virtualization of this range is only required for GeForce.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
There is already @hostwin in vfio_listener_region_add() so there is no
point in having the other one.
Fixes: 2e4109de8e ("vfio/spapr: Create DMA window dynamically (SPAPR IOMMU v2)")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Recently proposed vfio-pci kernel changes (v4.16) remove the
restriction preventing userspace from mmap'ing PCI BARs in areas
overlapping the MSI-X vector table. This change is primarily intended
to benefit host platforms which make use of system page sizes larger
than the PCI spec recommendation for alignment of MSI-X data
structures (ie. not x86_64). In the case of POWER systems, the SPAPR
spec requires the VM to program MSI-X using hypercalls, rendering the
MSI-X vector table unused in the VM view of the device. However,
ARM64 platforms also support 64KB pages and rely on QEMU emulation of
MSI-X. Regardless of the kernel driver allowing mmaps overlapping
the MSI-X vector table, emulation of the MSI-X vector table also
prevents direct mapping of device MMIO spaces overlapping this page.
Thanks to the fact that PCI devices have a standard self discovery
mechanism, we can try to resolve this by relocating the MSI-X data
structures, either by creating a new PCI BAR or extending an existing
BAR and updating the MSI-X capability for the new location. There's
even a very slim chance that this could benefit devices which do not
adhere to the PCI spec alignment guidelines on x86_64 systems.
This new x-msix-relocation option accepts the following choices:
off: Disable MSI-X relocation, use native device config (default)
auto: Use a known good combination for the platform/device (none yet)
bar0..bar5: Specify the target BAR for MSI-X data structures
If compatible, the target BAR will either be created or extended and
the new portion will be used for MSI-X emulation.
The first obvious user question with this option is how to determine
whether a given platform and device might benefit from this option.
In most cases, the answer is that it won't, especially on x86_64.
Devices often dedicate an entire BAR to MSI-X and therefore no
performance sensitive registers overlap the MSI-X area. Take for
example:
# lspci -vvvs 0a:00.0
0a:00.0 Ethernet controller: Intel Corporation I350 Gigabit Network Connection
...
Region 0: Memory at db680000 (32-bit, non-prefetchable) [size=512K]
Region 3: Memory at db7f8000 (32-bit, non-prefetchable) [size=16K]
...
Capabilities: [70] MSI-X: Enable+ Count=10 Masked-
Vector table: BAR=3 offset=00000000
PBA: BAR=3 offset=00002000
This device uses the 16K bar3 for MSI-X with the vector table at
offset zero and the pending bits arrary at offset 8K, fully honoring
the PCI spec alignment guidance. The data sheet specifically refers
to this as an MSI-X BAR. This device would not see a benefit from
MSI-X relocation regardless of the platform, regardless of the page
size.
However, here's another example:
# lspci -vvvs 02:00.0
02:00.0 Serial Attached SCSI controller: xxxxxxxx
...
Region 0: I/O ports at c000 [size=256]
Region 1: Memory at ef640000 (64-bit, non-prefetchable) [size=64K]
Region 3: Memory at ef600000 (64-bit, non-prefetchable) [size=256K]
...
Capabilities: [c0] MSI-X: Enable+ Count=16 Masked-
Vector table: BAR=1 offset=0000e000
PBA: BAR=1 offset=0000f000
Here the MSI-X data structures are placed on separate 4K pages at the
end of a 64KB BAR. If our host page size is 4K, we're likely fine,
but at 64KB page size, MSI-X emulation at that location prevents the
entire BAR from being directly mapped into the VM address space.
Overlapping performance sensitive registers then starts to be a very
likely scenario on such a platform. At this point, the user could
enable tracing on vfio_region_read and vfio_region_write to determine
more conclusively if device accesses are being trapped through QEMU.
Upon finding a device and platform in need of MSI-X relocation, the
next problem is how to choose target PCI BAR to host the MSI-X data
structures. A few key rules to keep in mind for this selection
include:
* There are only 6 BAR slots, bar0..bar5
* 64-bit BARs occupy two BAR slots, 'lspci -vvv' lists the first slot
* PCI BARs are always a power of 2 in size, extending == doubling
* The maximum size of a 32-bit BAR is 2GB
* MSI-X data structures must reside in an MMIO BAR
Using these rules, we can evaluate each BAR of the second example
device above as follows:
bar0: I/O port BAR, incompatible with MSI-X tables
bar1: BAR could be extended, incurring another 64KB of MMIO
bar2: Unavailable, bar1 is 64-bit, this register is used by bar1
bar3: BAR could be extended, incurring another 256KB of MMIO
bar4: Unavailable, bar3 is 64bit, this register is used by bar3
bar5: Available, empty BAR, minimum additional MMIO
A secondary optimization we might wish to make in relocating MSI-X
is to minimize the additional MMIO required for the device, therefore
we might test the available choices in order of preference as bar5,
bar1, and finally bar3. The original proposal for this feature
included an 'auto' option which would choose bar5 in this case, but
various drivers have been found that make assumptions about the
properties of the "first" BAR or the size of BARs such that there
appears to be no foolproof automatic selection available, requiring
known good combinations to be sourced from users. This patch is
pre-enabled for an 'auto' selection making use of a validated lookup
table, but no entries are yet identified.
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The kernel provides similar emulation of PCI BAR register access to
QEMU, so up until now we've used that for things like BAR sizing and
storing the BAR address. However, if we intend to resize BARs or add
BARs that don't exist on the physical device, we need to switch to the
pure QEMU emulation of the BAR.
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Add one more layer to our stack of MemoryRegions, this base region
allows us to register BARs independently of the vfio region or to
extend the size of BARs which do map to a region. This will be
useful when we want hypervisor defined BARs or sections of BARs,
for purposes such as relocating MSI-X emulation. We therefore call
msix_init() based on this new base MemoryRegion, while the quirks,
which only modify regions still operate on those sub-MemoryRegions.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In order to enable TCE operations support in KVM, we have to inform
the KVM about VFIO groups being attached to specific LIOBNs;
the necessary bits are implemented already by IOMMU MR and VFIO.
This defines get_attr() for the SPAPR TCE IOMMU MR which makes VFIO
call the KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE ioctl and establish
LIOBN-to-IOMMU link.
This changes spapr_tce_set_need_vfio() to avoid TCE table reallocation
if the kernel supports the TCE acceleration.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
[aw - remove unnecessary sys/ioctl.h include]
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
In order to enable TCE operations support in KVM, we have to inform
the KVM about VFIO groups being attached to specific LIOBNs. The KVM
already knows about VFIO groups, the only bit missing is which
in-kernel TCE table (the one with user visible TCEs) should update
the attached broups. There is an KVM_DEV_VFIO_GROUP_SET_SPAPR_TCE
attribute of the VFIO KVM device which receives a groupfd/tablefd couple.
This uses a new memory_region_iommu_get_attr() helper to get the IOMMU fd
and calls KVM to establish the link.
As get_attr() is not implemented yet, this should cause no behavioural
change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
This adds get_attr() to IOMMUMemoryRegionClass, like
iommu_ops::domain_get_attr in the Linux kernel.
This defines the first attribute - IOMMU_ATTR_SPAPR_TCE_FD - which
will be used between the pSeries machine and VFIO-PCI.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Peter Crosthwaite <crosthwaite.peter@gmail.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Stefan Weil <sw@weilnetz.de>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-14-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
xen_pt_log() was left with an fprintf(stderr,
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-13-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Fabien Chouteau <chouteau@adacore.com>
Cc: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Cc: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-12-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines were then manually tweaked to pass checkpatch and some curly
braces were added to match QEMU style.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: qemu-ppc@nongnu.org
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Also trim trailing punctuation from error messages.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-10-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
A trailing '.' was removed in hw/pci/pci.c
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-9-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Jia Liu <proljc@gmail.com>
Cc: Stafford Horne <shorne@gmail.com>
Acked-by: Stafford Horne <shorne@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-8-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Anthony Green <green@moxielogic.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-7-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paul Burton <paul.burton@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: "Hervé Poussineau" <hpoussin@reactos.org>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-6-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Michael Walle <michael@walle.cc>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Michael Walle <michael@walle.cc>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-5-armbru@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-4-armbru@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
The 'qemu: ' prefix was manually removed from the hw/arm/boot.c file.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: qemu-arm@nongnu.org
Conversions that aren't followed by exit() dropped, because they might
be inappropriate.
Also trim trailing punctuation from error messages.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-3-armbru@redhat.com>
Apparently we don't use __MSC_VER as a compiler anymore and we always
require a C99 compiler (which means we always have __func__) so we don't
need a special AUDIO_FUNC macro. We can just replace AUDIO_FUNC with
__func__ instead.
Checkpatch failures were manually fixed.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180203084315.20497-2-armbru@redhat.com>
If postcopy-ram was set on the source but not on the destination,
migration doesn't occur, the destination prints an error and boots
the guest:
qemu-system-ppc64: Expected vmdescription section, but got 0
We end up with two running instances.
This behaviour was introduced in 2.11 by commit 58110f0acb "migration:
split common postcopy out of ram postcopy" to prepare ground for the
upcoming dirty bitmap postcopy support. It adds a new case where the
source may send an empty postcopy advise because dirty bitmap doesn't
need to check page sizes like RAM postcopy does.
If the source has enabled postcopy-ram, then it sends an advise with
the page size values. If the destination hasn't enabled postcopy-ram,
then loadvm_postcopy_handle_advise() leaves the page size values on
the stream and returns. This confuses qemu_loadvm_state() later on
and causes the destination to start execution.
As discussed several times, postcopy-ram should be enabled both sides
to be functional. This patch changes the destination to perform some
extra checks on the advise length to ensure this is the case. Otherwise
an error is returned and migration is aborted.
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <151791621042.19120.3103118434734245776.stgit@bahia>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
In e91d895 I added the new pause-before-switchover mechanism
to allow migration completion to be delayed; this changes the
last state prior to completion to MIGRATE_STATUS_DEVICE rather
than MIGRATE_STATUS_ACTIVE.
Fix the failure path in migration_completion to recover the block
devices if it fails in MIGRATE_STATUS_DEVICE, not just the
MIGRATE_STATUS_ACTIVE that it previously had.
This corresponds to rh bz:
https://bugzilla.redhat.com/show_bug.cgi?id=1538494
whose symptom is an occasional source crash on a failed migration.
Fixes: e91d8951d5
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
We add deprecated commands on a new test, so we don't have to add it
on normal tests.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It already has RAMBlock and offset, it can calculate it itself.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
MAX_VM_CMD_PACKAGED_SIZE is a constant used in qemu_savevm_send_packaged
and loadvm_handle_cmd_packaged to determine whether a package is too
big to be sent or received. qemu_savevm_send_packaged is called inside
postcopy_start (migration/migration.c) to send the MigrationState
in a single blob to the destination, using the MIG_CMD_PACKAGED subcommand,
which will read it up using loadvm_handle_cmd_packaged. If the blob is
larger than MAX_VM_CMD_PACKAGED_SIZE, an error is thrown and the postcopy
migration is aborted. Both MAX_VM_CMD_PACKAGED_SIZE and MIG_CMD_PACKAGED
were introduced by commit 11cf1d984b ("MIG_CMD_PACKAGED: Send a packaged
chunk ..."). The constant has its original value of 1ul << 24 (16MB).
The current MAX_VM_CMD_PACKAGED_SIZE value is not enough to support postcopy
migration of bigger pseries guests. The blob size for a postcopy migration of
a pseries guest with the following setup:
qemu-system-ppc64 --nographic -vga none -machine pseries,accel=kvm -m 64G \
-smp 1,maxcpus=32 -device virtio-blk-pci,drive=rootdisk \
-drive file=f27.qcow2,if=none,cache=none,format=qcow2,id=rootdisk \
-netdev user,id=u1 -net nic,netdev=u1
Goes around 12MB. Bumping the RAM to 128G makes the blob sizes goes to 20MB.
With 256G the blob goes to 37MB - more than twice the current maximum size.
At this moment the pseries machine can handle guests with up to 1TB of RAM,
making this postcopy blob goes to 128MB of size approximately.
Following the discussions made in [1], there is a need to understand what
devices are aggressively consuming the blob in that manner and see if that
can be mitigated. Until then, we can set MAX_VM_CMD_PACKAGED_SIZE to the
maximum value allowed. Since the size is a 32 bit int variable, we can set
it as 1ul << 32, giving a maximum blob size of 4G that is enough to support
postcopy migration of 32TB RAM guests given the above constraints.
[1] https://lists.nongnu.org/archive/html/qemu-devel/2018-01/msg06313.html
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Route async errors (especially from sockets) down through
migration_channel_connect and on to migrate_fd_connect where they
can be cleaned up.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Allow whatever is performing the connection to pass migrate_fd_connect
an error to indicate there was a problem during connection, an allow
us to clean up.
The caller must free the error.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Currently travis declares ancient python 2.4 is desired. Update that to
2.6 which is the oldest version any targetted distros still needs. If we
just list a python 3 version at the top level this will double the
number of travis jobs we run which is unreasonable.
So arbitrarily pick the clang test matrix entries to build with python
3.0 and 3.6, to extend coverage of python versions, without increasing
job count or build time.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-14-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
If a VM is launched, files are created and a cleanup is required before
a new launch. This cleanup is executed by shutdown(), so shutdown() must
be called even if the VM is manually terminated (i.e. using kill).
This patch creates a control to make sure launch() will not be executed
again if shutdown() is not called after the previous launch().
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20180122205033.24893-7-apahim@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Now that shutdown() is guaranteed to always execute self._load_io_log()
and self._post_shutdown(), their calls in 'except' became redundant and
we can safely replace it by a call to shutdown().
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20180122205033.24893-6-apahim@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The 'returncode' Popen attribute is not guaranteed to be updated. It
actually depends on a call to either poll(), wait() or communicate().
On the other hand, poll() will: "Check if child process has terminated.
Set and return returncode attribute."
Let's use the poll() to check whether the process is running and to get
the updated process exit code, when the process is finished.
Reviewed-by: Fam Zheng <famz@redhat.com>
eviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20180122205033.24893-5-apahim@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently we only cleanup on shutdown() if the VM is running.
To make sure we will always cleanup, this patch makes the
self._load_io_log() and the self._post_shutdown() to
always be called on shutdown(), regardless the VM running state.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20180122205033.24893-4-apahim@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
To launch a VM, we need to create basically two files: the monitor
socket (if it's a UNIX socket) and the qemu log file.
For the qemu log file, we currently just open the path, which will
create the file if it does not exist or overwrite the file if it does
exist.
For the monitor socket, if it already exists, we are currently removing
it, even if it's not created by us.
This patch moves to _pre_launch() the responsibility to create a
temporary directory to host the files so we can remove the whole
directory on _post_shutdown().
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20180122205033.24893-2-apahim@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Python2 did not validate locale correctness when reading input data, so
would happily read UTF-8 data in non-UTF-8 locales. Python3 is strict so
if you try to read UTF-8 data in the C locale, it will raise an error
for any UTF-8 bytes that aren't representable in 7-bit ascii encoding.
e.g.
UnicodeDecodeError: 'ascii' codec can't decode byte 0xc3 in position 54: ordinal not in range(128)
Traceback (most recent call last):
File "/tmp/qemu-test/src/scripts/qapi-commands.py", line 317, in <module>
schema = QAPISchema(input_file)
File "/tmp/qemu-test/src/scripts/qapi.py", line 1468, in __init__
parser = QAPISchemaParser(open(fname, 'r'))
File "/tmp/qemu-test/src/scripts/qapi.py", line 301, in __init__
previously_included)
File "/tmp/qemu-test/src/scripts/qapi.py", line 348, in _include
exprs_include = QAPISchemaParser(fobj, previously_included, info)
File "/tmp/qemu-test/src/scripts/qapi.py", line 271, in __init__
self.src = fp.read()
File "/usr/lib64/python3.5/encodings/ascii.py", line 26, in decode
return codecs.ascii_decode(input, self.errors)[0]
More background on this can be seen in
https://www.python.org/dev/peps/pep-0538/
Many distros support a new C.UTF-8 locale that is like the C locale,
but with UTF-8 instead of 7-bit ASCII. That is not entirely portable
though. This patch thus sets the LANG to "C", but overrides LC_CTYPE
to be en_US.UTF-8 locale. This gets us pretty close to C.UTF-8, but
in a way that should be portable to everywhere QEMU builds.
This patch only forces UTF-8 for QAPI scripts, since that is the one
showing the immediate error under Python3 with C locale, but potentially
we ought to force this for all python scripts used in the build process.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-9-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Some early python 3.x versions will have different default
ordering when calling the 'values()' method on a dict, compared
to python 2.x and later 3.x versions. Explicitly sort the items
to get a stable ordering.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-8-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When the qapi schema tests fail they merely print that the expected
output didn't match the actual output. This is largely useless when
trying diagnose what went wrong. Removing the '-q' arg to diff
means that it is still silent on successful tests, but when it
fails we'll see details of the incorrect output.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-7-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The iteritems()/itervalues() methods are gone in py3, but the
items()/values() methods are still around. The latter are less
efficient than the former in py2, but this has unmeasurably
small impact on QEMU build time, so taking portability over
efficiency is a net win.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-3-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Python 3 no longer supports the bare "print" statement, it must be
called as a normal function with round brackets. It is possible to
opt-in to this new syntax with Python 2.6 onwards by importing the
"print_function" from the "__future__" module, making it easy to
support Python 2 and 3 in parallel.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180116134217.8725-2-berrange@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
gcc 5.4.0-6ubuntu1~16.04.5 build with UBSAN enabled error:
CC hw/display/exynos4210_fimd.o
/home/petmay01/linaro/qemu-for-merges/hw/display/exynos4210_fimd.c: In
function ‘fimd_get_buffer_id’:
/home/petmay01/linaro/qemu-for-merges/hw/display/exynos4210_fimd.c:1105:5:
error: case label does not reduce to an integer constant
case FIMD_WINCON_BUF2_STAT:
Because FIMD_WINCON_BUF2_STAT case contains an integer
overflow, use U suffix to get the unsigned type.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180116151152.4040-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The inet_parse() function looks for 'ipv4' and 'ipv6' flags, but only
treats them as bare bool flags. The normal QemuOpts parsing would allow
on/off values to be set too.
This updates inet_parse() so that its handling of the 'ipv4' and 'ipv6'
flags matches that done by QemuOpts.
This impacts the NBD block driver parsing the legacy filename syntax and
the migration code parsing the socket scheme.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20180125171412.21627-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We dropped support for ia64 host CPUs in the 2.11 release (removing
the TCG backend for it, and advertising the support as being
completely removed in the changelog). However there are a few bits
and pieces of code still floating about. Remove those, too.
We can drop the check in configure for "ia64 or hppa host?"
entirely, because we don't support hppa hosts either any more.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1516897189-11035-1-git-send-email-peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
hvf.c and vmx.h contain code from hvdos.c that is released as public domain:
from hvdos github: https://github.com/mist64/hvdos
"License
See LICENSE.txt (2-clause-BSD).
In order to simplify use of this code as a template, you can consider any parts from "hvdos.c" and "interface.h" as being in the public domain."
Signed-off-by: Izik Eidus <izik@veertu.com>
Message-Id: <20180123123639.35255-2-izik@veertu.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The i2c core and the at24c EEPROM should only be compiled and linked
on the machines that support i2c. Otherwise it's quite strange to see
the at24c-eeprom to be "available" on qemu-system-s390x for example.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1516634853-15883-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The memory-internal.h header claims that it is for "obsolete
exec.c functions" which "will be removed soon". This statement
was added in 2011, six years ago, but the header is still here.
(Admittedly none of the prototypes added in commit 67d95c153b
are still in the header.)
It's convenient to have a place to put prototypes for functions
which are used internally to the various .c files of the memory
system or by the accel/tcg code, which is inevitably fairly
closely coupled. So keep the header but update the comments to
reflect what we're actually using it for.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1511276888-17834-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is required otherwise python complains because of the
accentuated letter in Alex's last name:
Traceback (most recent call last):
File "scripts/qemu-gdb.py", line 29, in <module>
from qemugdb import aio, mtree, coroutine, tcg, timers
File "scripts/qemugdb/timers.py", line 1
SyntaxError: Non-ASCII character '\xc3' in file scripts/qemugdb/timers.py
on line 1, but no encoding declared;
see http://www.python.org/peps/pep-0263.html for details
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <151629549711.18276.15497684562308683805.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since commit e5dc1a6c6c, QEMU aborts on exit if completion was used
in the monitor:
*** Error in `obj/ppc64-softmmu/qemu-system-ppc64': double free or
corruption (fasttop): 0x00000100331069d0 ***
/home/greg/Work/qemu/qemu-spapr/util/readline.c:514
/home/greg/Work/qemu/qemu-spapr/monitor.c:586
/home/greg/Work/qemu/qemu-spapr/monitor.c:4125
argv=<optimized out>, envp=<optimized out>) at
/home/greg/Work/qemu/qemu-spapr/vl.c:4795
Completion strings are not persistent accross completions (why would
they?). They are allocated under readline_completion(), which already
takes care of freeing them before returning.
Maybe all completion related bits should be moved out of ReadLineState
to a dedicated structure ?
In the meantime, let's drop the offending lines from readline_free()
to fix the crash.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <151627206353.4505.4602428849861610759.stgit@bahia.lan>
Fixes: e5dc1a6c6c
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds a tracepoint to trace the KVM_SET_USER_MEMORY_REGION ioctl
parameters which is quite useful for debugging VFIO memory regions
being actually registered with KVM.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20171215052326.21386-1-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QOM API learning curve is quite hard, in particular when devices inherit from
abstract parent.
To be more explicit about when a device class change the parent hooks, add few
helpers hoping a device class_init() will be easier to understand.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180114020412.26160-3-f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Merge tpm 2018/02/03 v1
# gpg: Signature made Sat 03 Feb 2018 14:02:35 GMT
# gpg: using RSA key 75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-02-03-1:
tpm: tis: move one-line function into caller
MAINTAINERS: add pointer to tpm-next repository
tpm: wrap stX_be_p in tpm_cmd_set_XYZ functions
tpm: Split off tpm_crb_reset function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Wrap the calls to stl_be_p and stw_be_p in tpm_cmd_set_XYZ functions
that are similar to existing getters.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Split off the tpm_crb_reset function part from tpm_crb_realize
that we need to run every time the machine resets.
Also register our reset function with the system since TYPE_DEVICE
seems to not get a reset otherwise.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This series is mostly about 9p request cancellation. It fixes a
long standing bug (read "specification violation") where the server
would send an invalid response when the client has cancelled an
in-flight request. This was causing annoying spurious EINTR returns
in linux. The fix comes with some related testing in QTEST.
Other patches are code cleanup and improvements.
# gpg: Signature made Fri 02 Feb 2018 10:16:03 GMT
# gpg: using RSA key 71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
tests/virtio-9p: explicitly handle potential integer overflows
tests: virtio-9p: add FLUSH operation test
libqos/virtio: return length written into used descriptor
tests: virtio-9p: add WRITE operation test
tests: virtio-9p: add LOPEN operation test
tests: virtio-9p: use the synth backend
tests: virtio-9p: wait for completion in the test code
tests: virtio-9p: move request tag to the test functions
9pfs: Correctly handle cancelled requests
9pfs: drop v9fs_register_transport()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Lots of litte miscellaneous fixes for the IPMI code, plus
add me as the IPMI maintainer.
# gpg: Signature made Thu 01 Feb 2018 18:44:55 GMT
# gpg: using RSA key 61F38C90919BFF81
# gpg: Good signature from "Corey Minyard <cminyard@mvista.com>"
# gpg: aka "Corey Minyard <minyard@acm.org>"
# gpg: aka "Corey Minyard <corey@minyard.net>"
# gpg: aka "Corey Minyard <minyard@mvista.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: FD0D 5CE6 7CE0 F59A 6688 2686 61F3 8C90 919B FF81
* remotes/cminyard/tags/for-release-20180201:
ipmi: Allow BMC device properties to be set
ipmi: disable IRQ and ATN on an external disconnect
ipmi: Fix macro issues
ipmi: Add the platform event message command
ipmi: Don't set the timestamp on add events that don't have it
ipmi: Fix SEL get/set time commands
Add maintainer for the IPMI code
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The idea is to send a victim request that will possibly block in the
server and to send a flush request to cancel the victim request.
This patch adds two test to verifiy that:
- the server does not reply to a victim request that was actually
cancelled
- the server replies to the flush request after replying to the
victim request if it could not cancel it
9p request cancellation reference:
http://man.cat-v.org/plan_9/5/flush
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
(groug, change the test to only write a single byte to avoid
any alignment or endianess consideration)
When a 9p request is flushed (ie, cancelled) by the guest, the device
is expected to simply mark the request as used, without sending a 9p
reply (ie, without writing anything into the used buffer).
To be able to test this, we need access to the length written by the
device into the used descriptor. This patch adds a uint32_t * argument
to qvirtqueue_get_buf() and qvirtio_wait_used_elem() for this purpose.
All existing users are updated accordingly.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
virtio-gpu has special code path that bypassed vIOMMU protection. So
for now let's disable iommu_platform for the device until we fully
support that (if needed).
After the patch, both virtio-vga and virtio-gpu won't allow to boot with
iommu_platform parameter set.
CC: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20180131040401.3550-1-peterx@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
In this previous commit:
commit 8f61f1c5a6
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Dec 18 19:12:20 2017 +0000
ui: track how much decoded data we consumed when doing SASL encoding
I attempted to fix a flaw with tracking how much data had actually been
processed when encoding with SASL. With that flaw, the VNC server could
mistakenly discard queued data that had not been sent.
The fix was not quite right though, because it merely decremented the
vs->output.offset value. This is effectively discarding data from the
end of the pending output buffer. We actually need to discard data from
the start of the pending output buffer. We also want to free memory that
is no longer required. The correct way to handle this is to use the
buffer_advance() helper method instead of directly manipulating the
offset value.
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Message-id: 20180201155841.27509-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The VNC server already has the ability to listen on multiple sockets.
Converting it to use the QIONetListener APIs though, will reduce the
amount of code in the VNC server and improve the clarity of what is
left.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180201164514.10330-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The previous commit:
commit 2ec78706d1
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Wed Jan 17 16:47:15 2018 +0000
ui: convert GTK and SDL1 frontends to keycodemapdb
changed the x_keymap.c keymap so that its target was qcodes instead of
qnums. It updated the GTK frontend to take account of this change, but
forgot to update the SDL1 frontend. Thus the SDL frontend was getting
qcodes but dispatching them as if they were qnums. IOW, keyboard input
was completely hosed with SDL1. Since the keyboard layout tables are
still all based on qnums, it is easier to just keep SDL1 using qnums as
it will be deleted in a few releases time.
Reported-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-id: 20180201180033.14255-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Trivial test of a successful write.
Signed-off-by: Greg Kurz <groug@kaod.org>
(groug, handle potential overflow when computing request size,
add missing g_free(buf),
backend handles one written byte at a time to validate
the server doesn't do short-reads)
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The purpose of virtio-9p-test is to test the virtio-9p device, especially
the 9p server state machine. We don't really care what fsdev backend we're
using. Moreover, if we want to be able to test the flush request or a
device reset with in-flights I/O, it is close to impossible to achieve
with a physical backend because we cannot ask it reliably to put an I/O
on hold at a specific point in time.
Fortunately, we can do that with the synthetic backend, which allows to
register callbacks on read/write accesses to a specific file. This will
be used by a later patch to test the 9P flush request.
The walk request test is converted to using the synth backend.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
In order to test request cancellation, we will need to send multiple
requests and wait for the associated replies. Since we poll the ISR
to know if a request completed, we may have several replies to parse
when we detect ISR was set to 1.
This patch moves the waiting out of the reply parsing path, up into
the functional tests.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
It doesn't really makes sense to hide the request tag from the test
functions. It prevents to test the 9p server behavior when passed
a wrong tag (ie, still in use or different from P9_NOTAG for a
version request). Also the spec says that a tag is reusable as soon
as the corresponding request was replied or flushed: no need to
always increment tags like we do now. And finaly, an upcoming test
of the flush command will need to manipulate tags explicitely.
This simply changes all request functions to have a tag argument.
Except for the version request which needs P9_NOTAG, all other
tests can pass 0 since they wait for the reply before sending
another request.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
# Background
I was investigating spurious non-deterministic EINTR returns from
various 9p file system operations in a Linux guest served from the
qemu 9p server.
## EINTR, ERESTARTSYS and the linux kernel
When a signal arrives that the Linux kernel needs to deliver to user-space
while a given thread is blocked (in the 9p case waiting for a reply to its
request in 9p_client_rpc -> wait_event_interruptible), it asks whatever
driver is currently running to abort its current operation (in the 9p case
causing the submission of a TFLUSH message) and return to user space.
In these situations, the error message reported is generally ERESTARTSYS.
If the userspace processes specified SA_RESTART, this means that the
system call will get restarted upon completion of the signal handler
delivery (assuming the signal handler doesn't modify the process state
in complicated ways not relevant here). If SA_RESTART is not specified,
ERESTARTSYS gets translated to EINTR and user space is expected to handle
the restart itself.
## The 9p TFLUSH command
The 9p TFLUSH commands requests that the server abort an ongoing operation.
The man page [1] specifies:
```
If it recognizes oldtag as the tag of a pending transaction, it should
abort any pending response and discard that tag.
[...]
When the client sends a Tflush, it must wait to receive the corresponding
Rflush before reusing oldtag for subsequent messages. If a response to the
flushed request is received before the Rflush, the client must honor the
response as if it had not been flushed, since the completed request may
signify a state change in the server
```
In particular, this means that the server must not send a reply with the
orignal tag in response to the cancellation request, because the client is
obligated to interpret such a reply as a coincidental reply to the original
request.
# The bug
When qemu receives a TFlush request, it sets the `cancelled` flag on the
relevant pdu. This flag is periodically checked, e.g. in
`v9fs_co_name_to_path`, and if set, the operation is aborted and the error
is set to EINTR. However, the server then violates the spec, by returning
to the client an Rerror response, rather than discarding the message
entirely. As a result, the client is required to assume that said Rerror
response is a result of the original request, not a result of the
cancellation and thus passes the EINTR error back to user space.
This is not the worst thing it could do, however as discussed above, the
correct error code would have been ERESTARTSYS, such that user space
programs with SA_RESTART set get correctly restarted upon completion of
the signal handler.
Instead, such programs get spurious EINTR results that they were not
expecting to handle.
It should be noted that there are plenty of user space programs that do not
set SA_RESTART and do not correctly handle EINTR either. However, that is
then a userspace bug. It should also be noted that this bug has been
mitigated by a recent commit to the Linux kernel [2], which essentially
prevents the kernel from sending Tflush requests unless the process is about
to die (in which case the process likely doesn't care about the response).
Nevertheless, for older kernels and to comply with the spec, I believe this
change is beneficial.
# Implementation
The fix is fairly simple, just skipping notification of a reply if
the pdu was previously cancelled. We do however, also notify the transport
layer that we're doing this, so it can clean up any resources it may be
holding. I also added a new trace event to distinguish
operations that caused an error reply from those that were cancelled.
One complication is that we only omit sending the message on EINTR errors in
order to avoid confusing the rest of the code (which may assume that a
client knows about a fid if it sucessfully passed it off to pud_complete
without checking for cancellation status). This does mean that if the server
acts upon the cancellation flag, it always needs to set err to EINTR. I
believe this is true of the current code.
[1] https://9fans.github.io/plan9port/man/man9/flush.html
[2] https://github.com/torvalds/linux/commit/9523feac272ccad2ad8186ba4fcc891
Signed-off-by: Keno Fischer <keno@juliacomputing.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[groug, send a zero-sized reply instead of detaching the buffer]
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
No good reasons to do this outside of v9fs_device_realize_common().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
On some architectures, qemu doesn't support vmcoreinfo device,
and dump-guest-memory fails:
(gdb) dump-guest-memory /tmp/vmcore ppc64-le
guest RAM blocks:
target_start target_end host_addr message count
---------------- ---------------- ---------------- ------- -----
0000000000000000 0000000200000000 00003ffd86980000 added 1
0000200080000000 0000200080800000 00003ffd86170000 added 2
Python Exception <class 'gdb.error'> No symbol "vmcoreinfo_realize" in current context.:
Error occurred in Python command: No symbol "vmcoreinfo_realize" in current context.
Check that vmcoreinfo_realize symbol exists before evaluating an
expression with it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
200 currently fails on tmpfs because it sets cache=none. However,
without that (and aio=native), the test still works now and it fails
before Jeff's series (on fc7dbc119e). So
we can probably remove the aio=native safely, and replace cache=none by
cache=$CACHEMODE.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20180117135015.15051-1-mreitz@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
We masked the wrong bits, which prevented some of the
32-bit R registers. E.g. "fcnvxf,sgl,sgl fr22R,fr6R".
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have the prerequisites in target/hppa/,
implement the hardware for a PA7100LC.
This also enables build for hppa-softmmu.
Signed-off-by: Helge Deller <deller@gmx.de>
[rth: Since it is all new code, squashed all branch development
withing hw/hppa/ to a single patch.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is an extension to the base ISA, but we can use this in
the kernel idle loop to reduce the host cpu time consumed.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
HP-UX 10.20 CD contains "add r0, r0, r27" in a delay slot,
which uses at least 5 temps.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Unknown why this works, but if we return EXCP_ITLB_MISS we
will triple-fault the first userland instruction fetch.
Is it something to do with having a combined I/DTLB?
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Linux sets sr4-sr7 all to the same value, which means that we
need not do any runtime computation to find out what space to
use in forming the GVA.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Real hardware would use an external device to control the power.
But for the moment let's invent instructions in reserved space,
to be used by our custom firmware.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Otherwise there's no way to clear them without an external command,
and it could lock the OS in the VM if they were stuck.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Macro parameters should almost always have () around them when used.
llvm reported an error on this.
Remove redundant parenthesis and put parenthesis around the entire
macros with assignments in case they are used in an expression.
The macros were doing ((v) & 1) for a binary input, but that only works
if v == 0 or if v & 1. Changed to !!(v) so they work for all values.
Remove some unused macros.
Reported in https://bugs.launchpad.net/bugs/1651167
An audit of these changes found no semantic changes; this is just
cleanups for proper style and to avoid a compiler warning.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This lets an event be added to the SEL as if a sensor had generated
it. The OpenIPMI driver uses it for storing panic event information.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
According to the spec, from section "32.3 OEM SEL Record - Type
E0h-FFh", event types from 0x0e to 0xff do not have a timestamp.
So don't set it when adding those types. This required putting
the timestamp in a temporary buffer, since it's still required
to set the last addition time.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
The minimum message size was on the wrong commands, for getting
the time it's zero and for setting the time it's 6.
Signed-off-by: Corey Minyard <cminyard@mvista.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
However since HPPA has a software-managed TLB, and the relevant
TLB manipulation instructions are not implemented, this does not
actually do anything.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Any one TB will have only one space value. If we change spaces,
we change TBs. Thus BE and BEV must exit the TB immediately.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These instructions force the destination privilege level
of the branch destination to be no higher than current.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This changes the system virtual address width to 64-bit and
incorporates the space registers into load/store operations.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While the E bit is only used for pa2.0 mfctl,w from sar,
the otherwise reserved bit does not appear to be decoded.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Most aspects of privilege are not yet handled. But this
gives us the start from which to begin checking.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
For system mode, we will need 64-bit virtual addresses even when
we have 32-bit register sizes. Since the rest of QEMU equates
TARGET_LONG_BITS with the address size, redefine everything
related to register size in terms of a new TARGET_REGISTER_BITS.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't actually do anything with most of the bits yet,
but at least they have names and we have somewhere to
store them.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With the addition of default-configs/hppa-softmmu.mak, this
will compile. It is not enabled with this patch, however.
Signed-off-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tpm 2018/01/26 v2
# gpg: Signature made Mon 29 Jan 2018 22:20:05 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2018-01-26-2:
tpm: add CRB device
tpm: report backend request error
tpm: replace GThreadPool with AIO threadpool
tpm: lookup cancel path under tpm device class
tpm: fix alignment issues
tpm: Set the flags of the CMD_INIT command to 0
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The SPARC code in linux-user/signal.c defines a set of
MC_* constants. On some SPARC hosts these are also defined
by sys/ucontext.h, resulting in build failures:
linux-user/signal.c:2786:0: error: "MC_NGREG" redefined [-Werror]
#define MC_NGREG 19
In file included from /usr/include/signal.h:302:0,
from include/qemu/osdep.h:86,
from linux-user/signal.c:19:
/usr/include/sparc64-linux-gnu/sys/ucontext.h:59:0: note: this is the location of the previous definition
# define MC_NGREG __MC_NGREG
Rename all these constants to SPARC_MC_* to avoid the clash.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1517318239-15764-1-git-send-email-peter.maydell@linaro.org
tpm_crb is a device for TPM 2.0 Command Response Buffer (CRB)
Interface as defined in TCG PC Client Platform TPM Profile (PTP)
Specification Family “2.0” Level 00 Revision 01.03 v22.
The PTP allows device implementation to switch between TIS and CRB
model at run time, but given that CRB is a simpler device to
implement, I chose to implement it as a different device.
The device doesn't implement other locality than 0 for now (my laptop
TPM doesn't either, so I assume this isn't so bad)
Tested with some success with Linux upstream and Windows 10, seabios &
modified ovmf. The device is recognized and correctly transmit
command/response with passthrough & emu. However, we are missing PPI
ACPI part atm.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The TPM backend uses a GThreadPool to handle IO in a seperate
thread. However, GThreadPool isn't integrated with Qemu main loops,
making it unnecessarily complicated to deal with.
Qemu has a AIO threadpool, that is better integrated with loops and
various IO functions, provides completion BH by default etc.
Remove the only user of GThreadPool from qemu, use AIO threadpool.
Note that the backend:
- no longer accepts queing multiple requests (unneeded so far)
- increase ref to itself when handling a command, for extra safety
- tpm_backend_thread_end() is renamed tpm_backend_finish_sync() and
will wait for completion of BH (request_completed), which will help
migration handling.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Since Linux commit 313d21eeab9282e, tpm devices have their own device
class "tpm" and the cancel path must be looked up under
/sys/class/tpm/ instead of /sys/class/misc/.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The new tpm-crb-test fails on sparc host:
TEST: tests/tpm-crb-test... (pid=230409)
/i386/tpm-crb/test:
Broken pipe
FAIL
GTester: last random seed: R02S29cea50247fe1efa59ee885a26d51a85
(pid=230423)
FAIL: tests/tpm-crb-test
and generates a new clang sanitizer runtime warning:
/home/petmay01/linaro/qemu-for-merges/hw/tpm/tpm_util.h:36:24: runtime
error: load of misaligned address 0x7fdc24c00002 for type 'const
uint32_t' (aka 'const unsigned int'), which requires 4 byte alignment
0x7fdc24c00002: note: pointer points here
<memory cannot be printed>
The sparc architecture does not allow misaligned loads and will
segfault if you try them. For example, this function:
static inline uint32_t tpm_cmd_get_size(const void *b)
{
return be32_to_cpu(*(const uint32_t *)(b + 2));
}
Should read,
return ldl_be_p(b + 2);
As a general rule you can't take an arbitrary pointer into a byte
buffer and try to interpret it as a structure or a pointer to a
larger-than-bytesize-data simply by casting the pointer.
Use this clean up as an opportunity to remove unnecessary temporary
buffers and casts.
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The flags of the CMD_INIT control channel command were not
initialized properly. Fix this and set to 0.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
# gpg: Signature made Mon 29 Jan 2018 08:14:19 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
MAINTAINERS: update Dmitry Fleytman email
qemu-doc: Get rid of "vlan=X" example in the documentation
net: Allow netdevs to be used with 'hostfwd_add' and 'hostfwd_remove'
net: Allow hubports to connect to other netdevs
colo: compare the packet based on the tcp sequence number
colo: modified the payload compare function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2018-01-29
Here's another batch of patches for ppc, spapr and related things.
Higlights:
* Implement (with a bunch of necessary infrastructure) a hypercall
to let guests properly apply Spectre and Meltdown workarounds.
* Convert a number of old devices to trace events
* Fix some bugs
# gpg: Signature made Mon 29 Jan 2018 03:27:30 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180129:
target/ppc/spapr: Add H-Call H_GET_CPU_CHARACTERISTICS
target/ppc/spapr_caps: Add new tristate cap safe_indirect_branch
target/ppc/spapr_caps: Add new tristate cap safe_bounds_check
target/ppc/spapr_caps: Add new tristate cap safe_cache
target/ppc/spapr_caps: Add support for tristate spapr_capabilities
target/ppc/kvm: Add cap_ppc_safe_[cache/bounds_check/indirect_branch]
spapr_pci: fix MSI/MSIX selection
input: add missing newline from trace-events
uninorth: convert to trace-events
grackle: convert to trace-events
ppc: Deprecate qemu-system-ppcemb
ppc/pnv: fix PnvChip redefinition in <hw/ppc/pnv_xscom.h>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
C functions with no arguments must be declared foo(void) instead of
foo(). The tracetool argument list parser has never accepted an empty
argument list. This patch adds a clear error message for this error
case.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180110202553.31889-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The terminology used by tracetool is not consistent with C sprintf or
docs/devel/tracing.txt. The word "formats" is sometimes used to mean
"format strings".
This patch clarifies comments and error messages that contain this word.
Note that the error message lines are longer than 80 characters but I
have not wrapped them to aid grepping.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180110202553.31889-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Include the file line number in the message that is printed when
trace-events parse errors are raised.
[Use enumerate(fobj, 1) to avoid having to increment a 0-based index
later, as suggested by Eric Blake.
--Stefan]
Suggested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20180110202553.31889-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Replace the keymap_qcode table with automatically generated
tables.
Missing entries in keymap_qcode now fixed:
Q_KEY_CODE_ASTERISK -> KEY_KPASTERISK
Q_KEY_CODE_KP_MULTIPLY -> KEY_KPASTERISK
Q_KEY_CODE_STOP -> KEY_STOP
Q_KEY_CODE_AGAIN -> KEY_AGAIN
Q_KEY_CODE_PROPS -> KEY_PROPS
Q_KEY_CODE_UNDO -> KEY_UNDO
Q_KEY_CODE_FRONT -> KEY_FRONT
Q_KEY_CODE_COPY -> KEY_COPY
Q_KEY_CODE_OPEN -> KEY_OPEN
Q_KEY_CODE_PASTE -> KEY_PASTE
Q_KEY_CODE_FIND -> KEY_FIND
Q_KEY_CODE_CUT -> KEY_CUT
Q_KEY_CODE_LF -> KEY_LINEFEED
Q_KEY_CODE_HELP -> KEY_HELP
Q_KEY_CODE_COMPOSE -> KEY_COMPOSE
Q_KEY_CODE_RO -> KEY_RO
Q_KEY_CODE_HIRAGANA -> KEY_HIRAGANA
Q_KEY_CODE_HENKAN -> KEY_HENKAN
Q_KEY_CODE_YEN -> KEY_YEN
Q_KEY_CODE_KP_COMMA -> KEY_KPCOMMA
Q_KEY_CODE_KP_EQUALS -> KEY_KPEQUAL
Q_KEY_CODE_POWER -> KEY_POWER
Q_KEY_CODE_SLEEP -> KEY_SLEEP
Q_KEY_CODE_WAKE -> KEY_WAKEUP
Q_KEY_CODE_AUDIONEXT -> KEY_NEXTSONG
Q_KEY_CODE_AUDIOPREV -> KEY_PREVIOUSSONG
Q_KEY_CODE_AUDIOSTOP -> KEY_STOPCD
Q_KEY_CODE_AUDIOPLAY -> KEY_PLAYPAUSE
Q_KEY_CODE_AUDIOMUTE -> KEY_MUTE
Q_KEY_CODE_VOLUMEUP -> KEY_VOLUMEUP
Q_KEY_CODE_VOLUMEDOWN -> KEY_VOLUMEDOWN
Q_KEY_CODE_MEDIASELECT -> KEY_MEDIA
Q_KEY_CODE_MAIL -> KEY_MAIL
Q_KEY_CODE_CALCULATOR -> KEY_CALC
Q_KEY_CODE_COMPUTER -> KEY_COMPUTER
Q_KEY_CODE_AC_HOME -> KEY_HOMEPAGE
Q_KEY_CODE_AC_BACK -> KEY_BACK
Q_KEY_CODE_AC_FORWARD -> KEY_FORWARD
Q_KEY_CODE_AC_REFRESH -> KEY_REFRESH
Q_KEY_CODE_AC_BOOKMARKS -> KEY_BOOKMARKS
NB, the virtio-input device reports a bitmask to the guest driver that
has a bit set for each Linux keycode that the host is able to send to
the guest.
Thus by adding these extra key mappings we are technically changing the
host<->guest ABI. This would also happen any time we defined new mappings
for QEMU keycodes in future.
When a keycode is removed from the list of possible keycodes that host can
send to the guest, it means that the guest OS will think it is possible
to receive a key that in pratice can never be generated, which is harmless.
When a keycode is added to the list of possible keycodes that the host can
send to the guest, it means that the guest OS can see an unexpected event.
The Linux virtio_input.c driver code simply forwards this event to the
input_event() method in the Linux input subsystem. This in turn calls
input_handle_event(), which then calls input_get_disposition(). This method
checks if the input event is present in the permitted keys bitmap, and if
not returns INPUT_IGNORE_EVENT. Thus the unexpected event will get dropped,
which is harmless.
If the guest OS reboots, or otherwise re-initializes the virt-input device,
it will read the new keycode bitmap. No matter how many keys are defined,
the config space has a fixed 128 byte bitmap. There is, however, a size
field defiend which says how many bytes in the bitmap are used. So the guest
OS reads the size of the bitmap, and then it reads the data from bitmap upto
the designated size. So if the guest OS re-initializes at precisely the time
that QEMU is migrated across versions, in the worst case, it could conceivably
read the old size field, but then get the newly updated bitmap. If a key were
added this is harmless, since it simply means it may not process the newly
added key. If a key were removed, then it could be readnig a byte from the
bitmap that was not initialized. Fortunately QEMU always memsets() the entire
bitmap to 0, prior to setting keybits. Thus the guest OS will simply read
zeros, which is again harmless.
Based on this analysis, it is believed that there is no need to preserve the
virtio-input-hid keymaps across migration, as the host<->guest ABI change is
harmless and self-resolving at time of guest reboot.
NB, this behaviour should perhaps be formalized in the virtio-input spec
to declare how guest OS drivers should be written to be robust in their
handling of the potentially changable key bitmaps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180117164118.8510-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Replace the qcode_to_keycode_set1, qcode_to_keycode_set2,
and qcode_to_keycode_set3 tables with automatically
generated tables.
Missing entries in qcode_to_keycode_set1 now fixed:
- Q_KEY_CODE_SYSRQ -> 0x54
- Q_KEY_CODE_PRINT -> 0x54 (NB ignored due to special case)
- Q_KEY_CODE_AGAIN -> 0xe005
- Q_KEY_CODE_PROPS -> 0xe006
- Q_KEY_CODE_UNDO -> 0xe007
- Q_KEY_CODE_FRONT -> 0xe00c
- Q_KEY_CODE_COPY -> 0xe078
- Q_KEY_CODE_OPEN -> 0x64
- Q_KEY_CODE_PASTE -> 0x65
- Q_KEY_CODE_CUT -> 0xe03c
- Q_KEY_CODE_LF -> 0x5b
- Q_KEY_CODE_HELP -> 0xe075
- Q_KEY_CODE_COMPOSE -> 0xe05d
- Q_KEY_CODE_PAUSE -> 0xe046
- Q_KEY_CODE_KP_EQUALS -> 0x59
And some mistakes corrected:
- Q_KEY_CODE_HIRAGANA was mapped to 0x70 (Katakanahiragana)
instead of of 0x77 (Hirigana)
- Q_KEY_CODE_MENU was incorrectly mapped to the compose
scancode (0xe05d) and is now mapped to 0xe01e
- Q_KEY_CODE_FIND was mapped to 0xe065 (Search) instead
of to 0xe041 (Find)
- Q_KEY_CODE_POWER, SLEEP & WAKE had 0x0e instead of 0xe0
as the prefix
Missing entries in qcode_to_keycode_set2 now fixed:
- Q_KEY_CODE_PRINT -> 0x7f (NB ignored due to special case)
- Q_KEY_CODE_COMPOSE -> 0xe02f
- Q_KEY_CODE_PAUSE -> 0xe077
- Q_KEY_CODE_KP_EQUALS -> 0x0f
And some mistakes corrected:
- Q_KEY_CODE_HIRAGANA was mapped to 0x13 (Katakanahiragana)
instead of of 0x62 (Hirigana)
- Q_KEY_CODE_MENU was incorrectly mapped to the compose
scancode (0xe02f) and is now not mapped
- Q_KEY_CODE_FIND was mapped to 0xe010 (Search) and is now
not mapped.
- Q_KEY_CODE_POWER, SLEEP & WAKE had 0x0e instead of 0xe0
as the prefix
Missing entries in qcode_to_keycode_set3 now fixed:
- Q_KEY_CODE_ASTERISK -> 0x7e
- Q_KEY_CODE_SYSRQ -> 0x57
- Q_KEY_CODE_LESS -> 0x13
- Q_KEY_CODE_STOP -> 0x0a
- Q_KEY_CODE_AGAIN -> 0x0b
- Q_KEY_CODE_PROPS -> 0x0c
- Q_KEY_CODE_UNDO -> 0x10
- Q_KEY_CODE_COPY -> 0x18
- Q_KEY_CODE_OPEN -> 0x20
- Q_KEY_CODE_PASTE -> 0x28
- Q_KEY_CODE_FIND -> 0x30
- Q_KEY_CODE_CUT -> 0x38
- Q_KEY_CODE_HELP -> 0x09
- Q_KEY_CODE_COMPOSE -> 0x8d
- Q_KEY_CODE_AUDIONEXT -> 0x93
- Q_KEY_CODE_AUDIOPREV -> 0x94
- Q_KEY_CODE_AUDIOSTOP -> 0x98
- Q_KEY_CODE_AUDIOMUTE -> 0x9c
- Q_KEY_CODE_VOLUMEUP -> 0x95
- Q_KEY_CODE_VOLUMEDOWN -> 0x9d
- Q_KEY_CODE_CALCULATOR -> 0xa3
- Q_KEY_CODE_AC_HOME -> 0x97
And some mistakes corrected:
- Q_KEY_CODE_MENU was incorrectly mapped to the compose
scancode (0x8d) and is now 0x91
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180117164118.8510-2-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
During Qemu guest migration, a destination process invokes ps2
post_load function. In that, if 'rptr' and 'count' values were
invalid, it could lead to OOB access or infinite loop issue.
Add check to avoid it.
Reported-by: Cyrille Chatras <cyrille.chatras@orange.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-id: 20171116075155.22378-1-ppandit@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
On Linux, a mouse event is generated for both down and up when mouse
wheel is used. This caused virtio_input_send() to be called twice each
time the wheel was used.
This commit adds a check for the button down state and only calls
virtio_input_send() when it is true.
Signed-off-by: Miika S <miika9764@gmail.com>
Message-Id: <20171222152531.1849-4-miika9764@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The vlan concept is marked as deprecated, so we should not use
this for examples in the documentation anymore.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It does not make much sense to limit these commands to the legacy 'vlan'
concept only, they should work with the modern netdevs, too. So now
it is possible to use this command with one, two or three parameters.
With one parameter, the command installs a hostfwd rule on the default
"user" network:
hostfwd_add tcp:...
With two parameters, the command installs a hostfwd rule on a netdev
(that's the new way of using this command):
hostfwd_add netdev_id tcp:...
With three parameters, the command installs a rule on a 'vlan' (aka hub):
hostfwd_add hub_id name tcp:...
Same applies to the hostfwd_remove command now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
QEMU can emulate hubs to connect NICs and netdevs. This is currently
primarily used for the mis-named 'vlan' feature of the networking
subsystem. Now the 'vlan' feature has been marked as deprecated, since
its name is rather confusing and the users often rather mis-configure
their network when trying to use it. But while the 'vlan' parameter
should be removed at one point in time, the basic idea of emulating
a hub in QEMU is still good: It's useful for bundling up the output of
multiple NICs into one single l2tp netdev for example.
Now to be able to use the hubport feature without 'vlan's, there is one
missing piece: The possibility to connect a hubport to a netdev, too.
This patch adds this possibility by introducing a new "netdev=..."
parameter to the hubports.
To bundle up the output of multiple NICs into one socket netdev, you can
now run QEMU with these parameters for example:
qemu-system-ppc64 ... -netdev socket,id=s1,connect=:11122 \
-netdev hubport,hubid=1,id=h1,netdev=s1 \
-netdev hubport,hubid=1,id=h2 -device e1000,netdev=h2 \
-netdev hubport,hubid=1,id=h3 -device virtio-net-pci,netdev=h3
For using the socket netdev, you have got to start another QEMU as the
receiving side first, for example with network dumping enabled:
qemu-system-x86_64 -M isapc -netdev socket,id=s0,listen=:11122 \
-device ne2k_isa,netdev=s0 \
-object filter-dump,id=f1,netdev=s0,file=/tmp/dump.dat
After the ppc64 guest tried to boot from both NICs, you can see in the
dump file (using Wireshark, for example), that the output of both NICs
(the e1000 and the virtio-net-pci) has been successfully transfered
via the socket netdev in this case.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Packet size some time different or when network is busy.
Based on same payload size, but TCP protocol can not
guarantee send the same one packet in the same way,
like that:
We send this payload:
------------------------------
| header |1|2|3|4|5|6|7|8|9|0|
------------------------------
primary:
ppkt1:
----------------
| header |1|2|3|
----------------
ppkt2:
------------------------
| header |4|5|6|7|8|9|0|
------------------------
secondary:
spkt1:
------------------------------
| header |1|2|3|4|5|6|7|8|9|0|
------------------------------
In the original method, ppkt1 and ppkt2 are different in size and
spkt1, so they can't compare and trigger the checkpoint.
I have tested FTP get 200M and 1G file many times, I found that
the performance was less than 1% of the native.
Now I reconstructed the comparison of TCP packets based on the
TCP sequence number. first of all, ppkt1 and spkt1 have the same
starting sequence number, so they can compare, even though their
length is different. And then ppkt1 with a smaller payload length
is used as the comparison length, if the payload is same, send
out the ppkt1 and record the offset(the length of ppkt1 payload)
in spkt1. The next comparison, ppkt2 and spkt1 can be compared
from the recorded position of spkt1.
like that:
----------------
| header |1|2|3| ppkt1
---------|-----|
| |
---------v-----v--------------
| header |1|2|3|4|5|6|7|8|9|0| spkt1
---------------|\------------|
| \offset |
---------v-------------v
| header |4|5|6|7|8|9|0| ppkt2
------------------------
In this way, the performance can reach native 20% in my multiple
tests.
Cc: Zhang Chen <zhangckid@gmail.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Li Zhijian <lizhijian@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Reviewed-by: Zhang Chen <zhangckid@gmail.com>
Tested-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The new H-Call H_GET_CPU_CHARACTERISTICS is used by the guest to query
behaviours and available characteristics of the cpu.
Implement the handler for this new H-Call which formulates its response
based on the setting of the spapr_caps cap-cfpc, cap-sbbc and cap-ibs.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_caps are used to represent the level of support for various
capabilities related to the spapr machine type. Currently there is
only support for boolean capabilities.
Add support for tristate capabilities by implementing their get/set
functions. These capabilities can have the values 0, 1 or 2
corresponding to broken, workaround and fixed.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Add three new kvm capabilities used to represent the level of host support
for three corresponding workarounds.
Host support for each of the capabilities is queried through the
new ioctl KVM_PPC_GET_CPU_CHAR which returns four uint64 quantities. The
first two, character and behaviour, represent the available
characteristics of the cpu and the behaviour of the cpu respectively.
The second two, c_mask and b_mask, represent the mask of known bits for
the character and beheviour dwords respectively.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
[dwg: Correct some compile errors due to name change in final kernel
patch version]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In various place we don't correctly check if the device supports MSI or
MSI-X. This can cause devices to be advertised with MSI support, even
if they only support MSI-X (like virtio-pci-* devices for example):
ethernet@0 {
ibm,req#msi = <0x1>; <--- wrong!
.
ibm,loc-code = "qemu_virtio-net-pci:0000:00:00.0";
.
ibm,req#msi-x = <0x3>;
};
Worse, this can also cause the "ibm,change-msi" RTAS call to corrupt the
PCI status and cause migration to fail:
qemu-system-ppc64: get_pci_config_device: Bad config data: i=0x6
read: 0 device: 10 cmask: 10 wmask: 0 w1cmask:0
^^
PCI_STATUS_CAP_LIST bit which is assumed to be constant
This patch changes spapr_populate_pci_child_dt() to properly check for
MSI support using msi_present(): this ensures that PCIDevice::msi_cap
was set by msi_init() and that msi_nr_vectors_allocated() will look at
the right place in the config space.
Checking PCIDevice::msix_entries_nr is enough for MSI-X but let's add
a call to msix_present() there as well for consistency.
It also changes rtas_ibm_change_msi() to select the appropriate MSI
type in Function 1 instead of always selecting plain MSI. This new
behaviour is compliant with LoPAPR 1.1, as described in "Table 71.
ibm,change-msi Argument Call Buffer":
Function 1: If Number Outputs is equal to 3, request to set to a new
number of MSIs (including set to 0).
If the “ibm,change-msix-capable” property exists and Number
Outputs is equal to 4, request is to set to a new number of
MSI or MSI-X (platform choice) interrupts (including set to
0).
Since MSI is the the platform default (LoPAPR 6.2.3 MSI Option), let's
check for MSI support first.
And finally, it checks the input parameters are valid, as described in
LoPAPR 1.1 "R1–7.3.10.5.1–3":
For the MSI option: The platform must return a Status of -3 (Parameter
error) from ibm,change-msi, with no change in interrupt assignments if
the PCI configuration address does not support MSI and Function 3 was
requested (that is, the “ibm,req#msi” property must exist for the PCI
configuration address in order to use Function 3), or does not support
MSI-X and Function 4 is requested (that is, the “ibm,req#msi-x” property
must exist for the PCI configuration address in order to use Function 4),
or if neither MSIs nor MSI-Xs are supported and Function 1 is requested.
This ensures that the ret_intr_type variable contains a valid MSI type
for this device, and that spapr_msi_setmsg() won't corrupt the PCI status.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu-system-ppcemb has been once split of qemu-system-ppc to support
CPU page sizes < 4096 for some of the embedded 4xx PowerPC CPUs.
However, there was hardly any OS available in the wild that really
used such small page sizes (Linux uses 4096 on PPC), so there is
no known recent use case for this separate build anymore. It's
rather cumbersome to maintain a separate set of config switches for
this, and it's wasting compile and test time of all the developers
who have to build all QEMU targets to verify that their changes did
not break anything.
Except for the small CPU page sizes, qemu-system-ppc can be used as
a full replacement for qemu-system-ppcemb since it contains all the
embedded 4xx PPC boards and CPUs, too. Thus let's start the deprecation
process for qemu-system-ppcemb to see whether somebody still needs
the small page sizes or whether we could finally remove this unloved
separate build.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This redefinition generates warnings on some clang compilers and older
gcc4.4.
...include/hw/ppc/pnv_xscom.h:24:24: warning: redefinition of typedef 'PnvChip' is a C11
feature [-Wtypedef-redefinition]
typedef struct PnvChip PnvChip;
^
...include/hw/ppc/pnv.h:65:3: note: previous definition is here
} PnvChip;
^
1 warning generated.
CC ppc64-softmmu/hw/ppc/pnv_xscom.o
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since mirror job supports efficient zero out target mechanism (see
in mirror_dirty_init()), implement bdrv_get_info to make it work
over NBD. Such improvement will allow using the largest chunk possible
and will decrease the number of NBD_CMD_WRITE_ZEROES requests on the wire.
Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
Message-Id: <20180118115158.17219-1-edgar.kaziakhmedov@virtuozzo.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Since everything else about the nbd-server-* QMP commands is
accessible from HMP, we might as well make removing an export
available as well. For now, I went with a bool flag rather
than a mode string for choosing between safe (default) and
hard modes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180125144557.25502-1-eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Add command for removing an export. It is needed for cases when we
don't want to keep the export after the operation on it was completed.
The other example is a temporary node, created with blockdev-add.
If we want to delete it we should firstly remove any corresponding
NBD export.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180119135719.24745-3-vsementsov@virtuozzo.com>
[eblake: drop dead nb_clients code]
Signed-off-by: Eric Blake <eblake@redhat.com>
Xilinx queue
# gpg: Signature made Fri 26 Jan 2018 10:17:01 GMT
# gpg: using RSA key 0x29C596780F6BCA83
# gpg: Good signature from "Edgar E. Iglesias (Xilinx key) <edgar.iglesias@xilinx.com>"
# gpg: aka "Edgar E. Iglesias <edgar.iglesias@gmail.com>"
# Primary key fingerprint: AC44 FEDC 14F7 F1EB EDBF 4151 29C5 9678 0F6B CA83
* remotes/edgar/tags/edgar/xilinx-next-2018-01-26.for-upstream:
xlnx-zynqmp: Connect the IPI device to the ZynqMP SoC
xlnx-zynqmp-pmu: Connect the IPI device to the PMU
xlnx-zynqmp-ipi: Initial version of the Xilinx IPI device
xlnx-zynqmp-pmu: Connect the PMU interrupt controller
xlnx-pmu-iomod-intc: Add the PMU Interrupt controller
aarch64-softmmu.mak: Use an ARM specific config
xlnx-zynqmp-pmu: Add the CPU and memory
xlnx-zynqmp-pmu: Initial commit of the ZynqMP PMU
microblaze: boot.c: Don't try to find NULL file
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Xilinx ZynqMP SoC has two main processing systems in it. The ARM
processing system (which is already modeled in QEMU) and the MicroBlaze
Power Management Unit (PMU). This is the inital work for adding support
for the PMU.
The PMU susbsystem runs along side the ARM system on hardware, but due
to architecture limitations in QEMU the two instances are seperate for
the time being.
Let's follow the same setup we do with the ARM system, where there is an
SoC device and a ZCU102 board. Although the PMU is less board specific
we are still going to follow the same split as maybe in future we can
connect the PMU device to the ARM ZCU102 board. As the machine will be
fairly small let's keep them both together in one file.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Previously if no device tree was passed to microblaze_load_kernel() then
qemu_find_file() would try to find a NULL pointer. To avoid this put a
check around qemu_find_file().
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Replace init() of CCIDCardClass with realize, then convert
ccid_card_init(), ccid_card_initfn() and it's callbacks to
take an Error** in ordor to report the error more clearly.
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20180125171432.13554-2-f4bug@amsat.org
[PMD: fixed s->card assignation in ccid_card_realize()]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Because usb-storage creates an internal scsi device, we should propagate
options. We already do so for bootindex etc, but failed to take care of
share-rw. Fix it in an apparent way: add a new parameter to
scsi_bus_legacy_add_drive and pass in s->conf.share_rw.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Message-id: 20180117005222.4781-1-famz@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The option have been marked as deprecated since QEMU 2.10, and so far
nobody complained that the host, serial, disk and net options are urgently
required anymore. So let's now get rid at least of this legacy pile, to
simplify the usb code quite a bit.
This patch removes the usbdevices host, serial, disk and net. These devices
use their own complicated parameter parsing mechanisms, so they are just
ugly to maintain, without real benefit for the users (the users can use the
corresponding "-device" parameters instead which have the same complexity
as the "-usbdevice" devices here).
Note that the other rather simple -usbdevice options (mouse, tablet, etc.)
are not removed yet (the code is really simple here, so it does not hurt
much to keep it), as well as the two devices "braille" and "bt" which are
easier to use with -usbdevice than with -device.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1515519171-20315-1-git-send-email-thuth@redhat.com
[kraxel] delete some usb_host_device_open() leftovers.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
target-arm queue:
* target/arm: Fix address truncation in 64-bit pagetable walks
* i.MX: Fix FEC/ENET receive functions
* target/arm: preparatory refactoring for SVE emulation
* hw/intc/arm_gic: Prevent the GIC from signaling an IRQ when it's "active and pending"
* hw/intc/arm_gic: Fix C_RPR value on idle priority
* hw/intc/arm_gic: Fix group priority computation for group 1 IRQs
* hw/intc/arm_gic: Fix the NS view of C_BPR when C_CTRL.CBPR is 1
* hw/arm/virt: Check that the CPU realize method succeeded
* sdhci: fix a NULL pointer dereference due to uninitialized AddressSpace object
* xilinx_spips: Correct usage of an uninitialized local variable
* pl110: Implement vertical compare/next base interrupts
# gpg: Signature made Thu 25 Jan 2018 12:59:25 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180125: (21 commits)
pl110: Implement vertical compare/next base interrupts
xilinx_spips: Correct usage of an uninitialized local variable
sdhci: fix a NULL pointer dereference due to uninitialized AddresSpace object
hw/arm/virt: Check that the CPU realize method succeeded
hw/intc/arm_gic: Fix the NS view of C_BPR when C_CTRL.CBPR is 1
hw/intc/arm_gic: Fix group priority computation for group 1 IRQs
hw/intc/arm_gic: Fix C_RPR value on idle priority
hw/intc/arm_gic: Prevent the GIC from signaling an IRQ when it's "active and pending"
target/arm: Simplify fp_exception_el for user-only
target/arm: Hoist store to flags output in cpu_get_tb_cpu_state
target/arm: Move cpu_get_tb_cpu_state out of line
target/arm: Add ARM_FEATURE_SVE
vmstate: Add VMSTATE_UINT64_SUB_ARRAY
target/arm: Add aa{32, 64}_vfp_{dreg, qreg} helpers
target/arm: Change the type of vfp.regs
target/arm: Use pointers in neon tbl helper
target/arm: Use pointers in neon zip/uzp helpers
target/arm: Use pointers in crypto helpers
target/arm: Mark disas_set_insn_syndrome inline
i.MX: Fix FEC/ENET receive funtions
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The instruction "moves" can select source and destination
address space (user or kernel). This patch modifies
all the load/store functions to be able to provide
the address space the caller wants to use instead
of using the current one. All the callers are modified
to provide the default address space to these functions.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-5-laurent@vivier.eu>
Only add MC68040 MMU page table processing and related
registers (Special Status Word, Translation Control Register,
User Root Pointer and Supervisor Root Pointer).
Transparent Translation Registers, DFC/SFC and pflush/ptest
will be added later.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-3-laurent@vivier.eu>
The MC68040 MMU provides the size of the access that
triggers the page fault.
This size is set in the Special Status Word which
is written in the stack frame of the access fault
exception.
So we need the size in m68k_cpu_unassigned_access() and
m68k_cpu_handle_mmu_fault().
To be able to do that, this patch modifies the prototype of
handle_mmu_fault handler, tlb_fill() and probe_write().
do_unassigned_access() already includes a size parameter.
This patch also updates handle_mmu_fault handlers and
tlb_fill() of all targets (only parameter, no code change).
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180118193846.24953-2-laurent@vivier.eu>
Drop no_frame flag from sdl_display_init argument list, use a global
variable instead. This is temporary until -no-frame support is dropped
altogether when we remove sdl1 support.
Remove any traces of noframe from sdl2 code. It is just dead code as
sdl2 doesn't support the SDL_NOFRAME window flag any more.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20180115154855.30850-3-kraxel@redhat.com
The SDL 2.0 release was made in Aug, 2013:
https://www.libsdl.org/release/
That will soon be 4 + 1/2 years ago, which is enough time to consider
the 2.0 series widely supported.
Thus we deprecate the SDL 1.2 support, which will allow us to delete it
in the last release of 2018. By this time, SDL 2.0 will be more than 5
years old.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20180115142533.24585-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The x_keycode_to_pc_keycode and evdev_keycode_to_pc_keycode
tables are replaced with automatically generated tables.
In addition the X11 heuristics are improved to detect running
on XQuartz and XWin X11 servers, to activate the correct OS-X
and Win32 keycode maps.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180117164717.15855-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Pixman returns a signed int for the image width/height, but the VNC
protocol only permits a unsigned int16. Effective framebuffer size
is determined by the guest, limited by the video RAM size, so the
dimensions are unlikely to exceed the range of an unsigned int16,
but this is not currently validated.
With the current use of 'int' for client width/height, the calculation
of offsets in vnc_update_throttle_offset() suffers from integer size
promotion and sign extension, causing coverity warnings
*** CID 1385147: Integer handling issues (SIGN_EXTENSION)
/ui/vnc.c: 979 in vnc_update_throttle_offset()
973 * than that the client would already suffering awful audio
974 * glitches, so dropping samples is no worse really).
975 */
976 static void vnc_update_throttle_offset(VncState *vs)
977 {
978 size_t offset =
>>> CID 1385147: Integer handling issues (SIGN_EXTENSION)
>>> Suspicious implicit sign extension:
"vs->client_pf.bytes_per_pixel" with type "unsigned char" (8 bits,
unsigned) is promoted in "vs->client_width * vs->client_height *
vs->client_pf.bytes_per_pixel" to type "int" (32 bits, signed), then
sign-extended to type "unsigned long" (64 bits, unsigned). If
"vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel"
is greater than 0x7FFFFFFF, the upper bits of the result will all be 1.
979 vs->client_width * vs->client_height * vs->client_pf.bytes_per_pixel;
Change client_width / client_height to be a size_t to avoid sign
extension and integer promotion. Then validate that dimensions are in
range wrt the RFB protocol u16 limits.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20180118155254.17053-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
This inbuilt device contains a single 4-byte register, of which bit 24 is used
to power down the machine on a real Ultra 5.
The power device exists at offset 0x724000 on a real machine, but due to the
current configuration of the BARs in QEMU it must be located lower in PCI IO
space.
For the moment we place the power device at offset 0x7240 as a reminder of its
original location and raise the base PCI IO address from 0x4000 to 0x8000.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
This implements rudimentary support for interrupt generation on the
PL110. I am working on a new DRI/KMS driver for Linux and since that
uses the blanking interrupt, we need something to fire here. Without
any interrupt support Linux waits for a while and then gives ugly
messages about the vblank not working in the console (it does not
hang perpetually or anything though, DRI is pretty forgiving).
I solved it for now by setting up a timer to fire at 60Hz and pull
the interrupts for "vertical compare" and "next memory base"
at this interval. This works fine and fires roughly the same number
of IRQs on QEMU as on the hardware and leaves the console clean
and nice.
People who want to create more accurate emulation can probably work
on top of this if need be. It is certainly closer to the hardware
behaviour than what we have today anyway.
Cc: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Linus Walleij <linus.walleij@linaro.org>
Message-id: 20180123225654.5764-1-linus.walleij@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: folded long lines]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We were passing a NULL error pointer to the object_property_set_bool()
call that realizes the CPU object. This meant that we wouldn't detect
failure, and would plough blindly on to crash later trying to use a
NULL CPU object pointer. Detect errors and fail instead.
In particular, this will be necessary to detect the user error
of using "-cpu host" without "-enable-kvm" once we make the host
CPU type be registered unconditionally rather than only in
kvm_arch_init().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When determining the group priority of a group 1 IRQ, if C_CTRL.CBPR is
0, the non-secure BPR value is used. However, this value must be
incremented by one so that it matches the secure world number of
implemented priority bits (NS world has one less priority bit compared
to the Secure world).
Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Message-id: 20180119145756.7629-5-luc.michel@greensocs.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
[PMM: add assert, as the gicv3 code has]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When there is no active interrupts in the GIC, a read to the C_RPR
register should return the value of the "Idle priority", which is either
the maximum value an IRQ priority field can be set to, or 0xff.
Since the QEMU GIC model implements all the 8 priority bits, the Idle
priority is 0xff.
Internally, when there is no active interrupt, the running priority
value is 0x100. The gic_get_running_priority function returns an uint8_t
and thus, truncate this value to 0x00 when returning it. This is wrong since
a value of 0x00 correspond to the maximum possible priority.
This commit fixes the returned value when the internal value is 0x100.
Note that it is correct for the Non-Secure view to return 0xff even
though from the NS world point of view, only 7 priority bits are
implemented. The specification states that the Idle priority can be 0xff
even when not all the 8 priority bits are implemented. This has been
verified against a real GICv2 hardware on a Xilinx ZynqMP based board.
Regarding the ARM11MPCore version of the GIC, the specification is not
clear on that point, so this commit does not alter its behavior.
Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Message-id: 20180119145756.7629-4-luc.michel@greensocs.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the GIC, when an IRQ is acknowledged, its state goes from "pending"
to:
- "active" if the corresponding IRQ pin has been de-asserted
- "active and pending" otherwise.
The GICv2 manual states that when a IRQ becomes active (or active and
pending), the GIC should either signal another (higher priority) IRQ to
the CPU if there is one, or de-assert the CPU IRQ pin.
The current implementation of the GIC in QEMU does not check if the
IRQ is already active when looking for pending interrupts with
sufficient priority in gic_update(). This can lead to signaling an
interrupt that is already active.
This usually happens when splitting priority drop and interrupt
deactivation. On priority drop, the IRQ stays active until deactivation.
If it becomes pending again, chances are that it will be incorrectly
selected as best_irq in gic_update().
This commit fixes this by checking if the IRQ is not already active when
looking for best_irq in gic_update().
Note that regarding the ARM11MPCore GIC version, the corresponding
manual is not clear on that point, but it has has no priority
drop/interrupt deactivation separation, so this case should not happen.
Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Message-id: 20180119145756.7629-3-luc.michel@greensocs.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The actual imx_eth_enable_rx() function is buggy.
It updates s->regs[ENET_RDAR] after calling qemu_flush_queued_packets().
qemu_flush_queued_packets() is going to call imx_XXX_receive() which itself
is going to call imx_eth_enable_rx().
By updating s->regs[ENET_RDAR] after calling qemu_flush_queued_packets()
we end up updating the register with an outdated value which might
lead to disabling the receive function in the i.MX FEC/ENET device.
This patch change the place where the register update is done so that the
register value stays up to date and the receive function can keep
running.
Reported-by: Fyleo <fyleo45@gmail.com>
Tested-by: Fyleo <fyleo45@gmail.com>
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Message-id: 20180113113445.2705-1-jcd@tribudubois.net
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Tested-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit ("3b39d734141a target/arm: Handle page table walk load failures
correctly") modified both versions of the page table walking code (i.e.,
arm_ldl_ptw and arm_ldq_ptw) to record the result of the translation in
a temporary 'data' variable so that it can be inspected before being
returned. However, arm_ldq_ptw() returns an uint64_t, and using a
temporary uint32_t variable truncates the upper bits, corrupting the
result. This causes problems when using more than 4 GB of memory in
a TCG guest. So use a uint64_t instead.
Signed-off-by: Ard Biesheuvel <ard.biesheuvel@linaro.org>
Message-id: 20180119194648.25501-1-ard.biesheuvel@linaro.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Start a vm with qemu-kvm -enable-kvm -vnc :66 -smp 1 -m 1024 -hda
redhat_5.11.qcow2 -device pcnet -vga cirrus,
then use VNC client to connect to VM, and excute the code below in guest
OS will lead to qemu crash:
int main()
{
iopl(3);
srand(time(NULL));
int a,b;
while(1){
a = rand()%0x100;
b = 0x3c0 + (rand()%0x20);
outb(a,b);
}
return 0;
}
The above code is writing the registers of VGA randomly.
We can write VGA CRT controller registers index 0x0C or 0x0D
(which is the start address register) to modify the
the display memory address of the upper left pixel
or character of the screen. The address may be out of the
range of vga ram. So we should check the validation of memory address
when reading or writing it to avoid segfault.
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-id: 20180111132724.13744-1-linzhecheng@huawei.com
Fixes: CVE-2018-5683
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Block layer patches
# gpg: Signature made Tue 23 Jan 2018 12:38:36 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (29 commits)
iotests: Disable some tests for compat=0.10
iotests: Split 177 into two parts for compat=0.10
iotests: Make 059 pass on machines with little RAM
iotests: Filter compat-dependent info in 198
iotests: Make 191 work with qcow2 options
iotests: Make 184 image-less
iotests: Make 089 compatible with compat=0.10
iotests: Fix 067 for compat=0.10
iotests: Fix 059's reference output
iotests: Fix 051 for compat=0.10
iotests: Fix 020 for vmdk
iotests: Skip 103 for refcount_bits=1
iotests: Forbid 020 for non-file protocols
iotests: Drop format-specific in _filter_img_info
iotests: Fix _img_info for backslashes
block/vmdk: Add blkdebug events
block/qcow: Add blkdebug events
qcow2: No persistent dirty bitmaps for compat=0.10
block/vmdk: Fix , instead of ; at end of line
qemu-iotests: Fix locking issue in 102
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is the final stage in correcting the naming convention with respect to
sabre, APB and PBM. It is effectively a file rename from apb.c to sabre.c
along with touching up a few constants to remove the remaining references
to APB.
Note that as part of the rename process the configuration variable
CONFIG_PCI_APB is changed to CONFIG_PCI_SABRE.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
In order to reflect the previous change of TYPE_APB to TYPE_SABRE, update
the corresponding variable names to keep the terminology consistent.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
This is the proper name for the PBM host bridge as referenced in the Sun
documentation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
As hinted in the comment at the top of the file, the naming convention for the
APB types/QOM functions isn't correct. As a starting point we can at least
rename the APB type and related functions to improve the readability of apb.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
Here we rename PBMPCIBridge to SimbaPCIBridge and the QOM type from
TYPE_PBM_PCI_BRIDGE to TYPE_SIMBA_PCI_BRIDGE in improve the clarity
of the device name.
Also touch up the relevant spots in apb.c and various other function
names as appropriate.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
Move the QOM type and macros into a new include/hw/pci-bridge/simba.h
file, and add a new CONFIG_SIMBA Makefile.objs variable which is enabled
for sparc64-softmmu builds only.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Michael S. Tsirkin <mst@redhat.com>
CC: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
With the LEON3 IRQ controller IRQs can be acknowledged 2 ways:
* Explicitly by software writing to the CLEAR_OFFSET register
* Implicitly when the procesor is done running the trap handler attached
to the IRQ.
The actual IRQMP code only allows the implicit processor triggered IRQ ack.
If software write explicitly to the CLEAR_OFFSET register, this will clear
the pending bit in the register value but this will not lower the ongoing
raised IRQ with the processor. The IRQ will be kept raised to the LEON
processor until the related trap handler is run and the processor implicitly
ack the interrupt. So with the actual IRQMP code trap handler have to be run
even if the software has already done its job by clearing the pending bit.
This feature has been tested on another LEON3 simulator (tsim_leon3 from
Gaisler) and it turns out that the Qemu implementation is not equivalent to
the tsim one. In tsim, if software does clear a pending interrupt before
the related interrupt handler is triggered the said interrupt handler will
not be called.
This patch brings the Qemu IRQMP implementation in line with the tsim
implementation by allowing IRQ to be acknowledged by software only.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Reviewed-by: Fabien Chouteau <chouteau@adacore.com>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Pull request
v2:
* Drop merge failure from a previous pull request that broke virtio-blk on ARM
guests
* Add Parallels XML patch series
# gpg: Signature made Mon 22 Jan 2018 16:00:40 GMT
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
block/parallels: add backing support to readv/writev
block/parallels: replace some magic numbers
block/parallels: move some structures into header
configure: add dependency
docs/interop/prl-xml: description of Parallels Disk format
block: add block_set_io_throttle virtio-blk-pci QMP example
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If multiple guest threads in user-mode emulation write to a
page which QEMU has marked read-only because of cached TCG
translations, the threads can race in page_unprotect:
* threads A & B both try to do a write to a page with code in it at
the same time (ie which we've made non-writeable, so SEGV)
* they race into the signal handler with this faulting address
* thread A happens to get to page_unprotect() first and takes the
mmap lock, so thread B sits waiting for it to be done
* A then finds the page, marks it PAGE_WRITE and mprotect()s it writable
* A can then continue OK (returns from signal handler to retry the
memory access)
* ...but when B gets the mmap lock it finds that the page is already
PAGE_WRITE, and so it exits page_unprotect() via the "not due to
protected translation" code path, and wrongly delivers the signal
to the guest rather than just retrying the access
In particular, this meant that trying to run 'javac' in user-mode
emulation would fail with a spurious guest SIGSEGV.
Handle this by making page_unprotect() assume that a call for a page
which is already PAGE_WRITE is due to a race of this sort and return
a "fault handled" indication.
Since this would cause an infinite loop if we ever called
page_unprotect() for some other kind of fault than "write failed due
to bad access permissions", tighten the condition in
handle_cpu_signal() to check the signal number and si_code, and add a
comment so that if somebody does ever find themselves debugging an
infinite loop of faults they have some clue about why.
(The trick for identifying the correct setting for
current_tb_invalidated for thread B (needed to handle the precise-SMC
case) is due to Richard Henderson. Paolo Bonzini suggested just
relying on si_code rather than trying anything more complicated.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1511879725-9576-3-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Currently all the architecture/OS specific cpu_signal_handler()
functions call handle_cpu_signal() without passing it the
siginfo_t. We're going to want that so we can look at the si_code
to determine whether this is a SEGV_ACCERR access violation or
some other kind of fault, so change the functions to pass through
the pointer to the siginfo_t rather than just the si_addr value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1511879725-9576-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
sched_get/setaffinity linux-user syscalls were missing conversions for
little/big endian, which is hairy since longs may not be the same size
either.
For simplicity, this just introduces loops to convert bit by bit like is
done for select.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20180109201643.1479-1-samuel.thibault@ens-lyon.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
mmap() is required by the linux kernel ABI and POSIX to return a
non-NULL address when the implementation chooses a start address for the
mapping.
The current implementation of mmap_find_vma_reserved() can return NULL
as start address of a mapping which leads to subsequent crashes inside
the guests glibc, e.g. output of qemu-arm-static --strace executing a
test binary stx_test:
1879 mmap2(NULL,8388608,PROT_READ|PROT_WRITE,MAP_PRIVATE|MAP_ANONYMOUS|0x20000,-1,0) = 0x00000000
1879 write(2,0xf6fd39d0,79) stx_test: allocatestack.c:514: allocate_stack: Assertion `mem != NULL' failed.
This patch fixes mmap_find_vma_reserved() by skipping NULL as start
address while searching for a suitable mapping start address.
CC: Riku Voipio <riku.voipio@iki.fi>
CC: Laurent Vivier <laurent@vivier.eu>
CC: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Maximilian Riemensberger <riemensberger@cadami.net>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <1515286904-86418-1-git-send-email-riemensberger@cadami.net>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The Linux struct cmsghdr is already guaranteed to be sufficiently
aligned that CMSG_ALIGN(sizeof struct cmsghdr) is always equal
to sizeof struct cmsghdr. Stop doing the unnecessary alignment
arithmetic for host and target cmsghdr.
This follows kernel commit 1ff8cebf49ed9e9ca2 and brings our
TARGET_CMSG_* macros back into line with the kernel ones,
as well as making them easier to understand.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <1513345976-22958-3-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The handling of length calculations in host_to_target_cmsg()
was rather confused:
* when checking for whether the target cmsg header fit in
the remaining buffer, we were using the host struct size,
not the target size
* we were setting tgt_len to "target payload + header length"
but then using it as if it were the target payload length alone
* in various message type cases we weren't handling the possibility
that host or target buffers were truncated
Fix these problems. The second one in particular is liable
to result in us overrunning the guest provided buffer,
since we will try to convert more data than is actually
present.
Fixes: https://bugs.launchpad.net/qemu/+bug/1701808
Reported-by: Bruno Haible <bruno@clisp.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1513345976-22958-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
When we do a fork() in usermode emulation, we need to be in
a start/end exclusive section, so that we can ensure that no
other thread is in an RCU section. Otherwise you can get this
deadlock:
- fork thread: has mmap_lock, waits for rcu_sync_lock
(because rcu_init_lock() is registered as a pthread_atfork() hook)
- RCU thread: has rcu_sync_lock, waits for rcu_read_(un)lock
- another CPU thread: in RCU critical section, waits for mmap_lock
This can show up if you have a heavily multithreaded guest program
that does a fork().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reported-by: Stuart Monteith <stuart.monteith@linaro.org>
Message-Id: <1512650481-1723-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Our locking order is that the tb lock should be taken
inside the mmap_lock, but fork_start() grabs locks the
other way around. This means that if a heavily multithreaded
guest process (such as Java) calls fork() it can deadlock,
with the thread that called fork() stuck in fork_start()
with the tb lock and waiting for the mmap lock, but some
other thread in tb_find() with the mmap lock and waiting
for the tb lock. The cpu_list_lock() should also always be
taken last, not first.
Fix this by making fork_start() grab the locks in the
right order. The order in which we drop locks doesn't
matter, so we leave fork_end() the way it is.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <1512397331-15238-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Block patches
# gpg: Signature made Tue Jan 23 12:35:11 2018 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2018-01-23: (25 commits)
iotests: Disable some tests for compat=0.10
iotests: Split 177 into two parts for compat=0.10
iotests: Make 059 pass on machines with little RAM
iotests: Filter compat-dependent info in 198
iotests: Make 191 work with qcow2 options
iotests: Make 184 image-less
iotests: Make 089 compatible with compat=0.10
iotests: Fix 067 for compat=0.10
iotests: Fix 059's reference output
iotests: Fix 051 for compat=0.10
iotests: Fix 020 for vmdk
iotests: Skip 103 for refcount_bits=1
iotests: Forbid 020 for non-file protocols
iotests: Drop format-specific in _filter_img_info
iotests: Fix _img_info for backslashes
block/vmdk: Add blkdebug events
block/qcow: Add blkdebug events
qcow2: No persistent dirty bitmaps for compat=0.10
block/vmdk: Fix , instead of ; at end of line
qemu-iotests: Fix locking issue in 102
...
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When originally written, test 177 explicitly took care to run
with compat=0.10. Then I botched my own test in commit
81c219ac and f0a9c18f, by adding additional actions that require
v3 images. Split out the new code into a new v3-only test, 204,
and revert 177 back to its original state other than a new comment.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20180117165420.15946-2-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
There is a bit of image-specific information which depends on the qcow2
compat level. Filter it so that 198 works with compat=0.10 (and any
refcount_bits value).
Note that we cannot simply drop the --format-specific switch because we
do need the "encrypt" information.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-18-mreitz@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
In order for 191 to work with an explicit refcount_bits or compat=0.10,
we should strip format-specific information from the output--and we can
do so by using _filter_img_info.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-17-mreitz@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
051 has both compat=1.1 and compat=0.10 tests (once it uses
lazy_refcounts, once it tests that setting them does not work).
For the compat=0.10 tests, it already explicitly creates a suitable
image. So let's just ignore the user-specified compat level for the
lazy_refcounts test and explicitly create a compat=1.1 image there, too.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-12-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This test does funny things like TEST_IMG="TEST_IMG.base" _make_test_img
that usually only work with the file protocol. More specifically, they
do not work with the most interesting non-file protocols, so we might as
well skip this for anything but file.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-8-mreitz@redhat.com
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
_filter_img_info should remove format-specific information, too. We
already have such a filter in _img_info, and it is very useful for
query-block-named-block-nodes (etc.), too.
However, in 198 we need that information (but we still want the rest of
the filter), so make that filtering optional. Note that "the rest of
the filter" includes filtering of the test directory, so we can drop the
_filter_testdir from 198 at the same time.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-7-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Persistent dirty bitmaps require a properly functioning
autoclear_features field, or we cannot track when an unsupporting
program might overwrite them. Therefore, we cannot support them for
compat=0.10 images.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171123020832.8165-3-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
102 truncates a qcow2 file (the raw file) on purpose while a VM is
running. However, image locking will usually prevent exactly this.
The fact that most people have not noticed until now (I suppose you may
have seen sporadic failures, but not taken them too seriously, like me)
further shows that this truncation is actually not really done
concurrently, but that the VM is still starting up by this point and has
not yet opened the image. Remedy this by waiting for the monitor shell
to appear before the qemu-img invocation so we know the VM is up.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171129185102.29390-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Now that iotest 093 test proves that the throttling configuration
survives a blockdev-remove-medium/blockdev-insert-medium pair, the
original reason for declaring these commands experimental is gone
(see commit 6e0abc251d).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110224302.14424-5-mreitz@redhat.com
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
This patch implements a test case for the scenario that was failing
prior to the patch "migration/ram.c: do not set 'postcopy_running' in
POSTCOPY_INCOMING_END", commit acab30b85d.
This new test file 201 was derived from the test file 181 authored
by Kevin Wolf.
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Cleber Rosa <crosa@redhat.com>
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pin-based interrupt of NVMe controller did not work properly
because using an obsolated function pci_irq_pulse().
To fix this, change to use pci_irq_assert() / pci_irq_deassert()
instead of pci_irq_pulse().
Signed-off-by: Hikaru Nishida <hikarupsp@gmail.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We could hit lock failure if there is a signal that makes fcntl return
-1 and errno set to EINTR. In this case we should retry.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Pull request for various patches that have been reviewed and
laying on the mailing list for a while, but apparently no
maintainer feels really responsible for picking up.
# gpg: Signature made Mon 22 Jan 2018 11:10:16 GMT
# gpg: using RSA key 0x2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>"
# gpg: aka "Thomas Huth <thuth@redhat.com>"
# gpg: aka "Thomas Huth <huth@tuxfamily.org>"
# gpg: aka "Thomas Huth <th.huth@posteo.de>"
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/huth/tags/pull-request-2018-01-22:
hw/isa: Replace fprintf(stderr, "*\n" with error_report()
hw/ipmi: Replace fprintf(stderr, "*\n" with error_report()
hw/bt: Replace fprintf(stderr, "*\n" with error_report()
Fixes after renaming __FUNCTION__ to __func__
Replace all occurances of __FUNCTION__ with __func__
tests/cpu-plug-test: Test CPU hot-plugging on s390x
tests/cpu-plug-test: Check CPU hot-plugging on ppc64, too
tests/cpu-plug-test: Check the CPU hot-plugging with device_add, too
tests: Rename pc-cpu-test.c to cpu-plug-test.c
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commits
ca6011c migration: add postcopy total blocktime into query-migrate
5f32dc8 migration: add blocktime calculation into migration-test
2f7dae9 migration: postcopy_blocktime documentation
3be98be migration: calculate vCPU blocktime on dst side
01a87f0 migration: add postcopy blocktime ctx into MigrationIncomingState
31bf06a migration: introduce postcopy-blocktime capability
as they don't build on ppc32 due to trying to do atomic accesses
on types that are larger than the host pointer type.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Coverity warnings CID 1385146, 1385148 1385149 and 1385150 point that
xtensa_opcode_num_operands and xtensa_format_num_slots may return -1
even when xtensa_opcode_decode and xtensa_format_decode succeed. In that
case unsigned counters used to iterate through operands/slots will not
do the right thing.
Make counters and loop bounds signed to fix the warnings.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The sample_controller core is a simple noMMU general purpose core, modern
analog of de212. It is used as a default core in the xtensa port of
Zephyr.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Define default core for noMMU configurations and use that core as
machine default with noMMU XTFPGA machines.
This is done to avoid offering non-working configuration (MMU core on a
noMMU machine) as a default.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The block_set_io_throttle command can look up BlockBackends by the
attached qdev device ID. virtio-blk-pci is a special case because the
actual VirtIOBlock device is the "/virtio-backend" child of the PCI
adapter device.
Add a QMP schema example so clients will know how to use
block_set_io_throttle on the virtio-blk-pci device.
The alternative is to implement some sort of aliasing for qmp_get_blk()
but that is likely to cause confusion and could break future use cases.
Let's not go there.
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20180117090700.25811-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
ppc patch queue 2018-01-21
This request supersedes the one from 2018-01-19. The only difference
is that the patch deprecating ppcemb-softmmu, and thereby creating
many annying warnings from make check has been removed.
Highlights are:
* Significant TCG speedup by optimizing cmp generation
* Fix a regression caused by recent change to set compat mode on
hotplugged cpus
* Cleanup of default configs
* Some implementation of msgsnd/msgrcv instructions for server chips
# gpg: Signature made Sun 21 Jan 2018 05:30:54 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180121:
target/ppc/spapr_caps: Add macro to generate spapr_caps migration vmstate
target/ppc: add support for hypervisor doorbells on book3s CPUs
sii3112: Add explicit type casts to avoid unintended sign extension
sm501: Add missing break to case
target-ppc: optimize cmp translation
spapr: fix device tree properties when using compatibility mode
spapr: drop duplicate variable in spapr_core_plug()
target/ppc: msgsnd and msgclr instructions need hypervisor privilege
target/ppc: fix doorbell and hypervisor doorbell definitions
hw/ppc/Makefile: Add a way to disable the PPC4xx boards
default-configs/ppc-softmmu: Restructure the switches according to the machines
default-configs/ppc64-softmmu: Include 32-bit configs instead of copying them
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We need to handle the bpb control on reset and migration. Normally
stfle.82 is transparent (and the normal guest part works without
hypervisor activity). To prevent any issues we require full
host kernel support for this feature.
Cc: qemu-stable@nongnu.org
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20180118085628.40798-3-borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: 'Branch Prediction Blocking' -> 'Branch prediction blocking']
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
CC == 2 can only happen due to a protection exception, not if memory is
not available (PGM_ADDRESSING). So all PGM_ADDRESSING exceptions have to
be forwarded to the guest.
Since the initial definition of TEST PROTECTION, we now read globals
(e.g. PSW mask), so we have to correctly mark the instruction
(otherwise, e.g. booting fedora 27 fails).
Also, the architecture explicitly specifies which exceptions are
forwarded to the guest, this makes the code a little nicer.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20180112125452.8569-1-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Fix storage attribute migration so that it does not fail for guests
with more than a few GB of RAM.
With such guests, the index in the buffer would go out of bounds,
usually by large amounts, thus receiving -EFAULT from the kernel.
Migration itself would be successful, but storage attributes would then
not be migrated completely.
This patch fixes the out of bounds access, and thus migration of all
storage attributes when the guest have large amounts of memory.
Cc: qemu-stable@nongnu.org
Signed-off-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Fixes: 903fd80b03 ("s390x/migration: Storage attributes device")
Message-Id: <1516297904-18188-1-git-send-email-imbrenda@linux.vnet.ibm.com>
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Linux crashes right now if maxmem > mem is specified on the command line.
On s390x, the guest can hotplug memory itself right now - very weird -
and e.g. Fedora 27 will simply add all memory it can when booting.
So now, we have at least the same behavior on TCG and KVM.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171218224616.21030-3-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Linux uses TEST PROTECTION to sense for available memory locations.
Let's implement what we can for now (just as for the other instructions,
excluding AR mode and special protection mechanisms).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171218224616.21030-2-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr,
"\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
[THH: Changed one missing fprintf into an error_report, too]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Replace all occurs of __FUNCTION__ except for the check in checkpatch
with the non GCC specific __func__.
One line in hcd-musb.c was manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
[THH: Removed hunks related to pxa2xx_mmci.c (fixed already)]
Signed-off-by: Thomas Huth <thuth@redhat.com>
CPU hot-plugging on s390x is possible with both, "cpu-add"
and "device_add", so test both.
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Hot plugging on ppc64 is possible via "device_add", too. Unlike x86,
we must not specify a 'socket-id' and 'thread-id' here, so this needs
to be done with a separate function that just specifies the 'core-id'
during the "device_add".
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Using 'device_add' instead of 'cpu-add' is the new way for
hot-plugging CPUs, so we should test this regularly, too.
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The test will be extended to work on other architectures, too, so let's
use a more generic name for the file and the functions in here first.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The vmstate description and the contained needed function for migration
of spapr_caps is the same for each cap, with the name of the cap
substituted. As such introduce a macro to allow for easier generation of
these.
Convert the three existing spapr_caps (htm, vsx, and dfp) to use this
macro.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The hypervisor doorbells are used by skiboot and Linux on POWER9
processors to wake up secondaries.
This adds processor control support to the Server architecture by
reusing the Embedded support. They are very similar, only the bits
definition of the CPU identifier differ.
Still to be done is message broadcast to all threads of the same
processor.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We know that only one bit (in addition to SO) is going to be set in
the condition register, so do two movconds instead of three setconds,
three shifts and two ORs.
For ppc64-linux-user, the code size reduction is around 5% and the
performance improvement slightly less than 10%. For softmmu, the
improvement is around 5%.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 51f84465dd changed the compatility mode setting logic:
- machine reset only sets compatibility mode for the boot CPU
- compatibility mode is set for other CPUs when they are put online
by the guest with the "start-cpu" RTAS call
This causes a regression for machines started with max-compat-cpu:
the device tree nodes related to secondary CPU cores contain wrong
"cpu-version" and "ibm,pa-features" values, as shown below.
Guest started on a POWER8 host with:
-smp cores=2 -machine pseries,max-cpu-compat=compat7
ibm,pa-features = [18 00 f6 3f c7 c0 80 f0 80 00
00 00 00 00 00 00 00 00 80 00 80 00 80 00 00 00];
cpu-version = <0x4d0200>;
^^^
second CPU core
ibm,pa-features = <0x600f63f 0xc70080c0>;
cpu-version = <0xf000003>;
^^^
boot CPU core
The second core is advertised in raw POWER8 mode. This happens because
CAS assumes all CPUs to have the same compatibility mode. Since the
boot CPU already has the requested compatibility mode, the CAS code
does not set it for the secondary one, and exposes the bogus device
tree properties in in the CAS response to the guest.
A similar situation is observed when hot-plugging a CPU core. The
related device tree properties are generated and exposed to guest
with the "ibm,configure-connector" RTAS before "start-cpu" is called.
The CPU core is advertised to the guest in raw mode as well.
It both cases, it boils down to the fact that "start-cpu" happens too
late. This can be fixed globally by propagating the compatibility mode
of the boot CPU to the other CPUs during reset. For this to work, the
compatibility mode of the boot CPU must be set before the machine code
actually resets all CPUs.
It is not needed to set the compatibility mode in "start-cpu" anymore,
so the code is dropped.
Fixes: 51f84465dd
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
A variable is already defined at the begining of the function to
hold a pointer to the CPU core object:
sPAPRCPUCore *core = SPAPR_CPU_CORE(OBJECT(dev));
No need to define it again in the pre-2.10 compatibility code snipplet.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit f03a1af581 ("ppc: Fix POWER7 and POWER8 exception definitions")
introduced definitions for the server doorbell exceptions by reusing
the embedded definitions but this adds complexity in the powerpc_excp()
routine. Let's introduce specific definitions for the Server doorbells
exception.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We've got the config switch CONFIG_PPC4XX, so we should use it
in the Makefile accordingly and only include the PPC4xx boards
if this switch has been enabled. (Note: Unfortunately, the files
ppc4xx_devs.c and ppc405_uc.c still have to be included in the
build anyway to fulfil some complicated linker dependencies ...
so these are subject to a more thourough clean-up later)
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Order the CONFIG switches in ppc-softmmu.mak according to the machine
classes where they are used (embedded, Mac or PReP), so that it is
easier for the users to disable a set of switches completely if they
are not needed.
Also add the missing CONFIG_IDE_SII3112 switch to the embedded section
which was previously only added to ppcemb-softmmu.mak.
And while we're at it, also remove the CONFIG_IDE_CMD646 switch since
this controller does not seem to be used by any ppc machine in QEMU.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
qemu-softmmu-ppc64 is supposed to be a superset of qemu-softmmu-ppc.
However, instead of simply including the 32-bit config file, we've
duplicated all CONFIG_xxx settings there instead. This way, we've missed
some CONFIG switches in ppc64-softmmu.mak which were only added to the
32-bit config file (e.g. CONFIG_SUNGEM). Let's fix this problem by
including the 32-bit config file into the 64-bit config file instead
of duplicating all the CONFIG switches there.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The same definitions can also be found in include/hw/ide/ahci.h
so let's remove these #defines from ahci_internal.h.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1512457825-3847-1-git-send-email-thuth@redhat.com
[Maintainer edit: publicize object names, privatize object macros.]
Signed-off-by: John Snow <jsnow@redhat.com>
ATA8-ACS3, 7.9 DATA SET MANAGEMENT - 06h, DMA
7.9.5 Error Outputs
If the Trim bit is set to one and:
a) the device detects an invalid LBA Range Entry; or
b) count is greater than IDENTIFY DEVICE data word 105
(see 7.16.7.55),
then the device shall return command aborted.
A device may trim one or more LBA Range Entries before it returns
command aborted. See table 209.
This check is not in the common ide_dma_cb() as the range for TRIM
is harder to reach: it is not in LBA/count registers and the buffer has
to be parsed first.
Signed-off-by: Anton Nefedov <anton.nefedov@virtuozzo.com>
Message-id: 1512735034-35327-4-git-send-email-anton.nefedov@virtuozzo.com
Signed-off-by: John Snow <jsnow@redhat.com>
When all the fw_cfg slots are used, a write is made outside the
bounds of the fw_cfg files array as part of the sort algorithm.
Fix it by avoiding an unnecessary array element move.
Fix also an assert while at it.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Message-Id: <20180108215007.46471-1-marcel@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Remove dependency of possible_cpus on 1st CPU instance,
which decouples configuration data from CPU instances that
are created using that data.
Also later it would be used for enabling early cpu to numa node
configuration at runtime qmp_query_hotpluggable_cpus() should
provide a list of available cpu slots at early stage,
before machine_init() is called and the 1st cpu is created,
so that mgmt might be able to call it and use output to set
numa mapping.
Use MachineClass::possible_cpu_arch_ids() callback to set
cpu type info, along with the rest of possible cpu properties,
to let machine define which cpu type* will be used.
* for SPAPR it will be a spapr core type and for ARM/s390x/x86
a respective descendant of CPUClass.
Move parse_numa_opts() in vl.c after cpu_model is parsed into
cpu_type so that possible_cpu_arch_ids() would know which
cpu_type to use during layout initialization.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <1515597770-268979-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently the only vNVDIMM backend can guarantee the guest write
persistence is device DAX on Linux, because no host-side kernel cache
is involved in the guest access to it. The approach to detect whether
the backend is device DAX needs to access sysfs, which may not work
with SELinux.
Instead, we add the 'unarmed' option to device 'nvdimm', so that users
or management utils, which have enough knowledge about the backend,
can control the unarmed flag in guest ACPI NFIT via this option. The
guest Linux NVDIMM driver, for example, will mark the corresponding
vNVDIMM device read-only if the unarmed flag in guest NFIT is set.
The default value of 'unarmed' option is 'off' in order to keep the
backwards compatibility.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-4-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When mmap(2) the backend files, QEMU uses the host page size
(getpagesize(2)) by default as the alignment of mapping address.
However, some backends may require alignments different than the page
size. For example, mmap a device DAX (e.g., /dev/dax0.0) on Linux
kernel 4.13 to an address, which is 4K-aligned but not 2M-aligned,
fails with a kernel message like
[617494.969768] dax dax0.0: qemu-system-x86: dax_mmap: fail, unaligned vma (0x7fa37c579000 - 0x7fa43c579000, 0x1fffff)
Because there is no common approach to get such alignment requirement,
we add the 'align' option to 'memory-backend-file', so that users or
management utils, which have enough knowledge about the backend, can
specify a proper alignment via this option.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171211072806.2812-2-haozhong.zhang@intel.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[ehabkost: fixed typo, fixed error_setg() format string]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The qdev_unplug() function contains a g_assert(hotplug_ctrl) statement,
so QEMU crashes when the user tries to device_add + device_del a device
that does not have a corresponding hotplug controller. This could be
provoked for a couple of devices in the past (see commit 4c93950659
or 84ebd3e8c7 for example), and can currently for example also be
triggered like this:
$ s390x-softmmu/qemu-system-s390x -M none -nographic
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) device_add qemu-s390x-cpu,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
So devices clearly need a hotplug controller when they should be usable
with device_add.
The code in qdev_device_add() already checks whether the bus has a proper
hotplug controller, but for devices that do not have a corresponding bus,
there is no appropriate check available yet. In that case we should check
whether the machine itself provides a suitable hotplug controller and
refuse to plug the device if none is available.
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1509617407-21191-3-git-send-email-thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
pc, pci, virtio: features, fixes, cleanups
A bunch of fixes, cleanus and new features all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 18 Jan 2018 20:41:03 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (29 commits)
vhost: remove assertion to prevent crash
vhost-user: fix misaligned access to payload
vhost-user: factor out msg head and payload
tests: acpi: add comments to fetch_rsdt_referenced_tables/data->tables usage
tests: acpi: rename test_acpi_tables()/test_dst_table() to reflect its usage
tests: acpi: init table descriptor in test_dst_table()
tests: acpi: move tested tables array allocation outside of test_acpi_dsdt_table()
x86_iommu: check if machine has PCI bus
x86_iommu: Move machine check to x86_iommu_realize()
vhost-user-test: use init_virtio_dev in multiqueue test
vhost-user-test: make features mask an init_virtio_dev() argument
vhost-user-test: setup virtqueues in all tests
vhost-user-test: extract read-guest-mem test from main loop
vhost-user-test: fix features mask
hw/acpi-build: Make next_base easy to follow
ACPI/unit-test: Add a testcase for RAM allocation in numa node
hw/pci-bridge: fix QEMU crash because of pcie-root-port
intel-iommu: Extend address width to 48 bits
intel-iommu: Redefine macros to enable supporting 48 bit address width
vhost-user: fix multiple queue specification
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU will assert on vhost-user backed virtio device hotplug if QEMU is
using more RAM regions than VHOST_MEMORY_MAX_NREGIONS (for example if
it were started with a lot of DIMM devices).
Fix it by returning error instead of asserting and let callers of
vhost_set_mem_table() handle error condition gracefully.
Cc: qemu-stable@nongnu.org
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We currently take a pointer to a misaligned field of a packed structure.
clang reports this as a build warning.
A fix is to keep payload in a separate structure, and access is it
from there using a vectored write.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Main purpose of test_dst_table() is loading a table from QEMU
with checking that checksum in header matches actual one,
rename it reflect main action it performs.
Likewise test_acpi_tables() name is to broad, while the function
only loads tables referenced by RSDT, rename it to reflect it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
remove code duplication and make sure that table descriptor
passed in for initialization is in expected state.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
at best it's confusing that array for list of tables to be tested
against reference tables is allocated within test_acpi_dsdt_table()
and at worst it would just overwrite list of tables if they were
added before test_acpi_dsdt_table().
Move array initialization to test_acpi_one() before we start
processing tables.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Starting qemu with
qemu-system-x86_64 -S -M isapc -device {amd|intel}-iommu
leads to a segfault. The code assume PCI bus is present and
tries to access the bus structure without checking.
Since Intel VT-d and AMDVI should only work with PCI, add a
check for PCI bus and return error if not present.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Instead of having the same error checks in vtd_realize()
and amdvi_realize(), move that over to the generic
x86_iommu_realize().
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Mohammed Gamal <mgamal@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Now that init_virtio_dev() has been generalized to all cases,
use it in test_multiqueue() to avoid code duplication.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The goal is to generalize the use of [un]init_virtio_dev() to
all tests, which does not necessarily expose the same features
set.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Only the multiqueue test setups the virtqueues.
This patch generalizes the setup of virtqueues for all tests.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This patch makes read-guest-test consistent with other tests,
i.e. create the test server in the test function.
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
It may be hard to read the assignment statement of "next_base", so
S/next_base += (1ULL << 32) - pcms->below_4g_mem_size;
/next_base = mem_base + mem_len;
... for readability.
No functionality change.
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
As QEMU supports the memory-less node, it is possible that there is
no RAM in the first numa node(also be called as node0). eg:
... \
-m 128,slots=3,maxmem=1G \
-numa node -numa node,mem=128M \
But, this makes it hard for QEMU to build a known-to-work ACPI SRAT
table. Only fixing it is not enough.
Add a testcase for this situation to make sure the ACPI table is
correct for guest.
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If we try to use more pcie_root_ports then available slots
and an IO hint is passed to the port, QEMU crashes because
we try to init the "IO hint" capability even if the device
is not created.
Fix it by checking for error before adding the capability,
so QEMU can fail gracefully.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current implementation of Intel IOMMU code only supports 39 bits
iova address width. This patch provides a new parameter (x-aw-bits)
for intel-iommu to extend its address width to 48 bits but keeping the
default the same (39 bits). The reason for not changing the default
is to avoid potential compatibility problems with live migration of
intel-iommu enabled QEMU guest. The only valid values for 'x-aw-bits'
parameter are 39 and 48.
After enabling larger address width (48), we should be able to map
larger iova addresses in the guest. For example, a QEMU guest that
is configured with large memory ( >=1TB ). To check whether 48 bits
aw is enabled, we can grep in the guest dmesg output with line:
"DMAR: Host address width 48".
Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The current implementation of Intel IOMMU code only supports 39 bits
host/iova address width so number of macros use hard coded values based
on that. This patch is to redefine them so they can be used with
variable address widths. This patch doesn't add any new functionality
but enables adding support for 48 bit address width.
Signed-off-by: Prasad Singamsetty <prasad.singamsety@oracle.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The number of queues supported by the slave is queried with
message VHOST_USER_GET_QUEUE_NUM, not with message
VHOST_USER_GET_PROTOCOL_FEATURES.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This function should be declared in generic header file so we can
utilize it.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yuval Shaia <yuval.shaia@oracle.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The loading time of a VM is quite significant when its virtio
devices use a large amount of virt-queues (e.g. a virtio-serial
device with max_ports=511). Most of the time is spend in the
creation of all the required event notifiers (ioeventfd and memory
regions).
This patch pack all the changes to the memory regions in a
single memory transaction.
Reported-by: Sitong Liu <siliu@redhat.com>
Reported-by: Xiaoling Gao <xiagao@redhat.com>
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Use the EventNotifier's cleanup callback function to execute the
event_notifier_cleanup function after kvm unregistered the eventfd.
This change supports running the virtio_bus_set_host_notifier
function inside a memory region transaction. Otherwise, a closed
fd is sent to kvm, which results in a failure.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Adding a cleanup callback function to the EventNotifier struct
which allows users to execute event_notifier_cleanup in a
different context.
Signed-off-by: Gal Hammer <ghammer@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit introduces a vhost-user-blk backend device, it uses UNIX
domain socket to communicate with QEMU. The vhost-user-blk sample
application should be used with QEMU vhost-user-blk-pci device.
To use it, complie with:
make vhost-user-blk
and start like this:
vhost-user-blk -b /dev/sdb -s /path/vhost.socket
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Enable VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages in
libvhost-user library, users can implement their own I/O target
based on the library. This enable the virtio config space delivered
between QEMU host device and the I/O target.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This commit introduces a new vhost-user device for block, it uses a
chardev to connect with the backend, same with Qemu virito-blk device,
Guest OS still uses the virtio-blk frontend driver.
To use it, start QEMU with command line like this:
qemu-system-x86_64 \
-chardev socket,id=char0,path=/path/vhost.socket \
-device vhost-user-blk-pci,chardev=char0,num-queues=2, \
bootindex=2... \
Users can use different parameters for `num-queues` and `bootindex`.
Different with exist Qemu virtio-blk host device, it makes more easy
for users to implement their own I/O processing logic, such as all
user space I/O stack against hardware block device. It uses the new
vhost messages(VHOST_USER_GET_CONFIG) to get block virtio config
information from backend process.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add VHOST_USER_GET_CONFIG/VHOST_USER_SET_CONFIG messages which can be
used for live migration of vhost user devices, also vhost user devices
can benefit from the messages to get/set virtio config space from/to the
I/O target. For the purpose to support virtio config space change,
VHOST_USER_SLAVE_CONFIG_CHANGE_MSG message is added as the event notifier
in case virtio config space change in the slave I/O target.
Signed-off-by: Changpeng Liu <changpeng.liu@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
x86 queue, 2018-01-17
Highlight: new CPU models that expose CPU features that guests
can use to mitigate CVE-2017-5715 (Spectre variant #2).
# gpg: Signature made Thu 18 Jan 2018 02:00:03 GMT
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
i386: Add EPYC-IBPB CPU model
i386: Add new -IBRS versions of Intel CPU models
i386: Add FEAT_8000_0008_EBX CPUID feature word
i386: Add spec-ctrl CPUID bit
i386: Add support for SPEC_CTRL MSR
i386: Change X86CPUDefinition::model_id to const char*
target/i386: add clflushopt to "Skylake-Server" cpu model
pc: add 2.12 machine types
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-01-17
Another pull request for ppc related patches. The most interesting
thing here is the new capabilities framework for the pseries machine
type. This gives us better handling of several existing
incompatibilities between TCG, PR and HV KVM, as well as new ones that
arise with POWER9. Further, it will allow reasonable handling of the
advertisement of features necessary to mitigate the recent CVEs
(Spectre and Meltdown).
In addition there's:
* Improvide handling of different "vsmt" modes
* Significant enhancements to the "pnv" machine type
* Assorted other bugfixes
# gpg: Signature made Wed 17 Jan 2018 02:21:50 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180117: (22 commits)
target-ppc: Fix booke206 tlbwe TLB instruction
target/ppc: add support for POWER9 HILE
ppc/pnv: change initrd address
ppc/pnv: fix XSCOM core addressing on POWER9
ppc/pnv: introduce pnv*_is_power9() helpers
ppc/pnv: change core mask for POWER9
ppc/pnv: use POWER9 DD2 processor
tests/boot-serial-test: fix powernv support
ppc/pnv: Update skiboot firmware image
spapr: Adjust default VSMT value for better migration compatibility
spapr: Allow some cases where we can't set VSMT mode in the kernel
target/ppc: Clarify compat mode max_threads value
ppc: Change Power9 compat table to support at most 8 threads/core
spapr: Remove unnecessary 'options' field from sPAPRCapabilityInfo
hw/ppc/spapr_caps: Rework spapr_caps to use uint8 internal representation
spapr: Handle Decimal Floating Point (DFP) as an optional capability
spapr: Handle VMX/VSX presence as an spapr capability flag
target/ppc: Clean up probing of VMX, VSX and DFP availability on KVM
spapr: Validate capabilities on migration
spapr: Treat Hardware Transactional Memory (HTM) as an optional capability
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than making every callsite perform length sanity checks
and error reporting, add the helper functions nbd_opt_read()
and nbd_opt_drop() that use the length stored in the client
struct; also add an assertion that optlen is 0 before any
option (ie. any previous option was fully handled), complementing
the assertion added in an earlier patch that optlen is 0 after
all negotiation completes.
Note that the call in nbd_negotiate_handle_export_name() does
not use the new helper (in part because the server cannot
reply to NBD_OPT_EXPORT_NAME - it either succeeds or the
connection drops).
Based on patches by Vladimir Sementsov-Ogievskiy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-6-eblake@redhat.com>
When a client abruptly disconnects before we've finished reading
the name sent with NBD_OPT_EXPORT_NAME, we are better off logging
the failure as EIO (we can't communicate with the client), rather
than EINVAL (the client sent bogus data).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-4-eblake@redhat.com>
Instead of passing currently negotiating option and its length to
many of negotiation functions let's just store them on NBDClient
struct to be state-variables of negotiation phase.
This unifies semantics of negotiation functions and allows
tracking changes of remaining option length in future patches.
Asssert that optlen is back to 0 after negotiation (including
old-style connections which don't negotiate), although we need
more patches before we can assert optlen is 0 between options
during negotiation.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171122101958.17065-2-vsementsov@virtuozzo.com>
[eblake: rebase, commit message tweak, assert !optlen after
negotiation completes]
Signed-off-by: Eric Blake <eblake@redhat.com>
No semantic change, but will make it easier for an upcoming patch
to refactor code without having to add forward declarations. Fix
a poor comment while at it.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20180110230825.18321-2-eblake@redhat.com>
The new MSR IA32_SPEC_CTRL MSR was introduced by a recent Intel
microcode updated and can be used by OSes to mitigate
CVE-2017-5715. Unfortunately we can't change the existing CPU
models without breaking existing setups, so users need to
explicitly update their VM configuration to use the new *-IBRS
CPU model if they want to expose IBRS to guests.
The new CPU models are simple copies of the existing CPU models,
with just CPUID_7_0_EDX_SPEC_CTRL added and model_id updated.
Cc: Jiri Denemark <jdenemar@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-6-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It is valid to have a 48-character model ID on CPUID, however the
definition of X86CPUDefinition::model_id is char[48], which can
make the compiler drop the null terminator from the string.
If a CPU model happens to have 48 bytes on model_id, "-cpu help"
will print garbage and the object_property_set_str() call at
x86_cpu_load_def() will read data outside the model_id array.
We could increase the array size to 49, but this would mean the
compiler would not issue a warning if a 49-char string is used by
mistake for model_id.
To make things simpler, simply change model_id to be const char*,
and validate the string length using an assert() on
x86_register_cpudef_type().
Reported-by: "Dr. David Alan Gilbert" <dgilbert@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20180109154519.25634-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
When overwritting a valid TLB entry with a new one, the previous page
were not flushed in QEMU TLB, leading to incoherent mapping. This commit
fixes this.
Signed-off-by: Luc MICHEL <luc.michel@git.antfield.fr>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When skiboot starts, it first clears the CPU structs for all possible
CPUs on a system :
for (i = 0; i <= cpu_max_pir; i++)
memset(&cpu_stacks[i].cpu, 0, sizeof(struct cpu_thread));
On POWER9, cpu_max_pir is quite big, 0x7fff, and the skiboot cpu_stacks
array overlaps with the memory region in which QEMU maps the initramfs
file. Move it upwards in memory to keep it safe.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The XSCOM base address of the core chiplet was wrongly calculated. Use
the OPAL macros to fix that and do a couple of renames.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These are useful when instantiating device models which are shared
between the POWER8 and the POWER9 processor families.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When addressed by XSCOM, the first core has the 0x20 chiplet ID but
the CPU PIR can start at 0x0.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
commit 1ed9c8af50 ("target/ppc: Add POWER9 DD2.0 model information")
deprecated the POWER9 model v1.0.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Recent commit introduced the firmware image skiboot 5.9 which
has a different first line ouput.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is skiboot 5.9 (commit e0ee24c2). It brings improved POWER9
support among many other things. Built from submodule.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
fa98fbfc "PC: KVM: Support machine option to set VSMT mode" introduced the
"vsmt" parameter for the pseries machine type, which controls the spacing
of the vcpu ids of thread 0 for each virtual core. This was done to bring
some consistency and stability to how that was done, while still allowing
backwards compatibility for migration and otherwise.
The default value we used for vsmt was set to the max of the host's
advertised default number of threads and the number of vthreads per vcore
in the guest. This was done to continue running without extra parameters
on older KVM versions which don't allow the VSMT value to be changed.
Unfortunately, even that smaller than before leakage of host configuration
into guest visible configuration still breaks things. Specifically a guest
with 4 (or less) vthread/vcore will get a different vsmt value when
running on a POWER8 (vsmt==8) and POWER9 (vsmt==4) host. That means the
vcpu ids don't line up so you can't migrate between them, though you should
be able to.
Long term we really want to make vsmt == smp_threads for sufficiently
new machine types. However, that means that qemu will then require a
sufficiently recent KVM (one which supports changing VSMT) - that's still
not widely enough deployed to be really comfortable to do.
In the meantime we need some default that will work as often as
possible. This patch changes that default to 8 in all circumstances.
This does change guest visible behaviour (including for existing
machine versions) for many cases - just not the most common/important
case.
Following is case by case justification for why this is still the least
worst option. Note that any of the old behaviours can still be duplicated
after this patch, it's just that it requires manual intervention by
setting the vsmt property on the command line.
KVM HV on POWER8 host:
This is the overwhelmingly common case in production setups, and is
unchanged by design. POWER8 hosts will advertise a default VSMT mode
of 8, and > 8 vthreads/vcore isn't permitted
KVM HV on POWER7 host:
Will break, but POWER7s allowing KVM were never released to the public.
KVM HV on POWER9 host:
Not yet released to the public, breaking this now will reduce other
breakage later.
KVM HV on PowerPC 970:
Will theoretically break it, but it was barely supported to begin with
and already required various user visible hacks to work. Also so old
that I just don't care.
TCG:
This is the nastiest one; it means migration of TCG guests (without
manual vsmt setting) will break. Since TCG is rarely used in production
I think this is worth it for the other benefits. It does also remove
one more barrier to TCG<->KVM migration which could be interesting for
debugging applications.
KVM PR:
As with TCG, this will break migration of existing configurations,
without adding extra manual vsmt options. As with TCG, it is rare in
production so I think the benefits outweigh breakages.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
At present if we require a vsmt mode that's not equal to the kernel's
default, and the kernel doesn't let us change it (e.g. because it's an old
kernel without support) then we always fail.
But in fact we can cope with the kernel having a different vsmt as long as
a) it's >= the actual number of vthreads/vcore (so that guest threads
that are supposed to be on the same core act like it)
b) it's a submultiple of the requested vsmt mode (so that guest threads
spaced by the vsmt value will act like they're on different cores)
Allowing this case gives us a bit more freedom to adjust the vsmt behaviour
without breaking existing cases.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Tested-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
We recently had some discussions that were sidetracked for a while, because
nearly everyone misapprehended the purpose of the 'max_threads' field in
the compatiblity modes table. It's all about guest expectations, not host
expectations or support (that's handled elsewhere).
In an attempt to avoid a repeat of that confusion, rename the field to
'max_vthreads' and add an explanatory comment.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
Increases the max smt mode to 8 for Power9. That's because KVM supports
smt emulation in this platform so QEMU should allow users to use it as
well.
Today if we try to pass -smp ...,threads=8, QEMU will silently truncate
it to smt4 mode and may cause a crash if we try to perform a cpu
hotplug.
Signed-off-by: Jose Ricardo Ziviani <joserz@linux.vnet.ibm.com>
[dwg: Added an explanatory comment]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The options field here is intended to list the available values for the
capability. It's not used yet, because the existing capabilities are
boolean.
We're going to add capabilities that aren't, but in that case the info on
the possible values can be folded into the .description field.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently spapr_caps are tied to boolean values (on or off). This patch
reworks the caps so that they can have any uint8 value. This allows more
capabilities with various values to be represented in the same way
internally. Capabilities are numbered in ascending order. The internal
representation of capability values is an array of uint8s in the
sPAPRMachineState, indexed by capability number.
Capabilities can have their own name, description, options, getter and
setter functions, type and allow functions. They also each have their own
section in the migration stream. Capabilities are only migrated if they
were explictly set on the command line, with the assumption that
otherwise the default will match.
On migration we ensure that the capability value on the destination
is greater than or equal to the capability value from the source. So
long at this remains the case then the migration is considered
compatible and allowed to continue.
This patch implements generic getter and setter functions for boolean
capabilities. It also converts the existings cap-htm, cap-vsx and
cap-dfp capabilities to this new format.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Decimal Floating Point has been available on POWER7 and later (server)
cpus. However, it can be disabled on the hypervisor, meaning that it's
not available to guests.
We currently handle this by conditionally advertising DFP support in the
device tree depending on whether the guest CPU model supports it - which
can also depend on what's allowed in the host for -cpu host. That can lead
to confusion on migration, since host properties are silently affecting
guest visible properties.
This patch handles it by treating it as an optional capability for the
pseries machine type.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
We currently have some conditionals in the spapr device tree code to decide
whether or not to advertise the availability of the VMX (aka Altivec) and
VSX vector extensions to the guest, based on whether the guest cpu has
those features.
This can lead to confusion and subtle failures on migration, since it makes
a guest visible change based only on host capabilities. We now have a
better mechanism for this, in spapr capabilities flags, which explicitly
depend on user options rather than host capabilities.
Rework the advertisement of VSX and VMX based on a new VSX capability. We
no longer bother with a conditional for VMX support, because every CPU
that's ever been supported by the pseries machine type supports VMX.
NOTE: Some userspace distributions (e.g. RHEL7.4) already rely on
availability of VSX in libc, so using cap-vsx=off may lead to a fatal
SIGILL in init.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
When constructing the "host" cpu class we modify whether the VMX and VSX
vector extensions and DFP (Decimal Floating Point) are available
based on whether KVM can support those instructions. This can depend on
policy in the host kernel as well as on the actual host cpu capabilities.
However, the way we probe for this is not very nice: we explicitly check
the host's device tree. That works in practice, but it's not really
correct, since the device tree is a property of the host kernel's platform
which we don't really know about. We get away with it because the only
modern POWER platforms happen to encode VMX, VSX and DFP availability in
the device tree in the same way.
Arguably we should have an explicit KVM capability for this, but we haven't
needed one so far. Barring specific KVM policies which don't yet exist,
each of these instruction classes will be available in the guest if and
only if they're available in the qemu userspace process. We can determine
that from the ELF AUX vector we're supplied with.
Once reworked like this, there are no more callers for kvmppc_get_vmx() and
kvmppc_get_dfp() so remove them.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Now that the "pseries" machine type implements optional capabilities (well,
one so far) there's the possibility of having different capabilities
available at either end of a migration. Although arguably a user error,
it would be nice to catch this situation and fail as gracefully as we can.
This adds code to migrate the capabilities flags. These aren't pulled
directly into the destination's configuration since what the user has
specified on the destination command line should take precedence. However,
they are checked against the destination capabilities.
If the source was using a capability which is absent on the destination,
we fail the migration, since that could easily cause a guest crash or other
bad behaviour. If the source lacked a capability which is present on the
destination we warn, but allow the migration to proceed.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
This adds an spapr capability bit for Hardware Transactional Memory. It is
enabled by default for pseries-2.11 and earlier machine types. with POWER8
or later CPUs (as it must be, since earlier qemu versions would implicitly
allow it). However it is disabled by default for the latest pseries-2.12
machine type.
This means that with the latest machine type, HTM will not be available,
regardless of CPU, unless it is explicitly enabled on the command line.
That change is made on the basis that:
* This way running with -M pseries,accel=tcg will start with whatever cpu
and will provide the same guest visible model as with accel=kvm.
- More specifically, this means existing make check tests don't have
to be modified to use cap-htm=off in order to run with TCG
* We hope to add a new "HTM without suspend" feature in the not too
distant future which could work on both POWER8 and POWER9 cpus, and
could be enabled by default.
* Best guesses suggest that future POWER cpus may well only support the
HTM-without-suspend model, not the (frankly, horribly overcomplicated)
POWER8 style HTM with suspend.
* Anecdotal evidence suggests problems with HTM being enabled when it
wasn't wanted are more common than being missing when it was.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Because PAPR is a paravirtual environment access to certain CPU (or other)
facilities can be blocked by the hypervisor. PAPR provides ways to
advertise in the device tree whether or not those features are available to
the guest.
In some places we automatically determine whether to make a feature
available based on whether our host can support it, in most cases this is
based on limitations in the available KVM implementation.
Although we correctly advertise this to the guest, it means that host
factors might make changes to the guest visible environment which is bad:
as well as generaly reducing reproducibility, it means that a migration
between different host environments can easily go bad.
We've mostly gotten away with it because the environments considered mature
enough to be well supported (basically, KVM on POWER8) have had consistent
feature availability. But, it's still not right and some limitations on
POWER9 is going to make it more of an issue in future.
This introduces an infrastructure for defining "sPAPR capabilities". These
are set by default based on the machine version, masked by the capabilities
of the chosen cpu, but can be overriden with machine properties.
The intention is at reset time we verify that the requested capabilities
can be supported on the host (considering TCG, KVM and/or host cpu
limitations). If not we simply fail, rather than silently modifying the
advertised featureset to the guest.
This does mean that certain configurations that "worked" may now fail, but
such configurations were already more subtly broken.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
As stated in the 1ad9f0a464 commit log, the returned entries are not
a whole PTEG. It was not a problem before 1ad9f0a464 as it would read
a single record assuming it contains a whole PTEG but now the code tries
reading the entire PTEG and "if ((n - i) < invalid)" produces negative
values which then are converted to size_t for memset() and that throws
seg fault.
This fixes the math.
While here, fix the last @i increment as well.
Fixes: 1ad9f0a464 "target/ppc: Fix KVM-HV HPTE accessors"
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We recently relaxed the limit of the number of opcodes that can
appear in a TranslationBlock. In certain cases this has resulted
in relocation overflow.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The code sequence we were generating was only good for unsigned
comparisons. For signed comparisions, use the sequence from gcc.
Fixes booting of ppc64 firmware, with a patch changing the code
sequence for ppc comparisons.
Tested-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
target-arm queue:
* SDHCI: cleanups and minor bug fixes
* target/arm: minor refactor preparatory to fp16 support
* omap_ssd, ssi-sd, pl181, milkymist-memcard: reset the SD
card on controller reset (fixes migration failures)
* target/arm: Handle page table walk load failures correctly
* hw/arm/virt: Add virt-2.12 machine type
* get_phys_addr_pmsav7: Support AP=0b111 for v7M
* hw/intc/armv7m: Support byte and halfword accesses to CFSR
# gpg: Signature made Tue 16 Jan 2018 13:33:31 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20180116: (24 commits)
sdhci: add a 'dma' property to the sysbus devices
sdhci: fix the PCI device, using the PCI address space for DMA
sdhci: Implement write method of ACMD12ERRSTS register
sdhci: fix CAPAB/MAXCURR registers, both are 64bit and read-only
sdhci: rename the SDHC_CAPAB register
sdhci: move MASK_TRNMOD with other SDHC_TRN* defines in "sd-internal.h"
sdhci: convert the DPRINT() calls into trace events
sdhci: use qemu_log_mask(UNIMP) instead of fprintf()
sdhci: refactor common sysbus/pci unrealize() into sdhci_common_unrealize()
sdhci: refactor common sysbus/pci realize() into sdhci_common_realize()
sdhci: refactor common sysbus/pci class_init() into sdhci_common_class_init()
sdhci: use DEFINE_SDHCI_COMMON_PROPERTIES() for common sysbus/pci properties
sdhci: remove dead code
sdhci: clean up includes
target/arm: Add fp16 support to vfp_expand_imm
target/arm: Split out vfp_expand_imm
hw/sd/omap_mmc: Reset SD card on controller reset
hw/sd/ssi-sd: Reset SD card on controller reset
hw/sd/milkymist-memcard: Reset SD card on controller reset
hw/sd/pl181: Reset SD card on controller reset
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This script allows analysis of mutex acquisition and hold times based
on a trace file. Given a trace control file of:
qemu_mutex_lock
qemu_mutex_locked
qemu_mutex_unlock
And running with:
$QEMU $QEMU_ARGS -trace events=./lock-trace
You can analyse the results with:
./scripts/analyse-locks-simpletrace.py trace-events-all ./trace-21812
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Flushing TB cache is required because TBs key in the cache may match
different code which existed in the previous state.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru>
Message-Id: <20180110134846.12940.99993.stgit@pasha-VirtualBox>
[Add comment suggested by Peter Maydell. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
The dirty bitmaps are built from 'long's and there is fast-path code
for synchronising the case where the RAMBlock is aligned to the start
of a long boundary. Align the allocation to this boundary
to cause the fast path to be used.
Offsets before change:
11398@1515169675.018566:find_ram_offset size: 0x1e0000 @ 0x8000000
11398@1515169675.020064:find_ram_offset size: 0x20000 @ 0x81e0000
11398@1515169675.020244:find_ram_offset size: 0x20000 @ 0x8200000
11398@1515169675.024343:find_ram_offset size: 0x1000000 @ 0x8220000
11398@1515169675.025154:find_ram_offset size: 0x10000 @ 0x9220000
11398@1515169675.027682:find_ram_offset size: 0x40000 @ 0x9230000
11398@1515169675.032921:find_ram_offset size: 0x200000 @ 0x9270000
11398@1515169675.033307:find_ram_offset size: 0x1000 @ 0x9470000
11398@1515169675.033601:find_ram_offset size: 0x1000 @ 0x9471000
after change:
10923@1515169108.818245:find_ram_offset size: 0x1e0000 @ 0x8000000
10923@1515169108.819410:find_ram_offset size: 0x20000 @ 0x8200000
10923@1515169108.819587:find_ram_offset size: 0x20000 @ 0x8240000
10923@1515169108.823708:find_ram_offset size: 0x1000000 @ 0x8280000
10923@1515169108.824503:find_ram_offset size: 0x10000 @ 0x9280000
10923@1515169108.827093:find_ram_offset size: 0x40000 @ 0x92c0000
10923@1515169108.833045:find_ram_offset size: 0x200000 @ 0x9300000
10923@1515169108.833504:find_ram_offset size: 0x1000 @ 0x9500000
10923@1515169108.833787:find_ram_offset size: 0x1000 @ 0x9540000
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20180105170138.23357-3-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This code has an optimised, word aligned version, and a boring
unaligned version. My commit f70d345 fixed one alignment issue, but
there's another.
The optimised version operates on 'longs' dealing with (typically) 64
pages at a time, replacing the whole long by a 0 and counting the bits.
If the Ramblock is less than 64bits in length that long can contain bits
representing two different RAMBlocks, but the code will update the
bmap belinging to the 1st RAMBlock only while having updated the total
dirty page count for both.
This probably didn't matter prior to 6b6712ef which split the dirty
bitmap by RAMBlock, but now they're separate RAMBlocks we end up
with a count that doesn't match the state in the bitmaps.
Symptom:
Migration showing a few dirty pages left to be sent constantly
Seen on aarch64 and x86 with x86+ovmf
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Wei Huang <wei@redhat.com>
Fixes: 6b6712efcc
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use of a loop construct for code that is not intended to repeat
does not make much idiomatic sense, except in one place: it is a
common usage in macros in order to wrap arbitrary code with
single-statement semantics. But when used in a macro, it is more
typical for the caller to supply the trailing ';' when calling
the macro.
Although qemu coding style frowns on bare:
if (cond)
statement1;
else
statement2;
where extra semicolons actually cause syntax errors, we still
want our macro styles to be easily copied to other projects.
Thus, declare it an error if we encounter any form of 'while (0)'
with a semicolon in the same line.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171201232433.25193-8-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The point of writing a macro embedded in a 'do { ... } while (0)'
loop (particularly if the macro has multiple statements or would
otherwise end with an 'if' statement) is so that the macro can be
used as a drop-in statement with the caller supplying the
trailing ';'. Although our coding style frowns on brace-less 'if':
if (cond)
statement;
else
something else;
that is the classic case where failure to use do/while(0) wrapping
would cause the 'else' to pair with any embedded 'if' in the macro
rather than the intended outer 'if'. But conversely, if the macro
includes an embedded ';', then the same brace-less coding style
would now have two statements, making the 'else' a syntax error
rather than pairing with the outer 'if'. Thus, even though our
coding style with required braces is not impacted, ending a macro
with ';' makes our code harder to port to projects that use
brace-less styles.
The change should have no semantic impact. I was not able to
fully compile-test all of the changes (as some of them are
examples of the ugly bit-rotting debug print statements that are
completely elided by default, and I didn't want to recompile
with the necessary -D witnesses - cleaning those up is left as a
bite-sized task for another day); I did, however, audit that for
all files touched, all callers of the changed macros DID supply
a trailing ';' at the callsite, and did not appear to be used
as part of a brace-less conditional.
Found mechanically via: $ git grep -B1 'while (0);' | grep -A1 \\\\
Signed-off-by: Eric Blake <eblake@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20171201232433.25193-7-eblake@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use of a do/while(0) loop as a way to allow break statements in
the middle of execute-once code is unusual. More typical is
the use of goto for early exits, with a label at the end of
the execute-once code, rather than nesting code in a scope;
however, the comment at the end of the existing code makes this
alternative a bit unpractical.
So, to avoid false positives from a future syntax check about
'while (false);', and to keep the loop form (in case someone
ever does add DONTWAIT support, where they can just as easily
manipulate the initial loop condition or add an if around the
final 'break'), I opted to use the form of a while(1) loop (the
break as an early exit is more idiomatic there), coupled with
a final break preserving the original comment.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171201232433.25193-6-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Use of a do/while(0) control flow in order to permit an early break
is an unusual paradigm, and triggers a false positive with a planned
future syntax check against 'while (0);'. Rewrite the code to use a
goto instead. This patch temporarily keeps an extra level of
indentation to highlight the change; the next patch cleans it up.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171201232433.25193-4-eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is more typical to provide the ';' by the caller of a macro
than to embed it in the macro itself; this is because syntax
highlight engines can get confused if a macro is called without
a semicolon before the closing '}'.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20171201232433.25193-3-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
For a couple of macros in pcnet.c, we have to provide a new scope
to avoid compiler warnings about declarations in the middle of a
switch statement that aren't in a sub-scope. But use of
'do { ... } while (0);' merely to provide that new scope is arcane
overkill, compared to just using '{ ... }'.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20171201232433.25193-2-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Except for round-robin TCG, every other accelerator is using more or
less the same code around qemu_wait_io_event_common. The exception
is HAX, which also has to eat the dummy APC that is queued by
qemu_cpu_kick_thread.
We can add the SleepEx call to qemu_wait_io_event under "if
(!tcg_enabled())", since that is the condition that is used in
qemu_cpu_kick_thread, and unify the function for KVM, HAX, HVF and
multi-threaded TCG. Single-threaded TCG code can also be simplified
since it is only used in the round-robin, sleep-if-all-CPUs-idle case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds saving and restoring of the icount warp
timers in the vmstate.
It is needed because there timers affect the virtual clock value.
Therefore determinism of the execution in icount record/replay mode
depends on determinism of the timers.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Acked-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Pavel Dovgalyuk <Pavel.Dovgaluk@ispras.ru>
This introduces the qemu-gdb command "qemu timers" which will dump the
state of the main timers in the system.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
x86_update_hflags reference env->efer which is updated in hax_get_msrs,
so it has to be called after hax_get_msrs. This fix the bug that sometimes
dump_state show 32 bits regs even in 64 bits mode.
Signed-off-by: Tao Wu <lepton@google.com>
Message-Id: <20180110195056.85403-3-lepton@google.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
scsi_write_same_complete() can retry the write if the request was
unaligned. Make sure to release the AioContext when that code path is
taken!
This patch fixes a hang when QEMU terminates after an unaligned WRITE
SAME request has been processed with dataplane. The hang occurs because
iothread_stop_all() cannot acquire the AioContext lock that was leaked
by the IOThread in scsi_write_same_complete().
Fixes: b9e413dd37 ("block: explicitly acquire aiocontext in aio callbacks that need it").
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org
Reported-by: Cong Li <coli@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180104142502.15175-1-stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Spotted thanks to ASAN:
==25226==ERROR: AddressSanitizer: global-buffer-overflow on address 0x556715a1f120 at pc 0x556714b6f6b1 bp 0x7ffcdfac1360 sp 0x7ffcdfac1350
READ of size 1 at 0x556715a1f120 thread T0
#0 0x556714b6f6b0 in init_disasm /home/elmarco/src/qemu/disas/s390.c:219
#1 0x556714b6fa6a in print_insn_s390 /home/elmarco/src/qemu/disas/s390.c:294
#2 0x55671484d031 in monitor_disas /home/elmarco/src/qemu/disas.c:635
#3 0x556714862ec0 in memory_dump /home/elmarco/src/qemu/monitor.c:1324
#4 0x55671486342a in hmp_memory_dump /home/elmarco/src/qemu/monitor.c:1418
#5 0x5567148670be in handle_hmp_command /home/elmarco/src/qemu/monitor.c:3109
#6 0x5567148674ed in qmp_human_monitor_command /home/elmarco/src/qemu/monitor.c:613
#7 0x556714b00918 in qmp_marshal_human_monitor_command /home/elmarco/src/qemu/build/qmp-marshal.c:1704
#8 0x556715138a3e in do_qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:104
#9 0x556715138f83 in qmp_dispatch /home/elmarco/src/qemu/qapi/qmp-dispatch.c:131
#10 0x55671485cf88 in handle_qmp_command /home/elmarco/src/qemu/monitor.c:3839
#11 0x55671514e80b in json_message_process_token /home/elmarco/src/qemu/qobject/json-streamer.c:105
#12 0x5567151bf2dc in json_lexer_feed_char /home/elmarco/src/qemu/qobject/json-lexer.c:323
#13 0x5567151bf827 in json_lexer_feed /home/elmarco/src/qemu/qobject/json-lexer.c:373
#14 0x55671514ee62 in json_message_parser_feed /home/elmarco/src/qemu/qobject/json-streamer.c:124
#15 0x556714854b1f in monitor_qmp_read /home/elmarco/src/qemu/monitor.c:3881
#16 0x556715045440 in qemu_chr_be_write_impl /home/elmarco/src/qemu/chardev/char.c:172
#17 0x556715047184 in qemu_chr_be_write /home/elmarco/src/qemu/chardev/char.c:184
#18 0x55671505a8e6 in tcp_chr_read /home/elmarco/src/qemu/chardev/char-socket.c:440
#19 0x5567150943c3 in qio_channel_fd_source_dispatch /home/elmarco/src/qemu/io/channel-watch.c:84
#20 0x7fb90292b90b in g_main_dispatch ../glib/gmain.c:3182
#21 0x7fb90292c7ac in g_main_context_dispatch ../glib/gmain.c:3847
#22 0x556715162eca in glib_pollfds_poll /home/elmarco/src/qemu/util/main-loop.c:214
#23 0x556715163001 in os_host_main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:261
#24 0x5567151631fa in main_loop_wait /home/elmarco/src/qemu/util/main-loop.c:515
#25 0x556714ad6d3b in main_loop /home/elmarco/src/qemu/vl.c:1950
#26 0x556714ade329 in main /home/elmarco/src/qemu/vl.c:4865
#27 0x7fb8fe5c9009 in __libc_start_main (/lib64/libc.so.6+0x21009)
#28 0x5567147af4d9 in _start (/home/elmarco/src/qemu/build/s390x-softmmu/qemu-system-s390x+0xf674d9)
0x556715a1f120 is located 32 bytes to the left of global variable 'char_hci_type_info' defined in '/home/elmarco/src/qemu/hw/bt/hci-csr.c:493:23' (0x556715a1f140) of size 104
0x556715a1f120 is located 8 bytes to the right of global variable 's390_opcodes' defined in '/home/elmarco/src/qemu/disas/s390.c:860:33' (0x556715a15280) of size 40600
This fix is based on Andreas Arnez <arnez@linux.vnet.ibm.com> upstream
commit:
https://sourceware.org/git/gitweb.cgi?p=binutils-gdb.git;a=commitdiff;h=9ace48f3d7d80ce09c5df60cccb433470410b11b
2014-08-19 Andreas Arnez <arnez@linux.vnet.ibm.com>
* s390-dis.c (init_disasm): Simplify initialization of
opc_index[]. This also fixes an access after the last element
of s390_opcodes[].
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180104160523.22995-19-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The coroutine is not finished by the time the test ends, resulting in
ASAN warning:
==7005==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 312 byte(s) in 1 object(s) allocated from:
#0 0x7fd35290fa38 in __interceptor_calloc (/lib64/libasan.so.4+0xdea38)
#1 0x7fd3506c5f75 in g_malloc0 ../glib/gmem.c:124
#2 0x55994af03e47 in qemu_coroutine_new /home/elmarco/src/qemu/util/coroutine-ucontext.c:144
#3 0x55994aefed99 in qemu_coroutine_create /home/elmarco/src/qemu/util/qemu-coroutine.c:76
#4 0x55994ac1eb50 in verify_entered_step_1 /home/elmarco/src/qemu/tests/test-coroutine.c:80
#5 0x55994af03c75 in coroutine_trampoline /home/elmarco/src/qemu/util/coroutine-ucontext.c:119
#6 0x7fd34ec02bef (/lib64/libc.so.6+0x50bef)
Do not yield() to let the coroutine terminate.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20180104160523.22995-17-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Direct leak of 160 byte(s) in 4 object(s) allocated from:
#0 0x55ed7678cda8 in calloc (/home/elmarco/src/qq/build/x86_64-softmmu/qemu-system-x86_64+0x797da8)
#1 0x7f3f5e725f75 in g_malloc0 /home/elmarco/src/gnome/glib/builddir/../glib/gmem.c:124
#2 0x55ed778aa3a7 in query_option_descs /home/elmarco/src/qq/util/qemu-config.c:60:16
#3 0x55ed778aa307 in get_drive_infolist /home/elmarco/src/qq/util/qemu-config.c:140:19
#4 0x55ed778a9f40 in qmp_query_command_line_options /home/elmarco/src/qq/util/qemu-config.c:254:36
#5 0x55ed76d4868c in qmp_marshal_query_command_line_options /home/elmarco/src/qq/build/qmp-marshal.c:3078:14
#6 0x55ed77855dd5 in do_qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:104:5
#7 0x55ed778558cc in qmp_dispatch /home/elmarco/src/qq/qapi/qmp-dispatch.c:131:11
#8 0x55ed768b592f in handle_qmp_command /home/elmarco/src/qq/monitor.c:3840:11
#9 0x55ed7786ccfe in json_message_process_token /home/elmarco/src/qq/qobject/json-streamer.c:105:5
#10 0x55ed778fe37c in json_lexer_feed_char /home/elmarco/src/qq/qobject/json-lexer.c:323:13
#11 0x55ed778fdde6 in json_lexer_feed /home/elmarco/src/qq/qobject/json-lexer.c:373:15
#12 0x55ed7786cd83 in json_message_parser_feed /home/elmarco/src/qq/qobject/json-streamer.c:124:12
#13 0x55ed768b559e in monitor_qmp_read /home/elmarco/src/qq/monitor.c:3882:5
#14 0x55ed77714f29 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167:9
#15 0x55ed77714fde in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179:9
#16 0x55ed7772ffad in tcp_chr_read /home/elmarco/src/qq/chardev/char-socket.c:440:13
#17 0x55ed7777113b in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84:12
#18 0x7f3f5e71d90b in g_main_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3182
#19 0x7f3f5e71e7ac in g_main_context_dispatch /home/elmarco/src/gnome/glib/builddir/../glib/gmain.c:3847
#20 0x55ed77886ffc in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214:9
#21 0x55ed778865fd in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261:5
#22 0x55ed77886222 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515:11
#23 0x55ed76d2a4df in main_loop /home/elmarco/src/qq/vl.c:1995:9
#24 0x55ed76d1cb4a in main /home/elmarco/src/qq/vl.c:4914:5
#25 0x7f3f555f6039 in __libc_start_main (/lib64/libc.so.6+0x21039)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180104160523.22995-14-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Fixes leaks such as:
Direct leak of 2 byte(s) in 1 object(s) allocated from:
#0 0x7eff58beb850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7eff57942f0c in g_malloc ../glib/gmem.c:94
#2 0x7eff579431cf in g_malloc_n ../glib/gmem.c:331
#3 0x7eff5795f6eb in g_strdup ../glib/gstrfuncs.c:363
#4 0x55db720f1d46 in readline_hist_add /home/elmarco/src/qq/util/readline.c:258
#5 0x55db720f2d34 in readline_handle_byte /home/elmarco/src/qq/util/readline.c:387
#6 0x55db71539d00 in monitor_read /home/elmarco/src/qq/monitor.c:3896
#7 0x55db71f9be35 in qemu_chr_be_write_impl /home/elmarco/src/qq/chardev/char.c:167
#8 0x55db71f9bed3 in qemu_chr_be_write /home/elmarco/src/qq/chardev/char.c:179
#9 0x55db71fa013c in fd_chr_read /home/elmarco/src/qq/chardev/char-fd.c:66
#10 0x55db71fe18a8 in qio_channel_fd_source_dispatch /home/elmarco/src/qq/io/channel-watch.c:84
#11 0x7eff5793a90b in g_main_dispatch ../glib/gmain.c:3182
#12 0x7eff5793b7ac in g_main_context_dispatch ../glib/gmain.c:3847
#13 0x55db720af3bd in glib_pollfds_poll /home/elmarco/src/qq/util/main-loop.c:214
#14 0x55db720af505 in os_host_main_loop_wait /home/elmarco/src/qq/util/main-loop.c:261
#15 0x55db720af6d6 in main_loop_wait /home/elmarco/src/qq/util/main-loop.c:515
#16 0x55db7184e0de in main_loop /home/elmarco/src/qq/vl.c:1995
#17 0x55db7185e956 in main /home/elmarco/src/qq/vl.c:4914
#18 0x7eff4ea17039 in __libc_start_main (/lib64/libc.so.6+0x21039)
(while at it, use g_new0(ReadLineState), it's a bit easier to read)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180104160523.22995-11-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Note that data_dir[] will now point to allocated strings.
Fixes:
Direct leak of 16 byte(s) in 1 object(s) allocated from:
#0 0x7f1448181850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7f1446ed8f0c in g_malloc ../glib/gmem.c:94
#2 0x7f1446ed91cf in g_malloc_n ../glib/gmem.c:331
#3 0x7f1446ef739a in g_strsplit ../glib/gstrfuncs.c:2364
#4 0x55cf276439d7 in main /home/elmarco/src/qq/vl.c:4311
#5 0x7f143dfad039 in __libc_start_main (/lib64/libc.so.6+0x21039)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20180104160523.22995-10-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
/public/qobject_is_equal_conversion: OK
=================================================================
==14396==ERROR: LeakSanitizer: detected memory leaks
Direct leak of 56 byte(s) in 1 object(s) allocated from:
#0 0x7f07682c5850 in malloc (/lib64/libasan.so.4+0xde850)
#1 0x7f0767d12f0c in g_malloc ../glib/gmem.c:94
#2 0x7f0767d131cf in g_malloc_n ../glib/gmem.c:331
#3 0x562bd767371f in do_test_equality /home/elmarco/src/qq/tests/check-qobject.c:49
#4 0x562bd7674a35 in qobject_is_equal_dict_test /home/elmarco/src/qq/tests/check-qobject.c:267
#5 0x7f0767d37b04 in test_case_run ../glib/gtestutils.c:2237
#6 0x7f0767d37ec4 in g_test_run_suite_internal ../glib/gtestutils.c:2321
#7 0x7f0767d37f6d in g_test_run_suite_internal ../glib/gtestutils.c:2333
#8 0x7f0767d38184 in g_test_run_suite ../glib/gtestutils.c:2408
#9 0x7f0767d36e0d in g_test_run ../glib/gtestutils.c:1674
#10 0x562bd7674e75 in main /home/elmarco/src/qq/tests/check-qobject.c:327
#11 0x7f0766009039 in __libc_start_main (/lib64/libc.so.6+0x21039)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20180104160523.22995-9-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
zero-initialize ADMADescr 'dscr' in sdhci_do_adma() to avoid:
hw/sd/sdhci.c: In function ‘sdhci_do_adma’:
hw/sd/sdhci.c:714:29: error: ‘dscr.addr’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
trace_sdhci_adma("link", s->admasysaddr);
^
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Acked-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20180115182436.2066-9-f4bug@amsat.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since omap_mmc is still using the legacy SD card API, the SD
card created by sd_init() is not plugged into any bus. This
means that the controller has to reset it manually.
Failing to do this mostly didn't affect the guest since the
guest typically does a programmed SD card reset as part of
its SD controller driver initialization, but would mean that
migration fails because it's only in sd_reset() that we
set up the wpgrps_size field.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1515506513-31961-5-git-send-email-peter.maydell@linaro.org
Since ssi-sd is still using the legacy SD card API, the SD
card created by sd_init() is not plugged into any bus. This
means that the controller has to reset it manually.
Failing to do this mostly didn't affect the guest since the
guest typically does a programmed SD card reset as part of
its SD controller driver initialization, but meant that
migration failed because it's only in sd_reset() that we
set up the wpgrps_size field.
In the case of sd-ssi, we have to implement an entire
reset function since there wasn't one previously, and
that requires a QOM cast macro that got omitted when this
device was QOMified.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1515506513-31961-4-git-send-email-peter.maydell@linaro.org
Since milkymist-memcard is still using the legacy SD card API,
the SD card created by sd_init() is not plugged into any bus.
This means that the controller has to reset it manually.
Failing to do this mostly didn't affect the guest since the
guest typically does a programmed SD card reset as part of
its SD controller driver initialization, but meant that
migration failed because it's only in sd_reset() that we
set up the wpgrps_size field.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1515506513-31961-3-git-send-email-peter.maydell@linaro.org
Since pl181 is still using the legacy SD card API, the SD
card created by sd_init() is not plugged into any bus. This
means that the controller has to reset it manually.
Failing to do this mostly didn't affect the guest since the
guest typically does a programmed SD card reset as part of
its SD controller driver initialization, but meant that
migration failed because it's only in sd_reset() that we
set up the wpgrps_size field.
Cc: qemu-stable@nongnu.org
Fixes: https://bugs.launchpad.net/qemu/+bug/1739378
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1515506513-31961-2-git-send-email-peter.maydell@linaro.org
Instead of ignoring the response from address_space_ld*()
(indicating an attempt to read a page table descriptor from
an invalid physical address), use it to report the failure
correctly.
Since this is another couple of locations where we need to
decide the value of the ARMMMUFaultInfo ea bit based on a
MemTxResult, we factor out that operation into a helper
function.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For PMSAv7, the v7A/R Arm ARM defines that setting AP to 0b111
is an UNPREDICTABLE reserved combination. However, for v7M
this value is documented as having the same behaviour as 0b110:
read-only for both privileged and unprivileged. Accept this
value on an M profile core rather than treating it as a guest
error and a no-access page.
Reported-by: Andy Gross <andy.gross@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1512742402-31669-1-git-send-email-peter.maydell@linaro.org
migration/next for 20180115
# gpg: Signature made Mon 15 Jan 2018 11:51:00 GMT
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20180115: (27 commits)
migration: remove notify in fd_error
migration: remove some block_cleanup_parameters()
migration: put the finish part into a new function
migration: major cleanup for migrate iterations
migration: cleanup stats update into function
migration: use switch at the end of migration
migration: introduce migrate_calculate_complete
migration: introduce downtime_start
migration: move vm_old_running into global state
migration: split use of MigrationState.total_time
migration: remove "enable_colo" var
migration: qemu_savevm_state_cleanup() in cleanup
migration: assert colo instead of check
migration: finalize current_migration object
migration: Guard ram_bytes_remaining against early call
migration: add postcopy total blocktime into query-migrate
migration: add blocktime calculation into migration-test
migration: postcopy_blocktime documentation
migration: calculate vCPU blocktime on dst side
migration: add postcopy blocktime ctx into MigrationIncomingState
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Keep the one in migrate_fd_cleanup() would be enough. Removing the other
two.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch only moved the last part of migration_thread() into a new
function migration_iteration_finish() to make it much shorter. With
previous works to remove some local variables, now it's fairly easy to
do that.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The major work for migration iterations are to move RAM/block/... data
via qemu_savevm_state_iterate(). Generalize those part into a single
function.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
We have quite a few lines in migration_thread() that calculates some
statistics for the migration interations. Isolate it into a single
function to improve readability.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It converts the old if clauses into switch, explicitly mentions the
possible migration states. The old nested "if"s are not clear on what
we do on different states.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Generalize the calculation part when migration complete into a
function to simplify migration_thread().
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Introduce MigrationState.downtime_start to replace the local variable
"start_time" in migration_thread to avoid passing things around.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Firstly, it was passed around. Let's just move it into MigrationState
just like many other variables as state of migration, renaming it to
vm_was_running.
One thing to mention is that for postcopy, we actually don't need this
knowledge at all since postcopy can't resume a VM even if it fails (we
can see that from the old code too: when we try to resume we also check
against "entered_postcopy" variable). So further we do this:
- in postcopy_start(), we don't update vm_old_running since useless
- in migration_thread(), we don't need to check entered_postcopy when
resume, since it's only used for precopy.
Comment this out too for that variable definition.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
It was used either to:
1. store initial timestamp of migration start, and
2. store total time used by last migration
Let's provide two parameters for each of them. Mix use of the two is
slightly misleading.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Moving existing callers all into migrate_fd_cleanup(). It simplifies
migration_thread() a bit.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When reaching here if we are still "active" it means we must be in colo
state. After a quick discussion offlist, we decided to use the safer
error_report().
Finally I want to use "switch" here rather than lots of complicated if
clauses.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
current_migration has .instance_finalize callback, but it is not
called, because nobody unrefs current_migration. Fix that.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Calling ram_bytes_remaining during the early part of setup is unsafe
because the ram_state isn't yet initialised.
This can happen in the sequence:
migrate
migrate_cancel
info migrate
if the migrate sticks trying to connect (e.g. to an unresponsive
destination due to the connect timeout). Here 'info migrate' sees
a state of CANCELLING and so assumes the migrate has partially happened.
partial fix for:
RH bz: https://bugzilla.redhat.com/show_bug.cgi?id=1525899
Reported-by: Xianxian Wang <xianwang@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Postcopy total blocktime is available on destination side only.
But query-migrate was possible only for source. This patch
adds ability to call query-migrate on destination.
To be able to see postcopy blocktime, need to request postcopy-blocktime
capability.
The query-migrate command will show following sample result:
{"return":
"postcopy-vcpu-blocktime": [115, 100],
"status": "completed",
"postcopy-blocktime": 100
}}
postcopy_vcpu_blocktime contains list, where the first item is the first
vCPU in QEMU.
This patch has a drawback, it combines states of incoming and
outgoing migration. Ongoing migration state will overwrite incoming
state. Looks like better to separate query-migrate for incoming and
outgoing migration or add parameter to indicate type of migration.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch just requests blocktime calculation,
and check it in case when UFFD_FEATURE_THREAD_ID feature is set
on the host.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch provides blocktime calculation per vCPU,
as a summary and as a overlapped value for all vCPUs.
This approach was suggested by Peter Xu, as an improvements of
previous approch where QEMU kept tree with faulted page address and cpus bitmask
in it. Now QEMU is keeping array with faulted page address as value and vCPU
as index. It helps to find proper vCPU at UFFD_COPY time. Also it keeps
list for blocktime per vCPU (could be traced with page_fault_addr)
Blocktime will not calculated if postcopy_blocktime field of
MigrationIncomingState wasn't initialized.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
This patch adds request to kernel space for UFFD_FEATURE_THREAD_ID, in
case this feature is provided by kernel.
PostcopyBlocktimeContext is encapsulated inside postcopy-ram.c,
due to it being a postcopy-only feature.
Also it defines PostcopyBlocktimeContext's instance live time.
Information from PostcopyBlocktimeContext instance will be provided
much after postcopy migration end, instance of PostcopyBlocktimeContext
will live till QEMU exit, but part of it (vcpu_addr,
page_fault_vcpu_time) used only during calculation, will be released
when postcopy ended or failed.
To enable postcopy blocktime calculation on destination, need to
request proper compatibility (Patch for documentation will be at the
tail of the patch set).
As an example following command enable that capability, assume QEMU was
started with
-chardev socket,id=charmonitor,path=/var/lib/migrate-vm-monitor.sock
option to control it
[root@host]#printf "{\"execute\" : \"qmp_capabilities\"}\r\n \
{\"execute\": \"migrate-set-capabilities\" , \"arguments\": {
\"capabilities\": [ { \"capability\": \"postcopy-blocktime\", \"state\":
true } ] } }" | nc -U /var/lib/migrate-vm-monitor.sock
Or just with HMP
(qemu) migrate_set_capability postcopy-blocktime on
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Right now it could be used on destination side to
enable vCPU blocktime calculation for postcopy live migration.
vCPU blocktime - it's time since vCPU thread was put into
interruptible sleep, till memory page was copied and thread awake.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Since commit 3a38429748 ("Add a "no HPT" encoding to HTAB migration stream")
the HTAB migration stream contains a header set to "-1", meaning there
is no HPT. Teach analyze-migration.py to ignore the section in this case.
Without this fix, the script fails with a dump from a POWER9 guest:
Traceback (most recent call last):
File "./qemu/scripts/analyze-migration.py", line 602, in <module>
dump.read(dump_memory = args.memory)
File "./qemu/scripts/analyze-migration.py", line 539, in read
section.read()
File "./qemu/scripts/analyze-migration.py", line 250, in read
self.file.readvar(n_valid * self.HASH_PTE_SIZE_64)
File "./qemu/scripts/analyze-migration.py", line 64, in readvar
raise Exception("Unexpected end of %s at 0x%x" % (self.filename, self.file.tell()))
Exception: Unexpected end of migrate.dump at 0x1d4763ba
Fixes: 3a38429748 ("Add a "no HPT" encoding to HTAB migration stream")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Otherwise, we can't use it after calling socket_start_incoming_migration
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We use int for everything (int64_t), and then we check that value is
between 0 and 255. Change it to the valid types.
This change only happens for HMP. QMP always use bytes and similar.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Host: Mac OS 10.12.5
Compiler: Apple LLVM version 8.1.0 (clang-802.0.42)
slirp/ip6_icmp.c:80:38: warning: taking address of packed member 'ip_src' of class or
structure 'ip6' may result in an unaligned pointer value
[-Waddress-of-packed-member]
IN6_IS_ADDR_UNSPECIFIED(&ip->ip_src)) {
^~~~~~~~~~
/usr/include/netinet6/in6.h:238:42: note: expanded from macro 'IN6_IS_ADDR_UNSPECIFIED'
((*(const __uint32_t *)(const void *)(&(a)->s6_addr[0]) == 0) && \
^
Reported-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
sdl2: bugfixes.
spice: cleanups.
input: mem leak fix.
gtk: deprecate 2.x support.
# gpg: Signature made Fri 12 Jan 2018 14:52:39 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20180112-pull-request:
sdl2: Ignore UI hotkeys after a focus change when GUI modifier is held
sdl2 uses surface relative coordinates
sdl2: Do not hide the cursor on auxilliary windows
spice: remove unused timer list
spice: remove only written event_mask field
spice: remove unused watch list
spice: remove QXLWorker interface field
ui: deprecate use of GTK 2.x in favour of 3.x series
input: fix memory leak
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
vnc: limit memory usage (CVE-2017-15124)
# gpg: Signature made Fri 12 Jan 2018 12:57:22 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/vnc-20180112-pull-request:
ui: mix misleading comments & return types of VNC I/O helper methods
ui: add trace events related to VNC client throttling
ui: place a hard cap on VNC server output buffer size
ui: fix VNC client throttling when forced update is requested
ui: fix VNC client throttling when audio capture is active
ui: refactor code for determining if an update should be sent to the client
ui: correctly reset framebuffer update state after processing dirty regions
ui: introduce enum to track VNC client framebuffer update request state
ui: track how much decoded data we consumed when doing SASL encoding
ui: avoid pointless VNC updates if framebuffer isn't dirty
ui: remove redundant indentation in vnc_client_update
ui: remove unreachable code in vnc_update_client
ui: remove 'sync' parameter from vnc_update_client
vnc: fix debug spelling
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When SDL2 windows change focus while a key is held, the window that
receives the focus also receives a new KeyDown event, without an
autorepeat flag. This means that if a WM places the qemu console
over the main window after Ctrl-Alt-2, the console closes immediately
after opening. Then, the main window receives the KeyDown event again
and the whole process repeats.
This patch makes the SDL2 UI ignore the KeyDown events on a window that
just received the focus, if the GUI modifier was held. The ignore flag
is reset on a first KeyUp event. This effectively works around the issue
above.
Signed-off-by: Jindrich Makovicka <makovick@gmail.com>
Message-Id: <20171117112258.5888-4-makovick@gmail.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Some older versions of gcc complain if a typedef is defined twice:
target/xtensa/translate.c:81: error: redefinition of typedef 'DisasContext'
target/xtensa/cpu.h:339: note: previous declaration of 'DisasContext' was here
Remove the now-redundant typedef from the definition of the struct in
translate.c.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1515762528-22818-1-git-send-email-peter.maydell@linaro.org
While the QIOChannel APIs for reading/writing data return ssize_t, with negative
value indicating an error, the VNC code passes this return value through the
vnc_client_io_error() method. This detects the error condition, disconnects the
client and returns 0 to indicate error. Thus all the VNC helper methods should
return size_t (unsigned), and misleading comments which refer to the possibility
of negative return values need fixing.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-14-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The previous patches fix problems with throttling of forced framebuffer updates
and audio data capture that would cause the QEMU output buffer size to grow
without bound. Those fixes are graceful in that once the client catches up with
reading data from the server, everything continues operating normally.
There is some data which the server sends to the client that is impractical to
throttle. Specifically there are various pseudo framebuffer update encodings to
inform the client of things like desktop resizes, pointer changes, audio
playback start/stop, LED state and so on. These generally only involve sending
a very small amount of data to the client, but a malicious guest might be able
to do things that trigger these changes at a very high rate. Throttling them is
not practical as missed or delayed events would cause broken behaviour for the
client.
This patch thus takes a more forceful approach of setting an absolute upper
bound on the amount of data we permit to be present in the output buffer at
any time. The previous patch set a threshold for throttling the output buffer
by allowing an amount of data equivalent to one complete framebuffer update and
one seconds worth of audio data. On top of this it allowed for one further
forced framebuffer update to be queued.
To be conservative, we thus take that throttling threshold and multiply it by
5 to form an absolute upper bound. If this bound is hit during vnc_write() we
forceably disconnect the client, refusing to queue further data. This limit is
high enough that it should never be hit unless a malicious client is trying to
exploit the sever, or the network is completely saturated preventing any sending
of data on the socket.
This completes the fix for CVE-2017-15124 started in the previous patches.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-12-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).
The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check is disabled if the client has requested
a forced update, because we want to send these as soon as possible.
As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then repeatedly send full framebuffer update
requests, but never read data back from the server. This can easily make QEMU's
VNC server send buffer consume 100MB of RAM per second, until the OOM killer
starts reaping processes (hopefully the rogue QEMU process, but it might pick
others...).
To address this we make the throttling more intelligent, so we can throttle
full updates. When we get a forced update request, we keep track of exactly how
much data we put on the output buffer. We will not process a subsequent forced
update request until this data has been fully sent on the wire. We always allow
one forced update request to be in flight, regardless of what data is queued
for incremental updates or audio data. The slight complication is that we do
not initially know how much data an update will send, as this is done in the
background by the VNC job thread. So we must track the fact that the job thread
has an update pending, and not process any further updates until this job is
has been completed & put data on the output buffer.
This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.
This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:
commit a7b20a8efa
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Oct 9 14:43:42 2017 +0100
io: monitor encoutput buffer size from websocket GSource
This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-11-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The VNC server must throttle data sent to the client to prevent the 'output'
buffer size growing without bound, if the client stops reading data off the
socket (either maliciously or due to stalled/slow network connection).
The current throttling is very crude because it simply checks whether the
output buffer offset is zero. This check must be disabled if audio capture is
enabled, because when streaming audio the output buffer offset will rarely be
zero due to queued audio data, and so this would starve framebuffer updates.
As a result, the VNC client can cause QEMU to allocate arbitrary amounts of RAM.
They can first start something in the guest that triggers lots of framebuffer
updates eg play a youtube video. Then enable audio capture, and simply never
read data back from the server. This can easily make QEMU's VNC server send
buffer consume 100MB of RAM per second, until the OOM killer starts reaping
processes (hopefully the rogue QEMU process, but it might pick others...).
To address this we make the throttling more intelligent, so we can throttle
when audio capture is active too. To determine how to throttle incremental
updates or audio data, we calculate a size threshold. Normally the threshold is
the approximate number of bytes associated with a single complete framebuffer
update. ie width * height * bytes per pixel. We'll send incremental updates
until we hit this threshold, at which point we'll stop sending updates until
data has been written to the wire, causing the output buffer offset to fall
back below the threshold.
If audio capture is enabled, we increase the size of the threshold to also
allow for upto 1 seconds worth of audio data samples. ie nchannels * bytes
per sample * frequency. This allows the output buffer to have a mixture of
incremental framebuffer updates and audio data queued, but once the threshold
is exceeded, audio data will be dropped and incremental updates will be
throttled.
This unbounded memory growth affects all VNC server configurations supported by
QEMU, with no workaround possible. The mitigating factor is that it can only be
triggered by a client that has authenticated with the VNC server, and who is
able to trigger a large quantity of framebuffer updates or audio samples from
the guest OS. Mostly they'll just succeed in getting the OOM killer to kill
their own QEMU process, but its possible other processes can get taken out as
collateral damage.
This is a more general variant of the similar unbounded memory usage flaw in
the websockets server, that was previously assigned CVE-2017-15268, and fixed
in 2.11 by:
commit a7b20a8efa
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Mon Oct 9 14:43:42 2017 +0100
io: monitor encoutput buffer size from websocket GSource
This new general memory usage flaw has been assigned CVE-2017-15124, and is
partially fixed by this patch.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
According to the RFB protocol, a client sends one or more framebuffer update
requests to the server. The server can reply with a single framebuffer update
response, that covers all previously received requests. Once the client has
read this update from the server, it may send further framebuffer update
requests to monitor future changes. The client is free to delay sending the
framebuffer update request if it needs to throttle the amount of data it is
reading from the server.
The QEMU VNC server, however, has never correctly handled the framebuffer
update requests. Once QEMU has received an update request, it will continue to
send client updates forever, even if the client hasn't asked for further
updates. This prevents the client from throttling back data it gets from the
server. This change fixes the flawed logic such that after a set of updates are
sent out, QEMU waits for a further update request before sending more data.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When we encode data for writing with SASL, we encode the entire pending output
buffer. The subsequent write, however, may not be able to send the full encoded
data in one go though, particularly with a slow network. So we delay setting the
output buffer offset back to zero until all the SASL encoded data is sent.
Between encoding the data and completing sending of the SASL encoded data,
however, more data might have been placed on the pending output buffer. So it
is not valid to set offset back to zero. Instead we must keep track of how much
data we consumed during encoding and subtract only that amount.
With the current bug we would be throwing away some pending data without having
sent it at all. By sheer luck this did not previously cause any serious problem
because appending data to the send buffer is always an atomic action, so we
only ever throw away complete RFB protocol messages. In the case of frame buffer
updates we'd catch up fairly quickly, so no obvious problem was visible.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The vnc_update_client() method checks the 'has_dirty' flag to see if there are
dirty regions that are pending to send to the client. Regardless of this flag,
if a forced update is requested, updates must be sent. For unknown reasons
though, the code also tries to sent updates if audio capture is enabled. This
makes no sense as audio capture state does not impact framebuffer contents, so
this check is removed.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
A previous commit:
commit 5a8be0f73d
Author: Gerd Hoffmann <kraxel@redhat.com>
Date: Wed Jul 13 12:21:20 2016 +0200
vnc: make sure we finish disconnect
Added a check for vs->disconnecting at the very start of the
vnc_update_client method. This means that the very next "if"
statement check for !vs->disconnecting always evaluates true,
and is thus redundant. This in turn means the vs->disconnecting
check at the very end of the method never evaluates true, and
is thus unreachable code.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20171218191228.31018-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When --enable-debug is turned on, configure doesn't set -O level, and
uses default compiler -O0 level, which is slow.
Instead, use -Og if supported by the compiler (optimize debugging
experience), or -O1 (keeps code somewhat debuggable and works around
compiler bugs).
Unfortunately, gcc has many false-positive maybe-uninitialized
errors with Og and O1 (f27 gcc 7.2.1 20170915):
/home/elmarco/src/qemu/hw/ipmi/isa_ipmi_kcs.c: In function ‘ipmi_kcs_ioport_read’:
/home/elmarco/src/qemu/hw/ipmi/isa_ipmi_kcs.c:279:12: error: ‘ret’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
return ret;
^~~
cc1: all warnings being treated as errors
make: *** [/home/elmarco/src/qemu/rules.mak:66: hw/ipmi/isa_ipmi_kcs.o] Error 1
make: *** Waiting for unfinished jobs....
/home/elmarco/src/qemu/hw/ide/ahci.c: In function ‘ahci_populate_sglist’:
/home/elmarco/src/qemu/hw/ide/ahci.c:903:58: error: ‘tbl_entry_size’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if ((off_idx == -1) || (off_pos < 0) || (off_pos > tbl_entry_size)) {
~~~~~~~~~^~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
make: *** [/home/elmarco/src/qemu/rules.mak:66: hw/ide/ahci.o] Error 1
/home/elmarco/src/qemu/hw/display/qxl.c: In function ‘qxl_add_memslot’:
/home/elmarco/src/qemu/hw/display/qxl.c:1397:52: error: ‘pci_start’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
memslot.virt_end = virt_start + (guest_end - pci_start);
~~~~~~~~~~~~~^~~~~~~~~~~~
/home/elmarco/src/qemu/hw/display/qxl.c:1389:9: error: ‘pci_region’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
qxl_set_guest_bug(d, "%s: pci_region = %d", __func__, pci_region);
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
cc1: all warnings being treated as errors
There seems to be a long list of related bugs in upstream GCC, some of
them are being fixed very recently:
https://gcc.gnu.org/bugzilla/show_bug.cgi?id=24639
For now, let's workaround it by using Wno-maybe-uninitialized (gcc-only).
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20180104160523.22995-5-marcandre.lureau@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When linking qemu-ga under some configuration (when gthread-2.0.pc
doesn't have -pthread, as happening atm with meson build), you may
have this linking issue:
/usr/bin/ld: libqemuutil.a(qemu-thread-posix.o): undefined reference to symbol 'pthread_setname_np@@GLIBC_2.12'
/usr/lib64/libpthread.so.0: error adding symbols: DSO missing from command line
Make sure qemu-ga links with the pthread library, by adding correct
flags to libs_qga.
This is really a QEMU bug, because it's QEMU code that's using pthread
functions, and so we must explicitly link against pthreads. The bug
was just masked by the fact that often some pkg-config or another for
one of our dependencies will add -pthread to the link line anyway.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20180104160523.22995-2-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's a replacement of g_timeout_add[_seconds]() for chardevs. Chardevs
now can have dedicated gcontext, we should always bind chardev tasks
onto those gcontext rather than the default main context. Since there
are quite a few of g_timeout_add[_seconds]() callers, a new function
qemu_chr_timeout_add_ms() is introduced.
One thing to mention is that, terminal3270 is still always running on
main gcontext. However let's convert that as well since it's still part
of chardev codes and in case one day we'll miss that when we move it out
of main gcontext too.
Also, convert all the timers from GSource tags into GSource pointers.
Gsource tag IDs and g_source_remove()s can only work with default
gcontext, while now these GSources can logically be attached to other
contexts. So let's use explicit g_source_destroy() plus another
g_source_unref() to remove a timer.
Note: when in the timer handler, we don't need the g_source_destroy()
any more since that'll be done automatically if the timer handler
returns false (and that's what all the current handlers do).
Yet another note: in pty_chr_rearm_timer() we take special care for
ms=1000. This patch merged the two cases into one.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180104141835.17987-4-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The idle task will be attached to main gcontext even if the chardev
backend is running in another gcontext. Fix the only caller by
extending the g_idle_add() logic into the more powerful
g_source_attach(). It's basically g_idle_add_full() implementation, but
with the chardev's gcontext passed in.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180104141835.17987-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit 6bbb6c0644 ("chardev: use per-dev context for
io_add_watch_poll", 2017-09-22) all the chardev watches are converted to
use per-chardev gcontext to support chardev to be run outside default
main thread. However that's still missing one call from the frontend
code. Touch that up.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20180104141835.17987-2-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Certain PMU-related MSRs are not supported for CPUs with PMU
architecture below version 2. KVM rejects any access to them (see
intel_is_valid_msr_idx routine in KVM), and QEMU fails on the following
assertion:
kvm_put_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
QEMU also could fail if KVM exposes less fixed counters then 3. It could
happen if host system run inside another hypervisor, which is tweaking
PMU-related CPUID. To prevent possible fail, number of fixed counters now is
obtained in the same way as number of GP counters.
Reviewed-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Jan Dakinevich <jan.dakinevich@virtuozzo.com>
Message-Id: <1514383466-7257-1-git-send-email-jan.dakinevich@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
HPET saves its state by calculating the current time and recovers timer
offset using this calculated value. But these calculations include
divisions and multiplications. Therefore the timer state cannot be recovered
precise enough.
This patch introduces saving of the original value of the offset to
preserve the determinism of the timer.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru>
Reviewed-by: Juan Quintela <quintela@redhat.com>
--
v3: Added compat property for correct migration.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pc, pci, virtio: features, fixes, cleanups
A bunch of fixes, cleanus and new features all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 11 Jan 2018 20:04:57 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (23 commits)
smbus: do not immediately complete commands
dump-guest-memory.py: fix "You can't do that without a process to debug"
virtio-pci: Don't force Subsystem Vendor ID = Vendor ID
intel_iommu: fix error param in string
intel_iommu: remove X86_IOMMU_PCI_DEVFN_MAX
vhost-user: document memory accesses
vhost-user: fix indentation in protocol specification
hw/pci-host/xilinx: QOM'ify the AXI-PCIe host bridge
hw/pci-host/piix: QOM'ify the IGD Passthrough host bridge
tests/pxe-test: Add some extra tests
tests/pxe-test: Test net booting over IPv6 in some cases
tests/pxe-test: Use table of testcases rather than open-coding
tests/pxe-test: Remove unnecessary special case test functions
virtio_error: don't invoke status callbacks
pci: Eliminate pci_find_primary_bus()
pci: Eliminate redundant PCIDevice::bus pointer
pci: Add pci_dev_bus_num() helper
pci: Move bridge data structures from pci_bus.h to pci_bridge.h
pci: Rename root bus initialization functions for clarity
tests: add test to check VirtQueue object
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When -no-acpi option is used with Q35 machine type, no guest ACPI is
built, but the ACPI device is still created, so only checking the
presence of ACPI device before memory plug/unplug is not enough in
such cases. Check whether ACPI is disabled globally in addition and
fail memory plug/unplug if it's disabled.
Signed-off-by: Haozhong Zhang <haozhong.zhang@intel.com>
Message-Id: <20171222015120.31730-1-haozhong.zhang@intel.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
scsi_disk_emulate_command passes in_buf == NULL when sent a REQUEST
SENSE command. Check for in_len == 0 before dereferencing in_buf.
Fixes: f68d98b21f
Reported-by: Roman Kagan <rkagan@virtuozzo.com>
Tested-by: Roman Kagan <rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Add the property to the device model, then parse it by calling
blkconf_apply_backend_options().
In addition to blk_set_perm(), the called function also handles error
options and wce. For error options we've already checked that the
default values are used, for wce we don't have the option either so it
is always the default (true). In other words there is no change of
behavior in these regards.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171205151553.7834-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DE212 is a noMMU core supported in linux. Import this core to provide
true noMMU configuration for xtensa linux to run on QEMU.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Cores with and without MMU have system RAM and ROM at different locations.
Also with noMMU cores system IO region is accessible through two physical
address ranges.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Extract flash configuration into a separate structure to make it easier
to share between MMU and noMMU configurations.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
XTFPGA boards should populate core memory regions the same way sim
machine does. Move xtensa_create_memory_regions implementation to a
separate file and use it to create instruction and data memory regions
on XTFPGA boards.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Function/structure naming inconsistently uses lx, lx60 and xtensa
prefixes where xtfpga would be appropriate. Fix that.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Don't load jump target into the CPU config, instead put it and initial
a2 as literals into the mini bootloader and use l32r to load them
natively. With these changes it should be possible to do warm reboot of
the guest.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Refactor disas_thumb2_insn() so that it generates the code for raising
an UNDEF exception for invalid insns, rather than returning a flag
which the caller must check to see if it needs to generate the UNDEF
code. This brings the function in to line with the behaviour of
disas_thumb_insn() and disas_arm_insn().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1513080506-17703-1-git-send-email-peter.maydell@linaro.org
Our copy of the nwfpe code for emulating of the old FPA11 floating
point unit doesn't check the coprocessor number in the instruction
when it emulates it. This means that we might treat some
instructions which should really UNDEF as being FPA11 instructions by
accident.
The kernel's copy of the nwfpe code doesn't make this error; I suspect
the bug was noticed and fixed as part of the process of mainlining
the nwfpe code more than a decade ago.
Add a check that the coprocessor number (which is always in bits
[11:8] of the instruction) is either 1 or 2, which is where the
FPA11 lives.
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In current implementation, packet queue flushing logic seem to suffer
from a deadlock like scenario if a packet is received by the interface
before before Rx ring is initialized by Guest's driver. Consider the
following sequence of events:
1. A QEMU instance is started against a TAP device on Linux
host, running Linux guest, e. g., something to the effect
of:
qemu-system-arm \
-net nic,model=imx.fec,netdev=lan0 \
netdev tap,id=lan0,ifname=tap0,script=no,downscript=no \
... rest of the arguments ...
2. Once QEMU starts, but before guest reaches the point where
FEC deriver is done initializing the HW, Guest, via TAP
interface, receives a number of multicast MDNS packets from
Host (not necessarily true for every OS, but it happens at
least on Fedora 25)
3. Recieving a packet in such a state results in
imx_eth_can_receive() returning '0', which in turn causes
tap_send() to disable corresponding event (tap.c:203)
4. Once Guest's driver reaches the point where it is ready to
recieve packets it prepares Rx ring descriptors and writes
ENET_RDAR_RDAR to ENET_RDAR register to indicate to HW that
more descriptors are ready. And at this points emulation
layer does this:
s->regs[index] = ENET_RDAR_RDAR;
imx_eth_enable_rx(s);
which, combined with:
if (!s->regs[ENET_RDAR]) {
qemu_flush_queued_packets(qemu_get_queue(s->nic));
}
results in Rx queue never being flushed and corresponding
I/O event beign disabled.
To prevent the problem, change the code to always flush packet queue
when ENET_RDAR transitions 0 -> ENET_RDAR_RDAR.
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: qemu-devel@nongnu.org
Cc: qemu-arm@nongnu.org
Cc: yurovsky@gmail.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Andrey Smirnov <andrew.smirnov@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
acpi_data_push uses g_array_set_size to resize the memory size. If there
is no enough contiguous memory, the address will be changed. If we use
the old value, it will assert.
qemu-kvm: hw/acpi/bios-linker-loader.c:214: bios_linker_loader_add_checksum:
Assertion `start_offset < file->blob->len' failed.`
This issue only happens in building SRAT table now but here we unify the
pattern for other tables as well to avoid possible issues in the future.
Signed-off-by: Zhaoshenglong <zhaoshenglong@huawei.com>
Reviewed-by: Andrew Jones <drjones@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ldxp loads two consecutive doublewords from memory regardless of CPU
endianness. On store, stlxp currently assumes to work with a 128bit
value and consequently switches order in big-endian mode. With this
change it packs the doublewords in reverse order in anticipation of the
128bit big-endian store operation interposing them so they end up in
memory in the right order. This makes it work for both MTTCG and !MTTCG.
It effectively implements the ARM ARM STLXP operation pseudo-code:
data = if BigEndian() then el1:el2 else el2:el1;
With this change an aarch64_be Linux 4.14.4 kernel succeeds to boot up
in system emulation mode.
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Give big-endian arm and aarch64 CPUs their own family in
qemu-binfmt-conf.sh to make sure we register qemu-user for binaries of
the opposite endianness on arm and aarch64. Apart from the family
assignments of the magic values, qemu_get_family() needs to be able to
distinguish the two and recognise aarch64{,_be} as well.
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 20171220212308.12614-7-michael.weiser@gmx.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Since for aarch64 the signal trampoline is synthesized directly into the
signal frame we need to make sure the instructions end up little-endian.
Otherwise the wrong endianness will cause a SIGILL upon return from the
signal handler on big-endian targets.
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20171220212308.12614-4-michael.weiser@gmx.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Enable big-endian mode for data accesses on aarch64 for big-endian linux
user mode. Activate it for all exception levels as documented by ARM:
Set the SCTLR EE bit for ELs 1 through 3. Additionally set bit E0E in
EL1 to enable it in EL0 as well.
Signed-off-by: Michael Weiser <michael.weiser@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20171220212308.12614-2-michael.weiser@gmx.de
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2018-01-11
This pull request supersedes ppc-for-2.12-20180108 and several before
it. The earlier pull request included a patch which exposed a bug in
the ARM TCG backend. I've pulled that out and will repost once the
ARM bug is fixed (a patch has been posted by Richard Henderson).
Higlights from this series:
* SLOF update
* Several new devices for embedded platforms
* Fix to correctly set compatiblity mode for hotplugged CPUs
* dtc compile fix for older MacOS versions
# gpg: Signature made Thu 11 Jan 2018 04:58:11 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20180111:
spapr: Correct compatibility mode setting for hotplugged CPUs
hw/ppc: Remove the deprecated spapr-pci-vfio-host-bridge device
Update dtc to fix compilation problem on Mac OS 10.6
target/ppc: more use of the PPC_*() macros
ppc/pnv: change powernv_ prefix to pnv_ for overall naming consistency
hw/ide: Emulate SiI3112 SATA controller
spapr_pci: use warn_report()
ppc4xx_i2c: Implement basic I2C functions
sm501: Add some more unimplemented registers
sm501: Add panel hardware cursor registers also to read function
pseries: Update SLOF firmware image to qemu-slof-20171214
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
qemu-sparc update
# gpg: Signature made Tue 09 Jan 2018 22:12:22 GMT
# gpg: using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* remotes/mcayland/tags/qemu-sparc-signed: (25 commits)
sun4u_iommu: add trace event for IOMMU translations
sun4u_iommu: convert from IOMMU_DPRINTF to trace-events
sun4u_iommu: update to reflect IOMMU is no longer part of the APB device
sun4u: split IOMMU device out from apb.c to sun4u_iommu.c
apb: QOMify IOMMU
sun4m: remove include/hw/sparc/sun4m.h and all references to it
sun4m: move IOMMU declarations from sun4m.h to sun4m_iommu.h
sun4m: move sun4m_iommu.c from hw/dma to hw/sparc
sun4u: switch from EBUS_DPRINTF() macro to trace-events
sparc64: introduce trace-events for hw/sparc64
apb: replace OBIO interrupt numbers in pci_pbmA_map_irq() with constants
ebus: wire up OBIO interrupts to APB pbm via qdev GPIOs
apb: remove busA property from PBMPCIBridge state
apb: split pci_pbm_map_irq() into separate functions for bus A and bus B
apb: remove pci_apb_init() and instantiate APB device using qdev
apb: move the two secondary PCI bridges objects into APBState
apb: use gpios to wire up the apb device to the SPARC CPU IRQs
apb: return APBState from pci_apb_init() rather than PCIBus
apb: APB QOMify tidy-up
sun4u: move initialisation of all ISABus devices into ebus_realize()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently the pseries machine sets the compatibility mode for the
guest's cpus in two places: 1) at machine reset and 2) after CAS
negotiation.
This means that if we set or negotiate a compatiblity mode, then
hotplug a cpu, the hotplugged cpu doesn't get the right mode set and
will incorrectly have the full native features.
To correct this, we set the compatibility mode on a cpu when it is
brought online with the 'start-cpu' RTAS call. Given that we no
longer need to set the compatibility mode on all CPUs at machine
reset, so we change that to only set the mode for the boot cpu.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Tested-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
It's a deprecated dummy device since QEMU v2.6.0. That should have
been enough time to allow the users to update their scripts in case
they still use it, so let's remove this legacy code now.
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently QEMU does not build on Mac OS 10.6
because of a missing patch in the dtc
subproject. Updating dtc to make the patch
available fixes this problem.
Signed-off-by: John Arbuckle <programmingkidx@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also introduce utilities to manipulate bitmasks (originaly from OPAL)
which be will be used in the model of the XIVE interrupt controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'pnv' prefix is now used for all and the routines populating the
device tree start with 'pnv_dt'. The handler of the PnvXScomInterface
is also renamed to 'dt_xscom' which should reflect that it is
populating the device tree under the 'xscom@' node of the chip.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This is a common generic PCI SATA controller that is also used in PCs
but more importantly guests running on the Sam460ex board prefer this
card and have a driver for it (unlike for other SATA controllers
already emulated).
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Acked-by: John Snow <jsnow@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These two are definitely warnings. Let's use the appropriate API.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These are not really implemented (just return zero or default values)
but add these so guests accessing them can run.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These were forgotten when adding panel layer support in ffd3925701
"SM501 emulation for R2D-SH4".
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[dwg: Added reference to earlier commit in message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The main changes are:
- able to handle more devices with specified bootindex;
- implements flatten device tree rendering, for both QEMU and guest kernel.
The full list is:
> boot: use a temporary bootdev-buf
> boot: do not concatenate bootdev
> libvirtio: Mark struct virtio_scsi_req_cmd as packed
> fdt: Implement "fdt-fetch" method for client interface
> rtas: Store RTAS address and entry in the device tree
> board-qemu: Fix slof-build-id length
> fdt: Pass the resulting device tree to QEMU
> fdt: Fix version and add a word for FDT header size
> tree: Rework set-chosen-cpu and store /chosen ihandle and phandle
> node: Add some documentation
> Revert various SLOF-to-QEMU private hypercalls
> Use input-device and output-device
> netboot: Create bootp-response when bootp is used
> libnet/ipv6: assign times_asked value directly
> usb-xhci: Reset ERSTSZ together with ERSTBA
> virtio-net: rework the driver to support multiple open
> board-qemu: add private hcall to inform host on "phandle" update
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
By separating the sun4u IOMMU device into new sun4u_iommu.c and sun4m_iommu.h
files we noticeably simplify apb.c whilst bringing sun4u in line with all the
other IOMMU-supporting architectures.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
This is in preparation to split the IOMMU device out of the APB. As part of
this commit we also enforce separation of the IOMMU and APB devices by using
a QOM object link to pass the IOMMU reference and accessing the IOMMU registers
via a separate memory region mapped into the APB config space rather than
directly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
With the previous commit there is now nothing left in sun4m.h so it can be
removed, along with all remaining references to it.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
This seems more appropriate and brings sun4m in line with the other
architectures.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: Artyom Tarasenko <atar4qemu@gmail.com>
This is in preparation for switching code in hw/sparc64 from DPRINTF over to
trace events.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Following on from the previous commit, we can also do the same with
with legacy OBIO interrupts in pci_pbmA_map_irq().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This enables us to remove the static array mapping in the ISA IRQ
handler (and the embedded reference to the APB device) by formalising
the interrupt wiring via the qdev GPIO API.
For more clarity we replace the APB OBIO interrupt numbers with constants
designating the interrupt source, and rename isa_irq_handler() to
ebus_isa_irq_handler().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Since the previous commit the only remaining use of the qdev busA property is
to configure the PCI bridge in front of the onboard ebus devices differently
to allow early OpenBIOS serial console access.
Instead we can now manually update the PCI configuration for bridge A in
pci_pbm_reset() and thus completely remove the busA property from the
PBMPCIBridge state.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
After the previous refactoring it is now possible to use separate functions
to improve the clarity of the interrupt paths.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
By making the special_base and mem_base values qdev properties, we can move
the remaining parts of pci_apb_init() into the pbm init() and realize()
functions.
This finally allows us to instantiate the APB directly using standard qdev
create/init functions in sun4u.c.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Use DeviceClass rather than SysBusDeviceClass in pbm_host_class_init() and
adjust pci_pbm_init_device() accordingly.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This is initialisation that should really take place in the ebus realize
function. As part of this we also rework the ebus IRQ mapping so that
instead of having to pass in the array of pbm_irqs, we obtain a reference
to them by looking up the APB device during ebus realize.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Since the EBus is effectively a PCI-ISA bridge then the underlying ISA bus
should be contained within the PCI bridge itself.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The main change here is to introduce the proper TYPE_EBUS/EBUS QOM macros
and remove the use of DO_UPCAST.
Alongside this there are some a couple of minor cosmetic changes and a rename
of pci_ebus_realize() to ebus_realize() since the ebus device is always what
is effectively a PCI-ISA bridge.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This code is preventing the MMU debug code from displaying virtual
mappings of IO devices (anything that is not located in the RAM).
Before this patch, Qemu would output 0xffffffffffffffff (-1) as the
physical address corresponding to an IO device virtual address.
With this patch the intended physical address is displayed.
Signed-off-by: Jean-Christophe Dubois <jcd@tribudubois.net>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
const16 is an opcode that shifts 16 lower bits of an address register
to the 16 upper bits and puts its immediate operand into the lower 16
bits. It is not controlled by an Xtensa option and doesn't have a fixed
opcode.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
GPIO32 is not in the core ISA, but it was widely used in Diamond Cores.
This implementation doesn't do actual I/O and doesn't handle the case of
GPIO32 state being a part of coprocessor.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Add two special registers: MMID and DDR:
- MMID is write-only and the only side effect of writing to it is output
to the trace port, which is not emulated;
- DDR is only accessible in debug mode, which is not emulated.
Add two debug-mode-only opcodes:
- rfdd and rfdo do return from the debug mode, which is not emulated.
Add three internal opcodes for full MMU:
- hwwdtlba and hwwitlba are the internal opcodes that write a value into
autoupdate DTLB or ITLB entry.
- ldpte is internal opcode that loads PTE entry that covers the most
recent page fault address.
None of these three opcodes may appear in a valid instruction.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
It doesn't help much, always-set bit 0 of the LITBASE SR is easy to
compensate with decrement of the l32r immediate argument.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
memctl SR is not available on dc232b, as it was introduced in more
recent hardware release. Now that this information is available through
the libisa the test fails. Fix the test.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Replace manual opcode analysis with libisa-based code. This makes it
possible to support variable-encoding instructions of the core ISA, like
const16, and will allow to support advanced Xtensa features, like FLIX
and TIE.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
- Aneesh no longer listed in MAINTAINERS,
- deprecation of the handle backend,
- improved error reporting, especially when the local backend fails to
open the VirtFS root,
- virtio-9p-test to behave more like a real virtio guest driver: set
DRIVER_OK when ready to use the device and process the used ring
for completed requests,
- cosmetic fixes (mostly coding style related).
# gpg: Signature made Mon 08 Jan 2018 10:19:18 GMT
# gpg: using RSA key 0x71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
MAINTAINERS: Drop Aneesh as 9pfs maintainer
9pfs: deprecate handle backend
fsdev: improve error handling of backend init
fsdev: improve error handling of backend opts parsing
tests: virtio-9p: set DRIVER_OK before using the device
tests: virtio-9p: fix ISR dependence
9pfs: make pdu_marshal() and pdu_unmarshal() static functions
9pfs: fix error path in pdu_submit()
9pfs: fix type in *_parse_opts declarations
9pfs: handle: fix type definition
9pfs: fix some type definitions
fsdev: fix some type definitions
9pfs: fix XattrOperations typedef
virtio-9p: move unrealize/realize after virtio_9p_transport definition
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The find_desc_by_name() from util/qemu-option.c relies on the .name not being
NULL to call strcmp(). This check becomes unsafe when the list is not
NULL-terminated, which is the case of nbd_runtime_opts in block/nbd.c, and can
result in segmentation fault when strcmp() tries to access an invalid memory:
#0 0x00007fff8c75f7d4 in __strcmp_power9 () from /lib64/libc.so.6
#1 0x00000000102d3ec8 in find_desc_by_name (desc=0x1036d6f0, name=0x28e46670 "server.path") at util/qemu-option.c:166
#2 0x00000000102d93e0 in qemu_opts_absorb_qdict (opts=0x28e47a80, qdict=0x28e469a0, errp=0x7fffec247c98) at util/qemu-option.c:1026
#3 0x000000001012a2e4 in nbd_open (bs=0x28e42290, options=0x28e469a0, flags=24578, errp=0x7fffec247d80) at block/nbd.c:406
#4 0x00000000100144e8 in bdrv_open_driver (bs=0x28e42290, drv=0x1036e070 <bdrv_nbd_unix>, node_name=0x0, options=0x28e469a0, open_flags=24578, errp=0x7fffec247f50) at block.c:1135
#5 0x0000000010015b04 in bdrv_open_common (bs=0x28e42290, file=0x0, options=0x28e469a0, errp=0x7fffec247f50) at block.c:1395
>From gdb, the desc[i].name was not NULL and resulted in strcmp() accessing an
invalid memory:
>>> p desc[5]
$8 = {
name = 0x1037f098 "R27A",
type = 1561964883,
help = 0xc0bbb23e <error: Cannot access memory at address 0xc0bbb23e>,
def_value_str = 0x2 <error: Cannot access memory at address 0x2>
}
>>> p desc[6]
$9 = {
name = 0x103dac78 <__gcov0.do_qemu_init_bdrv_nbd_init> "\001",
type = 272101528,
help = 0x29ec0b754403e31f <error: Cannot access memory at address 0x29ec0b754403e31f>,
def_value_str = 0x81f343b9 <error: Cannot access memory at address 0x81f343b9>
}
This patch fixes the segmentation fault in strcmp() by adding a NULL element at
the end of nbd_runtime_opts.desc list, which is the common practice to most of
other structs like runtime_opts in block/null.c. Thus, the desc[i].name != NULL
check becomes safe because it will not evaluate to true when .desc list reached
its end.
Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1727259
Signed-off-by: Murilo Opsfelder Araujo <muriloo@linux.vnet.ibm.com>
Message-Id: <20180105133241.14141-2-muriloo@linux.vnet.ibm.com>
CC: qemu-stable@nongnu.org
Fixes: 7ccc44fd7d
Signed-off-by: Eric Blake <eblake@redhat.com>
If we are careful to handle 0-length read requests correctly,
we can optimize our sparse read to send the NBD_REPLY_FLAG_DONE
bit on our last OFFSET_DATA or OFFSET_HOLE chunk rather than
needing a separate chunk.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The reason that NBD added structured reply in the first place was
to allow for efficient reads of sparse files, by allowing the
reply to include chunks to quickly communicate holes to the client
without sending lots of zeroes over the wire. Time to implement
this in the server; our client can already read such data.
We can only skip holes insofar as the block layer can query them;
and only if the client is okay with a fragmented request (if a
client requests NBD_CMD_FLAG_DF and the entire read is a hole, we
could technically return a single NBD_REPLY_TYPE_OFFSET_HOLE, but
that's a fringe case not worth catering to here). Sadly, the
control flow is a bit wonkier than I would have preferred, but
it was minimally invasive to have a split in the action between
a fragmented read (handled directly where we recognize
NBD_CMD_READ with the right conditions, and sending multiple
chunks) vs. a single read (handled at the end of nbd_trip, for
both simple and structured replies, when we know there is only
one thing being read). Likewise, I didn't make any effort to
optimize the final chunk of a fragmented read to set the
NBD_REPLY_FLAG_DONE, but unconditionally send that as a separate
NBD_REPLY_TYPE_NONE.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171107030912.23930-2-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Block layer patches
# gpg: Signature made Fri 22 Dec 2017 14:09:01 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (35 commits)
block: Keep nodes drained between reopen_queue/multiple
commit: Simplify reopen of base
test-bdrv-drain: Test graph changes in drained section
block: Allow graph changes in subtree drained section
test-bdrv-drain: Recursive draining with multiple parents
test-bdrv-drain: Test behaviour in coroutine context
test-bdrv-drain: Tests for bdrv_subtree_drain
block: Add bdrv_subtree_drained_begin/end()
block: Don't notify parents in drain call chain
test-bdrv-drain: Test nested drain sections
block: Nested drain_end must still call callbacks
block: Don't block_job_pause_all() in bdrv_drain_all()
test-bdrv-drain: Test drain vs. block jobs
blockjob: Pause job on draining any job BDS
test-bdrv-drain: Test bs->quiesce_counter
test-bdrv-drain: Test callback for bdrv_drain
block: Make bdrv_drain() driver callbacks non-recursive
block: Assert drain_all is only called from main AioContext
block: Remove unused bdrv_requests_pending
block: Mention -drive cyls/heads/secs/trans/serial/addr in deprecation chapter
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Initial support for the HVF accelerator
# gpg: Signature made Sat 23 Dec 2017 07:51:18 GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream-hvf:
i386: hvf: cleanup x86_gen.h
i386: hvf: remove VM_PANIC from "in"
i386: hvf: remove addr_t
i386: hvf: simplify flag handling
i386: hvf: abort on decoding error
i386: hvf: remove ZERO_INIT macro
i386: hvf: remove more dead emulator code
i386: hvf: unify register enums between HVF and the rest
i386: hvf: header cleanup
i386: hvf: move all hvf files in the same directory
i386: hvf: inject General Protection Fault when vmexit through vmcall
i386: hvf: refactor event injection code for hvf
i386: hvf: implement vga dirty page tracking
i386: refactor KVM cpuid code so that it applies to hvf as well
i386: hvf: implement hvf_get_supported_cpuid
i386: hvf: use new helper functions for put/get xsave
i386: hvf: fix licensing issues; isolate task handling code (GPL v2-only)
i386: hvf: add code base from Google's QEMU repository
apic: add function to apic that will be used by hvf
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Aneesh has been working on other topics for some time now. Let's reflect
that in the MAINTAINERS file, so that people stop Cc'ing him.
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
This backend raise some concerns:
- doesn't support symlinks
- fails +100 tests in the PJD POSIX file system test suite [1]
- requires the QEMU process to run with the CAP_DAC_READ_SEARCH
capability, which isn't recommended for security reasons
This backend should not be used and wil be removed. The 'local'
backend is the recommended alternative.
[1] https://www.tuxera.com/community/posix-test-suite/
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Aneesh Kumar K.V <aneesh.kumar@linux.vnet.ibm.com>
This patch changes some error messages in the backend init code and
convert backends to propagate QEMU Error objects instead of calling
error_report().
One notable improvement is that the local backend now provides a more
detailed error report when it fails to open the shared directory.
Signed-off-by: Greg Kurz <groug@kaod.org>
This patch changes some error messages in the backend opts parsing
code and convert backends to propagate QEMU Error objects instead
of calling error_report().
Signed-off-by: Greg Kurz <groug@kaod.org>
Like other virtio tests, use the used ring APIs instead of assuming ISR
being set means the request has completed.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
If we receive an unsupported request id, we first decide to
return -ENOTSUPP to the client, but since the request id
causes is_read_only_op() to return false, we change the
error to be -EROFS if the fsdev is read-only. This doesn't
make sense since we don't know what the client asked for.
This patch ensures that -EROFS can only be returned if the
request id is supported.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Merge tpm 2017/12/22 v1
# gpg: Signature made Fri 22 Dec 2017 20:03:37 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-12-22-1:
acpi: Update TPM2 ACPI table to more recent specs
tpm: Implement tpm_sized_buffer_reset
tpm_tis: merge r/w_offset into rw_offset
tpm_tis: move r/w_offsets to TPMState
tpm_tis: merge read and write buffer into single buffer
tpm_tis: move buffers from localities into common location
tpm_tis: remove TPMSizeBuffer usage
tpm_tis: limit size of buffer from backend
tpm_tis: convert uint32_t to size_t
tpm_emulator: Add a caching layer for the TPM Established flag
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Fri 22 Dec 2017 02:12:29 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
qemu-doc: Update the deprecation information of -tftp, -bootp, -redir and -smb
qemu-doc: The "-net nic" option can be used with "netdev=...", too
net: Remove the legacy "-net channel" parameter
net: remove unused compute_mcast_idx() function
rtl8139: use inline net_crc32() and bitshift instead of compute_mcast_idx()
ne2000: use inline net_crc32() and bitshift instead of compute_mcast_idx()
ftgmac100: use inline net_crc32() and bitshift instead of compute_mcast_idx()
lan9118: use inline net_crc32() and bitshift instead of compute_mcast_idx()
opencores_eth: use inline net_crc32() and bitshift instead of compute_mcast_idx()
eepro100: use inline net_crc32() and bitshift instead of compute_mcast_idx()
sungem: fix multicast filter CRC calculation
sunhme: switch sunhme over to use net_crc32_le()
eepro100: switch eepro100 e100_compute_mcast_idx() over to use net_crc32()
pcnet: switch pcnet over to use net_crc32_le()
net: introduce net_crc32_le() function
net: move CRC32 calculation from compute_mcast_idx() into its own net_crc32() function
e1000: Separate TSO and non-TSO contexts, fixing UDP TX corruption
e1000, e1000e: Move per-packet TX offload flags out of context state
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Some cleanup, and allows SR to be moved from any addressing mode.
Previous code was wrong for coldfire: coldfire also allows to
use addressing mode to set SR/CCR. It only supports Data register
to get SR/CCR (move from)
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180104012913.30763-15-laurent@vivier.eu>
As gen_helper_get_ccr() is able to compute CCR from cc_op and
flags, we don't need to flush flags before to call it.
flush_flags() and get_ccr() use COMPUTE_CCR() to compute
flags. get_ccr() computes CCR value,
whereas flush_flags update live cc_op and flags.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20180104012913.30763-3-laurent@vivier.eu>
If the script is run with a core (no running process), it produces an
error:
(gdb) dump-guest-memory /tmp/vmcore X86_64
guest RAM blocks:
target_start target_end host_addr message count
---------------- ---------------- ---------------- ------- -----
0000000000000000 00000000000a0000 00007f7935800000 added 1
00000000000a0000 00000000000b0000 00007f7934200000 added 2
00000000000c0000 00000000000ca000 00007f79358c0000 added 3
00000000000ca000 00000000000cd000 00007f79358ca000 joined 3
00000000000cd000 00000000000e8000 00007f79358cd000 joined 3
00000000000e8000 00000000000f0000 00007f79358e8000 joined 3
00000000000f0000 0000000000100000 00007f79358f0000 joined 3
0000000000100000 0000000080000000 00007f7935900000 joined 3
00000000fd000000 00000000fe000000 00007f7934200000 added 4
00000000fffc0000 0000000100000000 00007f7935600000 added 5
Python Exception <class 'gdb.error'> You can't do that without a process to debug.:
Error occurred in Python command: You can't do that without a process
to debug.
Replace the object_resolve_path_type() function call with a local
volatile variable.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Use the function argument "name" instead of hardcoded
"VMCOREINFO". All callers use "VMCOREINFO" as argument, so this isn't
an exposed bug, thankfully.
Simplify a little bit the code while touching this.
Suggested-by: Andrew Jones <drjones@redhat.com>
Reported-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
We already handle this in the backends, and the lifetime datum
for the TCGOp is already large enough.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We had two fields specific to INDEX_op_call. Rename these and
add some macros so that the fields may be reused for other opcodes.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
With no fixed array allocation, we can't overflow a buffer.
This will be important as optimizations related to host vectors
may expand the number of ops used.
Use QTAILQ to link the ops together.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These are now trivial sets and tests against NULL. Unwrap.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
cpu_restore_state officially supports being passed an address it can't
resolve the state for. As a result the checks in the helpers are
superfluous and can be removed. This makes the code consistent with
other users of cpu_restore_state.
Of course this does nothing to address what to do if cpu_restore_state
can't resolve the state but so far it seems this is handled elsewhere.
The change was made with included coccinelle script.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
[rth: Fixed up comment indentation. Added second hunk to script to
combine cpu_restore_state and cpu_loop_exit.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
More recent specs of the TPM2 ACPI table add fields for the log area
start address and the log area minimum size, which we already use
for the TCPA table.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The bdrv_reopen*() implementation doesn't like it if the graph is
changed between queuing nodes for reopen and actually reopening them
(one of the reasons is that queuing can be recursive).
So instead of draining the device only in bdrv_reopen_multiple(),
require that callers already drained all affected nodes, and assert this
in bdrv_reopen_queue().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Since commit bde70715, base is the only node that is reopened in
commit_start(). This means that the code, which still involves an
explicit BlockReopenQueue, can now be simplified by using bdrv_reopen().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
We need to remember how many of the drain sections in which a node is
were recursive (i.e. subtree drain rather than node drain), so that they
can be correctly applied when children are added or removed during the
drained section.
With this change, it is safe to modify the graph even inside a
bdrv_subtree_drained_begin/end() section.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If bdrv_do_drained_begin/end() are called in coroutine context, they
first use a BH to get out of the coroutine context. Call some existing
tests again from a coroutine to cover this code path.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_drained_begin() waits for the completion of requests in the whole
subtree, but it only actually keeps its immediate bs parameter quiesced
until bdrv_drained_end().
Add a version that keeps the whole subtree drained. As of this commit,
graph changes cannot be allowed during a subtree drained section, but
this will be fixed soon.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is in preparation for subtree drains, i.e. drained sections that
affect not only a single node, but recursively all child nodes, too.
Calling the parent callbacks for drain is pointless when we just came
from that parent node recursively and leads to multiple increases of
bs->quiesce_counter in a single drain call. Don't do it.
In order for this to work correctly, the parent callback must be called
for every bdrv_drain_begin/end() call, not only for the outermost one:
If we have a node N with two parents A and B, recursive draining of A
should cause the quiesce_counter of B to increase because its child N is
drained independently of B. If now B is recursively drained, too, A must
increase its quiesce_counter because N is drained independently of A
only now, even if N is going from quiesce_counter 1 to 2.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_do_drained_begin() restricts the call of parent callbacks and
aio_disable_external() to the outermost drain section, but the block
driver callbacks are always called. bdrv_do_drained_end() must match
this behaviour, otherwise nodes stay drained even if begin/end calls
were balanced.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block jobs are already paused using the BdrvChildRole drain callbacks,
so we don't need an additional block_job_pause_all() call.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block jobs already paused themselves when their main BlockBackend
entered a drained section. This is not good enough: We also want to
pause a block job and may not submit new requests if, for example, the
mirror target node should be drained.
This implements .drained_begin/end callbacks in child_job in order to
consider all block nodes related to the job, and removes the
BlockBackend callbacks which are unnecessary now because the root of the
job main BlockBackend is always referenced with a child_job, too.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is currently only working correctly for bdrv_drain(), not for
bdrv_drain_all(). Leave a comment for the drain_all case, we'll address
it later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The existing test is for bdrv_drain_all_begin/end() only. Generalise the
test case so that it can be run for the other variants as well. At the
moment this is only bdrv_drain_begin/end(), but in a while, we'll add
another one.
Also, add a backing file to the test node to test whether the operations
work recursively.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_drained_begin() doesn't increase bs->quiesce_counter recursively
and also doesn't notify other parent nodes of children, which both means
that the child nodes are not actually drained, and bdrv_drained_begin()
is providing useful functionality only on a single node.
To keep things consistent, we also shouldn't call the block driver
callbacks recursively.
A proper recursive drain version that provides an actually working
drained section for child nodes will be introduced later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Looks like we forgot to announce the deprecation of these options in
the corresponding chapter of the qemu-doc text, so let's do that now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's been marked as deprecated since QEMU v2.10.0, and so far nobody
complained that we should keep it, so let's remove this legacy option
now to simplify the code quite a bit.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Management tools create overlays of running guests with qemu-img:
$ qemu-img create -b /image/in/use.qcow2 -f qcow2 /overlay/image.qcow2
but this doesn't work anymore due to image locking:
qemu-img: /overlay/image.qcow2: Failed to get shared "write" lock
Is another process using the image?
Could not open backing image to determine size.
Use the force share option to allow this use case again.
Cc: qemu-stable@nongnu.org
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Add trace output for commands, errors, and undefined behavior.
Add guest error log output for undefined behavior.
Report invalid undefined accesses to MMIO.
Annotate unlikely error checks with unlikely.
Signed-off-by: Doug Gale <doug16k@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Removing a quorum child node with x-blockdev-change results in a quorum
driver state that cannot be recreated with create options because it
would require a list with gaps. This causes trouble in at least
.bdrv_refresh_filename().
Document this problem so that we won't accidentally mark the command
stable without having addressed it.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Since bdrv_co_preadv does all neccessary checks including
reading after the end of the backing file, avoid duplication
of verification before bdrv_co_preadv call.
Signed-off-by: Edgar Kaziakhmedov <edgar.kaziakhmedov@virtuozzo.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit 15afd94a04 added code to acquire and release the AioContext in
qemuio_command(). This means that the lock is taken twice now in the
call path from hmp_qemu_io(). This causes BDRV_POLL_WHILE() to hang for
any requests issued to nodes in a non-mainloop AioContext.
Dropping the first locking from hmp_qemu_io() fixes the problem.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Drain requests are propagated to child nodes, parent nodes and directly
to the AioContext. The order in which this happened was different
between all combinations of drain/drain_all and begin/end.
The correct order is to keep children only drained when their parents
are also drained. This means that at the start of a drained section, the
AioContext needs to be drained first, the parents second and only then
the children. The correct order for the end of a drained section is the
opposite.
This patch changes the three other functions to follow the example of
bdrv_drained_begin(), which is the only one that got it right.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The device is drained, so there is no point in waiting for requests at
the end of the drained section. Remove the bdrv_drain_recurse() calls
there.
The bdrv_drain_recurse() calls were introduced in commit 481cad48e5
in order to call the .bdrv_co_drain_end() driver callback. This is now
done by a separate bdrv_drain_invoke() call.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Now that the bdrv_drain_invoke() calls are pulled up to the callers of
bdrv_drain_recurse(), the 'begin' parameter isn't needed any more.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This adds a test case that the BlockDriver callbacks for drain are
called in bdrv_drained_all_begin/end(), and that both of them are called
exactly once.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
bdrv_drain_all_begin() used to call the .bdrv_co_drain_begin() driver
callback inside its polling loop. This means that how many times it got
called for each node depended on long it had to poll the event loop.
This is obviously not right and results in nodes that stay drained even
after bdrv_drain_all_end(), which calls .bdrv_co_drain_begin() once per
node.
Fix bdrv_drain_all_begin() to call the callback only once, too.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
This change separates bdrv_drain_invoke(), which calls the BlockDriver
drain callbacks, from bdrv_drain_recurse(). Instead, the function
performs its own recursion now.
One reason for this is that bdrv_drain_recurse() can be called multiple
times by bdrv_drain_all_begin(), but the callbacks may only be called
once. The separation is necessary to fix this bug.
The other reason is that we intend to go to a model where we call all
driver callbacks first, and only then start polling. This is not fully
achieved yet with this patch, as bdrv_drain_invoke() contains a
BDRV_POLL_WHILE() loop for the block driver callbacks, which can still
call callbacks for any unrelated event. It's a step in this direction
anyway.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
VPC has some difficulty creating geometries of particular size.
However, we can indeed force it to use a literal one, so let's
do that for the sake of test 197, which is testing some specific
offsets.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
Commit 1f4ad7d fixed 'qemu-img info' for raw images that are currently
in use as a mirror target. It is not enough for image formats, though,
as these still unconditionally request BLK_PERM_CONSISTENT_READ.
As this permission is geared towards whether the guest-visible data is
consistent, and has no impact on whether the metadata is sane, and
'qemu-img info' does not read guest-visible data (except for the raw
format), it makes sense to not require BLK_PERM_CONSISTENT_READ if there
is not going to be any guest I/O performed, regardless of image format.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Rather than unsupported situations, some VM_PANIC calls actually
are caused by internal errors. Convert them to just abort.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch refactors the event-injection code for hvf by using the
appropriate fields already provided by CPUX86State. At vmexit, it fills
these fields so that hvf_inject_interrupts can just retrieve them without
calling into hvf.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-14-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch replaces the license header for those files that were either
GPL v2-or-v3, or GPL v2-only; the replacing license is GPL v2-or-later.
The code for task switching/handling, which is derived from KVM and
hence is GPL v2-only, is isolated in the new files (with this license)
x86_task.c/.h, and the corresponding compilation rule is added to
target/i386/hvf-utils/Makefile.objs.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-4-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This file begins tracking the files that will be the code base for HVF
support in QEMU. This code base is part of Google's QEMU version of
their Android emulator, and can be found at
https://android.googlesource.com/platform/external/qemu/+/emu-master-dev
This code is based on Veertu Inc's vdhh (Veertu Desktop Hosted
Hypervisor), found at https://github.com/veertuinc/vdhh. Everything is
appropriately licensed under GPL v2-or-later, except for the code inside
x86_task.c and x86_task.h, which, deriving from KVM (the Linux kernel),
is licensed GPL v2-only.
This code base already implements a very great deal of functionality,
although Google's version removed from Vertuu's the support for APIC
page and hyperv-related stuff. According to the Android Emulator Release
Notes, Revision 26.1.3 (August 2017), "Hypervisor.framework is now
enabled by default on macOS for 32-bit x86 images to improve performance
and macOS compatibility", although we better use with caution for, as the
same Revision warns us, "If you experience issues with it specifically,
please file a bug report...". The code hasn't seen much update in the
last 5 months, so I think that we can further develop the code with
occasional visiting Google's repository to see if there has been any
update.
On top of Google's code, the following changes were made:
- add code to the configure script to support the --enable-hvf argument.
If the OS is Darwin, it checks for presence of HVF in the system. The
patch also adds strings related to HVF in the file qemu-options.hx.
QEMU will only support the modern syntax style '-M accel=hvf' no enable
hvf; the legacy '-enable-hvf' will not be supported.
- fix styling issues
- add glue code to cpus.c
- move HVFX86EmulatorState field to CPUX86State, changing the
the emulation functions to have a parameter with signature 'CPUX86State *'
instead of 'CPUState *' so we don't have to get the 'env'.
Signed-off-by: Sergio Andres Gomez Del Real <Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-2-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-3-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-5-Sergio.G.DelReal@gmail.com>
Message-Id: <20170913090522.4022-6-Sergio.G.DelReal@gmail.com>
Message-Id: <20170905035457.3753-7-Sergio.G.DelReal@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move the definition of TPMSizedBuffer out of tpm_tis.c into tpm_util.h
and implement tpm_sized_buffer_reset() for the following patches to use.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
We can now merge the r_offset and w_offset into a single rw_offset.
This is possible since when the offset is used for writing in
RECEPTION state then reads are ignore. Conversely, when the offset
is used for reading when in COMPLETION state, then writes are
ignored.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Now that we have a single buffer, we also only need a single set of
read/write offsets into that buffer. This works since only one
locality can be active.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
One read buffer and one write buffer is sufficient for all localities.
The localities cannot all be active at the same time, and only the active
locality can use the r/w buffers. Inactive localities will require the
COMMAND_READY flag to be set on the STS register to move to the READY
state, which then enables access to using the buffer for writing of a
command, while all other localities are inactive.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Remove usage of TPMSizeBuffer. The size of the buffers is limited now
by s->be_buffer_size, which is the size of the buffer the TIS has
negotiated with the backend.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This is a preparatory patch for the subsequent ones where we
get rid of the flexibility of supporting any kind of buffer size
that the backend may support. We keep the size at 4096, which is
also the size the external emulator supports. So, limit the size
of the buffer we can support and pass it back to the backend.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Add a caching layer for the TPM established flag so that we don't
need to go to the emulator every time the flag is read by accessing
the REG_ACCESS register.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The information how to update the deprecated parameters was too scarce,
so that some people did not update to the new syntax yet. Provide some
more information to make sure that it is clear how to update from the
old syntax to the new one.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Looks like we missed to document that it is also possible to specify
a netdev with "-net nic" - which is very useful if you want to
configure your on-board NIC to use a backend that has been specified
with "-netdev".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
It has never been documented, so hardly anybody knows about this
parameter, and it is marked as deprecated since QEMU v2.6.
Time to let it go now.
Reviewed-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Now that all of the callers have been converted to compute the multicast index
inline using new net CRC functions, this function can now be dropped.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
This makes it much easier to compare the multicast CRC calculation endian and
bitshift against the Linux driver implementation.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: Jason Wang <jasowang@redhat.com>
From the Linux sungem driver, we know that the multicast filter CRC is
implemented using ether_crc_le() which isn't the same as calling zlib's
crc32() function (the zlib implementation requires a complemented initial value
and also returns the complemented result).
Fix the multicast filter by simply using the new net_crc32_le() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Instead of sunhme_crc32_le() using its own implementation, we can simply call
net_crc32_le() directly and apply the bit shift inline.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Instead of e100_compute_mcast_idx() using its own implementation, we can
simply call net_crc32() directly and apply the bit shift inline.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Instead of lnc_mchash() using its own implementation, we can simply call
net_crc32_le() directly and apply the bit shift inline.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Separate out the standard ethernet CRC32 calculation into a new net_crc32()
function, renaming the constant POLYNOMIAL to POLYNOMIAL_BE to make it clear
that this is a big-endian CRC32 calculation.
As part of the constant rename, remove the duplicate definition of POLYNOMIAL
from eepro100.c and use the new POLYNOMIAL_BE constant instead.
Once this is complete remove the existing CRC32 implementation from
compute_mcast_idx() and call the new net_crc32() function in its place.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The device is supposed to maintain two distinct contexts for transmit
offloads: one has parameters for both segmentation and checksum
offload, the other only for checksum offload. The guest driver can
send two context descriptors, one for each context (the TSE flag
specifies which). Then the guest can refer to one or the other context
in subsequent transmit data descriptors, depending on what offloads it
wants applied to each packet.
Currently the e1000 device stores just one context, and misinterprets
the TSE flags in the context and data descriptors. This is often okay:
Linux happens to send a fresh context descriptor before every data
descriptor, so forgetting the other context doesn't matter. Windows
does rely on separate contexts for TSO vs. non-TSO packets, but for
mostly-TCP traffic the two contexts have identical TCP-specific
offload parameters so confusing them doesn't matter.
One case where this confusion matters is when a Windows guest sets up
a TSO context for TCP and a non-TSO context for UDP, and then
transmits both TCP and UDP traffic in parallel. The e1000 device
sometimes ends up using TCP-specific parameters while doing checksum
offload on a UDP datagram: it writes the checksum to offset 16 (the
correct location for a TCP checksum), stomping on two bytes of UDP
data, and leaving the wrong value in the actual UDP checksum field at
offset 6. (Even worse, the host network stack may then recompute the
UDP checksum, "correcting" it to match the corrupt data before sending
it out a physical interface.)
Correct this by tracking the TSO context independently of the non-TSO
context, and selecting the appropriate context based on the TSE flag
in each transmit data descriptor.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
sum_needed and cptse flags are received from the guest within each
transmit data descriptor. They are not part of the offload context;
instead, they determine how to apply a previously received context to
the packet being transmitted:
- If cptse is set, perform both segmentation and checksum offload
using the parameters in the TSO context; otherwise just do checksum
offload. (Currently the e1000 device incorrectly stores only one
context, which will be fixed in a subsequent patch.)
- Depending on the bits set in sum_needed, possibly perform L4
checksum offload and/or IP checksum offload, using the parameters in
the appropriate context.
Move these flags out of struct e1000x_txd_props, which is otherwise
dedicated to storing values from a context descriptor, and into the
per-packet TX struct.
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
PIIX4 errata says that "immediate polling of the Host Status Register BUSY
bit may indicate that the SMBus is NOT busy."
Due to this, some code does the following steps:
(a) set parameters
(b) start command
(c) check for smbus busy bit set (to know that command started)
(d) check for smbus busy bit not set (to know that command finished)
Let (c) happen, by immediately setting the busy bit, and really executing
the command when status register has been read once.
This fixes a problem with AMIBIOS, which can now properly initialize the PIIX4.
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
If the script is run with a core (no running process), it produces an
error:
(gdb) dump-guest-memory /tmp/vmcore X86_64
guest RAM blocks:
target_start target_end host_addr message count
---------------- ---------------- ---------------- ------- -----
0000000000000000 00000000000a0000 00007f7935800000 added 1
00000000000a0000 00000000000b0000 00007f7934200000 added 2
00000000000c0000 00000000000ca000 00007f79358c0000 added 3
00000000000ca000 00000000000cd000 00007f79358ca000 joined 3
00000000000cd000 00000000000e8000 00007f79358cd000 joined 3
00000000000e8000 00000000000f0000 00007f79358e8000 joined 3
00000000000f0000 0000000000100000 00007f79358f0000 joined 3
0000000000100000 0000000080000000 00007f7935900000 joined 3
00000000fd000000 00000000fe000000 00007f7934200000 added 4
00000000fffc0000 0000000100000000 00007f7935600000 added 5
Python Exception <class 'gdb.error'> You can't do that without a process to debug.:
Error occurred in Python command: You can't do that without a process
to debug.
Replace the object_resolve_path_type() function call call with a
local volatile variable.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The statement being removed doesn't change anything as virtio PCI devices already
have Subsystem Vendor ID set to pci_default_sub_vendor_id (0x1af4), same as Vendor
ID. And the Virtio spec does not require the two to be equal, either:
"The PCI Subsystem Vendor ID and the PCI Subsystem Device ID MAY reflect the PCI
Vendor and Device ID of the environment (for informational purposes by the driver)."
Background:
Following the recent virtio-win licensing change, several vendors are planning to
ship their own certified version of Windows guest Virtio drivers, potentially taking
advantage of Windows Update as a distribution channel. It is therefore critical that
each vendor uses their own PCI Subsystem Vendor ID for Virtio devices to prevent
drivers from other vendors binding to it.
This would be trivially done by adding:
k->subsystem_vendor_id = ...
to virtio_pci_class_init(). Except for the problematic statement deleted by this
patch, which reverts the Subsystem Vendor ID back to 0x1af4 for legacy devices for
no good reason.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
The vhost-user protocol specification does not define "guest address"
and "user address". It does not explain how to access memory given such
addresses.
This patch explains how memory access works, including the IOTLB.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Maxime Coquelin <maxime.coquelin@redhat.com>
Cc: Wei Wang <wei.w.wang@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Acked-by: Maxime Coquelin <maxime.coquelin@redhat.com>
When compiled with anything other than the 'log' trace backend, we have:
error: implicit declaration of function 'qemu_log_mask'
error: 'LOG_UNIMP' undeclared (first use in this function)
This patch adds the missing include.
Fixes: 7299e1a411
("hw/i386/vmport: replace fprintf() by trace events or LOG_UNIMP")
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 20171221211103.30311-1-laurent@vivier.eu
[PMM: fixed commit message description of when problem occurs]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We can output a character quite easily here with some few lines of
assembly that we provide as a mini-kernel for this board.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1512031988-32490-4-git-send-email-thuth@redhat.com>
[lv: add boot-serial-test in check-qtest-m68k]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
The first call of set_cc_op() in a new translation sequence
is done with old_op set to CC_OP_DYNAMIC (-1).
This will do an out of bound access to the array cc_op_live[].
We fix that by adding an entry in cc_op_live[] for CC_OP_DYNAMIC.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20171221160558.14151-1-laurent@vivier.eu>
It makes the code clearer to separate the bus implementation
from the devices one.
Replace ADB_DPRINTF() with trace events (and adding new ones in adb-kbd.c).
Some minor changes to make checkpatch.pl happy.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Message-Id: <20171220121406.24056-1-laurent@vivier.eu>
* NBD and chardev conversion to QIONetListener (Daniel)
* MTTCG fixes (David)
* Hyper-V fixes (Roman, Evgeny)
* share-rw option (Fam)
* Mux chardev event bugfix (Marc-André)
* Add systemd unit files in contrib/ (me)
* SCSI and block/iscsi.c bugfixes (me, Peter L.)
* unassigned_mem_ops fixes (Peter M.)
* VEX decoding fix (Peter M.)
* "info pic" and "info irq" improvements (Peter Xu)
* vmport trace events (Philippe)
* Braille chardev bugfix (Samuel)
* Compiler warnings fix (Stefan)
* initial support for TCG smoke test of more boards (Thomas)
* New CPU features (Yang)
* Reduce startup memory usage (Yang)
* QemuThread race fix (linhecheng)
# gpg: Signature made Thu 21 Dec 2017 08:30:49 GMT
# gpg: using RSA key 0xBFFBD25F78C7AE83
# gpg: Good signature from "Paolo Bonzini <bonzini@gnu.org>"
# gpg: aka "Paolo Bonzini <pbonzini@redhat.com>"
# Primary key fingerprint: 46F5 9FBD 57D6 12E7 BFD4 E2F7 7E15 100C CD36 69B1
# Subkey fingerprint: F133 3857 4B66 2389 866C 7682 BFFB D25F 78C7 AE83
* remotes/bonzini/tags/for-upstream: (41 commits)
chardev: convert the socket server to QIONetListener
blockdev: convert qemu-nbd server to QIONetListener
blockdev: convert internal NBD server to QIONetListener
test: add some chardev mux event tests
chardev: fix backend events regression with mux chardev
rcu: reduce more than 7MB heap memory by malloc_trim()
checkpatch: volatile with a comment or sig_atomic_t is okay
i8259: move TYPE_INTERRUPT_STATS_PROVIDER upper
kvm-i8259: support "info pic" and "info irq"
i8259: generalize statistics into common code
i8259: use DEBUG_IRQ_COUNT always
i8259: convert DPRINTFs into trace
Remove legacy -no-kvm-pit option
scsi: replace hex constants with #defines
scsi: provide general-purpose functions to manage sense data
hw/i386/vmport: replace fprintf() by trace events or LOG_UNIMP
hw/mips/boston: Remove workaround for writes to ROM aborting
exec: Don't reuse unassigned_mem_ops for io_mem_rom
block/iscsi: only report an iSCSI Failure if we don't handle it gracefully
block/iscsi: dont leave allocmap in an invalid state on UNMAP failure
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Instead of creating a QIOChannelSocket directly for the chardev
server socket, use a QIONetListener. This provides the ability
to listen on multiple sockets at the same time, so enables
full support for IPv4/IPv6 dual stack.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171218135417.28301-2-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of creating a QIOChannelSocket directly for the NBD
server socket, use a QIONetListener. This provides the ability
to listen on multiple sockets at the same time, so enables
full support for IPv4/IPv6 dual stack. This also means we can
honour multiple FDs received during socket activation.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171218101643.20360-3-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Instead of creating a QIOChannelSocket directly for the NBD
server socket, use a QIONetListener. This provides the ability
to listen on multiple sockets at the same time, so enables
full support for IPv4/IPv6 dual stack.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171218101643.20360-2-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Check the expected behaviour of qemu_chr_be_event() on a mux chardev.
For some reason, sending the event on the base chardev broadcast to
all frontends, while sending it on the mux chardev itself should
trigger the event on the currently focused chardev frontend.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20171103152824.21948-3-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Kirill noticied that on recent versions on QEMU he was not able to
trigger SysRq to invoke debug capabilites of Linux Kernel. He tracked
it down to qemu_chr_be_event() ignoring CHR_EVENT_BREAK due s->be
being NULL. The bug was introduced in 2.8, commit a4afa548fc ("char:
move front end handlers in CharBackend"). Since the commit, the
qemu_chr_be_event() failed to deliver CHR_EVENT_BREAK due to
qemu_chr_fe_init() does not set s->be in case of mux.
Let's fix this by teaching mux to send an event to the frontend with
the focus.
Reported-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Fixes: a4afa548fc ("char: move front end handlers in CharBackend")
Message-Id: <20171103152824.21948-2-marcandre.lureau@redhat.com>
Tested-by: Kirill A. Shutemov <kirill.shutemov@linux.intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since there are some issues in memory alloc/free machenism
in glibc for little chunk memory, if Qemu frequently
alloc/free little chunk memory, the glibc doesn't alloc
little chunk memory from free list of glibc and still
allocate from OS, which make the heap size bigger and bigger.
This patch introduce malloc_trim(), which will free heap
memory when there is no rcu call during rcu thread loop.
malloc_trim() can be enabled/disabled by --enable-malloc-trim/
--disable-malloc-trim in the Qemu configure command. The
default malloc_trim() is enabled for libc.
Below are test results from smaps file.
(1)without patch
55f0783e1000-55f07992a000 rw-p 00000000 00:00 0 [heap]
Size: 21796 kB
Rss: 14260 kB
Pss: 14260 kB
(2)with patch
55cc5fadf000-55cc61008000 rw-p 00000000 00:00 0 [heap]
Size: 21668 kB
Rss: 6940 kB
Pss: 6940 kB
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1513775806-19779-1-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This assumes that the comment gives some justification;
"volatile sig_atomic_t" is also self-explanatory and usually
correct.
Discussed in:
'[Qemu-devel] [PATCH] dump-guest-memory.py: fix "You can't do that without a process to debug"'
Suggested-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20171215181810.4122-1-marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Let's leverage the i8259 common code for kvm-i8259 too.
I think it's still possible that stats can lost when i8259 is in kernel
and meanwhile when irqfd is used, e.g., by vfio or vhost devices.
However that should be rare IMHO since they should be using MSIs mostly
if they really want performance (that's why people use vhost and device
assignment), and no old INTx should be used. As long as the INTx users
are emulated in QEMU the stats will be correct.
For "info pic", it should be always accurate since we fetch kvm regs
before dump.
More importantly, it's just too simple to do this now - it's only 10+
LOC to gain this feature.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171210063819.14892-5-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's not really scary to even enable it forever. After all it's i8259,
and it's even not the kernel one.
Then we can remove quite a few of lines to make it cleaner. And "info
irq" will always work for it.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171210063819.14892-3-peterx@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Sense keys have nice #defines in scsi/constants.h, use them.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Extract the common parts of scsi_sense_buf_to_errno, scsi_convert_sense
and scsi_target_send_command's REQUEST SENSE handling into two new
functions scsi_parse_sense_buf and scsi_build_sense_buf.
Fix a bug in scsi_target_send_command along the way; the length was
written in buf[10] rather than buf[7].
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Fixes: b07fbce634 ("scsi-bus: correct responses for INQUIRY and REQUEST SENSE")
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We set up the io_mem_rom special memory region using the
unassigned_mem_ops structure; this is then used when a guest tries to
write to ROM. This is incorrect, because the behaviour of unassigned
memory may be different from that of ROM for writes. In particular,
on some architectures writing to unassigned memory generates a guest
exception, whereas writing to ROM is generally ignored. Use a
special readonly_mem_ops for this purpose instead, so writes to
ROM are ignored for all guest CPUs.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1513187549-2435-2-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
we currently report an "iSCSI Failure" in iscsi_co_generic_cb if the task
hasn't completed with SCSI_STATUS_GOOD. However, we expect a failure in
some cases and handle it gracefully. This is the case for misaligned UNMAPs
and WRITESAME10/16 calls without UNMAP. In this case a failure in the
logs can be quite misleading.
While we are at it improve the logging to reveal which operation failed
at what LBA.
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1512733868-9009-3-git-send-email-pl@kamp.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Normally we create an address space for that CPU and pass that address
space into the function. Let's just do it inside to unify address space
creations. It'll simplify my next patch to rename those address spaces.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171123092333.16085-3-peterx@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The moxiesim machine already defines a memory region for a firmware,
but does not provide the possibility to load an image via "-bios" yet.
This will be needed for the boot-serial tester, so let's add support
for "-bios" here now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1512031988-32490-6-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QEMU only ships with some few firmware images, i.e. we can currently run
the boot-serial test only on a very limited set of machines. But writing
some characters to the default UART of a machine can often be done with
some few lines of assembly, so we add the possibility to the boot-serial
tester to use its own mini-kernels or mini-firmwares. We write such images
then into a file that we can load with the "-kernel" or "-bios" parameter
when we launch QEMU.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1512031988-32490-3-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If the guest continuesly writes characters to the UART, we never leave
the inner while loop and thus never check whether we've reached the
timeout value. So if we fail to find the expected string in the UART
output, the test just hangs and never finishs. Use a counter to regularly
break out of the while loop to check the timeout.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1512031988-32490-2-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
In commit e3af7c788b we
replaced direct calls to to cpu_ld*_code() with calls
to the x86_ld*_code() wrappers which incorporate an
advance of s->pc. Unfortunately we didn't notice that
in one place the old code was deliberately not incrementing
s->pc:
@@ -4501,7 +4528,7 @@ static target_ulong disas_insn(DisasContext *s, CPUState *cpu)
static const int pp_prefix[4] = {
0, PREFIX_DATA, PREFIX_REPZ, PREFIX_REPNZ
};
- int vex3, vex2 = cpu_ldub_code(env, s->pc);
+ int vex3, vex2 = x86_ldub_code(env, s);
if (!CODE64(s) && (vex2 & 0xc0) != 0xc0) {
/* 4.1.4.6: In 32-bit mode, bits [7:6] must be 11b,
This meant we were mishandling this set of instructions.
Remove the manual advance of s->pc for the "is VEX" case
(which is now done by x86_ldub_code()) and instead rewind
PC in the case where we decide that this isn't really VEX.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Cc: qemu-stable@nongnu.org
Reported-by: Alexandro Sanchez Bach <alexandro@phi.nz>
Message-Id: <1513163959-17545-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When listening on unix/tcp sockets there was optional code that would update
the original SocketAddress struct with the info about the actual address that
was listened on. Since the conversion of everything to QIOChannelSocket, no
remaining caller made use of this feature. It has been replaced with the ability
to query the listen address after the fact using the function
qio_channel_socket_get_local_address. This is a better model when the input
address can result in listening on multiple distinct sockets.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20171212111219.32601-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These gcc warnings are fixed:
target/i386/translate.c:4461:12: warning:
variable 'prefixes' might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]
target/i386/translate.c:4466:9: warning:
variable 'rex_w' might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]
target/i386/translate.c:4466:16: warning:
variable 'rex_r' might be clobbered by 'longjmp' or 'vfork' [-Wclobbered]
Tested with x86_64-w64-mingw32-gcc from Debian stretch.
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-Id: <20171113064845.29142-1-sw@weilnetz.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The conditional memory barrier not only looks strange but actually is
wrong.
On s390x, I can reproduce interrupts via cpu_interrupt() not leading to
a proper kick out of emulation every now and then. cpu_interrupt() is
especially used for inter CPU communication via SIGP (esp. external
calls and emergency interrupts).
With this patch, I was not able to reproduce. (esp. no stalls or hangs
in the guest).
My setup is s390x MTTCG with 16 VCPUs on 8 CPU host, running make -j16.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171129191319.11483-1-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pause_all_cpus() is sometimes called from a VCPU thread (e.g. s390x
during special reset). It cannot deal with multiple VCPUs per Thread
(single threaded TCG) yet.
Booting an s390x guest with -smp 2 and single threaded TCG from disk
currently fails. The DIAG 308 will issue a pause_all_cpus() and wait
forever for the CPUs to actually stop. But it is waiting for itself.
So let's stop all VCPUs belonging to the current thread. Factor out
stopping of a VCPU.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171129191215.11323-1-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The value of HV_X64_MSR_SVERSION is initialized once at vcpu init, and
is reset to zero on vcpu reset, which is wrong.
It is supposed to be a constant, so drop the field from X86CPU, set the
msr with the constant value, and don't bother getting it.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20171122181418.14180-4-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Hyper-V has a notion of partition-wide MSRs. Those MSRs are read and
written as usual on each VCPU, however the hypervisor maintains a single
global value for all VCPUs. Thus writing such an MSR from any single
VCPU affects the global value that is read by all other VCPUs.
This leads to an issue during VCPU hotplug: the zero-initialzied values
of those MSRs get synced into KVM and override the global values as has
already been set by the guest.
This change makes the partition-wide MSRs only be synchronized on the
first vcpu.
Signed-off-by: Evgeny Yakovlev <eyakovlev@virtuozzo.com>
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20171122181418.14180-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Intel IceLake cpu has added new cpu features,AVX512_VBMI2/GFNI/
VAES/VPCLMULQDQ/AVX512_VNNI/AVX512_BITALG. Those new cpu features
need expose to guest VM.
The bit definition:
CPUID.(EAX=7,ECX=0):ECX[bit 06] AVX512_VBMI2
CPUID.(EAX=7,ECX=0):ECX[bit 08] GFNI
CPUID.(EAX=7,ECX=0):ECX[bit 09] VAES
CPUID.(EAX=7,ECX=0):ECX[bit 10] VPCLMULQDQ
CPUID.(EAX=7,ECX=0):ECX[bit 11] AVX512_VNNI
CPUID.(EAX=7,ECX=0):ECX[bit 12] AVX512_BITALG
The release document ref below link:
https://software.intel.com/sites/default/files/managed/c5/15/\
architecture-instruction-set-extensions-programming-reference.pdf
Signed-off-by: Yang Zhong <yang.zhong@intel.com>
Message-Id: <1511335676-20797-1-git-send-email-yang.zhong@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Scsi-block doesn't use the DEFINE_BLOCK_PROPERTIES() macro so it didn't
gain the share-rw back when it was added to all other storage devices.
This option is meaningful here, and need to be used when attaching a
shared storage to guest.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171205071928.30242-1-famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Previously virtio-net was only tested for ppc64 in "slow" mode. That
doesn't make much sense since virtio-net is used much more often in
practice than the spapr-vlan device which was tested always. So, move
virtio-net to always be tested on ppc64.
We had no tests at all for the q35 machine, which doesn't seem wise
given its increasing prominence. Add a couple of tests for it,
including testing the newer e1000e adapter.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This adds IPv6 net boot testing (in addition to IPv4) when in slow test
mode on ppc64 or s390. IPv6 PXE doesn't seem to work on x86, I'm guessing
our BIOS image doesn't support it.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently pxe-tests open codes the list of tests for each architecture.
This changes it to use tables of test parameters, somewhat similar to
boot-serial-test.
This adds the machine type into the table as well, giving us the ability
to perform tests on multiple machine types for architectures where there's
more than one machine type that matters.
NOTE: This changes the names of the tests in the output, to include the
machine type and IPv4 vs. IPv6. I'm not sure if this has the
potential to break existing tooling.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
All of the x86 and some of the other test cases here use a common test
function, test_pxe_ipv4(), but one ppc and one s390 test use different
functions.
In the s390 case, this is completely pointless, the right parameter to
test_pxe_ipv4() will already do exactly the right thing. For the
spapr-vlan case there's a slight difference - it will use IPv6 instead of
IPv4.
But testing just one case with IPv6 (and NOT IPv4) is rather haphazard.
Change everything to use the common test function, until we have a better
way of testing IPv6 across the board.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This lets distros standardize on how QEMU should install systemd
services for qemu-ga and qemu-pr-helper.
The qemu-ga unit file comes from Fedora, but I checked that
Debian is using the same path for the virtio-serisal port.
I would like to include this in 2.11, so that the qemu-pr-helper
socket can be standardized across distros. Note however that
the files are not installed. We can add a configure option
in 2.12 perhaps, but it's too late now; documenting the files
in the release notes should do.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Message-Id: <20171124164422.3960-1-pbonzini@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
1) Return a generic sense if TEST UNIT READY does not provide one;
2) Fix two mistakes in copying from the spec.
Cc: qemu-stable@nongnu.org
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
If we create a thread with QEMU_THREAD_DETACHED mode, QEMU may get a segfault with low probability.
The backtrace is:
#0 0x00007f46c60291d7 in __GI_raise (sig=sig@entry=6) at ../nptl/sysdeps/unix/sysv/linux/raise.c:56
#1 0x00007f46c602a8c8 in __GI_abort () at abort.c:90
#2 0x00000000008543c9 in PAT_abort ()
#3 0x000000000085140d in patchIllInsHandler ()
#4 <signal handler called>
#5 pthread_detach (th=139933037614848) at pthread_detach.c:50
#6 0x0000000000829759 in qemu_thread_create (thread=thread@entry=0x7ffdaa8205e0, name=name@entry=0x94d94a "io-task-worker", start_routine=start_routine@entry=0x7eb9a0 <qio_task_thread_worker>,
arg=arg@entry=0x3f5cf70, mode=mode@entry=1) at util/qemu_thread_posix.c:512
#7 0x00000000007ebc96 in qio_task_run_in_thread (task=0x31db2c0, worker=worker@entry=0x7e7e40 <qio_channel_socket_connect_worker>, opaque=0xcd23380, destroy=0x7f1180 <qapi_free_SocketAddress>)
at io/task.c:141
#8 0x00000000007e7f33 in qio_channel_socket_connect_async (ioc=ioc@entry=0x626c0b0, addr=<optimized out>, callback=callback@entry=0x55e080 <qemu_chr_socket_connected>, opaque=opaque@entry=0x42862c0,
destroy=destroy@entry=0x0) at io/channel_socket.c:194
#9 0x000000000055bdd1 in socket_reconnect_timeout (opaque=0x42862c0) at qemu_char.c:4744
#10 0x00007f46c72483b3 in g_timeout_dispatch () from /usr/lib64/libglib-2.0.so.0
#11 0x00007f46c724799a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#12 0x000000000076c646 in glib_pollfds_poll () at main_loop.c:228
#13 0x000000000076c6eb in os_host_main_loop_wait (timeout=348000000) at main_loop.c:273
#14 0x000000000076c815 in main_loop_wait (nonblocking=nonblocking@entry=0) at main_loop.c:521
#15 0x000000000056a511 in main_loop () at vl.c:2076
#16 0x0000000000420705 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4940
The cause of this problem is a glibc bug; for more information, see
https://sourceware.org/bugzilla/show_bug.cgi?id=19951.
The solution for this bug is to use pthread_attr_setdetachstate.
There is a similar issue with pthread_setname_np, which is moved
from creating thread to created thread.
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Message-Id: <20171128044656.10592-1-linzhecheng@huawei.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
[Simplify the code by removing qemu_thread_set_name, and free the arguments
before invoking the start routine. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Recent glibc added memfd_create in sys/mman.h. This conflicts with
the definition in util/memfd.c:
/builddir/build/BUILD/qemu-2.11.0-rc1/util/memfd.c:40:12: error: static declaration of memfd_create follows non-static declaration
Fix the configure test, and remove the sys/memfd.h inclusion since the
file actually does not exist---it is a typo in the memfd_create(2) man
page.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
QAPI patches for 2017-12-20
# gpg: Signature made Wed 20 Dec 2017 18:53:28 GMT
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-12-20:
qmp: remove qmp_cpu
qapi-docs: fix a comment typo
qapi2texi: De-duplicate code to add blank line before symbol
qapi: Rename QAPIDoc.parser, .section to ._parser, ._section
qapi2texi: Simplify representation of section text
qapi: Simplify representation of QAPIDoc section text
qapi: Unify representation of doc section without name
qapi2texi: Clean up texi_sections()
tests/qapi-schema/doc-bad-section: New, factored out of doc-good
qapi: Make cur_doc local to QAPISchemaParser.__init__()
qapi: Eliminate QAPISchemaParser.__init__()'s local fname
qapi: Stop rejecting #optional
qapi-schema: Fix query-vm-generation-id's doc comment markup
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
'qmp_cpu' was implemented in commit 755f196898 ("qapi: Convert the cpu
command") as a functional no-op, a QMP call that does nothing and
return success. The idea, apparently, was to provide a counterpart
for the HMP 'hmp_cpu' command, introduced in the same commit.
After 6 years of its creation, qmp_cpu remains a functional no-op
that does nothing, having no value for any caller/user. A proposal
was sent to implement qmp_cpu like hmp_cpu works, but it was denied
[1]. The reason is that QMP must be as stateless as possible and a
function that changes its state (the current CPU monitor in the case
of qmp_cpu) goes against it. Any QMP command that needs a specific
monitor CPU setup must provide it in its arguments, instead of relying
in the current QMP monitor state.
After discussions that happened in [2] it was decided that a command
that does nothing since its birth, no one uses for anything and will
not be implemented, should be deprecated and erased. Given that we will
*not* provide any replacement for qmp_cpu and we believe that there
is no user relying on it, there is no point in adding a deprecation
delay for it.
So, this patch nukes qmp_cpu from QEMU code, removing both its blank
implementation in qmp.c and its doc in qapi-schema.json.
[1] https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg02283.html
[2] https://lists.gnu.org/archive/html/qemu-devel/2017-12/msg03696.html
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
CC: Markus Armbruster <armbru@redhat.com>
CC: Eric Blake <eblake@redhat.com>
CC: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171220102304.8288-1-danielhb@linux.vnet.ibm.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Repurposing the function parameter doc for stepping through
doc.sections.__str__() is not nice. Use new variable @text instead.
While there, eliminate variables name and func.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171002141341.24616-7-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Commit 1d8bda1 got rid of #optional tags, and added a check to keep
them from getting added back, to make sure patches then in flight
don't add them back. It's been six months, time to drop that check.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171002141341.24616-3-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
This reverts commit 5e8a7fe673.
It's hard to get all images to have all these packages, the usual
"FEATURES" and "require" mechanism doesn't scale with so many features.
With that change, the test basically only works in ubuntu.
Until a better way comes up, leave the feature enabling to ./configure
detection.
But don't remove the "-e" removal.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171018082002.9406-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Their last user went away in commit f51074cdc6, "pci-hotplug-old: Has
been dead for five major releases, bury", v2.3.0. Remove them, as new
code should use QemuOpts or maybe keyval_parse() instead.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171006131645.17729-1-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit 0f5314a (v1.0) added section "Device URL Syntax" to
qemu-options.hx. It's enclosed in STEXI..ETEXI, thus affects only
qemu-options.texi, not --help. It appears as a subsection under
section "Invocation". Similarly, qemu.1 has it as a subsection under
"OPTIONS".
Commit f9dadc9 (v1.1.0) dropped new option -iscsi into the middle of
this section. No effect on qemu-options.texi. It appears in --help
run together with the "Bluetooth(R) options:" header.
Commit c70a01e (v1.5.0) gives it is own heading in --help by moving
commit 0f5314a's DEFHEADING(Device URL Syntax:) outside STEXI..ETEXI.
Trouble is the heading makes no sense for -iscsi.
Move all of the "Device URL Syntax" Texinfo to qemu-doc.texi. Mark it
for inclusion in qemu.1 with '@c man begin NOTES'. This turns it into
a separate section outside the list of options both in qemu-doc and in
qemu.1.
There's substantial overlap with the existing qemu-doc section "Disk
Images". Mark with a TODO comment.
Output of --help will be fixed next.
Cc: Ronnie Sahlberg <ronniesahlberg@gmail.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171002140307.5292-4-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ronnie Sahlberg <ronniesahlberg@gmail.com>
[Unwanted @node dropped]
Commit 43f187a broke --help: it put colons into blank lines. It
removed the colon from DEFHEADING(TITLE:) and added it back in the
macro expansion of DEFHEADING(TITLE), so hxtool can emit "@subsection
TITLE" more easily. Trouble is it's added back even for the blank
lines made with DEFHEADING().
Put the colons back where they were before commit 43f187a, and strip
them in hxtool instead.
Cc: Paolo Bonzini <pbonzini@redhat.com>
CC: qemu-stable@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20171002140307.5292-2-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Backends don't need to know what frontend requested a reset,
and notifying then from virtio_error is messy because
virtio_error itself might be invoked from backend.
Let's just set the status directly.
Cc: qemu-stable@nongnu.org
Reported-by: Ilya Maximets <i.maximets@samsung.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Queued target/sh4 patches
# gpg: Signature made Mon 18 Dec 2017 22:36:42 GMT
# gpg: using RSA key 0x1388C0F899E8336B
# gpg: Good signature from "Aurelien Jarno <aurelien@aurel32.net>"
# gpg: aka "Aurelien Jarno <aurelien@jarno.fr>"
# gpg: aka "Aurelien Jarno <aurel32@debian.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7746 2642 A9EF 94FD 0F77 196D BA9C 7806 1DDD 8C9B
# Subkey fingerprint: 52BC 8695 BE34 F90A D7D4 0CB8 1388 C0F8 99E8 336B
* remotes/aurel/tags/pull-target-sh4-20171218:
target/sh4: Convert to DisasContextBase
target/sh4: Do not singlestep after exceptions
target/sh4: Convert to DisasJumpType
target/sh4: Use cmpxchg for movco when parallel_cpus
target/sh4: fix TCG leak during gusa sequence
target/sh4: add missing tcg_temp_free() in _decode_opc()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge tpm 2017/12/19 v1
# gpg: Signature made Tue 19 Dec 2017 11:51:13 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-12-19-1:
tpm: move qdev_prop_tpm to hw/tpm/
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Building with --disable-tpm yields
../hw/core/qdev-properties-system.o: In function `set_tpm':
/home/cohuck/git/qemu/hw/core/qdev-properties-system.c:274: undefined reference to `qemu_find_tpm_be'
/home/cohuck/git/qemu/hw/core/qdev-properties-system.c:278: undefined reference to `tpm_backend_init'
../hw/core/qdev-properties-system.o: In function `release_tpm':
/home/cohuck/git/qemu/hw/core/qdev-properties-system.c:291: undefined reference to `tpm_backend_reset'
Move the implementation of DEFINE_PROP_TPMBE to hw/tpm/ so that it is
only built when tpm is actually configured, and build tpm_util in every
case.
Fixes: 493b783035 ("qdev: add DEFINE_PROP_TPMBE")
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
There is a small chance that iothread_stop() hangs as follows:
Thread 3 (Thread 0x7f63eba5f700 (LWP 16105)):
#0 0x00007f64012c09b6 in ppoll () at /lib64/libc.so.6
#1 0x000055959992eac9 in ppoll (__ss=0x0, __timeout=0x0, __nfds=<optimized out>, __fds=<optimized out>) at /usr/include/bits/poll2.h:77
#2 0x000055959992eac9 in qemu_poll_ns (fds=<optimized out>, nfds=<optimized out>, timeout=<optimized out>) at util/qemu-timer.c:322
#3 0x0000559599930711 in aio_poll (ctx=0x55959bdb83c0, blocking=blocking@entry=true) at util/aio-posix.c:629
#4 0x00005595996806fe in iothread_run (opaque=0x55959bd78400) at iothread.c:59
#5 0x00007f640159f609 in start_thread () at /lib64/libpthread.so.0
#6 0x00007f64012cce6f in clone () at /lib64/libc.so.6
Thread 1 (Thread 0x7f640b45b280 (LWP 16103)):
#0 0x00007f64015a0b6d in pthread_join () at /lib64/libpthread.so.0
#1 0x00005595999332ef in qemu_thread_join (thread=<optimized out>) at util/qemu-thread-posix.c:547
#2 0x00005595996808ae in iothread_stop (iothread=<optimized out>) at iothread.c:91
#3 0x000055959968094d in iothread_stop_iter (object=<optimized out>, opaque=<optimized out>) at iothread.c:102
#4 0x0000559599857d97 in do_object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0, recurse=recurse@entry=false) at qom/object.c:852
#5 0x0000559599859477 in object_child_foreach (obj=obj@entry=0x55959bdb8100, fn=fn@entry=0x559599680930 <iothread_stop_iter>, opaque=opaque@entry=0x0) at qom/object.c:867
#6 0x0000559599680a6e in iothread_stop_all () at iothread.c:341
#7 0x000055959955b1d5 in main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4913
The relevant code from iothread_run() is:
while (!atomic_read(&iothread->stopping)) {
aio_poll(iothread->ctx, true);
and iothread_stop():
iothread->stopping = true;
aio_notify(iothread->ctx);
...
qemu_thread_join(&iothread->thread);
The following scenario can occur:
1. IOThread:
while (!atomic_read(&iothread->stopping)) -> stopping=false
2. Main loop:
iothread->stopping = true;
aio_notify(iothread->ctx);
3. IOThread:
aio_poll(iothread->ctx, true); -> hang
The bug is explained by the AioContext->notify_me doc comments:
"If this field is 0, everything (file descriptors, bottom halves,
timers) will be re-evaluated before the next blocking poll(), thus the
event_notifier_set call can be skipped."
The problem is that "everything" does not include checking
iothread->stopping. This means iothread_run() will block in aio_poll()
if aio_notify() was called just before aio_poll().
This patch fixes the hang by replacing aio_notify() with
aio_bh_schedule_oneshot(). This makes aio_poll() or g_main_loop_run()
to return.
Implementing this properly required a new bool running flag. The new
flag prevents races that are tricky if we try to use iothread->stopping.
Now iothread->stopping is purely for iothread_stop() and
iothread->running is purely for the iothread_run() thread.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171207201320.19284-6-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When a node is already associated with a BlockBackend the
x-blockdev-set-iothread command refuses to set the IOThread. This is to
prevent accidentally changing the IOThread when the nodes are in use.
When the nodes are created with -drive they automatically get a
BlockBackend. In that case we know nothing is using them yet and it's
safe to set the IOThread. Add a force boolean to override the check.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171207201320.19284-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
BDRV_POLL_WHILE() does not support recursive AioContext locking. It
only releases the AioContext lock once regardless of how many times the
caller has acquired it. This results in a hang since the IOThread does
not make progress while the AioContext is still locked.
The following steps trigger the hang:
$ qemu-system-x86_64 -M accel=kvm -m 1G -cpu host \
-object iothread,id=iothread0 \
-device virtio-scsi-pci,iothread=iothread0 \
-drive if=none,id=drive0,file=test.img,format=raw \
-device scsi-hd,drive=drive0 \
-drive if=none,id=drive1,file=test.img,format=raw \
-device scsi-hd,drive=drive1
$ qemu-system-x86_64 ...same options... \
-incoming tcp::1234
(qemu) migrate tcp:127.0.0.1:1234
...hang...
Tested-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171207201320.19284-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QMP 'transaction' blockdev-snapshot-sync with multiple disks in an
IOThread is an untested code path. Several bugs have been found in
connection with this command. This patch adds a test case to prevent
future regressions.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171206144550.22295-10-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Currently there is no easy way for iotests to ensure that a BDS is bound
to a particular IOThread. Normally the virtio-blk device calls
blk_set_aio_context() when dataplane is enabled during guest driver
initialization. This never happens in iotests since -machine
accel=qtest means there is no guest activity (including device driver
initialization).
This patch adds a QMP command to explicitly assign IOThreads in test
cases. See qapi/block-core.json for a description of the command.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171206144550.22295-9-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It is not necessary to hold AioContext across transactions anymore since
bdrv_drained_begin/end() is used to keep the nodes quiesced. In fact,
using the AioContext lock for this purpose was always buggy.
This patch reduces the scope of AioContext locked regions. This is not
just a cleanup but also fixes hangs that occur in BDRV_POLL_WHILE()
because it is unware of recursive locking and does not release the
AioContext the necessary number of times to allow progress to be made.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171206144550.22295-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
bdrv_unref() requires the AioContext lock because bdrv_flush() uses
BDRV_POLL_WHILE(), which assumes the AioContext is currently held. If
BDRV_POLL_WHILE() runs without AioContext held the
pthread_mutex_unlock() call in aio_context_release() fails.
This patch moves bdrv_unref() into the AioContext locked region to solve
the following pthread_mutex_unlock() failure:
#0 0x00007f566181969b in raise () at /lib64/libc.so.6
#1 0x00007f566181b3b1 in abort () at /lib64/libc.so.6
#2 0x00005592cd590458 in error_exit (err=<optimized out>, msg=msg@entry=0x5592cdaf6d60 <__func__.23977> "qemu_mutex_unlock") at util/qemu-thread-posix.c:36
#3 0x00005592cd96e738 in qemu_mutex_unlock (mutex=mutex@entry=0x5592ce9505e0) at util/qemu-thread-posix.c:96
#4 0x00005592cd969b69 in aio_context_release (ctx=ctx@entry=0x5592ce950580) at util/async.c:507
#5 0x00005592cd8ead78 in bdrv_flush (bs=bs@entry=0x5592cfa87210) at block/io.c:2478
#6 0x00005592cd89df30 in bdrv_close (bs=0x5592cfa87210) at block.c:3207
#7 0x00005592cd89df30 in bdrv_delete (bs=0x5592cfa87210) at block.c:3395
#8 0x00005592cd89df30 in bdrv_unref (bs=0x5592cfa87210) at block.c:4418
#9 0x00005592cd6b7f86 in qmp_transaction (dev_list=<optimized out>, has_props=<optimized out>, props=<optimized out>, errp=errp@entry=0x7ffe4a1fc9d8) at blockdev.c:2308
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171206144550.22295-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The AioContext pointer argument to co_aio_sleep_ns() is only used for
the sleep timer. It does not affect where the caller coroutine is
resumed.
Due to changes to coroutine and AIO APIs it is now possible to drop the
AioContext pointer argument. This is safe to do since no caller has
specific requirements for which AioContext the timer must run in.
This patch drops the AioContext pointer argument and renames the
function to simplify the API.
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171109102652.6360-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
FPU2000 implements basic single-precision floating point operations and
can be replaced with a different implementation, like DFPU or HiFi. Move
FPU2000 opcode translators into separate functions and list them in a
separate array.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Move implementations of core opcodes into separate translation
functions. Introduce data structures for mapping opcode name to
translator function. Make an array of core opcode/translator structures.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
The canonical way of dealing with Xtensa instructions decoding and
encoding is through the libisa. Libisa is a configuration-independent
library with a stable interface plus generated configuration-specific
xtensa-modules.c file with implementations of decoding and encoding
functions. Libisa is MIT-licensed and originally disributed
xtensa-modules.c files are also MIT-licensed and are available as a
part of xtensa configuration overlay.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
Currently 'entry' opcode helper accepts frame size divided by 8, as it
is encoded in the opcode. Make it more natural and accept actual frame
size instead.
Signed-off-by: Max Filippov <jcmvbkbc@gmail.com>
If curl_global_init() fails, per the documentation no other curl
functions may be called, so make sure to check the return value.
Also, some minor changes to the initialization latch variable 'inited':
- Make it static in the file, for clarity
- Change the name for clarity
- Make it a bool
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard W.M. Jones <rjones@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
'tag' is already checked in the lines immediately preceding this check,
and set to non-NULL if NULL. No need to check again, it hasn't changed.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
If users set an unreasonably low speed (like one byte per second), the
calculated delay may exceed many hours. While we like to punish users
for asking for stupid things, we do also like to allow users to correct
their wicked ways.
When a user provides a new speed, kick the job to allow it to recalculate
its delay.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171213204611.26276-1-jsnow@redhat.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Set fake progress for non-dirty clusters in copy_bitmap initialization,
to. It simplifies code and allows further refactoring.
This patch changes user's view of backup progress, but formally it
doesn't changed: progress hops are just moved to the beginning.
Actually it's just a point of view: when do we actually skip clusters?
We can say in the very beginning, that we skip these clusters and do
not think about them later.
Of course, if go through disk sequentially, it's logical to say, that
we skip clusters between copied portions to the left and to the right
of them. But even now copying progress is not sequential because of
write notifiers. Future patches will introduce new backup architecture
which will do copying in several coroutines in parallel, so it will
make no sense to publish fake progress by parts in parallel with
other copying requests.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Message-id: 20171012135313.227864-5-vsementsov@virtuozzo.com
Signed-off-by: Jeff Cody <jcody@redhat.com>
Recent Linux kernel provides separate tracefs which doesn't need to be
mounted on the debugfs. Although most systems mount it at the
traditional place on the debugfs, it'd be safer to check tracefs first.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The find_debugfs() can be shared to find a different filesystem like
tracefs. So make it more general and rename to find_mount().
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The return vale of find_debugfs() is 1 if it could find a mount point of
debugfs. It can be saved in the while loop instead of checking it again.
Signed-off-by: Namhyung Kim <namhyung@gmail.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
It's a x86-only device, so it does not make sense to keep it
in the shared misc folder.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
It's a x86-only device, so it does not make sense to keep it
in the shared misc folder.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
and remove the old i386/pc dependency.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
since The VGACommonState struct has a GraphicHwOps *hw_ops member,
then remove the now unnecessary includes.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
this allows to remove the old i386/pc dependency on acpi/core.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
instead move them to the source file
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
this file in include in "target/i386/hax-i386.h":
#ifdef CONFIG_WIN32
#include "target/i386/hax-windows.h"
#endif
which guaranties that sysemu/os-win32.h is previously included (CONFIG_WIN32)
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
and fix a typo in the "PC Chipset" section
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Add support for these keys: audiomute volumedown volumeup power.
Tested with "sendkey" command in monitor and verify the behavior
in guest OS.
Signed-off-by: Tao Wu <lepton@google.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This allows to use this header in qtests.
This fixes:
CC tests/test.o
include/hw/registerfields.h:32:41: error: implicit declaration of function ‘MAKE_64BIT_MASK’ [-Werror=implicit-function-declaration]
MAKE_64BIT_MASK(shift, length)};
^
include/hw/registerfields.h:39:5: error: implicit declaration of function ‘extract64’; [-Werror=implicit-function-declaration]
extract64((storage), R_ ## reg ## _ ## field ## _SHIFT,
^
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The cpu-exec-common.c file includes memory-internal.h, but it doesn't
actually use anything from that header. Remove the unnecessary include.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
For some systems (i.e. FreeBSD) the default 'make' is not compatible with the
GNU extensions used by QEMU makefiles.
Calling the GNU make (gmake) works, however the help displayed refers to the
host 'make' and copy/paste leads to lot of unobvious errors:
$ gmake check-help
[...]
make check Run all tests
$ make check
make: "Makefile" line 28: Missing dependency operator
make: "Makefile" line 37: Need an operator
make: "Makefile" line 41: warning: duplicate script for target "git-submodule-update" ignored
make: "rules.mak" line 70: warning: duplicate script for target "%.o" ignored
make: Unknown modifier ' '
make: Unclosed substitution for eval modules (= missing)
make: "tests/Makefile.include" line 24: Variable/Value missing from "export"
make: "tests/" line 1: warning: Zero byte read from file, skipping rest of line.
make: "tests/" line 1: Need an operator
make: "Makefile" line 660: warning: duplicate script for target "ifneq" ignored
make: "Makefile" line 78: warning: using previous script for "ifneq" defined here
make: Fatal errors encountered -- cannot continue
Using the $(MAKE) variable, the help displayed is consistent with the 'make'
program used.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This was never used since its introduction in commit
196ea13104 ("memory: Add global-locking property to memory
regions").
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When executing 'configure' in a fresh QEMU clone, in a fresh
OS install running in a ppc64le host, this is the error
shown:
-----
../configure --enable-trace-backend=simple --enable-debug
--target-list=ppc64-softmmu
ERROR: Unsupported CPU = ppc64le, try --enable-tcg-interpreter
-----
This isn't true, ppc64le host CPU is supported. This happens because,
in a fresh install, we don't have a C compiler to autodetect
the $cpu variable to "ppc64".
This patch moves the CC available check up a bit, just before verifying
the host CPU. This ensures that we bail out with a $CC not available
error instead of unsupported CPU (the host CPU detection without
the compiler wouldn't work properly anyway). It also allows --help to
keep working without a C compiler. With this patch, in the same ppc64le
host without gcc:
$ ../configure --enable-trace-backend=simple --enable-debug
--target-list=ppc64-softmmu
ERROR: "cc" either does not exist or does not work
$ ../configure --help
Usage: configure [options]
Options: [defaults in brackets after descriptions]
Standard options:
--help print this message
--prefix=PREFIX install in PREFIX [/usr/local]
--interp-prefix=PREFIX where to find shared libraries, etc.
(...)
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Thanks to Laszlo Ersek for spotting the double semicolon in target/i386/kvm.c
I have trivially grepped the tree for ';;' in C files.
Suggested-by: Laszlo Ersek <lersek@redhat.com>
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Merge tpm 2017/12/15 v1
# gpg: Signature made Fri 15 Dec 2017 04:44:15 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-12-15-1: (32 commits)
tpm: tpm_passthrough: Fail startup if FE buffer size < BE buffer size
tpm: tpm_emulator: get and set buffer size of device
tpm: tpm_passthrough: Read the buffer size from the host device
tpm: pull tpm_util_request() out of tpm_util_test()
tpm: Move getting TPM buffer size to backends
tpm: remove tpm_register_model()
tpm-tis: use DEFINE_PROP_TPMBE
qdev: add DEFINE_PROP_TPMBE
tpm-tis: check that at most one TPM device exists
tpm-tis: remove redundant 'tpm_tis:' in error messages
tpm-emulator: add a FIXME comment about blocking cancel
acpi: change TPM TIS data conditions
tpm: add tpm_cmd_get_size() to tpm_util
tpm: add TPM interface to lookup TPM version
tpm: lookup the the TPM interface instead of TIS device
tpm: rename qemu_find_tpm() -> qemu_find_tpm_be()
tpm-tis: simplify header inclusion
tpm-passthrough: workaround a possible race
tpm-passthrough: simplify create()
tpm-passthrough: make it safer to destroy after creation
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
SPARC Linux has an oddity that it insists that mmap()
of MAP_FIXED memory must be at an alignment defined by
SHMLBA, which is more aligned than the page size
(typically, SHMLBA alignment is to 16K, and pages are 8K).
This is a relic of ancient hardware that had cache
aliasing constraints, but even on modern hardware the
kernel still insists on the alignment.
To ensure that we get mmap() alignment sufficient to
make the kernel happy, change QEMU_VMALLOC_ALIGN,
qemu_fd_getpagesize() and qemu_mempath_getpagesize()
to use the maximum of getpagesize() and SHMLBA.
In particular, this allows 'make check' to pass on Sparc:
we were previously failing the ivshmem tests.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512752248-17857-1-git-send-email-peter.maydell@linaro.org
The existing QIOChannelSocket class provides the ability to
listen on a single socket at a time. This patch introduces
a QIONetListener class that provides a higher level API
concept around listening for network services, allowing
for listening on multiple sockets.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
s390x changes for 2.12:
- Lots of tcg improvements: ccw hotplug is now working and we can run
a Linux kernel built for z12 under tcg
- zPCI improvements to get virtio-pci working
- get rid of the cssid restrictions for virtual and non-virtual channel
devices
- we now support 8TB+ systems
- 2.12 compat machine
- fixes and cleanups
# gpg: Signature made Fri 15 Dec 2017 10:57:01 GMT
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20171215-v2: (46 commits)
s390-ccw-virtio: allow for systems larger that 7.999TB
s390x: change the QEMU cpu model to a stripped down z12
s390x/tcg: we already implement the Set-Program-Parameter facility
s390x/tcg: implement extract-CPU-time facility
s390x/tcg: Implement SIGNAL ADAPTER instruction
s390x/tcg: Implement STORE CHANNEL PATH STATUS
s390x/tcg: wire up SET CHANNEL MONITOR
s390x/tcg: wire up SET ADDRESS LIMIT
s390x/tcg: implement Interlocked-Access Facility 2
s390x/tcg: ASI/ASGI/ALSI/ALSGI are atomic with Interlocked-acccess facility 1
s390x/tcg: wire up STORE CHANNEL REPORT WORD
s390x/tcg: indicate value of TODPR in STCKE
s390x/tcg: implement SET CLOCK PROGRAMMABLE FIELD
s390x/tcg: fix and cleanup mcck injection
s390x/kvm: factor out build_channel_report_mcic() into cpu.h
s390x/css: attach css bridge
s390x: deprecate s390-squash-mcss machine prop
s390x/css: unrestrict cssids
s390x/pci: search for subregion inside the BARs
s390x/pci: move the memory region write from pcistg
...
# Conflicts:
# include/hw/compat.h
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-12-15
First pull request for qemu-2.12. This has quite a bit of stuff
accumulated while 2.11 was finalizing. Highlights are:
* Some preliminary work towards implementing the "XIVE" POWER9
interrupt controller
* Some fixes for problems during reboot with MTTCG
* A substantial TCG performance improvement via
tcg_get_lookup_and_goto_ptr
* Numerous assorted cleanups and bugfixes that weren't urgent enough
for 2.11
# gpg: Signature made Fri 15 Dec 2017 03:14:12 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.12-20171215: (24 commits)
spapr: don't initialize PATB entry if max-cpu-compat < power9
spapr: Assume msi_nonbroken
spapr: Rename machine init functions for clarity
target/ppc: introduce the PPC_BIT() macro
spapr_events: drop bogus cell from "interrupt-ranges" property
spapr: fix LSI interrupt specifiers in the device tree
spapr: replace numa_get_node() with lookup in pc-dimm list
spapr: introduce a spapr_qirq() helper
spapr: introduce a spapr_irq_set_lsi() helper
spapr: move the IRQ allocation routines under the machine
ppc/xics: assign of the CPU 'intc' pointer under the core
ppc/xics: introduce an icp_create() helper
spapr/rtas: do not reset the MSR in stop-self command
spapr/rtas: fix reboot of a a SMP TCG guest
spapr/rtas: disable the decrementer interrupt when a CPU is unplugged
e500: fix pci host bridge class/type
openpic: debug w/ info_report()
pcc: define the Power-saving mode Exit Cause Enable bits in PowerPCCPUClass
nvram: add AT24Cx i2c eeprom
e500: name openpic and pci host bridge
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
KVM does not allow memory regions > KVM_MEM_MAX_NR_PAGES, basically
limiting the memory per slot to 8TB-4k. As memory slots on s390/kvm must
be a multiple of 1MB we need start a new memory region if we cross
8TB-1M.
With that (and optimistic overcommitment in the kernel) I was able to
start a 24TB guest on a 1TB system.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171211122146.162430-1-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
[CH: 1UL -> 1ULL in KVM_MEM_MAX_NR_PAGES; build fix on 32 bit hosts]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If the requested buffer size of the frontend is smaller than the fixed
buffer size of the host's TPM, fail the startup_tpm() interface function,
which will make the device unusable. We fail it because the backend TPM
could produce larger packets than what the frontend could pass to the OS.
The current combination of TIS frontend and either passthrough or emulator
backend will not lead to this case since the TIS can support any size of
buffer.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Convert the tpm_emulator backend to get the current buffer size
of the external device and set it to the buffer size that the
frontend (TIS) requests.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Rather than hard coding the buffer size in the tpm_passthrough
backend read the TPM I/O buffer size from the host device.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Rather than setting the size of the TPM buffer in the front-end,
query the backend for the size of the buffer. In this patch we
just move the hard-coded buffer size of 4096 to the backends.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The reported error message is already prefixed with the -device
name & arguments.
Before:
qemu-system-x86_64: -device tpm-tis,id=foo,tpmdev=foo,irq=21: tpm_tis: IRQ 21 is outside valid range of 0 to 15
After:
qemu-system-x86_64: -device tpm-tis,id=foo,tpmdev=foo,irq=21: IRQ 21 is outside valid range of 0 to 15
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The device should be exposed if present. It shouldn't have an
undefined version (or else backend init failed, and device should fail
too). Finally, make the fields specific to TIS device model.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The TPM backend processing thread has common shared variable race
issues. (they should not be so easy to reach since guest interaction
with the device is slow compared to host emulation)
An obvious one is setting op_cancelled from device thread after
calling write(cancel_fd). The backend thread may return before the
device thread has set the variable. Instead set it before
cancellation. Even if the write() failed, the end result is command
get possibly cancelled (even if cancellation came from external
sources it doesn't matter much).
It's worth to consider removing the backend processing thread for now.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
QEMU code doesn't generally have assert() for mandatory
callbacks/function pointers, probably because the crash is pretty
obvious. Document the methods instead of going into the code.
Make get_tpm_options() mandatory to implement (since all
backend implementation have it).
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The value is later unneeded, and may leak if the free visitor doesn't
consider it since has_cancel_path is false. And for consistency with
"path" it shouldn't be returned in get_tpm_options().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Lift from the backend implementation the responsability to call the
request_completed() callback outside of thread context. This also
simplify frontend/interface work, as they no longer need to care
whether the callback is called from a different thread.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Store the TPM interface, the actual object may be different from
TPMState. Keep a reference on the interface, and check the backend
wasn't already initialized.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The passed-through device might be an express device. In this case the
old code allocated a too small emulated config space in
pci_config_alloc() since pci_config_size() returned the size for a
non-express device. This leads to an out-of-bound write in
xen_pt_config_reg_init(), which sometimes results in crashes. So set
is_express as already done for KVM in vfio-pci.
Shortened ASan report:
==17512==ERROR: AddressSanitizer: heap-buffer-overflow on address 0x611000041648 at pc 0x55e0fdac51ff bp 0x7ffe4af07410 sp 0x7ffe4af07408
WRITE of size 2 at 0x611000041648 thread T0
#0 0x55e0fdac51fe in memcpy /usr/include/x86_64-linux-gnu/bits/string3.h:53
#1 0x55e0fdac51fe in stw_he_p include/qemu/bswap.h:330
#2 0x55e0fdac51fe in stw_le_p include/qemu/bswap.h:379
#3 0x55e0fdac51fe in pci_set_word include/hw/pci/pci.h:490
#4 0x55e0fdac51fe in xen_pt_config_reg_init hw/xen/xen_pt_config_init.c:1991
#5 0x55e0fdac51fe in xen_pt_config_init hw/xen/xen_pt_config_init.c:2067
#6 0x55e0fdabcf4d in xen_pt_realize hw/xen/xen_pt.c:830
#7 0x55e0fdf59666 in pci_qdev_realize hw/pci/pci.c:2034
#8 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914
[...]
0x611000041648 is located 8 bytes to the right of 256-byte region [0x611000041540,0x611000041640)
allocated by thread T0 here:
#0 0x7ff596a94bb8 in __interceptor_calloc (/usr/lib/x86_64-linux-gnu/libasan.so.4+0xd9bb8)
#1 0x7ff57da66580 in g_malloc0 (/lib/x86_64-linux-gnu/libglib-2.0.so.0+0x50580)
#2 0x55e0fdda7d3d in device_set_realized hw/core/qdev.c:914
[...]
Signed-off-by: Simon Gaiser <hw42@ipsumj.de>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
If the frontend requests raw pointers, the input handlers must be
activated to have the input events delivered to the xenfb backend.
Without activation, the input events are delivered to handlers
registered earlier, which would be the emulated USB tablet or
emulated PS/2 mouse.
HVM xen_kbdfront can incorrectly scale absolute coordinates when
the display resolution is not 800x600.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Writes "feature-raw-pointer" during init to indicate the backend
can pass raw unscaled values for absolute axes to the frontend.
Frontends set "request-raw-pointer" to indicate the backend should
not attempt to scale absolute values to console size.
"request-raw-pointer" is only valid if "request-abs-pointer" is
also set. Raw unscaled pointer values are in the range [0, 0x7fff]
"feature-raw-pointer" and "request-raw-pointer" added to Xen
header in commit 7868654ff7fe5e4a2eeae2b277644fa884a5031e
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Avoid the unneccessary calls through the input-legacy.c file by
using the qemu_input_handler_*() calls directly. This did require
reworking the event and sync handlers to use the reverse mapping
from qcode to linux using qemu_input_qcode_to_linux().
Removes the scancode2linux mapping, and supporting documention.
Signed-off-by: Owen Smith <owen.smith@citrix.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
This patch allocates an IOThread object for each xen_disk instance and
sets the AIO context appropriately on connect. This allows processing
of I/O to proceed in parallel.
The patch also adds tracepoints into xen_disk to make it possible to
follow the state transtions of an instance in the log.
Signed-off-by: Paul Durrant <paul.durrant@citrix.com>
Acked-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
if KVM is enabled and KVM capabilities MMU radix is available,
the partition table entry (patb_entry) for the radix mode is
initialized by default in ppc_spapr_reset().
It's a problem if we want to migrate the guest to a POWER8 host
while the kernel is not started to set the value to the one
expected for a POWER8 CPU.
The "-machine max-cpu-compat=power8" should allow to migrate
a POWER9 KVM host to a POWER8 KVM host, but because patb_entry
is set, the destination QEMU tries to enable radix mode on the
POWER8 host. This fails and cancels the migration:
Process table config unsupported by the host
error while loading state for instance 0x0 of device 'spapr'
load of migration failed: Invalid argument
This patch doesn't set the PATB entry if the user provides
a CPU compatibility mode that doesn't support radix mode.
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We conditionally adjust part of the guest device tree based on the
global msi_nonbroken flag. However, the main machine type code
initializes msi_nonbroken to true and there's nothing that would set
it to false again.
So replace the test with an assert().
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Machine objects have two init functions - the generic QOM level
instance_init which should only do static object initialization, and
the Machine specific MachineClass::init which does the actual
construction of the machine.
In spapr the functions implementing these two have names -
ppc_machine_initfn() and ppc_spapr_init() - which don't correspond closely
to either of those. To prevent people (read, me) from confusing which is
which, rename them spapr_instance_init() and spapr_machine_init() to
make it clearer which is which.
While we're there rename ppc_spapr_reset() to spapr_machine_reset() to
match.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
and use them in a couple of obvious places. Other macros will be used
in the model of the XIVE interrupt controller.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
According to LoPAPR 1.1 B.6.12, the "/event-sources" node has an "interrupt-
ranges" property, the format of which is described in B.6.9.1.2 as follows:
“interrupt-ranges”
Standard property name that defines the interrupt number(s) and range(s)
handled by this unit.
prop-encoded-array: List of (int-number, range) specifications.
Int-number is encoded as with encode-int.
Range is encoded as with encode-int.
The first entry in this list shall contain the int-number associated with
the first “reg” property entry. The int-num-ber is the value representing
the interrupt source as would appear in the PowerPC External Interrupt
Architecture XISR. The range shall be the number of sequential interrupt
numbers which this unit can generate.
There's no such thing as a cell count at the end of the array, like the
one introduced by commit ffbb1705a3 in QEMU 2.8. It doesn't seem it had
any impact on existing guests and I couldn't find any related workaround
in linux. So, let's just drop the bogus lines.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
LoPAPR 1.1 B.6.9.1.2 describes the "#interrupt-cells" property of the
PowerPC External Interrupt Source Controller node as follows:
“#interrupt-cells”
Standard property name to define the number of cells in an interrupt-
specifier within an interrupt domain.
prop-encoded-array: An integer, encoded as with encode-int, that denotes
the number of cells required to represent an interrupt specifier in its
child nodes.
The value of this property for the PowerPC External Interrupt option shall
be 2. Thus all interrupt specifiers (as used in the standard “interrupts”
property) shall consist of two cells, each containing an integer encoded
as with encode-int. The first integer represents the interrupt number the
second integer is the trigger code: 0 for edge triggered, 1 for level
triggered.
This patch fixes the interrupt specifiers in the "interrupt-map" property
of the PHB node, that were setting the second cell to 8 (confusion with
IRQ_TYPE_LEVEL_LOW ?) instead of 1.
VIO devices and RTAS event sources use the same format for interrupt
specifiers: while here, we introduce a common helper to handle the
encoding details.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Tested-by: Cédric Le Goater <clg@kaod.org>
--
v3: - reference public LoPAPR instead of internal PAPR+ in changelog
- change helper name to spapr_dt_xics_irq()
v2: - drop the erroneous changes to the "interrupts" prop in PCI device nodes
- introduce a common helper to encode interrupt specifiers
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
SPAPR is the last user of numa_get_node() and a bunch of
supporting code to maintain numa_info[x].addr list.
Get LMB node id from pc-dimm list, which allows to
remove ~80LOC maintaining dynamic address range
lookup list.
It also removes pc-dimm dependency on numa_[un]set_mem_node_id()
and makes pc-dimms a sole source of information about which
node it belongs to and removes duplicate data from global
numa_info.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
xics_get_qirq() is only used by the sPAPR machine. Let's move it there
and change its name to reflect its scope. It will be useful for XIVE
support which will use its own set of qirqs.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
It will make synchronisation easier with the XIVE interrupt mode when
available. The 'irq' parameter refers to the global IRQ number space.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Also change the prototype to use a sPAPRMachineState and prefix them
with spapr_irq_. It will let us synchronise the IRQ allocation with
the XIVE interrupt mode when available.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The 'intc' pointer of the CPU references the interrupt presenter in
the XICS interrupt mode. When the XIVE interrupt mode is available and
activated, the machine will need to reassign this pointer to reflect
the change.
Moving this assignment under the realize routine of the CPU will ease
the process when the interrupt mode is toggled.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The sPAPR and the PowerNV core objects create the interrupt presenter
object of the CPUs in a very similar way. Let's provide a common
routine in which we use the presenter 'type' as a child identifier.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
The CPU is now also protected from the decrementer interrupt by the
LPCR:PECE* bits which are disabled in the 'stop-self' RTAS
call. Reseting the MSR is pointless.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Just like for hot unplug CPUs, when a guest is rebooted, the secondary
CPUs can be awaken by the decrementer and start entering SLOF at the
same time the boot CPU is.
To be safe, let's disable on the secondaries all the exceptions which
can cause an exit while the CPU is in power-saving mode.
Based on previous work from Nikunj A Dadhania <nikunj@linux.vnet.ibm.com>
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a CPU is stopped with the 'stop-self' RTAS call, its state
'halted' is switched to 1 and, in this case, the MSR is not taken into
account anymore in the cpu_has_work() routine. Only the pending
hardware interrupts are checked with their LPCR:PECE* enablement bit.
If the DECR timer fires after 'stop-self' is called and before the CPU
'stop' state is reached, the nearly-dead CPU will have some work to do
and the guest will crash. This case happens very frequently with the
not yet upstream P9 XIVE exploitation mode. In XICS mode, the DECR is
occasionally fired but after 'stop' state, so no work is to be done
and the guest survives.
I suspect there is a race between the QEMU mainloop triggering the
timers and the TCG CPU thread but I could not quite identify the root
cause. To be safe, let's disable in the LPCR all the exceptions which
can cause an exit while the CPU is in power-saving mode and reenable
them when the CPU is started.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Correct some confusion wrt. the PCI facing
side of the PCI host bridge (not PCIe root complex).
The ref. manual for the mpc8533 (as well as
mpc8540 and mpc8540) give the class code as
PCI_CLASS_PROCESSOR_POWERPC.
While the PCI_HEADER_TYPE field is oddly omitted,
the tables in the "PCI Configuration Header"
section shows a type 0 layout using all 6 BAR
registers (as 2x 32, and 2x 64 bit regions)
So 997505065d
seems to be in error. Although there was
perhaps some confusion as the mpc8533
has a separate PCIe root complex.
With PCIe, a root complex has PCI_HEADER_TYPE=1.
Neither the PCI host bridge, nor the PCIe
root complex advertise class PCI_CLASS_BRIDGE_PCI.
This was confusing Linux guests, which try
to interpret the host bridge as a pci-pci
bridge, but get confused and re-enumerate
the bus when the primary/secondary/subordinate
bus registers don't have valid values.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
and use the value to define precisely the default value of the LPCR in
the helper routine cpu_ppc_set_papr()
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The current code assumes that only the CPU core object holds a
reference on each individual CPU object, and happily frees their
allocated memory when the core is unrealized. This is dangerous
as some other code can legitimely keep a pointer to a CPU if it
calls object_ref(), but it would end up with a dangling pointer.
Let's allocate all CPUs with object_new() and let QOM free them
when their reference count reaches zero. This greatly simplify the
code as we don't have to fiddle with the instance size anymore.
Signed-off-by: Greg Kurz <groug@kaod.org>
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
While we're at it fix a couple of small errors in the 2.11 and 2.10 models
(they didn't have any real effect, but don't quite match the template).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The previous code section uses a 'first < 0' test and returns. Therefore,
there is no need to test the 'first' variable against '>= 0' afterwards.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We are good enough to boot upstream Linux kernels / Fedora 26/27. That
should be sufficient for now.
As the QEMU CPU model is migration safe, let's add compatibility code.
Generate the feature list to reduce the chance of messing things up in the
future.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208165529.14124-1-david@redhat.com>
[CH: squashed 's390x/cpumodel: make qemu cpu model play with "none" machine'
(20171213132407.5227-1-david@redhat.com) and 's390x/tcg: don't include z13
features in the qemu model' (20171213171512.17601-1-david@redhat.com) into
patch]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The Set-Program-Parameter facility (also known as Load-Program-Parameter
facility) provides the LPP instruction used to load the program
parameter. We already implement that instruction in TCG, so add it to our
list.
Note: Not documented in the PoP but in "The Load-Program-Parameter and
CPU-Measurement Facilities) - SA23-2260-05 document.
While at it, make the whole list ordered (according to cpu_features_def.h).
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
It only provides the EXTRACT CPU TIME instruction. We can reuse the stpt
helper, which calculates the CPU timer value.
As the instruction is not privileged, but we don't have a CPU timer
value in case of linux user, we simply reuse cpu_get_host_ticks() to
produce some descending value.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-13-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Just like KVM does, we should suppress this instruction:
When this instruction is not provided, it is
checked for privileged operation exception and the
instruction is suppressed by the machine
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-11-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's handle it just like KVM:
Depending on the model, this instruction may not be
provided. When this instruction is not provided, it is
checked for operand exception and privileged-opera-
tion exception, and then is suppressed.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The semantics of ASI/ASGI/ALSI/ALSGI changed. Let's implement them just
like LOAD AND ADD, so they are atomic. Emulate old behavior.
This fixes random crashes when booting a Linux kernel compiled for
z196+ with SMP + MTTCG.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-7-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The architecture mode indication wasn't stored. The split of certain
64bit fields was unnecessary. Also, the complete clock comparator, not
just bit 0-55 (starting at byte 1) was stored.
We now generate a proper MCIC via the same helper we use for KVM.
There is more to clean up, but we will change the other parts later on
either way.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171208160207.26494-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
With the cssids unrestricted (commit "s390x/css: unrestrict cssids") the
s390-squash-mcss machine property should not be used. Actually Libvirt
never supported this, so the expectation is that removing it should be
pretty painless. But let's play nice and deprecate it first.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20171206144438.28908-3-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The default css 0xfe is currently restricted to virtual subchannel
devices. The hope when the decision was made was, that non-virtual
subchannel devices will come around when guest can exploit multiple
channel subsystems. Since the guests generally don't do, the pain
of the partitioned (cssid) namespace outweighs the gain.
Let us remove the corresponding restrictions (virtual devices
can be put only in 0xfe and non-virtual devices in any css except
the 0xfe -- while s390-squash-mcss then remaps everything to cssid 0).
At the same time, change our schema for generating css bus ids to put
both virtual and non-virtual devices into the default css (spilling over
into other css images, if needed). The intention is to deprecate
s390-squash-mcss. With this change devices without a specified devno
won't end up hidden to guests not supporting multiple channel subsystems,
unless this can not be avoided (default css full).
Let us also advertise the changes to the management software (so it can
tell are cssids unrestricted or restricted).
The adverse effect of getting rid of the restriction on migration should
not be too severe. Vfio-ccw devices are not live-migratable yet, and for
virtual devices using the extra freedom would only make sense with the
aforementioned guest support in place.
The auto-generated bus ids are affected by both changes. We hope to not
encounter any auto-generated bus ids in production as Libvirt is always
explicit about the bus id. Since 8ed179c937 ("s390x/css: catch section
mismatch on load", 2017-05-18) the worst that can happen because the same
device ended up having a different bus id is a cleanly failed migration.
I find it hard to reason about the impact of changed auto-generated bus
ids on migration for command line users as I don't know which rules is
such an user supposed to follow.
Another pain-point is down- or upgrade of QEMU for command line users.
The old way and the new way of doing vfio-ccw are mutually incompatible.
Libvirt is only going to support the new way, so for libvirt users, the
possible problems at QEMU downgrade are the following. If a domain
contains virtual devices placed into a css different than 0xfe the domain
will refuse to start with a QEMU not having this patch. Putting devices
into a css different that 0xfe however won't make much sense in the near
future (guest support). Libvirt will refuse to do vfio-ccw with a QEMU
not having this patch. This is business as usual.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20171206144438.28908-2-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When dispatching memory access to PCI BAR region, we must
look for possible subregions, used by the PCI device to map
different memory areas inside the same PCI BAR.
Since the data offset we received is calculated starting at the
region start address we need to adjust the offset for the subregion.
The data offset inside the subregion is calculated by substracting
the subregion's starting address from the data offset in the region.
The access to the MSIX region is now handled in a generic way,
we do not need the specific trap_msix() function anymore.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <1512046530-17773-8-git-send-email-pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Enhance the fault detection.
Fixup the precedence to check the destination path existance
before checking for the source accessibility.
Add the maxstbl entry to both the Query PCI Function Group
response and the PCIBusDevice structure.
Initialize the maxstbl to 128 per default until we get
the actual data from the hardware.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <1512046530-17773-5-git-send-email-pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There are two places where the same endianness conversion
is done.
Let's factor this out into a static function.
Note that the conversion must always be done for data in a register:
The S390 BE guest converted date to le before issuing the instruction.
After interception in a BE host:
ZPCI VFIO using pwrite must make the conversion back for the BE kernel.
Kernel will do BE to le translation when loading the register for the
real instruction.
After interception in a le host:
TCG stores a BE register in le, swapping bytes.
But since the data in the register was already le it is now BE
ZPCI VFIO must convert it to le before writing to the PCI memory.
In both cases ZPCI VFIO must swap the bytes from the register.
Signed-off-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <1512046530-17773-2-git-send-email-pmorel@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
s390_cpu_virt_mem_rw() must always return, so callers can react on
an exception (e.g. see ioinst_handle_stcrw()).
Therefore, using program_interrupt() is wrong. Fix that up.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171130162744.25442-9-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
s390_cpu_virt_mem_rw() must always return, so callers can react on
an exception (e.g. see ioinst_handle_stcrw()).
However, for TCG we always have to exit the cpu loop (and restore the
cpu state before that) if we injected a program interrupt. So let's
introduce and use s390_cpu_virt_mem_handle_exc() in code that is not
purely KVM.
Directly pass the retaddr we already have available in these functions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171130162744.25442-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Once we wire up TCG, we will need the retaddr to correctly inject
program interrupts. As we want to get rid of the function
program_interrupt(), convert PCI code too.
For KVM, we can simply use RA_IGNORED.
Convert program_interrupt() to s390_program_interrupt() directly, making
use of the passed address.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171130162744.25442-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
TCG needs the retaddr when injecting an interrupt. Let's just pass it
along and use RA_IGNORED for KVM. The value will be completely ignored for
KVM.
Convert program_interrupt() to s390_program_interrupt() directly, making
use of the passed address.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171130162744.25442-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Allows to easily convert more callers of program_interrupt() and to
easily introduce new exceptions without forgetting about the cpu state
reset.
Use s390_program_interrupt() in places where we already had the same
pattern. We will later get rid of program_interrupt().
RA != 0 checks are already done behind the scenes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171130162744.25442-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The QEMU ELF loader does not zero the bss segment.
This resulted in several bugs, e.g. see
commit 5d739a4787 (s390-ccw.img: Fix sporadic errors with ccw boot image - initialize css)
commit 6a40fa2669d3 (s390-ccw.img: Initialize next_idx)
commit 8775d91a0f (pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting)
Let's fix this once and forever by letting the BIOS zero the bss itself.
Suggested-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171122142627.73170-3-borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
valgrind pointed out that we call KVM_S390_GET_IRQ_STATE with an
undefined value for flags. Kernels prior to 4.15 did not use that
field, and later kernels ignore it for compatibility reasons, but we
better play safe.
The same is true for SET_IRQ_STATE. We should make sure to not use the
flag field, either.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171122142627.73170-2-borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
target-arm queue:
* xilinx_spips: set reset values correctly
* MAINTAINERS: fix an email address
* hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
* nvic: Make systick banked for v8M
* refactor get_phys_addr() so we can return the right format PAR
for ATS operations
* implement v8M TT instruction
* fix some minor v8M bugs
* Implement reset for GICv3 ITS
* xlnx-zcu102: Add support for the ZynqMP QSPI
# gpg: Signature made Wed 13 Dec 2017 18:01:31 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171213: (43 commits)
xilinx_spips: Use memset instead of a for loop to zero registers
xilinx_spips: Set all of the reset values
xilinx_spips: Update the QSPI Mod ID reset value
MAINTAINERS: replace the unavailable email address
hw/display/tc6393xb: limit irq handler index to TC6393XB_GPIOS
nvic: Make systick banked
nvic: Make nvic_sysreg_ns_ops work with any MemoryRegion
target/arm: Extend PAR format determination
target/arm: Remove fsr argument from get_phys_addr() and arm_tlb_fill()
target/arm: Ignore fsr from get_phys_addr() in do_ats_write()
target/arm: Use ARMMMUFaultInfo in deliver_fault()
target/arm: Convert get_phys_addr_pmsav8() to not return FSC values
target/arm: Convert get_phys_addr_pmsav7() to not return FSC values
target/arm: Convert get_phys_addr_pmsav5() to not return FSC values
target/arm: Convert get_phys_addr_lpae() to not return FSC values
target/arm: Convert get_phys_addr_v6() to not return FSC values
target/arm: Convert get_phys_addr_v5() to not return FSC values
target/arm: Remove fsr argument from arm_ld*_ptw()
target/arm: Provide fault type enum and FSR conversion functions
target/arm: Implement TT instruction
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Provide HMP monitor command execution result as it would be seen
by user who established an HMP monitor session.
Currently many commands may silently fail without any sign of that.
This patch let this info to be printed once test is running in
verbose mode.
For the future it might be useful to fail the test if command has
failed, however it would require a bit of rework inside test
engine itself.
A simple example of silent failure without reporting it would to
add some non-existent HMP command into 'hmp_cmds' list. In this case
test will report it successfully passed without error.
Signed-off-by: Vadim Galitsyn <vadim.galitsyn@profitbricks.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: qemu-devel@nongnu.org
Message-Id: <20171023151310.6462-5-vadim.galitsyn@profitbricks.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
It's easy to use device_add and device_del as replacement instead.
The usb_add and usb_del commands are deprecated since QEMU 2.10,
and nobody complained that they are still needed, so let's get rid
of them now to make the HMP interface a little bit less overloaded.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1512073140-17672-1-git-send-email-thuth@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
For the v8M security extension, there should be two systick
devices, which use separate banked systick exceptions. The
register interface is banked in the same way as for other
banked registers, including the existence of an NS alias
region for secure code to access the nonsecure timer.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1512154296-5652-3-git-send-email-peter.maydell@linaro.org
Generalize nvic_sysreg_ns_ops so that we can pass it an
arbitrary MemoryRegion which it will use as the underlying
register implementation to apply the NS-alias behaviour
to. We'll want this so we can do the same with systick.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1512154296-5652-2-git-send-email-peter.maydell@linaro.org
Now that do_ats_write() is entirely in control of whether to
generate a 32-bit PAR or a 64-bit PAR, we can make it use the
correct (complicated) condition for doing so.
Signed-off-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1512503192-2239-13-git-send-email-peter.maydell@linaro.org
[PMM: Rebased Edgar's patch on top of get_phys_addr() refactoring;
use arm_s1_regime_using_lpae_format() rather than
regime_using_lpae_format() because the latter will assert
if passed ARMMMUIdx_S12NSE0 or ARMMMUIdx_S12NSE1;
updated commit message appropriately]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In do_ats_write(), rather than using the FSR value from get_phys_addr(),
construct the PAR values using the information in the ARMMMUFaultInfo
struct. This allows us to create a PAR of the correct format regardless
of what the translation table format is.
For the moment we leave the condition for "when should this be a
64 bit PAR" as it was previously; this will need to be fixed to
properly support AArch32 Hyp mode.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1512503192-2239-11-git-send-email-peter.maydell@linaro.org
Now that ARMMMUFaultInfo is guaranteed to have enough information
to construct a fault status code, we can pass it in to the
deliver_fault() function and let it generate the correct type
of FSR for the destination, rather than relying on the value
provided by get_phys_addr().
I don't think there are any cases the old code was getting
wrong, but this is more obviously correct.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1512503192-2239-10-git-send-email-peter.maydell@linaro.org
Make get_phys_addr_pmsav5() return a fault type in the ARMMMUFaultInfo
structure, which we convert to the FSC at the callsite.
Note that PMSAv5 does not define any guest-visible fault status
register, so the different "fsr" values we were previously
returning are entirely arbitrary. So we can just switch to using
the most appropriae fi->type values without worrying that we
need to special-case FaultInfo->FSC conversion for PMSAv5.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1512503192-2239-7-git-send-email-peter.maydell@linaro.org
All the callers of arm_ldq_ptw() and arm_ldl_ptw() ignore the value
that those functions store in the fsr argument on failure: if they
return failure to their callers they will always overwrite the fsr
value with something else.
Remove the argument from these functions and S1_ptw_translate().
This will simplify removing fsr from the calling functions.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1512503192-2239-3-git-send-email-peter.maydell@linaro.org
Currently get_phys_addr() and its various subfunctions return
a hard-coded fault status register value for translation
failures. This is awkward because FSR values these days may
be either long-descriptor format or short-descriptor format.
Worse, the right FSR type to use doesn't depend only on the
translation table being walked -- some cases, like fault
info reported to AArch32 EL2 for some kinds of ATS operation,
must be in long-descriptor format even if the translation
table being walked was short format. We can't get those cases
right with our current approach.
Provide fields in the ARMMMUFaultInfo struct which allow
get_phys_addr() to provide sufficient information for a caller to
construct an FSR value themselves, and utility functions which do
this for both long and short format FSR values, as a first step in
switching get_phys_addr() and its children to only returning the
failure cause in the ARMMMUFaultInfo struct.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1512503192-2239-2-git-send-email-peter.maydell@linaro.org
For the TT instruction we're going to need to do an MPU lookup that
also tells us which MPU region the access hit. This requires us
to do the MPU lookup without first doing the SAU security access
check, so pull the MPU lookup parts of get_phys_addr_pmsav8()
out into their own function.
The TT instruction also needs to know the MPU region number which
the lookup hit, so provide this information to the caller of the
MPU lookup code, even though get_phys_addr_pmsav8() doesn't
need to know it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512153879-5291-7-git-send-email-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
The TT instruction is going to need to look up the MMU index
for a specified security and privilege state. Refactor the
existing arm_v7m_mmu_idx_for_secstate() into a version that
lets you specify the privilege state and one that uses the
current state of the CPU.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512153879-5291-6-git-send-email-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
For M profile, we currently have an mmu index MNegPri for
"requested execution priority negative". This fails to
distinguish "requested execution priority negative, privileged"
from "requested execution priority negative, usermode", but
the two can return different results for MPU lookups. Fix this
by splitting MNegPri into MNegPriPriv and MNegPriUser, and
similarly for the Secure equivalent MSNegPri.
This takes us from 6 M profile MMU modes to 8, which means
we need to bump NB_MMU_MODES; this is OK since the point
where we are forced to reduce TLB sizes is 9 MMU modes.
(It would in theory be possible to stick with 6 MMU indexes:
{mpu-disabled,user,privileged} x {secure,nonsecure} since
in the MPU-disabled case the result of an MPU lookup is
always the same for both user and privileged code. However
we would then need to rework the TB flags handling to put
user/priv into the TB flags separately from the mmuidx.
Adding an extra couple of mmu indexes is simpler.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512153879-5291-5-git-send-email-peter.maydell@linaro.org
In ARMv7M the CPU ignores explicit writes to CONTROL.SPSEL
in Handler mode. In v8M the behaviour is slightly different:
writes to the bit are permitted but will have no effect.
We've already done the hard work to handle the value in
CONTROL.SPSEL being out of sync with what stack pointer is
actually in use, so all we need to do to fix this last loose
end is to update the condition we use to guard whether we
call write_v7m_control_spsel() on the register write.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512153879-5291-3-git-send-email-peter.maydell@linaro.org
For v8M it is possible for the CONTROL.SPSEL bit value and the
current stack to be out of sync. This means we need to update
the checks used in reads and writes of the PSP and MSP special
registers to use v7m_using_psp() rather than directly checking
the SPSEL bit in the control register.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1512153879-5291-2-git-send-email-peter.maydell@linaro.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Voiding the ITS caches is not supposed to happen via
individual register writes. So we introduced a dedicated
ITS KVM device ioctl to perform a cold reset of the ITS:
KVM_DEV_ARM_VGIC_GRP_CTRL/KVM_DEV_ARM_ITS_CTRL_RESET. Let's
use this latter if the kernel supports it.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1511883692-11511-5-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At the moment the ITS is not properly reset and this causes
various bugs on save/restore. We implement a minimalist reset
through individual register writes but for kernel versions
before v4.15 this fails voiding the vITS cache. We cannot
claim we have a comprehensive reset (hence the error message)
but that's better than nothing.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1511883692-11511-3-git-send-email-eric.auger@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When support for multiple mappings per a region were added, this was
left behind, let's finish and remove unused bits.
Fixes: db0da029a1 ("vfio: Generalize region support")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The vfio_iommu_spapr_tce driver advertises kernel's support for
v1 and v2 IOMMU support, however it is not always possible to use
the requested IOMMU type. For example, a pseries host platform does not
support dynamic DMA windows so v2 cannot initialize and QEMU fails to
start.
This adds a fallback to the v1 IOMMU if v2 cannot be used.
Fixes: 318f67ce13 ("vfio: spapr: Add DMA memory preregistering (SPAPR IOMMU v2)")
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Commit 8c37faa475 ("vfio-pci, ppc64/spapr: Reorder group-to-container
attaching") moved registration of groups with the vfio-kvm device from
vfio_get_group() to vfio_connect_container(), but it missed the case
where a group is attached to an existing container and takes an early
exit. Perhaps this is a less common case on ppc64/spapr, but on x86
(without viommu) all groups are connected to the same container and
thus only the first group gets registered with the vfio-kvm device.
This becomes a problem if we then hot-unplug the devices associated
with that first group and we end up with KVM being misinformed about
any vfio connections that might remain. Fix by including the call to
vfio_kvm_device_add_group() in this early exit path.
Fixes: 8c37faa475 ("vfio-pci, ppc64/spapr: Reorder group-to-container attaching")
Cc: qemu-stable@nongnu.org # qemu-2.10+
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
The refactoring of commit 296e5a0a6c has a nasty bug:
it accidentally dropped the generation of code to raise
the UNDEF exception when disas_thumb2_insn() returns nonzero.
This means that 32-bit Thumb2 instruction patterns that
ought to UNDEF just act like nops instead. This is likely
to break any number of things, including the kernel's "disable
the FPU and use the UNDEF exception to identify when to turn
it back on again" trick.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1513006964-3371-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
pci_find_primary_bus() only has one user, in pc_xen_hvm_init(). That's
inside the machine construction code, so it already has easy access to the
machine's primary PCI bus.
Get it directly, and thereby remove pci_find_primary_bus(). This removes
one of only a handful of users of the ugly pci_host_bridges global.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
The bus pointer in PCIDevice is basically redundant with QOM information.
It's always initialized to the qdev_get_parent_bus(), the only difference
is the type.
Therefore this patch eliminates the field, instead creating a pci_get_bus()
helper to do the type mangling to derive it conveniently from the QOM
Device object underneath.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
A fair proportion of the users of pci_bus_num() want to get the bus
number on a specific device, so first have to look up the bus from the
device then call it. This adds a helper to do that (since we're going
to make looking up the bus slightly more verbose).
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
include/hw/pci/pci_bus.h contains several data structures related to PCI
bridges that aren't needed by most users of pci_bus.h. We already have
a pci_bridge.h, so move them there.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
pci_bus_init(), pci_bus_new_inplace(), pci_bus_new() and pci_register_bus()
are misleadingly named. They're not used for initializing *any* PCI bus,
but only for a root PCI bus.
Non-root buses - i.e. ones under a logical PCI to PCI bridge - are instead
created with a direct qbus_create_inplace() (see pci_bridge_initfn()).
This patch renames the functions to make it clear they're only used for
a root bus.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
An uninitialised VirtQueue object or one with Vring.align field
set to zero(0) could lead to arithmetic exceptions. Add a unit
test to validate it.
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Switch vmgenid device to use the UUID property type introduced in the
previous patch for its 'guid' property.
One semantic change it introduces is that post-realize modification of
'guid' via HMP or QMP will now be rejected with an error; however,
according to docs/specs/vmgenid.txt this is actually desirable.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
UUIDs (GUIDs) are widely used in VMBus-related stuff, so a dedicated
property type becomes helpful.
The property accepts a string-formatted UUID or a special keyword "auto"
meaning a randomly generated UUID; the latter is also the default when
the property is not given a value explicitly.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The cloud-init program currently allows fetching of its data by repurposing of
the 'system' type 'serial' field. This is a clear abuse of the serial field that
would clash with other valid usage a virt management app might have for that
field.
Fortunately the SMBIOS defines an "OEM Strings" table whose puporse is to allow
exposing of arbitrary vendor specific strings to the operating system. This is
perfect for use with cloud-init, or as a way to pass arguments to OS installers
such as anaconda.
This patch makes it easier to support this with QEMU. e.g.
$QEMU -smbios type=11,value=Hello,value=World,value=Tricky,,value=test
Which results in the guest seeing dmidecode data
Handle 0x0E00, DMI type 11, 5 bytes
OEM Strings
String 1: Hello
String 2: World
String 3: Tricky,value=test
It is suggested that any app wanting to make use of this OEM strings capability
for accepting data from the host mgmt layer should use its name as a string
prefix. e.g. to expose OEM strings targetting both cloud init and anaconda in
parallel the mgmt app could set
$QEMU -smbios type=11,value=cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/,\
value=anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os
which would appear as
Handle 0x0E00, DMI type 11, 5 bytes
OEM Strings
String 1: cloud-init:ds=nocloud-net;s=http://10.10.0.1:8000/
String 2: anaconda:method=http://dl.fedoraproject.org/pub/fedora/linux/releases/25/x86_64/os
Use of such string prefixes means the app won't have to care which string slot
its data appears in.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Commit 5c0919d020 ("virtio-scsi: Add virtqueue_size parameter allowing
virtqueue size to be set.") introduced a new parameter to virtio-scsi.
Later, commit 9200361060 ("vhost-user-scsi: add missing virtqueue_size
param") added that parameter to the new vhost-user-scsi interface but
neglected the existing vhost-scsi interface it was built on.
Apply the same change to vhost-scsi, so that we can boot a guest with
a device defined. This also avoids crashing a guest when hotplugging
a vhost-scsi device.
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-id: 20171201151538.6844-2-farman@linux.vnet.ibm.com
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-12-05
Alas, this is yet another fix for ppc that I think it's worth
squeezing into 2.11. It's a really ugly fix for some pretty ugly
code, but it does seem to address a real problem. It's also a problem
that's appeared relatively recently, since it was either created by,
or made much easier to trigger by, by the merge of MTTCG.
# gpg: Signature made Tue 05 Dec 2017 05:24:04 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20171205:
target/ppc: Fix system lockups caused by interrupt_request state corruption
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Occasionally in Linux guests on x86_64 we're seeing logs like:
ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000004
when they should read:
ppc_set_irq: 0x55b4e0d562f0 n_IRQ 8 level 1 => pending 00000100req 00000002
The "00000004" is CPU_INTERRUPT_EXITTB yet the code calls
cpu_interrupt(cs, CPU_INTERRUPT_HARD) ("00000002") in this function
just before the log message. Something is causing the HARD bit setting
to get lost.
The knock on effect of losing that bit is the decrementer timer interrupts
don't get delivered which causes the guest to sit idle in its idle handler
and 'hang'.
The issue occurs due to races from code which sets CPU_INTERRUPT_EXITTB.
Rather than poking directly into cs->interrupt_request, that code needs to:
a) hold BQL
b) use the cpu_interrupt() helper
This patch fixes the call sites to do this, fixing the hang. The calls
are made from a variety of contexts so a helper function is added to handle
the necessary locking. This can likely be improved and optimised in the future
but it ensures the code is correct and doesn't lockup as it stands today.
Signed-off-by: Richard Purdie <richard.purdie@linuxfoundation.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Block layer patches for 2.11.0-rc4
# gpg: Signature made Mon 04 Dec 2017 16:46:07 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
blockjob: Make block_job_pause_all() keep a reference to the jobs
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Starting from commit 40840e419b we are
pausing all block jobs during bdrv_reopen_multiple() to prevent any of
them from finishing and removing nodes from the graph while they are
being reopened.
It turns out that pausing a block job doesn't necessarily prevent it
from finishing: a paused block job can still run its exit function
from the main loop and call block_job_completed(). The mirror block
job in particular always goes to the main loop while it is paused (by
virtue of the bdrv_drained_begin() call in mirror_run()).
Destroying a paused block job during bdrv_reopen_multiple() has two
consequences:
1) The references to the nodes involved in the job are released,
possibly destroying some of them. If those nodes were in the
reopen queue this would trigger the problem originally described
in commit 40840e419b, crashing QEMU.
2) At the end of bdrv_reopen_multiple(), bdrv_drain_all_end() would
not be doing all necessary bdrv_parent_drained_end() calls.
I can reproduce problem 1) easily with iotest 030 by increasing
STREAM_BUFFER_SIZE from 512KB to 8MB in block/stream.c, or by tweaking
the iotest like in this example:
https://lists.gnu.org/archive/html/qemu-block/2017-11/msg00934.html
This patch keeps an additional reference to all block jobs between
block_job_pause_all() and block_job_resume_all(), guaranteeing that
they are kept alive.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
pc, pci, virtio: fixes for rc3
A bunch of fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 01 Dec 2017 17:06:33 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
pc: fix crash on attempted cpu unplug
virtio: check VirtQueue Vring object is set
vhost: fix error check in vhost_verify_ring_mappings()
dump-guest-memory.py: fix No symbol "vmcoreinfo_find"
vhost: restore avail index from vring used index on disconnection
virtio: Add queue interface to restore avail index from vring used index
i386/msi: Correct mask of destination ID in MSI address
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-12-04
We are, alas, not yet to the bottom of ppc bugs. This pull request
fixes several more. I believe they're important enough to include in
2.11. despite the late date.
# gpg: Signature made Mon 04 Dec 2017 03:40:56 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20171204:
spapr: Include "pre-plugged" DIMMS in ram size calculation at reset
target-ppc: Don't invalidate non-supported msr bits
pseries: fix TCG migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
At guest reset time, we allocate a hash page table (HPT) for the guest
based on the guest's RAM size. If dynamic HPT resizing is not available we
use the maximum RAM size, if it is we use the current RAM size.
But the "current RAM size" calculation is incorrect - we just use the
"base" ram_size from the machine structure. This doesn't include any
pluggable DIMMs that are already plugged at reset time.
This means that if you try to start a 'pseries' machine with a DIMM
specified on the command line that's much larger than the "base" RAM size,
then the guest will get a woefully inadequate HPT. This can lead to a
guest freeze during boot as it runs out of HPT space during initial MMU
setup.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
when qemu is started with '-no-acpi' CLI option, an attempt
to unplug a CPU using device_del results in null pointer
dereference at:
#0 object_get_class
#1 pc_machine_device_unplug_request_cb
#2 qmp_marshal_device_del
which is caused by pcms->acpi_dev == NULL due to ACPI support
being disabled.
Considering that ACPI support is necessary for unplug to work,
check that it's enabled and fail unplug request gracefully
if no acpi device were found.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
A guest could attempt to use an uninitialised VirtQueue object
or unset Vring.align leading to a arithmetic exception. Add check
to avoid it.
Reported-by: Zhangboxian <zhangboxian@huawei.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Since commit f1f9e6c5 "vhost: adapt vhost_verify_ring_mappings() to
virtio 1 ring layout", we check the mapping of each part (descriptor
table, available ring and used ring) of each virtqueue separately.
The checking of a part is done by the vhost_verify_ring_part_mapping()
function: it returns either 0 on success or a negative errno if the
part cannot be mapped at the same place.
Unfortunately, the vhost_verify_ring_mappings() function checks its
return value the other way round. It means that we either:
- only verify the descriptor table of the first virtqueue, and if it
is valid we ignore all the other mappings
- or ignore all broken mappings until we reach a valid one
ie, we only raise an error if all mappings are broken, and we consider
all mappings are valid otherwise (false success), which is obviously
wrong.
This patch ensures that vhost_verify_ring_mappings() only returns
success if ALL mappings are okay.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
When qemu is compiled without debug, the dump gdb python script can fail with:
Error occurred in Python command: No symbol "vmcoreinfo_find" in current context.
Because vmcoreinfo_find() is inlined and not exported.
Use the underlying object_resolve_path_type() to get the instance instead.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_virtqueue_stop() gets avail index value from the backend,
except if the backend is not responding.
It happens when the backend crashes, and in this case, internal
state of the virtio queue is inconsistent, making packets
to corrupt the vring state.
With a Linux guest, it results in following error message on
backend reconnection:
[ 22.444905] virtio_net virtio0: output.0:id 0 is not a head!
[ 22.446746] net enp0s3: Unexpected TXQ (0) queue failure: -5
[ 22.476360] net enp0s3: Unexpected TXQ (0) queue failure: -5
Fixes: 283e2c2adc ("net: virtio-net discards TX data after link down")
Cc: qemu-stable@nongnu.org
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In case of backend crash, it is not possible to restore internal
avail index from the backend value as vhost_get_vring_base
callback fails.
This patch provides a new interface to restore internal avail index
from the vring used index, as done by some vhost-user backend on
reconnection.
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
According to SDM 10.11.1, only [19:12] bits of MSI address are
Destination ID, change the mask to avoid ambiguity for VT-d spec
has used the bit 4 to indicate a remappable interrupt request.
Signed-off-by: Chao Gao <chao.gao@intel.com>
Signed-off-by: Lan Tianyu <tianyu.lan@intel.com>
Reviewed-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The msr invalidation code (commits 993eb and 2360b) inverts all
bits except MSR_TGPR and MSR_HVB. On non PowerPC 601 processors
this leads to incorrect change of excp_prefix in hreg_store_msr()
function. The problem is that new msr value get multiplied by msr_mask
and inverted msr does not, thus values of MSR_EP bit in new msr value
and inverted msr are distinct, so that excp_prefix changes but should
not.
Signed-off-by: Kurban Mallachiev <mallachiev@ispras.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Migration of pseries is broken with TCG because
QEMU tries to restore KVM MMU state unconditionally.
The result is a SIGSEGV in kvm_vm_ioctl():
#0 kvm_vm_ioctl (s=0x0, type=-2146390353)
at qemu/accel/kvm/kvm-all.c:2032
#1 0x00000001003e3e2c in kvmppc_configure_v3_mmu (cpu=<optimized out>,
radix=<optimized out>, gtse=<optimized out>, proc_tbl=<optimized out>)
at qemu/target/ppc/kvm.c:396
#2 0x00000001002f8b88 in spapr_post_load (opaque=0x1019103c0,
version_id=<optimized out>) at qemu/hw/ppc/spapr.c:1578
#3 0x000000010059e4cc in vmstate_load_state (f=0x106230000,
vmsd=0x1009479e0 <vmstate_spapr>, opaque=0x1019103c0,
version_id=<optimized out>) at qemu/migration/vmstate.c:165
#4 0x00000001005987e0 in vmstate_load (f=<optimized out>, se=<optimized out>)
at qemu/migration/savevm.c:748
This patch fixes the problem by not calling the KVM function with the
TCG mode.
Fixes: d39c90f5f3 ("spapr: Fix migration of Radix guests")
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
One block patch for 2.11.0-rc3
# gpg: Signature made Wed Nov 29 15:28:38 2017 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2017-11-29:
block/nfs: fix nfs_client_open for filesize greater than 1TB
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This reverts the effects of commit 4afeffc857 ("blockjob: do not allow
coroutine double entry or entry-after-completion", 2017-11-21)
This fixed the symptom of a bug rather than the root cause. Canceling the
wait on a sleeping blockjob coroutine is generally fine, we just need to
make it work correctly across AioContexts. To do so, use a QEMUTimer
that calls block_job_enter. Use a mutex to ensure that block_job_enter
synchronizes correctly with block_job_sleep_ns.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-By: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Hide the clearing of job->busy in a single function, and set it
in block_job_enter. This lets block_job_do_yield verify that
qemu_coroutine_enter is not used while job->busy = false.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Tested-By: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers are using QEMU_CLOCK_REALTIME, and it will not be possible to
support more than one clock when block_job_sleep_ns switches to a single
timer stored in the BlockJob struct.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Tested-By: Jeff Cody <jcody@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The .drained_begin/end callbacks can (directly or indirectly via
aio_poll()) cause block nodes to be removed or the current BdrvChild to
point to a different child node.
Use QLIST_FOREACH_SAFE() to make sure we don't access invalid
BlockDriverStates or accidentally continue iterating the parents of the
new child node instead of the node we actually came from.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When destroying a block job in block_job_unref() we should remove it
from the job list before calling block_job_remove_all_bdrv().
This is because removing the BDSs can trigger an aio_poll() and wake
up other jobs that might attempt to use the block job list. If that
happens the job we're currently destroying should not be in that list
anymore.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Introduced in commit f37708f6b8 (2.10). The NBD spec says a client
can request export names up to 4096 bytes in length, even though
they should not expect success on names longer than 256. However,
qemu hard-codes the limit of 256, and fails to filter out a client
that probes for a longer name; the result is a stack smash that can
potentially give an attacker arbitrary control over the qemu
process.
The smash can be easily demonstrated with this client:
$ qemu-io f raw nbd://localhost:10809/$(printf %3000d 1 | tr ' ' a)
If the qemu NBD server binary (whether the standalone qemu-nbd, or
the builtin server of QMP nbd-server-start) was compiled with
-fstack-protector-strong, the ability to exploit the stack smash
into arbitrary execution is a lot more difficult (but still
theoretically possible to a determined attacker, perhaps in
combination with other CVEs). Still, crashing a running qemu (and
losing the VM) is bad enough, even if the attacker did not obtain
full execution control.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
The NBD spec gives us permission to abruptly disconnect on clients
that send outrageously large option requests, rather than having
to spend the time reading to the end of the option. No real
option request requires that much data anyways; and meanwhile, we
already have the practice of abruptly dropping the connection on
any client that sends NBD_CMD_WRITE with a payload larger than 32M.
For comparison, nbdkit drops the connection on any request with
more than 4096 bytes; however, that limit is probably too low
(as the NBD spec states an export name can theoretically be up
to 4096 bytes, which means a valid NBD_OPT_INFO could be even
longer) - even if qemu doesn't permit exports longer than 256
bytes.
It could be argued that a malicious client trying to get us to
read nearly 4G of data on a bad request is a form of denial of
service. In particular, if the server requires TLS, but a client
that does not know the TLS credentials sends any option (other
than NBD_OPT_STARTTLS or NBD_OPT_EXPORT_NAME) with a stated
payload of nearly 4G, then the server was keeping the connection
alive trying to read all the payload, tying up resources that it
would rather be spending on a client that can get past the TLS
handshake. Hence, this warranted a CVE.
Present since at least 2.5 when handling known options, and made
worse in 2.6 when fixing support for NBD_FLAG_C_FIXED_NEWSTYLE
to handle unknown options.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
If socket_listen_cleanup is passed an invalid FD, then querying the socket
local address will fail. We must thus be prepared for the returned addr to
be NULL
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
# gpg: Signature made Tue 28 Nov 2017 03:58:11 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
virtio-net: don't touch virtqueue if vm is stopped
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Guest state should not be touched if VM is stopped, unfortunately we
didn't check running state and tried to drain tx queue unconditionally
in virtio_net_set_status(). A crash was then noticed as a migration
destination when user type quit after virtqueue state is loaded but
before region cache is initialized. In this case,
virtio_net_drop_tx_queue_data() tries to access the uninitialized
region cache.
Fix this by only dropping tx queue data when vm is running.
Fixes: 283e2c2adc ("net: virtio-net discards TX data after link down")
Cc: Yuri Benditovich <yuri.benditovich@daynix.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Stefan Hajnoczi <stefanha@redhat.com>
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: qemu-stable@nongnu.org
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When you cancel an in-progress 'mirror' job (or "active `block-commit`")
with QMP `block-job-cancel`, it emits the event: BLOCK_JOB_CANCELLED.
However, when `block-job-cancel` is issued *after* `drive-mirror` has
indicated (via the event BLOCK_JOB_READY) that the source and
destination have reached synchronization:
[...] # Snip `drive-mirror` invocation & outputs
{
"execute":"block-job-cancel",
"arguments":{
"device":"virtio0"
}
}
{"return": {}}
It (`block-job-cancel`) will counterintuitively emit the event
'BLOCK_JOB_COMPLETED':
{
"timestamp":{
"seconds":1510678024,
"microseconds":526240
},
"event":"BLOCK_JOB_COMPLETED",
"data":{
"device":"virtio0",
"len":41126400,
"offset":41126400,
"speed":0,
"type":"mirror"
}
}
But this is expected behaviour, where the _COMPLETED event indicates
that synchronization has successfully ended (and the destination now has
a point-in-time copy, which is at the time of cancel).
So add a small note to this effect in 'block-core.json'. While at it,
also update the "Live disk synchronization -- drive-mirror and
blockdev-mirror" section in 'live-block-operations.rst'.
(Thanks: Max Reitz for reminding me of this caveat on IRC.)
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This documents the image locking feature and explains when and how
related options can be used.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both of these tests are for formats which now stipulate that they are
read-only. Adjust the tests to match.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
cpu->compat_pvr is used to store the current compat mode of the cpu.
On the receiving side during incoming migration we check compatibility
with the compat mode by calling ppc_set_compat(). However we fail to set
the compat mode with the hypervisor since the "new" compat mode doesn't
differ from the current (due to a "cpu->compat_pvr != compat_pvr" check).
This means that kvm runs the vcpus without a compat mode, which is the
incorrect behaviour. The implication being that a compatibility mode
will never be in effect after migration.
To fix this so that the compat mode is correctly set with the
hypervisor, store the desired compat mode and reset cpu->compat_pvr to
zero before calling ppc_set_compat().
Fixes: 5dfaa532 ("ppc: fix ppc_set_compat() with KVM PR")
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The patb_entry is used to store the location of the process table in
guest memory. The msb is also used to indicate the mmu mode of the
guest, that is patb_entry & 1 << 63 ? radix_mode : hash_mode.
Currently we set this to zero in spapr_setup_hpt_and_vrma() since if
this function gets called then we know we're hash. However some code
paths, such as setting up the hpt on incoming migration of a hash guest,
call spapr_reallocate_hpt() directly bypassing this higher level
function. Since we assume radix if the host is capable this results in
the msb in patb_entry being left set so in spapr_post_load() we call
kvmppc_configure_v3_mmu() and tell the host we're radix which as
expected means addresses cannot be translated once we actually run the cpu.
To fix this move the zeroing of patb_entry into spapr_reallocate_hpt().
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In our various supported host OSes, the time_t type may be either 32
or 64 bit, and could in theory also be either signed or unsigned.
Notably, in OpenBSD time_t is a 64 bit type even if 'long' is 32
bits, so using LONG_MAX for TIME_MAX is incorrect.
Use an approach suggested by Paolo Bonzini which calculates
the maximum value of the type rather than hardcoding it;
to do this we use the TYPE_MAXIMUM macro from Gnulib.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511452598-6077-1-git-send-email-peter.maydell@linaro.org
We no longer support the old s390 transport, neither does the newest
Linux kernel. Remove it from the linux header script as well as the
s390x virtio code. We still should handle the VIRTIO_NOTIFY hypercall,
to tolerate early printk on older guest kernels without an sclp console.
We continue to ignore these events.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20171115154223.109991-1-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Commit 2726627197 started to use tb_unlock() and tlb_set_dirty() on
non TCG code. Add the functions as stubs, so that builds with TCG
disabled continue to compile.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Acked-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMM: tweaked commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When migrating a VM with 'migrate_set_capability postcopy-ram on'
a postcopy_state is set during the process, ending up with the
state POSTCOPY_INCOMING_END when the migration is over. This
postcopy_state is taken into account inside ram_load to check
how it will load the memory pages. This same ram_load is called when
in a loadvm command.
Inside ram_load, the logic to see if we're at postcopy_running state
is:
postcopy_running = postcopy_state_get() >= POSTCOPY_INCOMING_LISTENING
postcopy_state_get() returns this enum type:
typedef enum {
POSTCOPY_INCOMING_NONE = 0,
POSTCOPY_INCOMING_ADVISE,
POSTCOPY_INCOMING_DISCARD,
POSTCOPY_INCOMING_LISTENING,
POSTCOPY_INCOMING_RUNNING,
POSTCOPY_INCOMING_END
} PostcopyState;
In the case where ram_load is executed and postcopy_state is
POSTCOPY_INCOMING_END, postcopy_running will be set to 'true' and
ram_load will behave like a postcopy is in progress. This scenario isn't
achievable in a migration but it is reproducible when executing
savevm/loadvm after migrating with 'postcopy-ram on', causing loadvm
to fail with Error -22:
Source:
(qemu) migrate_set_capability postcopy-ram on
(qemu) migrate tcp:127.0.0.1:4444
Dest:
(qemu) migrate_set_capability postcopy-ram on
(qemu)
ubuntu1704-intel login:
Ubuntu 17.04 ubuntu1704-intel ttyS0
ubuntu1704-intel login: (qemu)
(qemu) savevm test1
(qemu) loadvm test1
Unknown combination of migration flags: 0x4 (postcopy mode)
error while loading state for instance 0x0 of device 'ram'
Error -22 while loading VM state
(qemu)
This patch fixes this problem by changing the existing logic for
postcopy_advised and postcopy_running in ram_load, making them
'false' if we're at POSTCOPY_INCOMING_END state.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
CC: Juan Quintela <quintela@redhat.com>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reported-by: Balamuruhan S <bala24@linux.vnet.ibm.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Migration of a system under stress (for example, with
"stress-ng --numa 2") triggers on the destination
some kernel watchdog messages like:
NMI watchdog: BUG: soft lockup - CPU#0 stuck for 3489660870s!
NMI watchdog: BUG: soft lockup - CPU#1 stuck for 3489660884s!
This problem appears with the changes introduced by
42043e4 spapr: clock should count only if vm is running
I think this commit only triggers the problem.
Kernel computes the soft lockup duration using the
Virtual Timebase register (VTB), not using the Timebase
Register (TBR, the one 42043e4 stops).
It appears VTB is not migrated, so this patch adds it in
the list of the SPRs to migrate, and fixes the problem.
For the migration, I've tested a migration from qemu-2.8.0 and
pseries-2.8.0 to a patched master (qemu-2.11.0-rc1). The received
VTB is 0 (as is it not initialized by qemu-2.8.0), but the value
seems to be ignored by KVM and a non zero VTB is used by the kernel.
I have no explanation for that, but as the original problem appears
only with SMP system under stress I suspect some problems in KVM
(I think because VTB is shared by all threads of a core).
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The spapr-vty device implements the PAPR defined virtual console,
which is also implemented by IBM's proprietary PowerVM hypervisor.
PowerVM's implementation has a bug where it inserts an extra \0 after
every \r going to the guest. Because of that Linux's guest side
driver has a workaround which strips \0 characters that appear
immediately after a \r.
That means that when running under qemu, sending a binary stream from
host to guest via spapr-vty which happens to include a \r\0 sequence
will get corrupted by that workaround.
To deal with that, this patch duplicates PowerVM's bug, inserting an
extra \0 after each \r. Ugly, but the best option available.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
LUNs >= 256 have to be encoded with the so-called "flat space
addressing method" for virtio-scsi, where an additional bit has to
be set. SLOF already took care of this with the following commit:
https://git.qemu.org/?p=SLOF.git;a=commitdiff;h=f72a37713fea47da
(see https://bugzilla.redhat.com/show_bug.cgi?id=1431584 for details)
But QEMU does not use this encoding yet for device tree paths
that have to be handed over to SLOF to deal with the "bootindex"
property, so SLOF currently fails to boot from virtio-scsi devices
with LUNs >= 256 in the right boot order. Fix it by using the bit
to indicate the "flat space addressing method" for LUNs >= 256.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When doing a live migration of a Xen guest with libxl, the images for
block devices are locked by the original QEMU process, and this prevent
the QEMU at the destination to take the lock and the migration fail.
>From QEMU point of view, once the RAM of a domain is migrated, there is
two QMP commands, "stop" then "xen-save-devices-state", at which point a
new QEMU is spawned at the destination.
Release locks in "xen-save-devices-state" so the destination can takes
them, if it's a live migration.
This patch add the "live" parameter to "xen-save-devices-state" which
default to true so older version of libxenlight can work with newer
version of QEMU.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
# gpg: Signature made Tue 21 Nov 2017 17:01:33 GMT
# gpg: using RSA key 0xBDBE7B27C0DE3057
# gpg: Good signature from "Jeffrey Cody <jcody@redhat.com>"
# gpg: aka "Jeffrey Cody <jeff@codyprime.org>"
# gpg: aka "Jeffrey Cody <codyprime@gmail.com>"
# Primary key fingerprint: 9957 4B4D 3474 90E7 9D98 D624 BDBE 7B27 C0DE 3057
* remotes/cody/tags/block-pull-request:
qemu-iotest: add test for blockjob coroutine race condition
qemu-iotests: add option in common.qemu for mismatch only
coroutine: abort if we try to schedule or enter a pending coroutine
blockjob: do not allow coroutine double entry or entry-after-completion
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add option to echo response to QMP / HMP command only on mismatch.
Useful for ignore all normal responses, but catching things like
segfaults.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
The previous patch fixed a race condition, in which there were
coroutines being executing doubly, or after coroutine deletion.
We can detect common scenarios when this happens, and print an error
message and abort before we corrupt memory / data, or segfault.
This patch will abort if an attempt to enter a coroutine is made while
it is currently pending execution, either in a specific AioContext bh,
or pending execution via a timer. It will also abort if a coroutine
is scheduled, before a prior scheduled run has occurred.
We cannot rely on the existing co->caller check for recursive re-entry
to catch this, as the coroutine may run and exit with
COROUTINE_TERMINATE before the scheduled coroutine executes.
(This is the scenario that was occurring and fixed in the previous
patch).
This patch also re-orders the Coroutine struct elements in an attempt to
optimize caching.
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
When block_job_sleep_ns() is called, the co-routine is scheduled for
future execution. If we allow the job to be re-entered prior to the
scheduled time, we present a race condition in which a coroutine can be
entered recursively, or even entered after the coroutine is deleted.
The job->busy flag is used by blockjobs when a coroutine is busy
executing. The function 'block_job_enter()' obeys the busy flag,
and will not enter a coroutine if set. If we sleep a job, we need to
leave the busy flag set, so that subsequent calls to block_job_enter()
are prevented.
This changes the prior behavior of block_job_cancel() being able to
immediately wake up and cancel a job; in practice, this should not be an
issue, as the coroutine sleep times are generally very small, and the
cancel will occur the next time the coroutine wakes up.
This fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1508708
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Block layer patches for 2.11.0-rc2
# gpg: Signature made Tue 21 Nov 2017 15:09:12 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
iotests: Fix 176 on 32-bit host
block: Close a BlockDriverState completely even when bs->drv is NULL
block: Error out on load_vm with active dirty bitmaps
block: Add errp to bdrv_all_goto_snapshot()
block: Add errp to bdrv_snapshot_goto()
block: Don't request I/O permission with BDRV_O_NO_IO
block: Don't use BLK_PERM_CONSISTENT_READ for format probing
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Developers sometimes mistakenly run 'make test' instead of 'make check'.
'make test' triggers the ancient, unmaintained tcg unit tests in
tests/tcg/Makefile which have long since ceased compiling.
Even if someone fixes the TCG tests, it makes little sense to put
them in a 'make test' target, rather they should be 'make check-tcg',
possibly wired up as a dependency of 'make check'.
In the meantime, this patch disarms the 'make test' trap by simply
deleting it so users get an immediate error. This should be enough
for them to remember to type 'make check' instead (or 'make help'
to learn). It also deletes 'make speed' which is another route
into the tcg tests.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Message-id: 20171121142538.22072-1-berrange@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches for 2.11.0-rc2
# gpg: Signature made Tue Nov 21 14:54:28 2017 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2017-11-21:
iotests: Fix 176 on 32-bit host
block: Close a BlockDriverState completely even when bs->drv is NULL
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The contents of a qcow2 bitmap are rounded up to a size that
matches the number of bits available for the granularity, but
that granularity differs for 32-bit hosts (our default 64k
cluster allows for 2M bitmap coverage per 'long') and 64-bit
hosts (4M bitmap per 'long'). If the image is a multiple of
2M but not 4M, then the number of bytes occupied by the array
of longs in memory differs between architecture, thus
resulting in different SHA256 hashes.
Furthermore (but untested by me), if our computation of the
SHA256 hash is at all endian-dependent because of how we store
data in memory, that's another variable we'd have to account
for (ideally, we specified the bitmap stored in qcow2 as
fixed-endian on disk, because the same qcow2 file must be
usable across any architecture; but that says nothing about
how we represent things in memory). But we already have test
165 to validate that bitmaps are stored correctly on disk,
while this test is merely testing that the bitmap exists.
So for this test, the easiest solution is to filter out the
actual hash value. Broken in commit 4096974e.
Reported-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20171117190422.23626-1-eblake@redhat.com
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
bdrv_close() skips much of its logic when bs->drv is NULL. This is
fine when we're closing a BlockDriverState that has just been created
(because e.g the initialization process failed), but it's not enough
in other cases.
For example, when a valid qcow2 image is found to be corrupted then
QEMU marks it as such in the file header and then sets bs->drv to
NULL in order to make the BlockDriverState unusable. When that BDS is
later closed then many of its data structures are not freed (leaking
their memory) and none of its children are detached. This results in
bdrv_close_all() failing to close all BDSs and making this assertion
fail when QEMU is being shut down:
bdrv_close_all: Assertion `QTAILQ_EMPTY(&all_bdrv_states)' failed.
This patch makes bdrv_close() do the full uninitialization process
in all cases. This fixes the problem with corrupted images and still
works fine with freshly created BDSs.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20171106145345.12038-1-berto@igalia.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Loading a snapshot invalidates the bitmap. Just marking all blocks dirty
is not a useful response in practice, instead the user needs to be aware
that we switch to a completely different state. If they are okay with
losing the dirty bitmap, they can just explicitly delete it.
This effectively reverts commit 04dec3c3ae.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
'qemu-img info' makes sense even when BLK_PERM_CONSISTENT_READ cannot be
granted because of a block job in a running qemu process. It already
sets BDRV_O_NO_IO to indicate that it doesn't access the guest visible
data at all.
Check the BDRV_O_NO_IO flags in blk_new_open(), so that I/O related
permissions are not unnecessarily requested and 'qemu-img info' can work
even if BLK_PERM_CONSISTENT_READ cannot be granted.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
For format probing, we don't really care whether all of the image
content is consistent. The only thing we're looking at is the image
header, and specifically the magic numbers that are expected to never
change, no matter how inconsistent the guest visible disk content is.
Therefore, don't request BLK_PERM_CONSISTENT_READ. This allows to use
format probing, e.g. in the context of 'qemu-img info', even while the
guest visible data in the image is inconsistent during a running block
job.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
qemu.org enabled HTTPS in 2017 and it should be used instead of HTTP.
There are also URLs to json.org, openvpn.net, and other domains that
support HTTPS.
This patch updates the qemu.org domains everywhere and also third-party
domains that I have checked.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171121120435.28728-3-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The owner of qemu.org has delegated authority to modify DNS records to
the QEMU Project. This has allowed us to use the domain name without
worries about IP address changes or technical issues disrupting service.
The issues described in commit 8593898109
("Use qemu-project.org domain name") have therefore been mitigated.
This patch switches back to consistently using qemu.org instead of
qemu-project.org in documentation, version.rc, and the Windows installer
script.
The git submodules and SeaBIOS still use qemu-project.org for the time
being. This will be fixed in the QEMU 2.12 release cycle.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171121120435.28728-2-stefanha@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
To do a write to memory that is marked as notdirty, we need
to invalidate any TBs we have cached for that memory, and
update the cpu physical memory dirty flags for VGA and migration.
The slowpath code in notdirty_mem_write() does all this correctly,
but the new atomic handling code in atomic_mmu_lookup() doesn't
do anything at all, it just clears the dirty bit in the TLB.
The effect of this bug is that if the first write to a notdirty
page for which we have cached TBs is by a guest atomic access,
we fail to invalidate the TBs and subsequently will execute
incorrect code. This can be seen by trying to run 'javac' on AArch64.
Use the new notdirty_call_before() and notdirty_call_after()
functions to correctly handle the update to notdirty memory
in the atomic codepath.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1511201308-23580-3-git-send-email-peter.maydell@linaro.org
The function notdirty_mem_write() has a sequence of actions
it has to do before and after the actual business of writing
data to host RAM to ensure that dirty flags are correctly
updated and we flush any TCG translations for the region.
We need to do this also in other places that write directly
to host RAM, most notably the TCG atomic helper functions.
Pull out the before and after pieces into their own functions.
We use an API where the prepare function stashes the various
bits of information about the write into a struct for the
complete function to use, because in the calls for the atomic
helpers the place where the complete function will be called
doesn't have the information to hand.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 1511201308-23580-2-git-send-email-peter.maydell@linaro.org
The data obtained by GetIfEntry is 32 bits, and it may overflow. Thus
using GetIfEntry2 instead of GetIfEntry.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
*avoid CamelCase variable names
*update field names for MIB_IFROW -> MIB_IF_ROW2
*dynamically probe for GetIfIndex2 to deal with older OSs
*check return value from get_interface_index
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
ppc patch queue 2017-11-20
Here's the current queue of ppc patches. These 2 patches are both
more complex than I'd ideally like this late in the 2.11 cycle.
However, they do fix important bugs, so I think it's worth it on
balance.
# gpg: Signature made Mon 20 Nov 2017 03:27:19 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20171120:
spapr: reset DRCs after devices
target/ppc: Update setting of cpu features to account for compat modes
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Contains the following commit:
- pc-bios/s390-ccw: Fix problem with invalid virtio-scsi LUN when rebooting
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
# gpg: Signature made Mon 20 Nov 2017 03:28:54 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
hw/net/vmxnet3: Fix code to work on big endian hosts, too
net: Transmit zero UDP checksum as 0xFFFF
MAINTAINERS: Add missing entry for eepro100 emulation
hw/net/eepro100: Fix endianness problem on big endian hosts
Revert "Add new PCI ID for i82559a"
colo-compare: fix the dangerous assignment
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In commit 7c4ee5bcc8 we changed the order in which we construct
the AUXV, but forgot to adjust the calculation of the length. The
result is that we set info->auxv_len to a bogus and negative value,
and then later on the code in open_self_auxv() gets confused and
ends up presenting the guest with an empty file.
Since we now have to calculate the auxv length up-front as part
of figuring out how much we're going to put on the stack, set
info->auxv_len then; this allows us to assert that we put the
same number of entries into auxv as we pre-calculated, rather
than merely having a comment saying we need to do that.
Fixes: https://bugs.launchpad.net/qemu/+bug/1728116
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The ASPEED hardware contains a lock register for the SCU that disables
any writes to the SCU when it is locked. The machine comes up with the
lock enabled, but on all known hardware u-boot will unlock it and leave
it unlocked when loading the kernel.
This means the kernel expects the SCU to be unlocked. When booting from
an emulated ROM the normal u-boot unlock path is executed. Things don't
go well when booting using the -kernel command line, as u-boot does not
run first.
Change behaviour so that when a kernel is passed to the machine, set the
reset value of the SCU to be unlocked.
Signed-off-by: Joel Stanley <joel@jms.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Message-id: 20171114122018.12204-1-joel@jms.id.au
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In do_ats_write(), rather than using extended_addresses_enabled() to
decide whether the value we get back from get_phys_addr() is a 64-bit
format PAR or a 32-bit one, use arm_s1_regime_using_lpae_format().
This is not really the correct answer, because the PAR format
depends on the AT instruction being used, not just on the
translation regime. However getting this correct requires a
significant refactoring, so that get_phys_addr() returns raw
information about the fault which the caller can then assemble
into a suitable FSR/PAR/syndrome for its purposes, rather than
get_phys_addr() returning a pre-formatted FSR.
However this change at least improves the situation by making
the PAR work correctly for address translation operations done
at AArch64 EL2 on the EL2 translation regime. In particular,
this is necessary for Xen to be able to run in our emulation,
so this seems like a safer interim fix given that we are in freeze.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1509719814-6191-1-git-send-email-peter.maydell@linaro.org
The CPU ID registers ID_AA64PFR0_EL1, ID_PFR1_EL1 and ID_PFR1
have a field for reporting presence of GICv3 system registers.
We need to report this field correctly in order for Xen to
work as a guest inside QEMU emulation. We mustn't incorrectly
claim the sysregs exist when they don't, though, or Linux will
crash.
Unfortunately the way we've designed the GICv3 emulation in QEMU
puts the system registers as part of the GICv3 device, which
may be created after the CPU proper has been realized. This
means that we don't know at the point when we define the ID
registers what the correct value is. Handle this by switching
them to calling a function at runtime to read the value, where
we can fill in the GIC field appropriately.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: 1510066898-3725-1-git-send-email-peter.maydell@linaro.org
When rebooting a guest that has a virtio-scsi disk, the s390-ccw
bios sometimes bails out with an error message like this:
! SCSI cannot report LUNs: STATUS=02 RSPN=70 KEY=05 CODE=25 QLFR=00, sure !
Enabling the scsi_req* tracing in QEMU shows that the ccw bios is
trying to execute the REPORT LUNS SCSI command with a LUN != 0, and
this causes the SCSI command to fail.
Looks like we neither clear the BSS of the s390-ccw bios during reboot,
nor do we explicitly set the default_scsi_device.lun value to 0, so
this variable can contain random values from the OS after the reboot.
By setting this variable explicitly to 0, the problem is fixed and
the reboots always succeed.
Buglink: https://bugzilla.redhat.com/show_bug.cgi?id=1514352
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1510942228-22822-1-git-send-email-thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Currently, multi threaded TCG with > 1 VCPU gets stuck during IPL, when
the bios tries to switch to the loaded kernel via DIAG 308.
As run_on_cpu() is used, we run into a deadlock after handling the reset.
We need the iolock (just like KVM).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171116170526.12643-4-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Since commit ab06ec4357 we test the vmxnet3 device in the
pxe-tester, too (when running "make check SPEED=slow"). This now
revealed that the code is not working there if the host is a big
endian machine (for example ppc64 or s390x) - "make check SPEED=slow"
is now failing on such hosts.
The vmxnet3 code lacks endianness conversions in a couple of places.
Interestingly, the bitfields in the structs in vmxnet3.h already tried to
take care of the *bit* endianness of the C compilers - but the code missed
to change the *byte* endianness when reading or writing the corresponding
structs. So the bitfields are now wrapped into unions which allow to change
the byte endianness during runtime with the non-bitfield member of the union.
With these changes, "make check SPEED=slow" now properly works on big endian
hosts, too.
Reported-by: David Gibson <dgibson@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Gibson <dgibson@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The checksum algorithm used by IPv4, TCP and UDP allows a zero value
to be represented by either 0x0000 and 0xFFFF. But per RFC 768, a zero
UDP checksum must be transmitted as 0xFFFF because 0x0000 is a special
value meaning no checksum.
Substitute 0xFFFF whenever a checksum is computed as zero when
modifying a UDP datagram header. Doing this on IPv4 and TCP checksums
is unnecessary but legal. Add a wrapper for net_checksum_finish() that
makes the substitution.
(We can't just change net_checksum_finish(), as that function is also
used by receivers to verify checksums, and in that case the expected
value is always 0x0000.)
Signed-off-by: Ed Swierk <eswierk@skyportsystems.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since commit 1865e288a8 ("Fix eepro100 simple transmission
mode"), the test/pxe-test is broken for the eepro100 device on big
endian hosts. However, it seems like that commit did not introduce the
problem, but just uncovered it: The EEPRO100State->tx.tbd_array_addr and
EEPRO100State->tx.tcb_bytes fields are already in host byte order, since
they have already been byte-swapped in the read_cb() function.
Thus byte-swapping them in tx_command() again results in the wrong
endianness. Removing the byte-swapping here fixes the pxe-test.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
A DRC with a pending unplug request releases its associated device at
machine reset time.
In the case of LMB, when all DRCs for a DIMM device have been reset,
the DIMM gets unplugged, causing guest memory to disappear. This may
be very confusing for anything still using this memory.
This is exactly what happens with vhost backends, and QEMU aborts
with:
qemu-system-ppc64: used ring relocated for ring 2
qemu-system-ppc64: qemu/hw/virtio/vhost.c:649: vhost_commit: Assertion
`r >= 0' failed.
The issue is that each DRC registers a QEMU reset handler, and we
don't control the order in which these handlers are called (ie,
a LMB DRC will unplug a DIMM before the virtio device using the
memory on this DIMM could stop its vhost backend).
To avoid such situations, let's reset DRCs after all devices
have been reset.
Reported-by: Mallesh N. Koti <mallesh@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The device tree nodes ibm,arch-vec-5-platform-support and ibm,pa-features
are used to communicate features of the cpu to the guest operating
system. The properties of each of these are determined based on the
selected cpu model and the availability of hypervisor features.
Currently the compatibility mode of the cpu is not taken into account.
The ibm,arch-vec-5-platform-support node is used to communicate the
level of support for various ISAv3 processor features to the guest
before CAS to inform the guests' request. The available mmu mode should
only be hash unless the cpu is a POWER9 which is not in a prePOWER9
compat mode, in which case the available modes depend on the
accelerator and the hypervisor capabilities.
The ibm,pa-featues node is used to communicate the level of cpu support
for various features to the guest os. This should only contain features
relevant to the operating mode of the processor, that is the selected
cpu model taking into account any compat mode. This means that the
compat mode should be taken into account when choosing the properties of
ibm,pa-features and they should match the compat mode selected, or the
cpu model selected if no compat mode.
Update the setting of these cpu features in the device tree as described
above to properly take into account any compat mode. We use the
ppc_check_compat function which takes into account the current processor
model and the cpu compat mode.
Signed-off-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Block layer patches for 2.11.0-rc2
# gpg: Signature made Fri 17 Nov 2017 17:58:36 GMT
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (25 commits)
iotests: Make 087 pass without AIO enabled
block: Make bdrv_next() keep strong references
qcow2: Fix overly broad madvise()
qcow2: Refuse to get unaligned offsets from cache
qcow2: Add bounds check to get_refblock_offset()
block: Guard against NULL bs->drv
qcow2: Unaligned zero cluster in handle_alloc()
qcow2: check_errors are fatal
qcow2: reject unaligned offsets in write compressed
iotests: Add test for failing qemu-img commit
tests: Add check-qobject for equality tests
iotests: Add test for non-string option reopening
block: qobject_is_equal() in bdrv_reopen_prepare()
qapi: Add qobject_is_equal()
qapi/qlist: Add qlist_append_null() macro
qapi/qnull: Add own header
qcow2: fix image corruption on commit with persistent bitmap
iotests: test clearing unknown autoclear_features by qcow2
block: Fix permissions in image activation
qcow2: fix image corruption after committing qcow2 image into base
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches for 2.11.0-rc2
# gpg: Signature made Fri Nov 17 18:22:07 2017 CET
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2017-11-17:
iotests: Make 087 pass without AIO enabled
block: Make bdrv_next() keep strong references
qcow2: Fix overly broad madvise()
qcow2: Refuse to get unaligned offsets from cache
qcow2: Add bounds check to get_refblock_offset()
block: Guard against NULL bs->drv
qcow2: Unaligned zero cluster in handle_alloc()
qcow2: check_errors are fatal
qcow2: reject unaligned offsets in write compressed
iotests: Add test for failing qemu-img commit
tests: Add check-qobject for equality tests
iotests: Add test for non-string option reopening
block: qobject_is_equal() in bdrv_reopen_prepare()
qapi: Add qobject_is_equal()
qapi/qlist: Add qlist_append_null() macro
qapi/qnull: Add own header
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On one hand, it is a good idea for bdrv_next() to return a strong
reference because ideally nearly every pointer should be refcounted.
This fixes intermittent failure of iotest 194.
On the other, it is absolutely necessary for bdrv_next() itself to keep
a strong reference to both the BB (in its first phase) and the BDS (at
least in the second phase) because when called the next time, it will
dereference those objects to get a link to the next one. Therefore, it
needs these objects to stay around until then. Just storing the pointer
to the next in the iterator is not really viable because that pointer
might become invalid as well.
Both arguments taken together means we should probably just invoke
bdrv_ref() and blk_ref() in bdrv_next(). This means we have to assert
that bdrv_next() is always called from the main loop, but that was
probably necessary already before this patch and judging from the
callers, it also looks to actually be the case.
Keeping these strong references means however that callers need to give
them up if they decide to abort the iteration early. They can do so
through the new bdrv_next_cleanup() function.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110172545.32609-1-mreitz@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
@mem_size and @offset are both size_t, thus subtracting them from one
another will just return a big size_t if mem_size < offset -- even more
obvious here because the result is stored in another size_t.
Checking that result to be positive is therefore not sufficient to
exclude the case that offset > mem_size. Thus, we currently sometimes
issue an madvise() over a very large address range.
This is triggered by iotest 163, but with -m64, this does not result in
tangible problems. But with -m32, this test produces three segfaults,
all of which are fixed by this patch.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171114184127.24238-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Darren Kenny <darren.kenny@oracle.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Instead of using an assertion, it is better to emit a corruption event
here. Checking all offsets for correct alignment can be tedious and it
is easily possible to forget to do so. qcow2_cache_do_get() is a
function every L2 and refblock access has to go through, so this is a
good central point to add such a check.
And for good measure, let us also add an assertion that the offset is
non-zero. Making this a corruption event is not feasible, because a
zero offset usually means something special (such as the cluster is
unused), so all callers should be checking this anyway. If they do not,
it is their fault, hence the assertion here.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110203111.7666-6-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
We currently do not guard everywhere against a NULL bs->drv where we
should be doing so. Most of the places fixed here just do not care
about that case at all.
Some care implicitly, e.g. through a prior function call to
bdrv_getlength() which would always fail for an ejected BDS. Add an
assert there to make it more obvious.
Other places seem to care, but do so insufficiently: Freeing clusters in
a qcow2 image is an error-free operation, but it may leave the image in
an unusable state anyway. Giving qcow2_free_clusters() an error code is
not really viable, it is much easier to note that bs->drv may be NULL
even after a successful driver call. This concerns bdrv_co_flush(), and
the way the check is added to bdrv_co_pdiscard() (in every iteration
instead of only once).
Finally, some places employ at least an assert(bs->drv); somewhere, that
may be reasonable (such as in the reopen code), but in
bdrv_has_zero_init(), it is definitely not. Returning 0 there in case
of an ejected BDS saves us much headache instead.
Reported-by: R. Nageswara Sastry <nasastry@in.ibm.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1728660
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171110203111.7666-4-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Currently, bdrv_reopen_prepare() assumes that all BDS options are
strings. However, this is not the case if the BDS has been created
through the json: pseudo-protocol or blockdev-add.
Note that the user-invokable reopen command is an HMP command, so you
can only specify strings there. Therefore, specifying a non-string
option with the "same" value as it was when originally created will now
return an error because the values are supposedly similar (and there is
no way for the user to circumvent this but to just not specify the
option again -- however, this is still strictly better than just
crashing).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171114180128.17076-5-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
If an image contains persistent bitmaps, we cannot use the
fast path of bdrv_make_empty() to clear the image during
qemu-img commit, because that will lose the clusters related
to the bitmaps.
Also leave a comment in qcow2_read_extensions to remind future
feature additions to think about fast-path removal, since we
just barely fixed the same bug for LUKS encryption.
It's a pain that qemu-img has not yet been taught to manipulate,
or even at a very minimum display, information about persistent
bitmaps; instead, we have to use QMP commands. It's also a
pain that only qeury-block and x-debug-block-dirty-bitmap-sha256
will allow bitmap introspection; but the former requires the
node to be hooked to a block device, and the latter is experimental.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Test clearing unknown autoclear_features by qcow2 on incoming
migration.
[ kwolf: Fixed wait for destination VM startup ]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Inactive images generally request less permissions for their image files
than they would if they were active (in particular, write permissions).
Activating the image involves extending the permissions, therefore.
drv->bdrv_invalidate_cache() can already require write access to the
image file, so we have to update the permissions earlier than that.
The current code does it only later, so we have to move up this part.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
nbd patches for 2017-11-17
Eric Blake - nbd: Don't crash when server reports NBD_CMD_READ failure
Eric Blake - nbd/client: Use error_prepend() correctly
Eric Blake - nbd/client: Don't hard-disconnect on ESHUTDOWN from server
Eric Blake - nbd/server: Fix error reporting for bad requests
# gpg: Signature made Fri 17 Nov 2017 14:53:30 GMT
# gpg: using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2017-11-17:
nbd/server: Fix error reporting for bad requests
nbd/client: Don't hard-disconnect on ESHUTDOWN from server
nbd/client: Use error_prepend() correctly
nbd: Don't crash when server reports NBD_CMD_READ failure
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The NBD spec says an attempt to NBD_CMD_TRIM on a read-only
export should fail with EPERM, as a trim has the potential
to change disk contents, but we were relying on the block
layer to catch that for us, which might not always give the
right error (and even if it does, it does not let us pass
back a sane message for structured replies).
The NBD spec says an attempt to NBD_CMD_WRITE_ZEROES out of
bounds should fail with ENOSPC, not EINVAL.
Our check for u64 offset + u32 length wraparound up front is
pointless; nothing uses offset until after the second round
of sanity checks, and we can just as easily ensure there is
no wraparound by checking whether offset is in bounds (since
a disk size cannot exceed off_t which is 63 bits, adding a
32-bit number for a valid offset can't overflow). Bonus:
dropping the up-front check lets us keep the connection alive
after NBD_CMD_WRITE, whereas before we would drop the
connection (of course, any client sending a packet that would
trigger the failure is already buggy, so it's also okay to
drop the connection, but better quality-of-implementation
never hurts).
Solve all of these issues by some code motion and improved
request validation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171115213557.3548-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The NBD spec says that a server may fail any transmission request
with ESHUTDOWN when it is apparent that no further request from
the client can be successfully honored. The client is supposed
to then initiate a soft shutdown (wait for all remaining in-flight
requests to be answered, then send NBD_CMD_DISC). However, since
qemu's server never uses ESHUTDOWN errors, this code was mostly
untested since its introduction in commit b6f5d3b5.
More recently, I learned that nbdkit as the NBD server is able to
send ESHUTDOWN errors, so I finally tested this code, and noticed
that our client was special-casing ESHUTDOWN to cause a hard
shutdown (immediate disconnect, with no NBD_CMD_DISC), but only
if the server sends this error as a simple reply. Further
investigation found that commit d2febedb introduced a regression
where structured replies behave differently than simple replies -
but that the structured reply behavior is more in line with the
spec (even if we still lack code in nbd-client.c to properly quit
sending further requests). So this patch reverts the portion of
b6f5d3b5 that introduced an improper hard-disconnect special-case
at the lower level, and leaves the future enhancement of a nicer
soft-disconnect at the higher level for another day.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171113194857.13933-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
When using error prepend(), it is necessary to end with a space
in the format string; otherwise, messages come out incorrectly,
such as when connecting to a socket that hangs up immediately:
can't open device nbd://localhost:10809/: Failed to read dataUnexpected end-of-file before all bytes were read
Originally botched in commit e44ed99d, then several more instances
added in the meantime.
Pre-existing and not fixed here: we are inconsistent on capitalization;
some of our messages start with lower case, and others start with upper,
although the use of error_prepend() is much nicer to read when all
fragments consistently start with lower.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171113152424.25381-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
If a server fails a read, for example with EIO, but the connection
is still live, then we would crash trying to print a non-existent
error message in nbd_client_co_preadv(). For consistency, also
change the error printout in nbd_read_reply_entry(), although that
instance does not crash. Bug introduced in commit f140e300.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171112013936.5942-1-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
After committing the qcow2 image contents into the base image, qemu-img
will call bdrv_make_empty to drop the payload in the layered image.
When this is done for qcow2 images, it blows away the LUKS encryption
header, making the resulting image unusable. There are two codepaths
for emptying a qcow2 image, and the second (slower) codepath leaves
the LUKS header intact, so force use of that codepath.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_set_read_only() is used by some block drivers to override the
read-only option given by the user. This is not how read-only images
generally work in QEMU: Instead of second guessing what the user really
meant (which currently includes making an image read-only even if the
user didn't only use the default, but explicitly said read-only=off), we
should error out if we can't provide what the user requested.
This adds deprecation warnings to all callers of bdrv_set_read_only() so
that the behaviour can be corrected after the usual deprecation period.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently if trying to change encryption parameters on a qcow2 image, qemu-img
will abort. We already explicitly check for attempt to change encrypt.format
but missed other parameters like encrypt.key-secret. Rather than list each
parameter, just blacklist changing of all parameters with a 'encrypt.' prefix.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This avoids that random UI frontend error messages end up in the output.
In particular, we were seeing this line in CI error logs:
+Unable to init server: Could not connect: Connection refused
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Kashyap Chamarthy <kchamart@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
replication_child_perm request write
permissions for all child which will lead bdrv_check_perm fail.
replication_child_perm() should request write
permissions only if it is writable itself.
Signed-off-by: Wang Guang <wang.guang55@zte.com.cn>
Signed-off-by: Wang Yong <wang.yong155@zte.com.cn>
Reviewed-by: Xie Changlong <xiechanglong@cmss.chinamobile.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
sdl2 fixes for 2.11
# gpg: Signature made Fri 17 Nov 2017 10:06:27 GMT
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/ui-20171117-pull-request:
sdl2: Fix broken display updating after the window is hidden
sdl2: Do not leave grab when fullscreen
sdl2: Fix dead keyboard after fullsceen
sdl2: Use the same pointer show/hide logic for absolute and relative mode
sdl2: Do not quit the emulator when an auxilliary window is closed
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pc, pci, virtio: fixes for rc1
A bunch of fixes all over the place.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 16 Nov 2017 16:37:21 GMT
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
tests/bios-tables-test: Fix endianess problems when passing data to iasl
build-sys: restrict vmcoreinfo to fw_cfg+dma capable targets
vmcoreinfo: put it in the 'misc' device category
NUMA: Enable adding NUMA node implicitly
tests/acpi-test-data: update _CRS in DSDT
hw/pcie-pci-bridge: restrict to X86 and ARM
hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
pci: Initialize pci_dev->name before use
fix: unrealize virtio device if we fail to hotplug it
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The bios-tables-test was writing out files that we pass to iasl in
with the wrong endianness in the header when running on a big endian
host. So instead of storing mixed endian information in our structures,
let's keep everything in little endian and byte-swap it only when we
need a value in the code.
Reported-by: Daniel P. Berrange <berrange@redhat.com>
Buglink: https://bugs.launchpad.net/qemu/+bug/1724570
Suggested-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Tested-by: "Daniel P. Berrange" <berrange@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vmcoreinfo is built for all targets. However, it requires fw_cfg with
DMA operations support (write operation). Restrict vmcoreinfo exposure
to architectures that are supporting FW_CFG_DMA, that is arm-virt and
x86 only atm.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Tested-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Linux and Windows need ACPI SRAT table to make memory hotplug work properly,
however currently QEMU doesn't create SRAT table if numa options aren't present
on CLI.
Which breaks both linux and windows guests in certain conditions:
* Windows: won't enable memory hotplug without SRAT table at all
* Linux: if QEMU is started with initial memory all below 4Gb and no SRAT table
present, guest kernel will use nommu DMA ops, which breaks 32bit hw drivers
when memory is hotplugged and guest tries to use it with that drivers.
Fix above issues by automatically creating a numa node when QEMU is started with
memory hotplug enabled but without '-numa' options on CLI.
(PS: auto-create numa node only for new machine types so not to break migration).
Which would provide SRAT table to guests without explicit -numa options on CLI
and would allow:
* Windows: to enable memory hotplug
* Linux: switch to SWIOTLB DMA ops, to bounce DMA transfers to 32bit allocated
buffers that legacy drivers/hw can handle.
[Rewritten by Igor]
Reported-by: Thadeu Lima de Souza Cascardo <cascardo@canonical.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Cc: Thomas Huth <thuth@redhat.com>
Cc: Alistair Francis <alistair23@gmail.com>
Cc: Takao Indoh <indou.takao@jp.fujitsu.com>
Cc: Izumi Taku <izumi.taku@jp.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
commit dadf988e81b15065ac1d6dbaf4b87b5b80c7b670
hw/pci-host: Fix x86 Host Bridges 64bit PCI hole
Added a 64 bit hole to _CRS of PCI0.
Update the expected files accordingly.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently there is no MMIO range over 4G
reserved for PCI hotplug. Since the 32bit PCI hole
depends on the number of cold-plugged PCI devices
and other factors, it is very possible is too small
to hotplug PCI devices with large BARs.
Fix it by reserving 2G for I4400FX chipset
in order to comply with older Win32 Guest OSes
and 32G for Q35 chipset.
Even if the new defaults of pci-hole64-size will appear in
"info qtree" also for older machines, the property was
not implemented so no changes will be visible to guests.
Note this is a regression since prev QEMU versions had
some range reserved for 64bit PCI hotplug.
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This moves pci_dev->name initialization earlier so
pci_dev->bus_master_as could get a name instead of an empty string.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If we fail to hotplug virtio-blk device and then suspend
or shutdown VM, qemu is likely to crash.
Re-production steps:
1. Run VM named vm001
2. Create a virtio-blk.xml which contains wrong configurations:
<disk device="lun" rawio="yes" type="block">
<driver cache="none" io="native" name="qemu" type="raw" />
<source dev="/dev/mapper/11-dm" />
<target bus="virtio" dev="vdx" />
</disk>
3. Run command : virsh attach-device vm001 virtio-blk.xml
error: Failed to attach device from blk-scsi.xml
error: internal error: unable to execute QEMU command 'device_add': Please set scsi=off for virtio-blk devices in order to use virtio 1.0
it means hotplug virtio-blk device failed.
4. Suspend or shutdown VM will leads to qemu crash
Problem happens in virtio_vmstate_change which is called by
vm_state_notify:
vdev’s parent_bus is NULL, so qdev_get_parent_bus(DEVICE(vdev)) will crash.
virtio_vmstate_change is added to the list vm_change_state_head at virtio_blk_device_realize(virtio_init),
but after hotplug virtio-blk failed, virtio_vmstate_change will not be removed from vm_change_state_head.
Adding unrealize function of virtio-blk device can solve this problem.
Signed-off-by: linzhecheng <linzhecheng@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
tg->any_timer_armed[] must be cleared when detaching pending timers from
the AioContext. Failure to do so leads to hung I/O because it looks
like there are still timers pending when in fact they have been removed.
Other ThrottleGroupMembers might have requests pending too so it's
necessary to schedule the next TGM so it can set a timer.
This patch fixes hung I/O when QEMU is launched with drives that are in
the same throttling group:
(guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct bs=512 &
(guest)$ dd if=/dev/zero of=/dev/vdc oflag=direct bs=512 &
(qemu) stop
(qemu) cont
...I/O is stuck...
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171116112150.27607-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Merge tpm 2017/11/15 v1
# gpg: Signature made Wed 15 Nov 2017 11:51:47 GMT
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-11-15-1:
tpm_tis: Return 0 for every register in case of failure mode
tpm_tis: Return TPM_VERSION_UNSPEC in case of BE failure
tpm-emulator: protect concurrent ctrl_chr access
specs: Extend TPM spec with TPM emulator description
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fixes a crash caused by picking the wrong memory region in
address_space_lookup_region seen with client code accessing a device
model that uses alias memory regions. The expensive part of
address_space_lookup_region anyway is phys_page_find; performance-wise
it is okay to repeat the subsequent subpage lookup.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Message-Id: <20171114225941.072707456B5@zero.eik.bme.hu>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Rather than returning ~0, return 0 for every register in case of failure
mode. The '0' is better to indicate that there's no device there. It avoids
SeaBIOS detecting a device and getting stuck on it trying to read and write
its registers.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
In case the backend has a failure, such as the tpm_emulator's CMD_INIT
failing, the TIS goes into failure mode and does not respond to reads
or writes to MMIO registers. In this case we need to prevent the ACPI
table from being added and the straight-forward way is to indicate that
there's no known TPM version being used.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The control chardev is being used from the data thread to set the
locality of the next request. Altough the chr has a write mutex, we
may potentially read the reply from another thread request.
Add a mutex to protect from concurrent control commands.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Following the recent extension of QEMU with a TPM emulator device,
update the specs describing for how to interact with the device.
The results of commands run inside a Linux VM are expected to be
similar to those when the TPM passthrough device is used, so we
just reuse that.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
We use raw memory primitives along the !parallel_cpus paths in order to
simplify the endianness handling. Because of that, we did not benefit
from the generic changes to cpu_ldst_user_only_template.h.
The simplest fix is to manipulate helper_retaddr here.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When we handle a signal from a fault within a user-only memory helper,
we cannot cpu_restore_state with the PC found within the signal frame.
Use a TLS variable, helper_retaddr, to record the unwind start point
to find the faulting guest insn.
Tested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Block patches for 2.11.0-rc1
# gpg: Signature made Tue 14 Nov 2017 17:22:17 GMT
# gpg: using RSA key 0xF407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* remotes/maxreitz/tags/pull-block-2017-11-14:
qemu-iotests: update unsupported image formats in 194
block/parallels: add migration blocker
block/parallels: Do not update header or truncate image when INMIGRATE
block/vhdx.c: Don't blindly update the header
iotests: 077: Filter out 'resume' lines
block/snapshot: dirty all dirty bitmaps on snapshot-switch
qcow2: Check that corrupted images can be repaired in iotest 060
iotests: Use new-style NBD connections
iotests: Make 136 less flaky
iotests: Make 083 less flaky
iotests: Make 055 less flaky
iotests: Add missing 'blkdebug::' in 040
iotests: Make 030 less flaky
qcow2: Assert that the crypto header does not overlap other metadata
qcow2: Add iotest for an empty refcount table
qcow2: Add iotest for an image with header.refcount_table_offset == 0
qcow2: Don't open images with header.refcount_table_clusters == 0
qcow2: Prevent allocating compressed clusters at offset 0
qcow2: Prevent allocating L2 tables at offset 0
qcow2: Prevent allocating refcount blocks at offset 0
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The VHDX specification requires that before user data modification of
the vhdx image, the VHDX header file and data GUIDs need to be updated.
In vhdx_open(), if the image is set to RDWR, we go ahead and update the
header.
However, just because the image is set to RDWR does not mean we can go
ahead and write at this point - specifically, if the QEMU run state is
INMIGRATE, the underlying file BS may be set to inactive via the BDS
open flag of BDRV_O_INACTIVE. Attempting to write under this condition
will cause an assert in bdrv_co_pwritev().
We can alternatively latch the first time the image is written. And lo
and behold, we do just that, via vhdx_user_visible_write() in
vhdx_co_writev(). This means the call to vhdx_update_headers() in
vhdx_open() is likely just vestigial, and can be removed.
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Message-id: 659e4cdba6ef4c651737852777c8c93d27b38040.1510059970.git.jcody@redhat.com
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Denis V. Lunev <den@openvz.org>
Signed-off-by: Max Reitz <mreitz@redhat.com>
In the "Overlapping multiple requests" cases, the 3rd reqs (the break
point B) doesn't wait for the 2nd, and once resumed the I/O will just
continue. This is because the 2nd is already waiting for the 1st, and
in wait_serialising_requests() there is:
/* If the request is already (indirectly) waiting for us, or
* will wait for us as soon as it wakes up, then just go on
* (instead of producing a deadlock in the former case). */
if (!req->waiting_for) {
/* actually break */
...
}
Consequently, the following "sleep 100; resume A" command races with the
completion of that request, and sometimes results in an unexpected
order of output:
> @@ -56,9 +56,9 @@
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> blkdebug: Resuming request 'B'
> +blkdebug: Resuming request 'A'
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> -blkdebug: Resuming request 'A'
> wrote XXX/XXX bytes at offset XXX
> XXX bytes, X ops; XX:XX:XX.X (XXX YYY/sec and XXX ops/sec)
> wrote XXX/XXX bytes at offset XXX
Filter out the "Resuming request" lines to make the output
deterministic.
Reported-by: Patchew <no-reply@patchew.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20171113150026.4743-1-famz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Old-style NBD is deprecated upstream (it is documented, but no
longer implemented in the reference implementation), and it is
severely limited (it cannot support structured replies, which
means it cannot support efficient handling of zeroes), when
compared to new-style NBD. We are better off having our iotests
favor new-style everywhere (although some explicit tests,
particularly 83, still cover old-style for back-compat reasons);
this is as simple as supplying the empty string as the default
export name, as it does not change the URI needed to connect a
client to the server. This also gives us more coverage of the
just-added structured reply code, when not overriding $QEMU_NBD
to intentionally point to an older server.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-id: 20171109221216.10248-1-eblake@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
136 executes some AIO requests without a final aio_flush; then it
advances the virtual clock and thus expects the last access time of the
device to be less than the current time when queried (i.e. idle_time_ns
to be greater than 0). However, without the aio_flush, some requests
may be settled after the clock_step invocation. In that case,
idle_time_ns would be 0 and the test fails.
Fix this by adding an aio_flush if any AIO request other than some other
aio_flush has been executed.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171109203025.27493-6-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
083 has (at least) two issues:
1. By launching the nbd-fault-injector in background, it may not be
scheduled until the first grep on its output file is executed.
However, until then, that file may not have been created yet -- so it
either does not exist yet (thus making the grep emit an error), or it
does exist but contains stale data (thus making the rest of the test
case work connect to a wrong address).
Fix this by explicitly overwriting the output file before executing
nbd-fault-injector.
2. The nbd-fault-injector prints things other than "Listening on...".
It also prints a "Closing connection" message from time to time. We
currently invoke sed on the whole file in the hope of it only
containing the "Listening on..." line yet. That hope is sometimes
shattered by the brutal reality of race conditions, so make the sed
script more robust.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171109203025.27493-5-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
First of all, test 055 does a valiant job of invoking pause_drive()
sometimes, but that is worth nothing without blkdebug. So the first
thing to do is to sprinkle a couple of "blkdebug::" in there -- with the
exception of the transaction tests, because the blkdebug break points
make the transaction QMP command hang (which is bad). In that case, we
can get away with throttling the block job that it effectively is
paused.
Then, 055 usually does not pause the drive before starting a block job
that should be cancelled. This means that the backup job might be
completed already before block-job-cancel is invoked; thus making the
test either fail (currently) or moot if cancel_and_wait() ignored this
condition. Fix this by pausing the drive before starting the job.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171109203025.27493-4-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
This patch fixes two race conditions in 030:
1. The first is in TestENOSPC.test_enospc(). After resuming the job,
querying it to confirm it is no longer paused may fail because in the
meantime it might have completed already. The same was fixed in
TestEIO.test_ignore() already (in commit
2c3b44da07).
2. The second is in TestSetSpeed.test_set_speed_invalid(): Here, a
stream job is started on a drive without any break points, with a
block-job-set-speed invoked subsequently. However, without any break
points, the job might have completed in the meantime (on tmpfs at
least); or it might complete before cancel_and_wait() which expects
the job to still exist. This can be fixed like everywhere else by
pausing the drive (installing break points) before starting the job
and letting cancel_and_wait() resume it.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20171109203025.27493-2-mreitz@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The crypto header is initialized only when QEMU is creating a new
image, so there's no chance of this happening on a corrupted image.
If QEMU is really trying to allocate the header overlapping other
existing metadata sections then this is a serious bug in QEMU itself
so let's add an assertion.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: ae3d77f312fc0c5e0ac2bbd71676c0112eebe2e5.1509718618.git.berto@igalia.com
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
qcow2_do_open() is checking that header.refcount_table_clusters is not
too large, but it doesn't check that it's greater than zero. Apart
from the fact that an image like that is obviously corrupted, trying
to use it crashes QEMU since we end up with a null s->refcount_table
after qcow2_refcount_init().
These images can however be repaired, so allow opening them if the
BDRV_O_CHECK flag is set.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: f9750f50c80359babba11062e88f5075a47e8e16.1509718618.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
If the refcount data is corrupted then we can end up trying to
allocate a new compressed cluster at offset 0 in the image, triggering
an assertion in qcow2_alloc_bytes() that would crash QEMU:
qcow2_alloc_bytes: Assertion `offset' failed.
This patch adds an explicit check for this scenario and a new test
case.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: fb53467cf48e95ff3330def1cf1003a5b862b7d9.1509718618.git.berto@igalia.com
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
If the refcount data is corrupted then we can end up trying to
allocate a new L2 table at offset 0 in the image, triggering an
assertion in the qcow2 cache that would crash QEMU:
qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed
This patch adds an explicit check for this scenario and a new test
case.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 92dac37191ae7844a2da22c122204eb493cc3133.1509718618.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Each entry in the qcow2 cache contains an offset field indicating the
location of the data in the qcow2 image. If the offset is 0 then it
means that the entry contains no data and is available to be used when
needed.
Because of that it is not possible to store in the cache the first
cluster of the qcow2 image (offset = 0). This is not a problem because
that cluster always contains the qcow2 header and we're not using this
cache for that.
However, if the qcow2 image is corrupted it can happen that we try to
allocate a new refcount block at offset 0, triggering this assertion
and crashing QEMU:
qcow2_cache_entry_mark_dirty: Assertion `c->entries[i].offset != 0' failed
This patch adds an explicit check for this scenario and a new test
case.
This problem was originally reported here:
https://bugs.launchpad.net/qemu/+bug/1728615
Reported-by: R.Nageswara Sastry <nasastry@in.ibm.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 92a2fadd10d58b423f269c1d1a309af161cdc73f.1509718618.git.berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Pull request
The following disk I/O throttling fixes solve recent bugs.
# gpg: Signature made Tue 14 Nov 2017 10:37:12 GMT
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
qemu-iotests: Test I/O limits with removable media
block: Leave valid throttle timers when removing a BDS from a backend
block: Check for inserted BlockDriverState in blk_io_limits_disable()
throttle-groups: drain before detaching ThrottleState
block: all I/O should be completed before removing throttle timers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Update our pre-release seabios snapshot to the final release.
git shortlog
============
Gerd Hoffmann (1):
sercon: Disable ScreenAndDebug in case both serial console and serial debug are active
Kevin O'Connor (2):
timer: Avoid integer overflows in usec and nsec calculations
docs: Note v1.11.0 release
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
# gpg: Signature made Tue 14 Nov 2017 02:05:34 GMT
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
net/socket: fix coverity issue
Add new PCI ID for i82559a
Fix eepro100 simple transmission mode
colo: Consolidate the duplicate code chunk into a routine
colo-compare: Fix comments
colo-compare: compare the packet in a specified Connection
colo-compare: Insert packet into the suitable position of packet queue directly
net: fix check for number of parameters to -netdev socket
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch ensures that icount_decr.u32.high is clear before calling
cpu_exec_nocache when exception is pending. Because the exception is
caused by the first instruction in the block and it cannot be executed
without resetting the flag.
There are two parts in the fix. First, clear icount_decr.u32.high in
cpu_handle_interrupt (just before processing the "dependent" request,
stored in cpu->interrupt_request or cpu->exit_request) rather than
cpu_loop_exec_tb; this ensures that cpu_handle_exception is always
reached with zero icount_decr.u32.high unless another interrupt has
happened in the meanwhile.
Second, try to cause the exception at the beginning of
cpu_handle_exception, and exit immediately if the TB cannot
execute. With this change, interrupts are processed and
cpu_exec_nocache can make process.
Signed-off-by: Maria Klimushenkova <maria.klimushenkova@ispras.ru>
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20171114081818.27640.33165.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch adds a condition before overwriting exception_index fiels.
It is needed when exception_index is already set to some meaningful value.
Signed-off-by: Pavel Dovgalyuk <pavel.dovgaluk@ispras.ru>
Message-Id: <20171114081812.27640.26372.stgit@pasha-VirtualBox>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 5c0919d0 [1] introduced virtqueue_size parameter
for common virtio-scsi path, without updaing the vhost-user-scsi
code. vhost-user-scsi devices right now report size 0 for each vq.
This patch introduces virtqueue_size param to vhost-user-scsi,
that can now be set by the user. However, the most importantly, it
now has a default value of 128 (same as QEMU's virtio-scsi).
[1] 5c0919d0 ("virtio-scsi: Add virtqueue_size parameter
allowing virtqueue size to be set.")
Change-Id: I70e87eab702ebf1196c028dbf17d54fdc0c89a14
Signed-off-by: Dariusz Stojaczyk <dariuszx.stojaczyk@intel.com>
Message-Id: <1510676916-76409-1-git-send-email-dariuszx.stojaczyk@intel.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using obscure black magic introduced in eaa2ddbb76 :)
In an out-of-tree directory, running "../configure && make help" will generate
some required files (.mak), then clone some submodules, compile at least
the capstone submodule, generate QMP and Trace files, and finally display
the help.
On an outdated computer (Sun Blade workstation), running "make help" took
more than 5h :) With this patch it took roughly 37min.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20171108032052.20029-1-f4bug@amsat.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target-arm queue:
* translate-a64.c: silence gcc5 warning
* highbank: validate register offset before access
* MAINTAINERS: Add entries for Smartfusion2
* accel/tcg/translate-all: expand cpu_restore_state addr check
(so usermode insn aborts don't crash with an assertion failure)
* fix TCG initialization of some Arm boards by allowing them
to specify min/default number of CPUs to create
# gpg: Signature made Mon 13 Nov 2017 14:11:09 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171113:
accel/tcg/translate-all: expand cpu_restore_state addr check
hw: add .min_cpus and .default_cpus fields to machine_class
xlnx-zcu102: Specify the max number of CPUs for the EP108
xlnx-zcu102: Add an info message deprecating the EP108
xlnx-zynqmp: Properly support the smp command line option
qom: move CPUClass.tcg_initialize to a global
MAINTAINERS: Add entries for Smartfusion2
highbank: validate register offset before access
arm/translate-a64: mark path as unreachable to eliminate warning
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When using the emulated XICS, the 'info pic' monitor command shows:
CPU 0 XIRR=ff000000 ((nil)) PP=ff MFRR=ff
ICS 1000..13ff 0x10040060340
1000 MSI 05 00
1001 MSI 05 00
1002 MSI 05 00
1003 MSI ff 00
1004 LSI ff 00
1005 LSI ff 00
1006 LSI ff 00
1007 LSI ff 00
1008 MSI 05 00
1009 MSI 05 00
100a MSI 05 00
100b MSI 05 00
100c MSI 05 00
but when using the in-kernel XICS with the very same guest, we get:
CPU 0 XIRR=00000000 ((nil)) PP=ff MFRR=ff
ICS 1000..13ff 0x10032e00340
1000 MSI ff 00
1001 MSI ff 00
1002 MSI ff 00
1003 MSI ff 00
1004 LSI ff 00
1005 LSI ff 00
1006 LSI ff 00
1007 LSI ff 00
1008 MSI ff 00
1009 MSI ff 00
100a MSI ff 00
100b MSI ff 00
100c MSI ff 00
ie, all irqs are masked and XIRR is null, while we should get the
same output as with the emulated XICS.
If the guest is then migrated, 'info pic' shows the expected values
on both source and destination.
The problem is that QEMU doesn't synchronize with KVM before printing
the XICS state. Migration happens to fix the output because it enforces
synchronization with KVM.
To fix the invalid output of 'info pic', this patch introduces a new
synchronize_state operation for both ICPStateClass and ICSStateClass.
The ICP operation relies on run_on_cpu() in order to kick the vCPU
and avoid sleeping on KVM_GET_ONE_REG.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
KVM HV will soon support running a guest in hash mode on a POWER9 host
running in radix mode (see [1]), however the guest currently fails to
boot.
This is because the "htab_shift" value (the size of the MMU's hash
table) is added to the device tree before KVM has had a chance to
change it. If the host is in hash mode, KVM does not need to change it
and so the problem is not seen, but when the host is in radix mode a
change is required and we see a problem.
To fix this, move the call spapr_setup_hpt_and_vrma() (where
htab_shift could be changed) up a little so that it's called before
spapr_h_cas_compose_response() (where htab_shift is added to the
device tree).
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[1] See http://www.spinics.net/lists/kvm-ppc/msg13057.html
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This test hotplugs a CD drive to a VM and checks that I/O limits can
be set only when the drive has media inserted and that they are kept
when the media is replaced.
This also tests the removal of a device with valid I/O limits set but
no media inserted. This involves deleting and disabling the limits
of a BlockBackend without BlockDriverState, a scenario that has been
crashing until the fixes from the last couple of patches.
[Python PEP8 fixup: "Don't use spaces are the = sign when used to
indicate a keyword argument or a default parameter value"
--Stefan]
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 071eb397118ed207c5a7f01d58766e415ee18d6a.1510339534.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If a BlockBackend has I/O limits set then its ThrottleGroupMember
structure uses the AioContext from its attached BlockDriverState.
Those two contexts must be kept in sync manually. This is not
ideal and will be fixed in the future by removing the throttling
configuration from the BlockBackend and storing it in an implicit
filter node instead, but for now we have to live with this.
When you remove the BlockDriverState from the backend then the
throttle timers are destroyed. If a new BlockDriverState is later
inserted then they are created again using the new AioContext.
There are a couple of problems with this:
a) The code manipulates the timers directly, leaving the
ThrottleGroupMember.aio_context field in an inconsisent state.
b) If you remove the I/O limits (e.g by destroying the backend)
when the timers are gone then throttle_group_unregister_tgm()
will attempt to destroy them again, crashing QEMU.
While b) could be fixed easily by allowing the timers to be freed
twice, this would result in a situation in which we can no longer
guarantee that a valid ThrottleState has a valid AioContext and
timers.
This patch ensures that the timers and AioContext are always valid
when I/O limits are set, regardless of whether the BlockBackend has a
BlockDriverState inserted or not.
[Fixed "There'a" typo as suggested by Max Reitz <mreitz@redhat.com>
--Stefan]
Reported-by: sochin jiang <sochin.jiang@huawei.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: e089c66e7c20289b046d782cea4373b765c5bc1d.1510339534.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When you set I/O limits using block_set_io_throttle or the command
line throttling.* options they are kept in the BlockBackend regardless
of whether a BlockDriverState is attached to the backend or not.
Therefore when removing the limits using blk_io_limits_disable() we
need to check if there's a BDS before attempting to drain it, else it
will crash QEMU. This can be reproduced very easily using HMP:
(qemu) drive_add 0 if=none,throttling.iops-total=5000
(qemu) drive_del none0
Reported-by: sochin jiang <sochin.jiang@huawei.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Message-id: 0d3a67ce8d948bb33e08672564714dcfb76a3d8c.1510339534.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
I/O requests hang after stop/cont commands at least since QEMU 2.10.0
with -drive iops=100:
(guest)$ dd if=/dev/zero of=/dev/vdb oflag=direct count=1000
(qemu) stop
(qemu) cont
...I/O is stuck...
This happens because blk_set_aio_context() detaches the ThrottleState
while requests may still be in flight:
if (tgm->throttle_state) {
throttle_group_detach_aio_context(tgm);
throttle_group_attach_aio_context(tgm, new_context);
}
This patch encloses the detach/attach calls in a drained region so no
I/O request is left hanging. Also add assertions so we don't make the
same mistake again in the future.
Reported-by: Yongxue Hong <yhong@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 20171110151934.16883-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In blk_remove_bs, all I/O should be completed before removing throttle
timers. If there has inflight I/O, removing throttle timers here will
cause the inflight I/O never return.
This patch add bdrv_drained_begin before throttle_timers_detach_aio_context
to let all I/O completed before removing throttle timers.
[Moved declaration of bs as suggested by Alberto Garcia
<berto@igalia.com>.
--Stefan]
Signed-off-by: Zhengui <lizhengui@huawei.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Message-id: 1508564040-120700-1-git-send-email-lizhengui@huawei.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
We are still seeing signals during translation time when we walk over
a page protection boundary. This expands the check to ensure the host
PC is inside the code generation buffer. The original suggestion was
to check versus tcg_ctx.code_gen_ptr but as we now segment the
translation buffer we have to settle for just a general check for
being inside.
I've also fixed up the declaration to make it clear it can deal with
invalid addresses. A later patch will fix up the call sites.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reported-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20171108153245.20740-2-alex.bennee@linaro.org
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Tested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
max_cpus needs to be an upper bound on the number of vCPUs
initialized; otherwise TCG region initialization breaks.
Some boards initialize a hard-coded number of vCPUs, which is not
captured by the global max_cpus and therefore breaks TCG initialization.
Fix it by adding the .min_cpus field to machine_class.
This commit also changes some user-facing behaviour: we now die if
-smp is below this hard-coded vCPU minimum instead of silently
ignoring the passed -smp value (sometimes announcing this by printing
a warning). However, the introduction of .default_cpus lessens the
likelihood that users will notice this: if -smp isn't set, we now
assign the value in .default_cpus to both smp_cpus and max_cpus. IOW,
if a user does not set -smp, they always get a correct number of vCPUs.
This change fixes 3468b59 ("tcg: enable multiple TCG contexts in
softmmu", 2017-10-24), which broke TCG initialization for some
ARM boards.
Fixes: 3468b59e18
Reported-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-id: 1510343626-25861-6-git-send-email-cota@braap.org
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The EP108 was an early access development board that is no longer used.
Add an info message to convert any users to the ZCU102 instead. On QEMU
they are both identical.
This patch also updated the qemu-doc.texi file to indicate that the
EP108 has been deprecated.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Message-id: 1510343626-25861-4-git-send-email-cota@braap.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
55c3cee ("qom: Introduce CPUClass.tcg_initialize", 2017-10-24)
introduces a per-CPUClass bool that we check so that the target CPU
is initialized for TCG only once. This works well except when
we end up creating more than one CPUClass, in which case we end
up incorrectly initializing TCG more than once, i.e. once for
each CPUClass.
This can be replicated with:
$ aarch64-softmmu/qemu-system-aarch64 -machine xlnx-zcu102 -smp 6 \
-global driver=xlnx,,zynqmp,property=has_rpu,value=on
In this case the class name of the "RPUs" is prefixed by "cortex-r5-",
whereas the "regular" CPUs are prefixed by "cortex-a53-". This
results in two CPUClass instances being created.
Fix it by introducing a static variable, so that only the first
target CPU being initialized will initialize the target-dependent
part of TCG, regardless of CPUClass instances.
Fixes: 55c3ceef61
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Tested-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1510343626-25861-2-git-send-email-cota@braap.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fixes the following warning when compiling with gcc 5.4.0 with -O1
optimizations and --enable-debug:
target/arm/translate-a64.c: In function ‘aarch64_tr_translate_insn’:
target/arm/translate-a64.c:2361:8: error: ‘post_index’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (!post_index) {
^
target/arm/translate-a64.c:2307:10: note: ‘post_index’ was declared here
bool post_index;
^
target/arm/translate-a64.c:2386:8: error: ‘writeback’ may be used uninitialized in this function [-Werror=maybe-uninitialized]
if (writeback) {
^
target/arm/translate-a64.c:2308:10: note: ‘writeback’ was declared here
bool writeback;
^
Note that idx comes from selecting 2 bits, and therefore its value
can be at most 3.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1510087611-1851-1-git-send-email-cota@braap.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Adds a new PCI ID for the i82559a (0x8086 0x1030) interface. The
"x-use-alt-device-id" property controls whether this new ID is to be
used, and is true by default, and set to false in a compat entry.
Signed-off-by: Mike Nawrocki <michael.nawrocki@gtri.gatech.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The simple transmission mode was treating the area immediately after the
transmit command block (TCB) as if it were a transmit buffer descriptor,
when in reality it is simply the packet data. This change simply copies
the data following the TCB into the packet buffer.
Signed-off-by: Mike Nawrocki <michael.nawrocki@gtri.gatech.edu>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
A package from pri_indev or sec_indev only belongs to a particular
Connection, so we only need to compare the package in the specified
Connection's primary_list and secondary_list, rather than for each
the whole Connection list to compare. This is time-consuming and
unnecessary.
Less checkpoint more efficiency.
Cc: Zhang Chen <zhangckid@gmail.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Currently, a packet from pri_dev or sec_dev is fristly pushed at the
tail of the primary or secondary packet queue then sorted by the tcp
sequence number.
Now, this patch use g_queue_insert_sorted to insert the packet directly
into the suitable position to avoid ordering all packets each time when
a new packet is comming, thereby increasing efficiency.
In addition, consolidate the code that add a packet to the list of
Connection (primary or secondary) into a separate routine colo_insert_packet()
since the same chunk of code is called from two place.
Cc: Zhang Chen <zhangckid@gmail.com>
Cc: Li Zhijian <lizhijian@cn.fujitsu.com>
Cc: Jason Wang <jasowang@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Zhang Chen <zhangckid@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Since commit 0f8c289ad "net: fix -netdev socket,fd= for UDP sockets"
we allow more than one parameter for -netdev socket. But now
we run into an assert when no parameter at all is specified
> qemu-system-x86_64 -netdev socket
socket.c:729: net_init_socket: Assertion `sock->has_udp' failed.
Fix this by reverting the change of the if condition done in 0f8c289ad.
Cc: Jason Wang <jasowang@redhat.com>
Cc: qemu-stable@nongnu.org
Fixes: 0f8c289ad5
Reported-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Merge qcrypto 2017/11/08 v1
# gpg: Signature made Wed 08 Nov 2017 11:06:38 GMT
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qcrypto-2017-11-08-1:
crypto: afalg: fix a NULL pointer dereference
tests: Run the luks tests in test-crypto-block only if encryption is available
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-11-08
Here's the current set of accumulated ppc patches for qemu-2.11.
Since we're now in hard freeze these are all bugfixes (although some
fix a bug by way of a cleanup).
# gpg: Signature made Wed 08 Nov 2017 08:10:38 GMT
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20171108:
e500: ppce500_init_mpic() return device instead of IRQ array
hw/display/sm501: Fix comment in sm501_sysbus_class_init()
ppc: fix setting of compat mode
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
98c63057d2 ('slirp: Factorizing
tcpiphdr structure with an union') introduced a memset call to clear
possibly-undefined fields in ti. This however overwrites src/dst/pr which
are used below.
So let us clear only the unused fields.
This should fix some rare cases (some RST cases, keep alive probes)
where packets would be sent to 0.0.0.0.
Signed-off-by: Tao Wu <lepton@google.com>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
The NBD spec was recently clarified to state that a read of length 0
should not be attempted by a compliant client; but that a server must
still handle it correctly in an unspecified manner (that is, either
a successful no-op or an error reply, but not a crash) [1]. However,
it also implies that NBD_REPLY_TYPE_OFFSET_DATA must have a non-zero
payload length, but our existing code was replying with a chunk
that a picky client could reject as invalid because it was missing
a payload (our own client implementation was recently patched to be
that picky, after first fixing it to not send 0-length requests).
We are already doing successful no-ops for 0-length writes and for
non-structured reads; so for consistency, we want structured reply
reads to also be a no-op. The easiest way to do this is to return
a NBD_REPLY_TYPE_NONE chunk; this is best done via a new helper
function (especially since future patches for other structured
replies may benefit from using the same helper).
[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-8-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Ensure that the server is not sending unexpected chunk lengths
for either the NONE or the OFFSET_DATA chunk, nor unexpected
hole length for OFFSET_HOLE. This will flag any server as
broken that responds to a zero-length read with an OFFSET_DATA
(what our server currently does, but that's about to be fixed)
or with OFFSET_HOLE, even though we previously fixed our client
to never be able to send such a request over the wire.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-7-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The NBD spec was recently clarified to state that clients should
not send 0-length requests to the server, as the server behavior
is undefined [1]. We know that qemu-nbd's behavior is a successful
no-op (once it has filtered for read-only exports), but other NBD
implementations might return an error. To avoid any questionable
server implementations, it is better to just short-circuit such
requests on the client side (we are relying on the block layer to
already filter out requests such as invalid offset, write to a
read-only volume, and so forth); do the short-circuit as late as
possible to still benefit from protections from assertions that
the block layer is not violating our assumptions.
[1] https://github.com/NetworkBlockDevice/nbd/commit/ee926037
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-6-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
A closer read of the NBD spec shows that a structured reply chunk
for a hole is not quite identical to the prefix of a data chunk,
because the hole has to also send a 32-bit size field. Although
we do not yet send holes, we should fix the misleading information
in our header and make it easier for a future patch to support
sparse reads. Messed up in commit bae245d1.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-5-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
The NBD spec says that clients should not try to write/trim to
an export advertised as read-only by the server. But we failed
to check that, and would allow the block layer to use NBD with
BDRV_O_RDWR even when the server is read-only, which meant we
were depending on the server sending a proper EPERM failure for
various commands, and also exposes a leaky abstraction: using
qemu-io in read-write mode would succeed on 'w -z 0 0' because
of local short-circuiting logic, but 'w 0 0' would send a
request over the wire (where it then depends on the server, and
fails at least for qemu-nbd but might pass for other NBD
implementations).
With this patch, a client MUST request read-only mode to access
a server that is doing a read-only export, or else it will get
a message like:
can't open device nbd://localhost:10809/foo: request for write access conflicts with read-only export
It is no longer possible to even attempt writes over the wire
(including the corner case of 0-length writes), because the block
layer enforces the explicit read-only request; this matches the
behavior of qcow2 when backed by a read-only POSIX file.
Fix several iotests to comply with the new behavior (since
qemu-nbd of an internal snapshot, as well as nbd-server-add over QMP,
default to a read-only export, we must tell blockdev-add/qemu-io to
set up a read-only client).
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171108215703.9295-3-eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
When cross compiling QEMU for Windows we need to specify the cross
version of ranlib to avoid build errors when building capstone. This
patch ensures we use the same cross prefix on ranlib as other toolchain
components.
- Fedora23 mingw
- RHEL-7.2 with mingw packages from epel:
LINK qemu-img.exe
build-win64/capstone/capstone.lib: error adding symbols: Archive has no
index; run ranlib to add one
collect2: error: ld returned 1 exit status
$ x86_64-w64-mingw32-ar --version
GNU ar (GNU Binutils) 2.25
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <e457d4e906dceea4de6c3431813a06b137c1ab9c.1510103351.git.alistair.francis@xilinx.com>
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This feature is present for some targets in the bfd disassembler(s).
Implement it generically for all capstone users.
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
namelen should be here, length is unrelated, and always 0 at this
point. Broken in introduction in commit f37708f6, but mostly
harmless (replying with '' as the name does not violate protocol,
and does not confuse qemu as the nbd client since our implementation
does not ask for the name; but might confuse some other client that
does ask for the name especially if the default export is different
than the export name being queried).
Adding an assert makes it obvious that we are not skipping any bytes
in the client's message, as well as making it obvious that we were
using the wrong variable.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
CC: qemu-stable@nongnu.org
Message-Id: <20171101154204.27146-1-vsementsov@virtuozzo.com>
[eblake: improve commit message, squash in assert addition]
Signed-off-by: Eric Blake <eblake@redhat.com>
Commit b7a745d added a qemu_bh_cancel call to the completion function
as an optimization to prevent it from unnecessarily rescheduling itself.
This completion function is scheduled from worker_thread, after setting
the state of a ThreadPoolElement to THREAD_DONE.
This was considered to be safe, as the completion function restarts the
loop just after the call to qemu_bh_cancel. But, as this loop lacks a HW
memory barrier, the read of req->state may actually happen _before_ the
call, seeing it still as THREAD_QUEUED, and ending the completion
function without having processed a pending TPE linked at pool->head:
worker thread | I/O thread
------------------------------------------------------------------------
| speculatively read req->state
req->state = THREAD_DONE; |
qemu_bh_schedule(p->completion_bh) |
bh->scheduled = 1; |
| qemu_bh_cancel(p->completion_bh)
| bh->scheduled = 0;
| if (req->state == THREAD_DONE)
| // sees THREAD_QUEUED
The source of the misunderstanding was that qemu_bh_cancel is now being
used by the _consumer_ rather than the producer, and therefore now needs
to have acquire semantics just like e.g. aio_bh_poll.
In some situations, if there are no other independent requests in the
same aio context that could eventually trigger the scheduling of the
completion function, the omitted TPE and all operations pending on it
will get stuck forever.
[Added Sergio's updated wording about the HW memory barrier.
--Stefan]
Signed-off-by: Sergio Lopez <slp@redhat.com>
Message-id: 20171108063447.2842-1-slp@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Test-crypto-hash calls qcrypto_hash_bytesv/digest/base64 with
errp=NULL, this will cause a NULL pointer dereference if afalg_driver
doesn't support requested algos:
ret = qcrypto_hash_afalg_driver.hash_bytesv(alg, iov, niov,
result, resultlen,
errp);
if (ret == 0) {
return ret;
}
error_free(*errp); // <--- here
Because the error message is thrown away immediately, we should
just pass NULL to hash_bytesv(). There is also the same problem in
afalg-backend cipher & hmac, let's fix them together.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reported-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Longpeng <longpeng2@huawei.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The test-crypto-block currently fails if encryption has not been
compiled into QEMU:
TEST: tests/test-crypto-block... (pid=22231)
/crypto/block/qcow: OK
/crypto/block/luks/default:
Unexpected error in qcrypto_pbkdf2() at qemu/crypto/pbkdf-stub.c:41:
FAIL
GTester: last random seed: R02Sbbb5b6f299c6727f41bb50ba4aa6ef5c
(pid=22237)
/crypto/block/luks/aes-256-cbc-plain64:
Unexpected error in qcrypto_pbkdf2() at qemu/crypto/pbkdf-stub.c:41:
FAIL
GTester: last random seed: R02S3e27992a5ab4cc95e141c4ed3c7f0d2e
(pid=22239)
/crypto/block/luks/aes-256-cbc-essiv:
Unexpected error in qcrypto_pbkdf2() at qemu/crypto/pbkdf-stub.c:41:
FAIL
GTester: last random seed: R02S51b52bb02a66c42d8b331fd305384f53
(pid=22241)
FAIL: tests/test-crypto-block
So run the luks test only if the required encryption support is available.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently, to enable a pci device in the guest, the user has to issue
echo 1 > /sys/bus/pci/slots/00000000/power. This is not what people
expect. On an LPAR, the user can put a PCI device in configured or
deconfigured state via IOCDS. The "start in deconfigured state" can be
used for "sharing" a pci function across LPARs. This is not what we are
going to use in KVM, so always start configured.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Message-Id: <20171107175455.73793-2-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
test_multi_co_schedule_entry() set to_schedule[id] in the final loop
iteration before terminating the coroutine. There is a race condition
where the main thread attempts to enter the terminating or terminated
coroutine when signalling coroutines to stop:
atomic_mb_set(&now_stopping, true);
for (i = 0; i < NUM_CONTEXTS; i++) {
ctx_run(i, finish_cb, NULL); <--- enters dead coroutine!
to_schedule[i] = NULL;
}
Make sure only to set to_schedule[id] if this coroutine really needs to
be scheduled!
Reported-by: "R.Nageswara Sastry" <nasastry@in.ibm.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Message-id: 20171106190233.1175-1-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In Makefiles the $ must be escaped as $$ in shell uses.
Since 8a2390a4f4:
$ make docker
[...]
NETWORK=1 Enable virtual network interface with default backend.
NETWORK=ACKEND Enable virtual network interface with ACKEND.
Once escaped:
$ make docker
[...]
NETWORK=1 Enable virtual network interface with default backend.
NETWORK=$BACKEND Enable virtual network interface with $BACKEND.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171108024719.8389-1-f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
When a base image locally defined by QEMU, such as in the debian images,
is updated, the dockerfile checksum mechanism in docker.py still skips
updating the derived image, because it only looks at the literal content
of the dockerfile, without considering changes to the base image.
For example we have a recent fix e58c1f9b35 that fixed
debian-win64-cross by updating its base image, debian8-mxe, but due to
above "feature" of docker.py the image in question is automatically NOT
rebuilt unless you add NOCACHE=1. It is noticed on Shippable:
https://app.shippable.com/github/qemu/qemu/runs/541/2/console
because after the fix is merged, the error still occurs, and the log
shows the container image is, as explained above, not updated.
This is because at the time docker.py was written, there wasn't any
dependencies between QEMU's docker images.
Now improve this to preprocess any "FROM qemu:*" directives in the
dockerfiles while doing checksum, and inline the base image's dockerfile
content, recursively. This ensures any changes on the depended _QEMU_
images are taken into account.
This means for external images that we expect to retrieve from docker
registries, we still do it as before. It is not perfect, because
registry images can get updated too. Technically we could substitute the
image name with its hex ID as obtained with $(docker images $IMAGE
--format="{{.Id}}"), but --format is not supported by RHEL 7, so leave
it for now.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171103131229.4737-1-famz@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Actual number of interrupt pins isn't known
in ppce500_init_mpic() so a hardcoded number
was used, which causes a crash with older openpic.
Instead, return the DeviceState* and change ppce500_init()
to call qdev_get_gpio_in() to get only the irq pins
which are needed.
Signed-off-by: Michael Davidsaver <mdavidsaver@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The "cannot_instantiate_with_device_add_yet" flag has been renamed
to "user_creatable" a while ago.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
While trying to make KVM PR usable again, commit 5dfaa532ae introduced a
regression: the current compat_pvr value is passed to KVM instead of the
new one. This means that we always pass 0 instead of the max-cpu-compat
PVR during the initial machine reset. And at CAS time, we either pass
the PVR from the command line or even don't call kvmppc_set_compat() at
all, ie, the PCR will not be set as expected.
For example if we start a big endian fedora26 guest in power7 compat
mode on a POWER8 host, we get this in the guest:
$ cat /proc/cpuinfo
processor : 0
cpu : POWER7 (architected), altivec supported
clock : 4024.000000MHz
revision : 2.0 (pvr 004d 0200)
timebase : 512000000
platform : pSeries
model : IBM pSeries (emulated by qemu)
machine : CHRP IBM pSeries (emulated by qemu)
MMU : Hash
but the guest can still execute POWER8 instructions, and the following
program succeeds:
int main()
{
asm("vncipher 0,0,0"); // ISA 2.07 instruction
}
Let's pass the new compat_pvr to kvmppc_set_compat() and the program fails
with SIGILL as expected.
Reported-by: Nageswara R Sastry <rnsastry@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
SPARC is like Alpha in its handling of the rt_sigaction syscall:
it takes an extra parameter 'restorer' which needs to be copied
into the sa_restorer field of the sigaction struct. The order
of the arguments differs slightly between SPARC and Alpha but
the implementation is otherwise the same. (Compare the
rt_sigaction() functions in arch/sparc/kernel/sys_sparc_64.c
and arch/alpha/kernel/signal.c.)
Note that this change is somewhat moot until SPARC acquires
support for actually delivering RT signals.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
In the user-mode-only version of sparc_cpu_handle_mmu_fault(),
we must save the fault address for a data fault into the CPU
state's mmu registers, because the code in linux-user/main.c
expects to find it there in order to populate the si_addr
field of the guest siginfo.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For faults on loads and stores, ppc_cpu_handle_mmu_fault() in
target/ppc/user_only_helper.c stores the offending address
in env->spr[SPR_DAR]. Report this correctly to the guest
in si_addr, rather than incorrectly using the address of the
instruction that caused the fault.
This fixes the test case in
https://bugs.launchpad.net/qemu/+bug/1077116
for ppc, ppc64 and ppc64le.
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
For s390x, the address passed to a signal handler in the
siginfo_t si_addr field is masked (in the kernel this is done in
do_sigbus() and do_sigsegv() in arch/s390/mm/fault.c). Implement
this architecture-specific oddity in linux-user.
This is one of the issues described in
https://bugs.launchpad.net/qemu/+bug/1705118
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If an application tries to install a seccomp filter using
prctl(PR_SET_SECCOMP), the filter is likely for the target instead of the host
architecture. This will probably cause qemu to be immediately killed when it
executes another syscall.
Prevent this from happening by returning EINVAL from both seccomp prctl
calls. This is the error returned by the kernel when seccomp support is
disabled.
Fixes: https://bugs.launchpad.net/qemu/+bug/1726394
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: James Cowgill <james.cowgill@mips.com>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Add the missing defines and for TARGET_MAP_STACK and TARGET_MAP_HUGETLB
for alpha, mips, ppc, x86, hppa. Fix the mmap_flags translation table
to translate MAP_HUGETLB between host and target architecture, and to
drop MAP_STACK.
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20170311183016.GA20514@ls3530.fritz.box>
[rth: Drop MAP_STACK instead of translating it, since it is ignored
in the kernel anyway. Fix tabs to spaces.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
By failing to return from the syscall in the child, the child
issues another clone syscall and hilarity ensues.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Reading and writing to an sa_restorer member that isn't supposed to
exist corrupts user memory. Introduce TARGET_ARCH_HAS_SA_RESTORER,
similar to the kernel's __ARCH_HAS_SA_RESTORER.
Reported-by: Helge Deller <deller@gmx.de>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
If we iterate over the full port range without successfully binding+listening
on the socket, we'll try the next address, whereupon we overwrite the slisten
file descriptor variable without closing it.
Rather than having two places where we open + close socket FDs on different
iterations of nested for loops, re-arrange the code to always open+close
within the same loop iteration.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
target-arm queue:
* arm_gicv3_its: Don't abort on table save failure
* arm_gicv3_its: Fix the VM termination in vm_change_state_handler()
* translate.c: Fix usermode big-endian AArch32 LDREXD and STREXD
* hw/arm: Mark the "fsl,imx31/25/6" devices with user_creatable = false
* arm: implement cache/shareability attribute bits for PAR registers
# gpg: Signature made Tue 07 Nov 2017 13:33:58 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171107:
hw/intc/arm_gicv3_its: Don't abort on table save failure
hw/intc/arm_gicv3_its: Fix the VM termination in vm_change_state_handler()
translate.c: Fix usermode big-endian AArch32 LDREXD and STREXD
hw/arm: Mark the "fsl,imx31" device with user_creatable = false
hw/arm: Mark the "fsl,imx25" device with user_creatable = false
hw/arm: Mark the "fsl,imx6" device with user_creatable = false
arm: implement cache/shareability attribute bits for PAR registers
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The ITS is not fully properly reset at the moment. Caches are
not emptied.
After a reset, in case we attempt to save the state before
the bound devices have registered their MSIs and after the
1st level table has been allocated by the ITS driver
(device BASER is valid), the first level entries are still
invalid. If the device cache is not empty (devices registered
before the reset), vgic_its_save_device_tables fails with -EINVAL.
This causes a QEMU abort().
Cc: qemu-stable@nongnu.org
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Reported-by: wanghaibin <wanghaibin.wang@huawei.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The commit cddafd8f35 ("hw/intc/arm_gicv3_its: Implement state save
/restore") breaks the backward compatibility with the older kernels
where vITS save/restore support is not available. The vmstate function
vm_change_state_handler() should not be registered if the running kernel
doesn't support ITS save/restore feature. Otherwise VM instance will be
killed whenever vmstate callback function is invoked.
Observed a virtual machine shutdown with QEMU-2.10+linux-4.11 when testing
the reboot command "virsh reboot <domain> --mode acpi" instead of reboot.
KVM Error: 'KVM_SET_DEVICE_ATTR failed: Group 4 attr 0x00000000000001'
Signed-off-by: Shanker Donthineni <shankerd@codeaurora.org>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1509712671-16299-1-git-send-email-shankerd@codeaurora.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For AArch32 LDREXD and STREXD, architecturally the 32-bit word at the
lowest address is always Rt and the one at addr+4 is Rt2, even if the
CPU is big-endian. Our implementation does these with a single
64-bit store, so if we're big-endian then we need to put the two
32-bit halves together in the opposite order to little-endian,
so that they end up in the right places. We were trying to do
this with the gen_aa32_frob64() function, but that is not correct
for the usermode emulator, because there there is a distinction
between "load a 64 bit value" (which does a BE 64-bit access
and doesn't need swapping) and "load two 32 bit values as one
64 bit access" (where we still need to do the swapping, like
system mode BE32).
Fixes: https://bugs.launchpad.net/qemu/+bug/1725267
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1509622400-13351-1-git-send-email-peter.maydell@linaro.org
QEMU currently crashes when the user tries to instantiate the fsl,imx31
device manually:
$ aarch64-softmmu/qemu-system-aarch64 -M kzm -device fsl,,imx31
**
ERROR:/home/thuth/devel/qemu/tcg/tcg.c:538:tcg_register_thread:
assertion failed: (n < max_cpus)
Aborted (core dumped)
The kzm board (which is the one that uses this CPU type) only supports
one CPU, and the realize function of the "fsl,imx31" device also uses
serial_hds[] directly, so this device clearly can not be instantiated
twice and thus we should mark it with user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1509519537-6964-4-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU currently crashes when the user tries to instantiate the fsl,imx25
device manually:
$ aarch64-softmmu/qemu-system-aarch64 -S -M imx25-pdk -device fsl,,imx25
**
ERROR:/home/thuth/devel/qemu/tcg/tcg.c:538:tcg_register_thread:
assertion failed: (n < max_cpus)
The imx25-pdk board (which is the one that uses this CPU type) only
supports one CPU, and the realize function of the "fsl,imx25" device
also uses serial_hds[] directly, so this device clearly can not be
instantiated twice and thus we should mark it with user_creatable = 0.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1509519537-6964-3-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This device causes QEMU to abort if the user tries to instantiate it:
$ qemu-system-aarch64 -M sabrelite -smp 1,maxcpus=2 -device fsl,,imx6
Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:222:
qemu-system-aarch64: -device fsl,,imx6: Device 'serial0' is in use
Aborted (core dumped)
The device uses serial_hds[] directly in its realize function, so it
can not be instantiated again by the user.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1509519537-6964-2-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On a successful address translation instruction, PAR is supposed to
contain cacheability and shareability attributes determined by the
translation. We previously returned 0 for these bits (in line with the
general strategy of ignoring caches and memory attributes), but some
guest OSes may depend on them.
This patch collects the attribute bits in the page-table walk, and
updates PAR with the correct attributes for all LPAE translations.
Short descriptor formats still return 0 for these bits, as in the
prior implementation.
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-id: 20171031223830.4608-1-Andrew.Baumann@microsoft.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
cocoa queue:
* make scrolling work in GUI monitor windows
* change ungrab to ctrl-alt-g (matching gtk)
* pass unused ctrl-alt combos to guest
# gpg: Signature made Tue 07 Nov 2017 10:15:00 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-cocoa-20171107:
ui/cocoa.m: Send ctrl-alt key combos to guest if QEMU isn't using them
ui/cocoa.m: move ungrab to ctrl-alt-g
ui/cocoa.m: Make scrolling work again in GUI monitor windows
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge build 2017/11/07 v1
# gpg: Signature made Tue 07 Nov 2017 10:14:49 GMT
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-build-2017-11-07-1:
build: remove use of MAKELEVEL optimization in submodule handling
build: delay check for empty git submodule list
build: don't fail if given a git submodule which does not exist
build: allow automatic git submodule updates to be disabled
build: don't create temporary files in source dir
build: allow setting a custom GIT binary for transparent proxying
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fixes a bad errno returned to the guest and a trivial coding style nit.
# gpg: Signature made Mon 06 Nov 2017 18:09:24 GMT
# gpg: using RSA key 0x71D4D5E5822F73D6
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Gregory Kurz <gregory.kurz@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# Primary key fingerprint: B482 8BAF 9431 40CE F2A3 4910 71D4 D5E5 822F 73D6
* remotes/gkurz/tags/for-upstream:
9pfs: fix v9fs_mark_fids_unreclaim() return value
9pfs: drop one user of struct V9fsFidState
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Send those ctrl-alt key combos that QEMU doesn't treat specially to
the guest rather than ignoring them.
All the case where we do special handling of ctrl-alt-X exit the
event handling using a "return" statement, so we can simply allow
the rest to fall through into the normal key handling by deleting
the now-spurious "else".
We take the opportunity to clean up some oddly-formatted and
now rather uninformative comments by removing them.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The return value of v9fs_mark_fids_unreclaim() is then propagated to
pdu_complete(). It should be a negative errno, not -1.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
MIPS patches 2017-11-06
Changes:
Update email addresses of Yongbok Kim, James Hogan and Paul Burton.
# gpg: Signature made Mon 06 Nov 2017 15:38:58 GMT
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@mips.com>"
# gpg: aka "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20171106:
MAINTAINERS: Update Paul Burton's email address
MAINTAINERS: Update James Hogan's email address
MAINTAINERS: Update Yongbok Kim's email address
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The Makefile attempts to optimize the handling of submodules by using MAKELEVEL
to only check the submodule status when running from the top level make
invokation. This causes problems for people who are using a makefile of their
own to in turn invoke QEMU's makefile, as MAKELEVEL is already set to 1 (or
more) when QEMU's makefile runs.
This optimization should not really be needed, since the git-submodule.sh
script is already used to detect if a submodule update is required. This by
removing the MAKELEVEL check, we at most add an extra 'git-submodule.sh status'
call to each make level, the overhead of which is lost in noise of building
QEMU.
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We short circuit the git submodule update when passed an empty module list.
This accidentally causes the 'status' command to write to the status file. The
test needs to be delayed into the individual commands to avoid this premature
writing of the status file.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If going back in time in git history, across a commit that introduces a new
submodule, the 'git-submodule.sh' script will fail, causing rebuild to fail.
This is because config-host.mak contains a GIT_SUBMODULES variable that lists
a submodule that only exists in the later commit. config-host.mak won't get
repopulated until config.status is invoked, but make won't get this far due to
the submodule error.
This change makes 'git-submodule.sh' check whether each module is known to git
and drops any which are not present. A warning message will be printed when any
submodule is dropped in this manner.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some people building QEMU use VPATH builds where the source directory is on a
read-only volume. In such a case 'scripts/git-submodules.sh update' will always
fail and users are required to run it manually themselves on their original
writable source directory.
While this is already supported, it is nice to give users a command line flag
to configure to permanently disable automatic submodule updates, as it means
they won't get hard to diagnose failures from git-submodules.sh at an arbitrary
later date.
This patch thus introduces a flag '--disable-git-update' which will prevent
'make' from ever running 'scripts/git-submodules.sh update'. It will still run
the 'status' command to determine if a submodule update is needed, but when it
does this it'll simply stop and print a message instructing the developer what
todo. eg
$ ./configure --target-list=x86_64-softmmu --disable-git-update
...snip...
$ make
GEN config-host.h
GEN trace/generated-tcg-tracers.h
GEN trace/generated-helpers-wrappers.h
GEN trace/generated-helpers.h
GEN trace/generated-helpers.c
GEN module_block.h
GIT submodule checkout is out of date. Please run
scripts/git-submodule.sh update ui/keycodemapdb
from the source directory checkout /home/berrange/src/virt/qemu
make: *** [Makefile:31: git-submodule-update] Error 1
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
There are cases where users do VPATH builds with the source directory being on
a read-only volume. In such a case they have to manually run the command
'git-submodule.sh ...modules...' ahead of time. When checking for status we
should not then write into the source dir.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some users can't run a bare 'git' command, due to need for a transparent
proxying solution such as 'tsocks'. This adds an argument to configure to
let users specify such a thing:
./configure --with-git="tsocks git"
The submodule script is also updated to give the user a hint about using this
flag, if we fail to checkout modules.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
GCC 4.9 and newer stopped warning for missing braces around the
"universal" C zero initializer {0}. One such initializer sneaked
into scsi/qemu-pr-helper.c and is breaking the build with such
older GCC versions.
Detect the lack of support for the idiom, and disable the warning
in that case.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Legacy PCI device assignment has been removed from Linux in 4.12,
and had been deprecated 2 years ago there. We can remove it from
QEMU as well.
The ROM loading code was shared with Xen PCI passthrough, so move
it to hw/xen.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Queued tcg patches
# gpg: Signature made Fri 03 Nov 2017 08:37:58 GMT
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20171103:
cpu-exec: Exit exclusive region on longjmp from step_atomic
tcg/s390x: Use constant pool for prologue
tcg: Allow constant pool entries in the prologue
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit ac03ee5331 narrowed the scope of the exclusive
region so it only covers when we're executing the TB, not when
we're generating it. However it missed that there is more than
one execution path out of cpu_tb_exec -- if the atomic insn
causes an exception then the code will longjmp out, skipping
the code to end the exclusive region. This causes QEMU to hang
the next time the CPU calls start_exclusive(), waiting for
itself to exit the region.
Move the "end the region" code out to the end of the
function so that it is run for both normal exit and also
for exit-via-longjmp. We have to use a volatile bool flag
to decide whether we need to end the region, because we
can longjump out of the codegen as well as the execution.
(For some reason this only reproduces for me with a clang
optimized build, not a gcc debug build.)
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Fixes: ac03ee5331
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1509640536-32160-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than have separate code only used for guest_base,
rely on a recent change to handle constant pool entries.
Cc: qemu-s390x@nongnu.org
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Both ARMv6 and AArch64 currently may drop complex guest_base values
into the constant pool. But generic code wasn't expecting that, and
the pool is not emitted. Correct that.
Tested-by: Emilio G. Cota <cota@braap.org>
Tested-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Reported-by: Laurent Desnogues <laurent.desnogues@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since 927128222b QEMU depends of keycodemapdb, which uses the python 'csv'
module from stdlib to parse keymaps.csv.
Without this package the build fails:
GEN ui/input-keymap-linux-to-qcode.c
Traceback (most recent call last):
File "ui/keycodemapdb/tools/keymap-gen", line 15, in <module>
import csv
ImportError: No module named csv
GEN ui/input-keymap-qcode-to-qnum.c
Traceback (most recent call last):
File "ui/keycodemapdb/tools/keymap-gen", line 15, in <module>
import csv
ImportError: No module named csv
[...]
CC ui/input-keymap.o
ui/input-keymap.c:8:44: fatal error: ui/input-keymap-linux-to-qcode.c: No such file or directory
make: *** [ui/input-keymap.o] Error 1
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
qemu-sparc update
# gpg: Signature made Tue 31 Oct 2017 17:43:11 GMT
# gpg: using RSA key 0x5BC2C56FAE0F321F
# gpg: Good signature from "Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>"
# Primary key fingerprint: CC62 1AB9 8E82 200D 915C C9C4 5BC2 C56F AE0F 321F
* remotes/mcayland/tags/qemu-sparc-signed:
sun4m: change TYPE_SUN4M_IOMMU macro from "iommu" to "sun4m-iommu"
sun4m_iommu: remove legacy sparc_iommu_memory_rw() function
sparc32_dma: switch over to using IOMMU memory region and DMA API
sun4m: implement IOMMU translation using IOMMU memory region
sparc32_dma: add len to esp/le DMA memory tracing
sparc32_dma: remove is_ledma hack and replace with memory region alias
sparc32_dma: introduce new SPARC32_DMA type container object
sparc32_dma: make lance device child of ledma device
lance: move TYPE_LANCE and SysBusPCNetState from lance.c to lance.h
sparc32_dma: make esp device child of espdma device
esp: move TYPE_ESP and SysBusESPState from esp.c to esp.h
sparc32_dma: use object link instead of qdev property to pass IOMMU reference
sun4m_iommu: move TYPE_SUN4M_IOMMU declaration to sun4m.h
sun4m: move DMA device wiring from sparc32_dma_init() to sun4m_hw_init()
sparc32_dma: move type declarations from sparc32_dma.c to sparc32_dma.h
sparc32_dma: split esp and le into separate DMA devices
sparc32_dma: rename SPARC32_DMA type to SPARC32_DMA_DEVICE
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The allwinner code is only needed for the allwinner board (for which
we also have a separate CONFIG_ALLWINNER_A10 config switch), so it
does not make sense that we compile this for all the other boards
that need AHCI, too. Let's move it to a separate file that is only
compiled when CONFIG_ALLWINNER_A10 is set.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1508784509-29377-1-git-send-email-thuth@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
This is a legacy artifact from when the sun4m IOMMU implementation was
the only IOMMU available within QEMU.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
This hack originated from before the memory region API was introduced, and
increased the size of the ledma DMA device to capture incorrect accesses
beyond the end of the ledma device. A full analysis can be found on Artyom's
blog at http://tyom.blogspot.co.uk/2010/10/bug-in-all-solaris-versions-after-57.html.
With the memory API we can now simply alias the incorrect access onto its
intended destination allowing us to remove the hack.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Create a new SPARC32_DMA container object (including an appropriate container
memory region) and add instances of the SPARC32_ESPDMA_DEVICE and
SPARC32_LEDMA_DEVICE as child objects. The benefit is that most of the gpio
wiring complexity between esp/espdma and lance/ledma is now hidden within the
SPARC32_DMA realize function.
Since the sun4m IOMMU is already QOMified we can find a reference to
it using object_resolve_path_type() allowing us to completely remove all external
references to the iommu pointer.
Finally we rework sun4m's sparc32_dma_init() to invoke the new SPARC32_DMA object
and wire up the remaining board memory regions/IRQs.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This enables them to be used outside of lance.c. We also update the comment to
refer to the SPARC32 lance device rather than the AMD PCNet-II device (of which
lance is a register-compatible subset).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
CC: Jason Wang <jasowang@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This makes it possible to reference the esp device from the espdma device as
required, and by wiring up the device ourselves in sun4m.c we can drop use
of the esp_init() function.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This enables us to remove the last remaining (opaque) qdev property. Whilst we
are here, also update iommu_init() to use TYPE_SUN4M_IOMMU instead of a
hardcoded string.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
By using the sysbus interface it is possible to wire up the esp/le devices
to the sun4m DMA controller directly during sun4m_hw_init() instead of
passing qemu_irqs into the sparc32_dma_init() function.
This is an intermediate step to allow further reorganisation as more logic
is moved into the relevant SPARC32 DMA devices; there will be a final
refactoring of sparc32_dma_init() once this work is complete.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Due to slight differences in behaviour accessing the registers for the
esp and le devices, create two separate SPARC32_DMA_DEVICE types and
update the sun4m machine to use.
Note that by using different device types we already know the size of
the register block and the value of is_ledma at init time, allowing us to
drop the SPARC32_DMA_DEVICE realize function and the is_ledma device
property.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Also update the function names to match as appropriate. While we're
here rename the type from sparc32_dma to sparc32-dma in order to
match the current QOM convention.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
target-arm queue:
* fix instruction-length bit in syndrome for WFI/WFE traps
* xlnx-zcu102: Specify the max number of CPUs
* msf2: Remove dead code reported by Coverity
* msf2: Wire up SYSRESETREQ in SoC for system reset
* hw/pci-host/gpex: Improve INTX to gsi routing error checking
# gpg: Signature made Tue 31 Oct 2017 13:10:02 GMT
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171031:
hw/pci-host/gpex: Improve INTX to gsi routing error checking
msf2: Wire up SYSRESETREQ in SoC for system reset
msf2: Remove dead code reported by Coverity
xlnx-zcu102: Specify the max number of CPUs
fix WFI/WFE length in syndrome register
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We exposed gpex_set_irq_num() for machines to set the INTx to
GSI routing. However if the machine forgets to call that
function we currently do not check the association was properly
done. Let's initialize gsi values to -1 and if this value is
found in gpex_route_intx_pin_to_irq, set the routing mode as
disabled.
Signed-off-by: Eric Auger <eric.auger@redhat.com>
Message-id: 1508776211-22175-1-git-send-email-eric.auger@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
WFI/E are often, but not always, 4 bytes long. When they are, we need to
set ARM_EL_IL_SHIFT in the syndrome register.
Pass the instruction length to HELPER(wfi), use it to decrement pc
appropriately and to pass an is_16bit flag to syn_wfx, which sets
ARM_EL_IL_SHIFT if needed.
Set dc->insn in both arm_tr_translate_insn and thumb_tr_translate_insn.
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Message-id: alpine.DEB.2.10.1710241055160.574@sstabellini-ThinkPad-X260
[PMM: move setting of dc->insn for Thumb so it is correct for 32 bit insns]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Minimal implementation: for structured error only error_report error
message.
Note that test 83 is now more verbose, because the implementation
prints more warnings about unexpected communication errors; perhaps
future patches should tone things down by using trace messages
instead of traces, but the common case of successful communication
is no noisier than before.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-13-eblake@redhat.com>
An upcoming change to block/nbd-client.c will want to read the
tail of a structured reply chunk directly from the wire. Move
this function to make it easier.
Based on a patch from Vladimir Sementsov-Ogievskiy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-12-eblake@redhat.com>
In following patch nbd_receive_reply will be used both for simple
and structured reply header receiving.
NBDReply is altered into union of simple reply header and structured
reply chunk header, simple error translation moved to block/nbd-client
to be consistent with further structured reply error translation.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-11-eblake@redhat.com>
The NBD spec permits including a human-readable error string if
structured replies are in force, so we might as well send the
client the message that we logged on any error.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-9-eblake@redhat.com>
Minimal implementation of structured read: one structured reply chunk,
no segmentation.
Minimal structured error implementation: no text message.
Support DF flag, but just ignore it, as there is no segmentation any
way.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-8-eblake@redhat.com>
Consolidate the response for a non-zero-length option payload
into a new function, nbd_reject_length(). This check will
also be used when introducing support for structured replies.
Note that STARTTLS response differs based on time: if the connection
is still unencrypted, we set fatal to true (a client that can't
request TLS correctly may still think that we are ready to start
the TLS handshake, so we must disconnect); while if the connection
is already encrypted, the client is sending a bogus request but
is no longer at risk of being confused by continuing the connection.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-7-eblake@redhat.com>
[eblake: correct return value on STARTTLS]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Instead of making each caller check whether a transmission error
occurred, we can sink a common error check to the end of the loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20171027104037.8319-6-eblake@redhat.com>
[eblake: squash in compiler warning fix]
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
When the server is read-only, we were already reporting an error
message for NBD_CMD_WRITE_ZEROES, but failed to set errp for a
similar NBD_CMD_WRITE. This will matter more once structured
replies allow the server to propagate the errp information back
to the client. While at it, use an error message that makes a
bit more sense if viewed on the client side.
Note that when using qemu-io to test qemu-nbd behavior, it is
rather difficult to convince qemu-io to send protocol violations
(such as a read beyond bounds), because we have a lot of active
checking on the client side that a qemu-io request makes sense
before it ever goes over the wire to the server. The case of a
client attempting a write when the server is started as
'qemu-nbd -r' is one of the few places where we can easily test
error path handling, without having to resort to hacking in known
temporary bugs to either the server or client. [Maybe we want a
future patch to the client to do up-front checking on writes to a
read-only export, the way it does up-front bounds checking; but I
don't see anything in the NBD spec that points to a protocol
violation in our current behavior.]
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-5-eblake@redhat.com>
Upcoming patches will implement the NBD structured reply
extension [1] for both client and server roles. Declare the
constants, structs, and lookup routines that will be valuable
whether the server or client code is backported in isolation.
This includes moving one constant from an internal header to
the public header, as part of the structured read processing
will be done in block/nbd-client.c rather than nbd/client.c.
[1]https://github.com/NetworkBlockDevice/nbd/blob/extension-structured-reply/doc/proto.md
Based on patches from Vladimir Sementsov-Ogievskiy.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-4-eblake@redhat.com>
This is needed in preparation for structured reply handling,
as we will be performing the translation from NBD error to
system errno value higher in the stack at block/nbd-client.c.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-3-eblake@redhat.com>
NBD errors were originally sent over the wire based on Linux errno
values; but not all the world is Linux, and not all platforms share
the same values. Since a number isn't very easy to decipher on all
platforms, update the trace messages to include the name of NBD
errors being sent/received over the wire. Tweak the trace messages
to be at the point where we are using the NBD error, not the
translation to the host errno values.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171027104037.8319-2-eblake@redhat.com>
If a CPU selected with the "cpu" command is hot-unplugged then "info cpus"
causes QEMU to exit:
(qemu) device_del cpu1
(qemu) info cpus
qemu:qemu_cpu_kick_thread: No such process
This happens because "cpu" stores the pointer to the selected CPU into
the monitor structure. When the CPU is hot-unplugged, we end up with a
dangling pointer. The "info cpus" command then does:
hmp_info_cpus()
monitor_get_cpu_index()
mon_get_cpu()
cpu_synchronize_state() <--- called with dangling pointer
This could cause a QEMU crash as well.
This patch switches the monitor to store the QOM path instead of a
pointer to the current CPU. The path is then resolved when needed.
If the resolution fails, we assume that the CPU was removed and the
path is resetted to the default (ie, path of first_cpu).
Reported-by: Satheesh Rajendran <sathnaga@linux.vnet.ibm.com>
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150822818243.26242.12993827911736928961.stgit@bahia.lan>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
s390x: fixups for 2.11
- missing \r in the BIOS console output
- CPU type name is now "s390x-cpu"
- fixup for the host-model on z14 and older machine versions
# gpg: Signature made Mon 30 Oct 2017 08:34:15 GMT
# gpg: using RSA key 0x117BBC80B5A61C7C
# gpg: Good signature from "Christian Borntraeger (IBM) <borntraeger@de.ibm.com>"
# Primary key fingerprint: F922 9381 A334 08F9 DBAB FBCA 117B BC80 B5A6 1C7C
* remotes/borntraeger/tags/s390x-20171030:
s390-*.img: update s390 bios with latest fixes
s390-ccw: print carriage return with new lines
s390x/kvm: use cpu model for gscb on compat machines
target/s390x: change CPU type name to "s390x-cpu"
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
migration/next for 20171029
# gpg: Signature made Sun 29 Oct 2017 13:07:43 GMT
# gpg: using RSA key 0xF487EF185872D723
# gpg: Good signature from "Juan Quintela <quintela@redhat.com>"
# gpg: aka "Juan Quintela <quintela@trasno.org>"
# Primary key fingerprint: 1899 FF8E DEBF 58CC EE03 4B82 F487 EF18 5872 D723
* remotes/juanquintela/tags/migration/20171029:
tests: check that migration parameters are really assigned
tests: Don't abuse global_qtest
tests: Factorize out migrate_test_start/end
tests: Refactor setting of parameters/capabilities
tests: rename postcopy-test to migration-test
migration: Make xbzrle_cache_size a migration parameter
migration: No need to return the size of the cache
migration: Don't play games with the requested cache size
migration: Make sure that we pass the right cache size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
x86/cpu/numa queue, 2017-10-27
# gpg: Signature made Fri 27 Oct 2017 15:17:12 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-and-machine-pull-request: (39 commits)
x86: Skip check apic_id_limit for Xen
numa: fixup parsed NumaNodeOptions earlier
mips: r4k: replace cpu_model with cpu_type
mips: mipssim: replace cpu_model with cpu_type
mips: Magnum/Acer Pica 61: replace cpu_model with cpu_type
mips: fulong2e: replace cpu_model with cpu_type
mips: malta/boston: replace cpu_model with cpu_type
mips: use object_new() instead of gnew()+object_initialize()
sparc: leon3: use generic cpu_model parsing
sparc: sparc: use generic cpu_model parsing
sparc: sun4u/sun4v/niagara: use generic cpu_model parsing
sparc: cleanup cpu type name composition
tricore: use generic cpu_model parsing
tricore: cleanup cpu type name composition
unicore32: use generic cpu_model parsing
unicore32: cleanup cpu type name composition
xtensa: lx60/lx200/ml605/kc705: use generic cpu_model parsing
xtensa: sim: use generic cpu_model parsing
xtensa: cleanup cpu type name composition
sh4: remove SuperHCPUClass::name field
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
includes
7618c0aefe ("s390-ccw: print carriage return with new lines")
a8fbbf1db7 ("s390: set DHCP client architecure id for netboot")
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
The sclp console in the s390 bios writes raw data,
leading console emulators (such as virsh console) to
treat a new line ('\n') as just a new line instead
of as a Unix line feed. Because of this, output
appears in a "stair case" pattern.
Let's print \r\n on every occurrence of a new line
in the string passed to write to amend this issue.
This is in sync with the guest Linux code in
drivers/s390/char/sclp_vt220.c which also does a line feed
conversion in the console part of the driver.
This fixes the s390-ccw and s390-netboot output like
$ virsh start test --console
Domain test started
Connected to domain test
Escape character is ^]
Network boot starting...
Using MAC address: 02:01:02:03:04:05
Requesting information via DHCP: 010
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Message-Id: <1509120893-28054-1-git-send-email-walling@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Alexander Graf <agraf@suse.de>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Starting a guest with
<os>
<type arch='s390x' machine='s390-ccw-virtio-2.9'>hvm</type>
</os>
<cpu mode='host-model'/>
on an IBM z14 results in
"qemu-system-s390x: Some features requested in the CPU model are not
available in the configuration: gs"
This is because guarded storage is fenced for compat machines that did
not have guarded storage support. While this prevents future migration
abort (by not starting the guest at all), not being able to start a
"host-model" guest is very much unexpected. As it turns out, even if we
would modify libvirt to not expand the cpu model to contain "gs" for
compat machines, it cannot guarantee that a migration will succeed. For
example if the kernel changes its features (or the user has nested=1 on
one host but not on the other) the migration will fail nevertheless. So
instead of fencing "gs" for machines <= 2.9 lets allow it for all
machine types that support the CPU model. This will make "host-model"
runnable all the time, while relying on the CPU model to reject invalid
migration attempts. We also need to change the migration for guarded
storage.
Additional discussions about host-model are still pending but are out
of scope of this patch.
Suggested-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: David Hildenbrand <david@redhat.com>
Acked-by: Cornelia Huck <Cornelia Huck <cohuck@redhat.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
For now, e.g. host-s390-cpu wasn't exposed to the user. cpu-add, -cpu
and the CPU model qmp interfaces didn't care about the actual type,
as that information was hidden.
This changed with CPU hotplug via device_add. Now the type is visible to
the user. Before we get that supported in a stable version, this is our
last chance to change it.
So change it from "s390-cpu" to "s390x-cpu", to match the architecture
name. Example names are then e.g. z14-s390x-cpu or qemu-s390x-cpu.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171020115803.14093-1-david@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
As we have two guests running, just pass always who we want to send a
message to. Once there, refactor return_or_event() into wait_command.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Instead of repeating the code, we are going to bo more tests on this file
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Right now it is a variable in MigrationState instead of a
MigrationParameter. The change allows to set it as the rest of the
Migration parameters, from the command line, with
query_migration_paramters, set_migrate_parameters, etc.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
After the previous commits, we make sure that the value passed is
right, or we just drop an error. So now we return if there is one
error or we have setup correctly the value passed.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
Improve error messasge
Return 0 always for success
Now that we check that the value passed is a power of 2, we don't need
to play games when comparing what is the size that is going to take
the cache.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Instead of passing silently round down the number of pages, make it an
error that the cache size is not a power of 2.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
numa 'mem' option with suffix or without one is possible
only on CLI/HMP. Instead of fixing up special suffix less
CLI case deep in parse_numa_node() do it earlier right
after option is parsed into NumaNodeOptions with OptVisistor
so that the rest of the code would use valid values in
NumaNodeOptions and won't have to reparse QemuOpts.
It will help to isolate CLI/HMP parts in parse_numa() and
split out parsed NumaNodeOptions processing into separate
function that could be reused by QMP handler where we have
only NumaNodeOptions and don't need any fixups.
While at it reuse qemu_strtosz_MiB() instead of manually
checking for suffixes.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1507801198-98182-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
object_initialize() is intended for inplace initialization of
objects, but here it's first allocated with g_new0() and then
initialized with object_initialize(). QEMU already has API
to do this (object_new), so do object creation with suitable
for usecase API.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-36-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
introduce TRICORE_CPU_TYPE_NAME macro and use it to construct
cpu type names. While at it move cpu type_infos into one
array and register it directly with type_init_from_array()
instead of custom tricore_cpu_register_types()/cpu_register()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-30-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
the field contains upper-cased cpu model name and is used
for printing supported cpu model names for '-cpu help'.
Considering that cpu model lookup in superh_cpu_class_by_name()
is case-insensitive, we can drop upper-casing when
printing supported cpus list and use cpu type directly
to do the same by cutting out SUPERH_CPU_TYPE_SUFFIX from
typename.
It allows to remove SuperHCPUClass::name, which practically
duplicates names defined by TYPE_SH*_CPU definitions and
simplify sh*_class_init()/SuperHCPUClass a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-24-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
currently for sh4 cpu_model argument for '-cpu' option
could be either 'cpu model' name or cpu_typename.
however typically '-cpu' takes 'cpu model' name and
cpu type for sh4 target isn't advertised publicly
('-cpu help' prints only 'cpu model' names) so we
shouldn't care about this use case (it's more of a bug).
1. Drop '-cpu cpu_typename' to align with the rest of
targets.
2. Compose searched for typename from cpu model and use
it with object_class_by_name() directly instead of
over-complicated
object_class_get_list()
g_slist_find_custom() + superh_cpu_name_compare()
With #1 droped, #2 could be used for both lookups which
simplifies superh_cpu_class_by_name() quite a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-23-git-send-email-imammedo@redhat.com>
[ehabkost: Include fixup sent by Igor]
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
introduce SUPERH_CPU_TYPE_NAME macro and use it to construct
cpu type names. While at it move cpu type_infos into one
array and register it directly with type_init_from_array()
instead of custom superh_cpu_register_types()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-22-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
It 'works' with default CPU only because of bug in
moxie_cpu_class_by_name() where it treats cpu_model
as type name and default cpu_model also happens to be
type name. But specifying explicitly cpu on CLI,
ex: '-cpu MoxieLite', makes QEMU fail since
moxie_cpu_class_by_name() doesn't traslate cpu_model
to cpu type and fails to find corresponding object class.
Fix moxie_cpu_class_by_name() to do proper
cpu_model -> cpu type
translation and fix default cpu_model to be cpu_model
instead of being typename.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-15-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Introduce ALPHA_CPU_TYPE_NAME macro to replace rather ununique
TYPE macro that alpha uses. With new macro it will follow
the same naming convention as other targets.
While at it put scattered TypeInfo into one array which places
type desriptions at one place and reduces code a bit.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1507211474-188400-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Xen 2017/10/26
# gpg: Signature made Thu 26 Oct 2017 23:57:16 BST
# gpg: using RSA key 0x894F8F4870E1AE90
# gpg: Good signature from "Stefano Stabellini <stefano.stabellini@eu.citrix.com>"
# gpg: aka "Stefano Stabellini <sstabellini@kernel.org>"
# Primary key fingerprint: D04E 33AB A51F 67BA 07D3 0AEA 894F 8F48 70E1 AE90
* remotes/sstabellini/tags/xen-20171026-tag:
xen: Log errno rather than return value
xen: dont try setting max grants multiple times
xen: add a global indicator for grant copy being available
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches
# gpg: Signature made Thu 26 Oct 2017 14:02:54 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (35 commits)
iotests: Add cluster_size=64k to 125
qcow2: Always execute preallocate() in a coroutine
qcow2: Fix unaligned preallocated truncation
qcow2: Emit errp when truncating the image tail
iotests: Filter actual image size in 184 and 191
iotests: Pull _filter_actual_image_size from 67/87
iotests: Add test for dataplane mirroring
qcow2: Use BDRV_SECTOR_BITS instead of its literal value
qemu-img.1: Image invalidation on qemu-img commit
qemu-io: Relax 'alloc' now that block-status doesn't assert
qcow2: Reduce is_zero() rounding
block: Reduce bdrv_aligned_preadv() rounding
block: Align block status requests
qemu-img: Change img_compare() to be byte-based
qemu-img: Change img_rebase() to be byte-based
qemu-img: Change compare_sectors() to be byte-based
qemu-img: Change check_empty_sectors() to byte-based
qemu-img: Drop redundant error message in compare
qemu-img: Add find_nonzero()
qemu-img: Speed up compare on pre-allocated larger file
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In one case we misconstrue a BOOL return as an HRESULT, and in the
other case we don't check the BOOL return from LookupAccountSidW()
before extracting the HRESULT from GetLastError(). Both can lead to
getNameByStringSID() misreporting an error.
Reported-by: Chen Hanxiao <chenhanxiao@gmail.com>
Suggested-by: Tomáš Golembiovský <tgolembi@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
xen_modified_memory() sets errno to communicate what went wrong so log
this rather than the return value which is not interesting.
Signed-off-by: Ross Lagerwall <ross.lagerwall@citrix.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Trying to call xengnttab_set_max_grants() with the same file handle
might fail on some kernels, as this operation is allowed only once.
This is a problem for the qdisk backend as blk_connect() can be
called multiple times for a domain, e.g. in case grub-xen is being
used to boot it.
So instead of letting the generic backend code open the gnttab device
do it in blk_connect() and close it again in blk_disconnect.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
The Xen qdisk backend needs to test whether grant copy operations is
available in the kernel. Unfortunately this collides with using
xengnttab_set_max_grants() on some kernels as this operation has to
be the first one after opening the gnttab device.
In order to solve this problem test for the availability of grant copy
in xen_be_init() opening the gnttab device just for that purpose and
closing it again afterwards. Advertise the availability via a global
flag and use that flag in the qdisk backend.
Signed-off-by: Juergen Gross <jgross@suse.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Block patches
# gpg: Signature made Thu Oct 26 15:01:20 2017 CEST
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2017-10-26:
iotests: Add cluster_size=64k to 125
qcow2: Always execute preallocate() in a coroutine
qcow2: Fix unaligned preallocated truncation
qcow2: Emit errp when truncating the image tail
iotests: Filter actual image size in 184 and 191
iotests: Pull _filter_actual_image_size from 67/87
iotests: Add test for dataplane mirroring
qcow2: Use BDRV_SECTOR_BITS instead of its literal value
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tests 067 and 087 filter the actual image size because it depends on the
host filesystem (and is not part of the respective test). Since this is
generally true, we should have a common filter function for this, so
let's pull out the sed line from both tests into such a function.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20171009163456.485-2-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
BDRV_SECTOR_BITS is defined to be 9 in block.h (and BDRV_SECTOR_SIZE
is calculated from that), but there are still a couple of places where
we are using the literal value instead of the macro.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 20171009153856.20387-1-berto@igalia.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
qemu-img commit invalidates all images between base and top. This
should be mentioned in the man page.
Suggested-by: Ping Li <pingl@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Previously, the alloc command required that input parameters be
sector-aligned and clamped to 32 bits, because the underlying
bdrv_is_allocated used a 32-bit parameter and asserted aligned
inputs. But now that we have fixed block status to report a
64-bit bytes value, and to properly round requests on behalf of
guests, we can pass any values, and can use qemu-io to add
coverage that our rounding is correct regardless of the guest
alignment constraints.
Update iotest 177 to intentionally probe block status at
unaligned boundaries as well as with a bytes value that does not
map to 32-bit sectors, which also required tweaking the image
prep to leave an unallocated portion to the image under test.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that bdrv_is_allocated accepts non-aligned inputs, we can
remove the TODO added in earlier refactoring.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that bdrv_is_allocated accepts non-aligned inputs, we can
remove the TODO added in commit d6a644bb.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Any device that has request_alignment greater than 512 should be
unable to report status at a finer granularity; it may also be
simpler for such devices to be guaranteed that the block layer
has rounded things out to the granularity boundary (the way the
block layer already rounds all other I/O out). Besides, getting
the code correct for super-sector alignment also benefits us
for the fact that our public interface now has byte granularity,
even though none of our drivers have byte-level callbacks.
Add an assertion in blkdebug that proves that the block layer
never requests status of unaligned sections, similar to what it
does on other requests (while still keeping the generic helper
in place for when future patches add a throttle driver). Note
that iotest 177 already covers this (it would fail if you use
just the blkdebug.c hunk without the io.c changes). Meanwhile,
we can drop assertions in callers that no longer have to pass
in sector-aligned addresses.
There is a mid-function scope added for 'count' and 'longret',
for a couple of reasons: first, an upcoming patch will add an
'if' statement that checks whether a driver has an old- or
new-style callback, and can conveniently use the same scope for
less indentation churn at that time. Second, since we are
trying to get rid of sector-based computations, wrapping things
in a scope makes it easier to group and see what will be
deleted in a final cleanup patch once all drivers have been
converted to the new-style callback.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the continuing quest to make more things byte-based, change
the internal iteration of img_compare(). We can finally drop the
TODO assertions added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.
Most of the change is mechanical ('total_sectors' becomes
'total_size', 'sector_num' becomes 'offset', 'nb_sectors' becomes
'chunk', 'progress_base' goes from sectors to bytes); some of it
is also a cleanup (sectors_to_bytes() is now unused, loss of
variable 'count' added earlier in commit 51b0a488).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the continuing quest to make more things byte-based, change
the internal iteration of img_rebase(). We can finally drop the
TODO assertion added earlier, now that the entire algorithm is
byte-based and no longer has to shift from bytes to sectors.
Most of the change is mechanical ('num_sectors' becomes 'size',
'sector' becomes 'offset', 'n' goes from sectors to bytes); some
of it is also a cleanup (use of MIN() instead of open-coding,
loss of variable 'count' added earlier in commit d6a644bb).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the continuing quest to make more things byte-based, change
compare_sectors(), renaming it to compare_buffers() in the
process. Note that one caller (qemu-img compare) only cares
about the first difference, while the other (qemu-img rebase)
cares about how many consecutive sectors have the same
equal/different status; however, this patch does not bother to
micro-optimize the compare case to avoid the comparisons of
sectors beyond the first mismatch. Both callers are always
passing valid buffers in, so the initial check for buffer size
can be turned into an assertion.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Continue on the quest to make more things byte-based instead of
sector-based.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If a read error is encountered during 'qemu-img compare', we
were printing the "Error while reading offset ..." message twice;
this was because our helper function was awkward, printing output
on some but not all paths. Fix it to consistently report errors
on all paths, so that the callers do not risk a redundant message,
and update the testsuite for the improved output.
Further simplify the code by hoisting the conversion from an error
message to an exit code into the helper function, rather than
repeating that logic at all callers (yes, the helper function is
now less generic, but it's a net win in lines of code).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
During 'qemu-img compare', when we are checking that an allocated
portion of one file is all zeros, we don't need to waste time
computing how many additional sectors after the first non-zero
byte are also non-zero. Create a new helper find_nonzero() to do
the check for a first non-zero sector, and rebase
check_empty_sectors() to use it.
The new interface intentionally uses bytes in its interface, even
though it still crawls the buffer a sector at a time; it is robust
to a partial sector at the end of the buffer.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Compare the following images with all-zero contents:
$ truncate --size 1M A
$ qemu-img create -f qcow2 -o preallocation=off B 1G
$ qemu-img create -f qcow2 -o preallocation=metadata C 1G
On my machine, the difference is noticeable for pre-patch speeds,
with more than an order of magnitude in difference caused by the
choice of preallocation in the qcow2 file:
$ time ./qemu-img compare -f raw -F qcow2 A B
Warning: Image size mismatch!
Images are identical.
real 0m0.014s
user 0m0.007s
sys 0m0.007s
$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.
real 0m0.341s
user 0m0.144s
sys 0m0.188s
Why? Because bdrv_is_allocated() returns false for image B but
true for image C, throwing away the fact that both images know
via lseek(SEEK_HOLE) that the entire image still reads as zero.
From there, qemu-img ends up calling bdrv_pread() for every byte
of the tail, instead of quickly looking for the next allocation.
The solution: use block_status instead of is_allocated, giving:
$ time ./qemu-img compare -f raw -F qcow2 A C
Warning: Image size mismatch!
Images are identical.
real 0m0.014s
user 0m0.011s
sys 0m0.003s
which is on par with the speeds for no pre-allocation.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
As long as we are querying the status for a chunk smaller than
the known image size, we are guaranteed that a successful return
will have set pnum to a non-zero size (pnum is zero only for
queries beyond the end of the file). Use that to slightly
simplify the calculation of the current chunk size being compared.
Likewise, we don't have to shrink the amount of data operated on
until we know we have to read the file, and therefore have to fit
in the bounds of our buffer. Also, note that 'total_sectors_over'
is equivalent to 'progress_base'.
With these changes in place, sectors_to_process() is now dead code,
and can be removed.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the name of the function from bdrv_get_block_status_above()
to bdrv_block_status_above() ensures that the compiler enforces that
all callers are updated. Likewise, since it a byte interface allows
an offset mapping that might not be sector aligned, split the mapping
out of the return value and into a pass-by-reference parameter. For
now, the io.c layer still assert()s that all uses are sector-aligned,
but that can be relaxed when a later patch implements byte-based
block status in the drivers.
For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_block_status(), plus
updates for the new split return interface. But some code,
particularly bdrv_block_status(), gets a lot simpler because it no
longer has to mess with sectors. Likewise, mirror code no longer
computes s->granularity >> BDRV_SECTOR_BITS, and can therefore drop
an assertion about alignment because the loop no longer depends on
alignment (never mind that we don't really have a driver that
reports sub-sector alignments, so it's not really possible to test
the effect of sub-sector mirroring). Fix a neighboring assertion to
use is_power_of_2 while there.
For ease of review, bdrv_get_block_status() was tackled separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
type (no semantic change), and rename it to match the corresponding
public function rename.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change); and as with its public counterpart,
rename to bdrv_co_block_status() and split the offset return, to
make the compiler enforce that we catch all uses. For now, we
assert that callers and the return value still use aligned data,
but ultimately, this will be the function where we hand off to a
byte-based driver callback, and will eventually need to add logic
to ensure we round calls according to the driver's
request_alignment then touch up the result handed back to the
caller, to start permitting a caller to pass unaligned offsets.
Note that we are now prepared to accepts 'bytes' larger than INT_MAX;
this is okay as long as we clamp things internally before violating
any 32-bit limits, and makes no difference to how a client will
use the information (clients looping over the entire file must
already be prepared for consecutive calls to return the same status,
as drivers are already free to return shorter-than-maximal status
due to any other convenient split points, such as when the L2 table
crosses cluster boundaries in qcow2).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually moving away from sector-based interfaces, towards
byte-based. In the common case, allocation is unlikely to ever use
values that are not naturally sector-aligned, but it is possible
that byte-based values will let us be more precise about allocation
at the end of an unaligned file that can do byte-based access.
Changing the name of the function from bdrv_get_block_status() to
bdrv_block_status() ensures that the compiler enforces that all
callers are updated. For now, the io.c layer still assert()s that
all callers are sector-aligned, but that can be relaxed when a later
patch implements byte-based block status in the drivers.
There was an inherent limitation in returning the offset via the
return value: we only have room for BDRV_BLOCK_OFFSET_MASK bits, which
means an offset can only be mapped for sector-aligned queries (or,
if we declare that non-aligned input is at the same relative position
modulo 512 of the answer), so the new interface also changes things to
return the offset via output through a parameter by reference rather
than mashed into the return value. We'll have some glue code that
munges between the two styles until we finish converting all uses.
For the most part this patch is just the addition of scaling at the
callers followed by inverse scaling at bdrv_block_status(), coupled
with the tweak in calling convention. But some code, particularly
bdrv_is_allocated(), gets a lot simpler because it no longer has to
mess with sectors.
For ease of review, bdrv_get_block_status_above() will be tackled
separately.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Continue by converting
an internal function (no semantic change), and simplifying its
caller accordingly.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Change the internal
loop iteration of zeroing a device to track by bytes instead of
sectors (although we are still guaranteed that we iterate by steps
that are sector-aligned).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Convert another internal
function (no semantic change), and rename it to is_zero() in the
process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In the process of converting sector-based interfaces to bytes,
I'm finding it easier to represent a byte count as a 64-bit
integer at the block layer (even if we are internally capped
by SIZE_MAX or even INT_MAX for individual transactions, it's
still nicer to not have to worry about truncation/overflow
issues on as many variables). Update the signature of
bdrv_round_to_clusters() to uniformly use int64_t, matching
the signature already chosen for bdrv_is_allocated and the
fact that off_t is also a signed type, then adjust clients
according to the required fallout (even where the result could
now exceed 32 bits, no client is directly assigning the result
into a 32-bit value without breaking things into a loop first).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Not all callers care about which BDS owns the mapping for a given
range of the file, or where the zeroes lie within that mapping. In
particular, bdrv_is_allocated() cares more about finding the
largest run of allocated data from the guest perspective, whether
or not that data is consecutive from the host perspective, and
whether or not the data reads as zero. Therefore, doing subsequent
refinements such as checking how much of the format-layer
allocation also satisfies BDRV_BLOCK_ZERO at the protocol layer is
wasted work - in the best case, it just costs extra CPU cycles
during a single bdrv_is_allocated(), but in the worst case, it
results in a smaller *pnum, and forces callers to iterate through
more status probes when visiting the entire file for even more
extra CPU cycles.
This patch only optimizes the block layer (no behavior change when
want_zero is true, but skip unnecessary effort when it is false).
Then when subsequent patches tweak the driver callback to be
byte-based, we can also pass this hint through to the driver.
Tweak BdrvCoGetBlockStatusData to declare arguments in parameter
order, rather than mixing things up (minimizing padding is not
necessary here).
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Not all callers care about which BDS owns the mapping for a given
range of the file. This patch merely simplifies the callers by
consolidating the logic in the common call point, while guaranteeing
a non-NULL file to all the driver callbacks, for no semantic change.
The only caller that does not care about pnum is bdrv_is_allocated,
as invoked by vvfat; we can likewise add assertions that the rest
of the stack does not have to worry about a NULL pnum.
Furthermore, this will also set the stage for a future cleanup: when
a caller does not care about which BDS owns an offset, it would be
nice to allow the driver to optimize things to not have to return
BDRV_BLOCK_OFFSET_VALID in the first place. In the case of fragmented
allocation (for example, it's fairly easy to create a qcow2 image
where consecutive guest addresses are not at consecutive host
addresses), the current contract requires bdrv_get_block_status()
to clamp *pnum to the limit where host addresses are no longer
consecutive, but allowing a NULL file means that *pnum could be
set to the full length of known-allocated data.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This changes test case 191 to include a backing image that has
backing_fmt set in the image file, but is referenced by node name in the
qemu command line.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When referring to a backing file of an image via node name
bdrv_open_backing_file would add the 'driver' option to the option list
filling it with the backing format driver. This breaks construction of
the backing chain via -blockdev, as bdrv_open_inherit reports an error
if both 'reference' and 'options' are provided.
$ qemu-img create -f raw /tmp/backing.raw 64M
$ qemu-img create -f qcow2 -F raw -b /tmp/backing.raw /tmp/test.qcow2
$ qemu-system-x86_64 \
-blockdev driver=file,filename=/tmp/backing.raw,node-name=backing \
-blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing
qemu-system-x86_64: -blockdev driver=qcow2,file.driver=file,file.filename=/tmp/test.qcow2,node-name=root,backing=backing: Could not open backing file: Cannot reference an existing block device with additional options or a new filename
Signed-off-by: Peter Krempa <pkrempa@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Do not require the submodule, but use it if present. Allow the
command-line to override system or git submodule either way.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Merge tpm 2017/10/24 v1
# gpg: Signature made Wed 25 Oct 2017 06:06:55 BST
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-10-24-1:
tpm: print buffers received from TPM when debugging
vl: remove unnecessary #ifdef CONFIG_TPM
tpm: remove unnecessary #ifdef CONFIG_TPM
tpm: add stubs
tpm: add missing include
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
we can get the network interface statistics inside a virtual machine by
guest-network-get-interfaces command. it is very useful for us tomonitor
and analyze network traffic.
Signed-off-by: ZhiPeng Lu <lu.zhipeng@zte.com.cn>
* don't rely on sizeof(wchar[]) for wchar[] indexing
* avoid camelCase variable names
* fix up getline() usage
* condensed commit subject line
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
When VM is in a heavy IO, if the command "guest-fsfreeze-freeze"
is executed, VSS may timeout when trying to hold writes.
Inside guest, Event ID 12298(VSS_ERROR_HOLD_WRITES_TIMEOUT)
is logged in the Event Viewer.
At that time, if we call AbortBackup, qga may hang forever.
This patch will solve this issue.
Cc: Michael Roth <mdroth@linux.vnet.ibm.com>
Cc: Tomoki Sekiyama <tomoki.sekiyama@gmail.com>
Signed-off-by: Chen Hanxiao <chenhanxiao@gmail.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
TCG patch queue
# gpg: Signature made Wed 25 Oct 2017 10:30:18 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20171025: (51 commits)
translate-all: exit from tb_phys_invalidate if qht_remove fails
tcg: Initialize cpu_env generically
tcg: enable multiple TCG contexts in softmmu
tcg: introduce regions to split code_gen_buffer
translate-all: use qemu_protect_rwx/none helpers
osdep: introduce qemu_mprotect_rwx/none
tcg: allocate optimizer temps with tcg_malloc
tcg: distribute profiling counters across TCGContext's
tcg: introduce **tcg_ctxs to keep track of all TCGContext's
gen-icount: fold exitreq_label into TCGContext
tcg: define tcg_init_ctx and make tcg_ctx a pointer
tcg: take tb_ctx out of TCGContext
translate-all: report correct avg host TB size
exec-all: rename tb_free to tb_remove
translate-all: use a binary search tree to track TBs in TBContext
tcg: Remove CF_IGNORE_ICOUNT
tcg: Add CF_LAST_IO + CF_USE_ICOUNT to CF_HASH_MASK
cpu-exec: lookup/generate TB outside exclusive region during step_atomic
tcg: check CF_PARALLEL instead of parallel_cpus
target/sparc: check CF_PARALLEL instead of parallel_cpus
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Even though there is only one monitor, and thus no race on this
global data object, there is also no point in having it. We can
just as well record the decision in the read_memory_function that
we select.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
If configured, prefer this over our rather dated copy of the
GPLv2-only binutils. This will be especially apparent with
the proposed vector extensions to TCG, as disas/i386.c does
not handle AVX.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The Capstone disassembler has its own big-endian fixup.
Doing this twice does not work, of course. Move our current
fixup from target/arm/cpu.c to disas/arm.c.
This makes read_memory_inner_func unused and can be removed.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Two or more threads might race while invalidating the same TB. We currently
do not check for this at all despite taking tb_lock, which means we would
wrongly invalidate the same TB more than once. This bug has actually been
hit by users: I recently saw a report on IRC, although I have yet to see
the corresponding test case.
Fix this by using qht_remove as the synchronization point; if it fails,
that means the TB has already been invalidated, and therefore there
is nothing left to do in tb_phys_invalidate.
Note that this solution works now that we still have tb_lock, and will
continue working once we remove tb_lock.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Message-Id: <1508445114-4717-1-git-send-email-cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is identical for each target. So, move the initialization to
common code. Move the variable itself out of tcg_ctx and name it
cpu_env to minimize changes within targets.
This also means we can remove tcg_global_reg_new_{ptr,i32,i64},
since there are no longer global-register temps created by targets.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This enables parallel TCG code generation. However, we do not take
advantage of it yet since tb_lock is still held during tb_gen_code.
In user-mode we use a single TCG context; see the documentation
added to tcg_region_init for the rationale.
Note that targets do not need any conversion: targets initialize a
TCGContext (e.g. defining TCG globals), and after this initialization
has finished, the context is cloned by the vCPU threads, each of
them keeping a separate copy.
TCG threads claim one entry in tcg_ctxs[] by atomically increasing
n_tcg_ctxs. Do not be too annoyed by the subsequent atomic_read's
of that variable and tcg_ctxs; they are there just to play nice with
analysis tools such as thread sanitizer.
Note that we do not allocate an array of contexts (we allocate
an array of pointers instead) because when tcg_context_init
is called, we do not know yet how many contexts we'll use since
the bool behind qemu_tcg_mttcg_enabled() isn't set yet.
Previous patches folded some TCG globals into TCGContext. The non-const
globals remaining are only set at init time, i.e. before the TCG
threads are spawned. Here is a list of these set-at-init-time globals
under tcg/:
Only written by tcg_context_init:
- indirect_reg_alloc_order
- tcg_op_defs
Only written by tcg_target_init (called from tcg_context_init):
- tcg_target_available_regs
- tcg_target_call_clobber_regs
- arm: arm_arch, use_idiv_instructions
- i386: have_cmov, have_bmi1, have_bmi2, have_lzcnt,
have_movbe, have_popcnt
- mips: use_movnz_instructions, use_mips32_instructions,
use_mips32r2_instructions, got_sigill (tcg_target_detect_isa)
- ppc: have_isa_2_06, have_isa_3_00, tb_ret_addr
- s390: tb_ret_addr, s390_facilities
- sparc: qemu_ld_trampoline, qemu_st_trampoline (build_trampolines),
use_vis3_instructions
Only written by tcg_prologue_init:
- 'struct jit_code_entry one_entry'
- aarch64: tb_ret_addr
- arm: tb_ret_addr
- i386: tb_ret_addr, guest_base_flags
- ia64: tb_ret_addr
- mips: tb_ret_addr, bswap32_addr, bswap32u_addr, bswap64_addr
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is groundwork for supporting multiple TCG contexts.
The naive solution here is to split code_gen_buffer statically
among the TCG threads; this however results in poor utilization
if translation needs are different across TCG threads.
What we do here is to add an extra layer of indirection, assigning
regions that act just like pages do in virtual memory allocation.
(BTW if you are wondering about the chosen naming, I did not want
to use blocks or pages because those are already heavily used in QEMU).
We use a global lock to serialize allocations as well as statistics
reporting (we now export the size of the used code_gen_buffer with
tcg_code_size()). Note that for the allocator we could just use
a counter and atomic_inc; however, that would complicate the gathering
of tcg_code_size()-like stats. So given that the region operations are
not a fast path, a lock seems the most reasonable choice.
The effectiveness of this approach is clear after seeing some numbers.
I used the bootup+shutdown of debian-arm with '-tb-size 80' as a benchmark.
Note that I'm evaluating this after enabling per-thread TCG (which
is done by a subsequent commit).
* -smp 1, 1 region (entire buffer):
qemu: flush code_size=83885014 nb_tbs=154739 avg_tb_size=357
qemu: flush code_size=83884902 nb_tbs=153136 avg_tb_size=363
qemu: flush code_size=83885014 nb_tbs=152777 avg_tb_size=364
qemu: flush code_size=83884950 nb_tbs=150057 avg_tb_size=373
qemu: flush code_size=83884998 nb_tbs=150234 avg_tb_size=373
qemu: flush code_size=83885014 nb_tbs=154009 avg_tb_size=360
qemu: flush code_size=83885014 nb_tbs=151007 avg_tb_size=370
qemu: flush code_size=83885014 nb_tbs=151816 avg_tb_size=367
That is, 8 flushes.
* -smp 8, 32 regions (80/32 MB per region) [i.e. this patch]:
qemu: flush code_size=76328008 nb_tbs=141040 avg_tb_size=356
qemu: flush code_size=75366534 nb_tbs=138000 avg_tb_size=361
qemu: flush code_size=76864546 nb_tbs=140653 avg_tb_size=361
qemu: flush code_size=76309084 nb_tbs=135945 avg_tb_size=375
qemu: flush code_size=74581856 nb_tbs=132909 avg_tb_size=375
qemu: flush code_size=73927256 nb_tbs=135616 avg_tb_size=360
qemu: flush code_size=78629426 nb_tbs=142896 avg_tb_size=365
qemu: flush code_size=76667052 nb_tbs=138508 avg_tb_size=368
Again, 8 flushes. Note how buffer utilization is not 100%, but it
is close. Smaller region sizes would yield higher utilization,
but we want region allocation to be rare (it acquires a lock), so
we do not want to go too small.
* -smp 8, static partitioning of 8 regions (10 MB per region):
qemu: flush code_size=21936504 nb_tbs=40570 avg_tb_size=354
qemu: flush code_size=11472174 nb_tbs=20633 avg_tb_size=370
qemu: flush code_size=11603976 nb_tbs=21059 avg_tb_size=365
qemu: flush code_size=23254872 nb_tbs=41243 avg_tb_size=377
qemu: flush code_size=28289496 nb_tbs=52057 avg_tb_size=358
qemu: flush code_size=43605160 nb_tbs=78896 avg_tb_size=367
qemu: flush code_size=45166552 nb_tbs=82158 avg_tb_size=364
qemu: flush code_size=63289640 nb_tbs=116494 avg_tb_size=358
qemu: flush code_size=51389960 nb_tbs=93937 avg_tb_size=362
qemu: flush code_size=59665928 nb_tbs=107063 avg_tb_size=372
qemu: flush code_size=38380824 nb_tbs=68597 avg_tb_size=374
qemu: flush code_size=44884568 nb_tbs=79901 avg_tb_size=376
qemu: flush code_size=50782632 nb_tbs=90681 avg_tb_size=374
qemu: flush code_size=39848888 nb_tbs=71433 avg_tb_size=372
qemu: flush code_size=64708840 nb_tbs=119052 avg_tb_size=359
qemu: flush code_size=49830008 nb_tbs=90992 avg_tb_size=362
qemu: flush code_size=68372408 nb_tbs=123442 avg_tb_size=368
qemu: flush code_size=33555560 nb_tbs=59514 avg_tb_size=378
qemu: flush code_size=44748344 nb_tbs=80974 avg_tb_size=367
qemu: flush code_size=37104248 nb_tbs=67609 avg_tb_size=364
That is, 20 flushes. Note how a static partitioning approach uses
the code buffer poorly, leading to many unnecessary flushes.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The helpers require the address and size to be page-aligned, so
do that before calling them.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting multiple TCG contexts.
While at it, also allocate temps_used directly as a bitmap of the
required size, instead of using a bitmap of TCG_MAX_TEMPS via
TCGTempSet.
Performance-wise we lose about 1.12% in a translation-heavy workload
such as booting+shutting down debian-arm:
Performance counter stats for 'taskset -c 0 arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=die-on-boot.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel kernel.img -append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
exec time (s) Relative slowdown wrt original (%)
---------------------------------------------------------------
original 20.213321616 0.
tcg_malloc 20.441130078 1.1270214
TCGContext 20.477846517 1.3086662
g_malloc 20.780527895 2.8061013
The other two alternatives shown in the table are:
- TCGContext: embed temps[TCG_MAX_TEMPS] and TCGTempSet used_temps
in TCGContext. This is simple enough but it isn't faster than using
tcg_malloc; moreover, it wastes memory.
- g_malloc: allocate/deallocate both temps and used_temps every time
tcg_optimize is executed.
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is groundwork for supporting multiple TCG contexts.
To avoid scalability issues when profiling info is enabled, this patch
makes the profiling info counters distributed via the following changes:
1) Consolidate profile info into its own struct, TCGProfile, which
TCGContext also includes. Note that tcg_table_op_count is brought
into TCGProfile after dropping the tcg_ prefix.
2) Iterate over the TCG contexts in the system to obtain the total counts.
This change also requires updating the accessors to TCGProfile fields to
use atomic_read/set whenever there may be conflicting accesses (as defined
in C11) to them.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting multiple TCG contexts.
Note that having n_tcg_ctxs is unnecessary. However, it is
convenient to have it, since it will simplify iterating over the
array: we'll have just a for loop instead of having to iterate
over a NULL-terminated array (which would require n+1 elems)
or having to check with ifdef's for usermode/softmmu.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting multiple TCG contexts.
The core of this patch is this change to tcg/tcg.h:
> -extern TCGContext tcg_ctx;
> +extern TCGContext tcg_init_ctx;
> +extern TCGContext *tcg_ctx;
Note that for now we set *tcg_ctx to whatever TCGContext is passed
to tcg_context_init -- in this case &tcg_init_ctx.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Since commit 6e3b2bfd6 ("tcg: allocate TB structs before the
corresponding translated code") we are not fully utilizing
code_gen_buffer for translated code, and therefore are
incorrectly reporting the amount of translated code as well as
the average host TB size. Address this by:
- Making the conscious choice of misreporting the total translated code;
doing otherwise would mislead users into thinking "-tb-size" is not
honoured.
- Expanding tb_tree_stats to accurately count the bytes of translated code on
the host, and using this for reporting the average tb host size,
as well as the expansion ratio.
In the future we might want to consider reporting the accurate numbers for
the total translated code, together with a "bookkeeping/overhead" field to
account for the TB structs.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We don't really free anything in this function anymore; we just remove
the TB from the binary search tree.
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This is a prerequisite for supporting multiple TCG contexts, since
we will have threads generating code in separate regions of
code_gen_buffer.
For this we need a new field (.size) in struct tb_tc to keep
track of the size of the translated code. This field uses a size_t
to avoid adding a hole to the struct, although really an unsigned
int would have been enough.
The comparison function we use is optimized for the common case:
insertions. Profiling shows that upon booting debian-arm, 98%
of comparisons are between existing tb's (i.e. a->size and b->size
are both !0), which happens during insertions (and removals, but
those are rare). The remaining cases are lookups. From reading the glib
sources we see that the first key is always the lookup key. However,
the code does not assume this to always be the case because this
behaviour is not guaranteed in the glib docs. However, we embed
this knowledge in the code as a branch hint for the compiler.
Note that tb_free does not free space in the code_gen_buffer anymore,
since we cannot easily know whether the tb is the last one inserted
in code_gen_buffer. The next patch in this series renames tb_free
to tb_remove to reflect this.
Performance-wise, lookups in tb_find_pc are the same as before:
O(log n). However, insertions are O(log n) instead of O(1), which
results in a small slowdown when booting debian-arm:
Performance counter stats for 'build/arm-softmmu/qemu-system-arm \
-machine type=virt -nographic -smp 1 -m 4096 \
-netdev user,id=unet,hostfwd=tcp::2222-:22 \
-device virtio-net-device,netdev=unet \
-drive file=img/arm/jessie-arm32.qcow2,id=myblock,index=0,if=none \
-device virtio-blk-device,drive=myblock \
-kernel img/arm/aarch32-current-linux-kernel-only.img \
-append console=ttyAMA0 root=/dev/vda1 \
-name arm,debug-threads=on -smp 1' (10 runs):
- Before:
8048.598422 task-clock (msec) # 0.931 CPUs utilized ( +- 0.28% )
16,974 context-switches # 0.002 M/sec ( +- 0.12% )
0 cpu-migrations # 0.000 K/sec
10,125 page-faults # 0.001 M/sec ( +- 1.23% )
35,144,901,879 cycles # 4.367 GHz ( +- 0.14% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,758,252,643 instructions # 1.87 insns per cycle ( +- 0.33% )
10,871,298,668 branches # 1350.707 M/sec ( +- 0.41% )
192,322,212 branch-misses # 1.77% of all branches ( +- 0.32% )
8.640869419 seconds time elapsed ( +- 0.57% )
- After:
8146.242027 task-clock (msec) # 0.923 CPUs utilized ( +- 1.23% )
17,016 context-switches # 0.002 M/sec ( +- 0.40% )
0 cpu-migrations # 0.000 K/sec
18,769 page-faults # 0.002 M/sec ( +- 0.45% )
35,660,956,120 cycles # 4.378 GHz ( +- 1.22% )
<not supported> stalled-cycles-frontend
<not supported> stalled-cycles-backend
65,095,366,607 instructions # 1.83 insns per cycle ( +- 1.73% )
10,803,480,261 branches # 1326.192 M/sec ( +- 1.95% )
195,601,289 branch-misses # 1.81% of all branches ( +- 0.39% )
8.828660235 seconds time elapsed ( +- 0.38% )
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that we have curr_cflags, we can include CF_USE_ICOUNT
early and then remove it as necessary.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These flags are used by target/*/translate.c,
and affect code generation.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Now that all code generation has been converted to check CF_PARALLEL, we can
generate !CF_PARALLEL code without having yet set !parallel_cpus --
and therefore without having to be in the exclusive region during
cpu_exec_step_atomic.
While at it, merge cpu_exec_step into cpu_exec_step_atomic.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
The tb->cflags field is not passed to tcg generation functions. So
we add a field to TCGContext, storing there a copy of tb->cflags.
Most architectures have <= 32 registers, which results in a 4-byte hole
in TCGContext. Use this hole for the new field.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Thereby decoupling the resulting translated code from the current state
of the system.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Convert all existing readers of tb->cflags to tb_cflags, so that we
use atomic_read and therefore avoid undefined behaviour in C11.
Note that the remaining setters/getters of the field are protected
by tb_lock, and therefore do not need conversion.
Luckily all readers access the field via 'tb->cflags' (so no foo.cflags,
bar->cflags in the code base), which makes the conversion easily
scriptable:
FILES=$(git grep 'tb->cflags' target include/exec/gen-icount.h \
accel/tcg/translator.c | cut -f1 -d':' | sort | uniq)
perl -pi -e 's/([^.>])tb->cflags/$1tb_cflags(tb)/g' $FILES
perl -pi -e 's/([a-z->.]*)(->|\.)tb->cflags/tb_cflags($1$2tb)/g' $FILES
Then manually fixed the few errors that checkpatch reported.
Compile-tested for all targets.
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We were generating code during tb_invalidate_phys_page_range,
check_watchpoint, cpu_io_recompile, and (seemingly) discarding
the TB, assuming that it would magically be picked up during
the next iteration through the cpu_exec loop.
Instead, record the desired cflags in CPUState so that we request
the proper TB so that there is no more magic.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This will enable us to decouple code translation from the value
of parallel_cpus at any given time. It will also help us minimize
TB flushes when generating code via EXCP_ATOMIC.
Note that the declaration of parallel_cpus is brought to exec-all.h
to be able to define there the "curr_cflags" inline.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Using the offset of a temporary, relative to TCGContext, rather than
its index means that we don't use 0. That leaves offset 0 free for
a NULL representation without having to leave index 0 unused.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When we used structures for TCGv_*, we needed a macro in order to
perform a comparison. Now that we use pointers, this is just clutter.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The GET and MAKE functions weren't really specific enough.
We now have a full complement of functions that convert exactly
between temporaries, arguments, tcgv pointers, and indices.
The target/sparc change is also a bug fix, which would have affected
a host that defines TCG_TARGET_HAS_extr[lh]_i64_i32, i.e. MIPS64.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Transform TCGv_* to an "argument" or a temporary.
For now, an argument is simply the temporary index.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
While we're touching many of the lines anyway, adjust the naming
of the functions to better distinguish when "TCGArg" vs "TCGTemp"
should be used.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Copy s->nb_globals or s->nb_temps to a local variable for the purposes
of iteration. This should allow the compiler to use low-overhead
looping constructs on some hosts.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This avoids having to allocate external memory for each temporary.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
At the same time, drop the TCGContext argument and use tcg_ctx instead.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This avoids needing to test the index of a temp against nb_globals.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Rather than have a separate buffer of 10*max_ops entries,
give each opcode 10 entries. The result is actually a bit
smaller and should have slightly more cache locality.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
else file including "sysemu/tpm.h" fails to compile:
In file included from qemu/stubs/tpm.c:2:0:
qemu/include/sysemu/tpm.h:36:19: error: implicit declaration of function ‘object_resolve_path_type’ [-Werror=implicit-function-declaration]
Object *obj = object_resolve_path_type("", TYPE_TPM_TIS, NULL);
^~~~~~~~~~~~~~~~~~~~~~~~
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Richard W.M. Jones <rjones@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
input: fixes for ui input code and ps/2 keyboard (mostly sysrq key)
# gpg: Signature made Mon 23 Oct 2017 10:19:22 BST
# gpg: using RSA key 0x4CB6D8EED3E87138
# gpg: Good signature from "Gerd Hoffmann (work) <kraxel@redhat.com>"
# gpg: aka "Gerd Hoffmann <gerd@kraxel.org>"
# gpg: aka "Gerd Hoffmann (private) <kraxel@gmail.com>"
# Primary key fingerprint: A032 8CFF B93A 17A7 9901 FE7D 4CB6 D8EE D3E8 7138
* remotes/kraxel/tags/input-20171023-pull-request:
ui: pull in latest keycodemapdb
ui: normalize the 'sysrq' key into the 'print' key
ps2: fix scancodes sent for Ctrl+Pause key combination
ps2: fix scancodess sent for Pause key in AT set 1
ps2: fix scancodes sent for Shift/Ctrl+Print key combination
ps2: fix scancodes sent for Alt-Print key combination (aka SysRq)
ui: use correct union field for key number
ui: fix crash with sendkey and raw key numbers
input: use hex in ps2 keycode trace events
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
OpenRISC SMP patchset 20171021
# gpg: Signature made Fri 20 Oct 2017 22:51:16 BST
# gpg: using RSA key 0xC3B31C2D5E6627E4
# gpg: Good signature from "Stafford Horne <shorne@gmail.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: D9C4 7354 AEF8 6C10 3A25 EFF1 C3B3 1C2D 5E66 27E4
* remotes/shorne/tags/openrisc-20171021-smp-pr:
openrisc: Only kick cpu on timeout, not on update
openrisc: Initial SMP support
openrisc/cputimer: Perparation for Multicore
target/openrisc: Make coreid and numcores variable
openrisc/ompic: Add OpenRISC Multicore PIC (OMPIC)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We now report errors also when we finish migration, not only on info
migrate. We plan to use this error from several places, and we want
the first error to happen to win, so we add an mutex to order it.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
This patch adds ability to track down already received
pages, it's necessary for calculation vCPU block time in
postcopy migration feature, and for recovery after
postcopy migration failure.
Also it's necessary to solve shared memory issue in
postcopy livemigration. Information about received pages
will be transferred to the software virtual bridge
(e.g. OVS-VSWITCHD), to avoid fallocate (unmap) for
already received pages. fallocate syscall is required for
remmaped shared memory, due to remmaping itself blocks
ioctl(UFFDIO_COPY, ioctl in this case will end with EEXIT
error (struct page is exists after remmap).
Bitmap is placed into RAMBlock as another postcopy/precopy
related bitmaps.
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Just for placing auxilary operations inside helper,
auxilary operations like: track received pages,
notify about copying operation in futher patches.
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Rearrange the bitmap initialization and the first sync. Since at it,
make sure the locks are taken/released in correct order (I moved RCU
unlock upper - though it may not affect much).
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Let's further simplify ram_init_all() and ram_save_cleanup() by abstract
all the XBZRLE related codes into their own functions.
When allocating xbzrle cache, we are always very careful on -ENOMEM;
which makes sense. Replacing the last g_malloc0() with g_try_malloc0(),
then refactor the logic a bit.
This patch should be fixing some memory leaks when some memory
allocation failed for XBZRLE in the past.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
There are two Mutexes that are created but not yet destroyed for
RAMState. Fix that.
Since we are at it, provide helper function to clean up RAMState.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
The old ram_state_init() is not really initializing the RAMState only,
but including lots of other stuff that is RAM-related. Renaming it to
ram_init_all(). Instead, provide a real ram_state_init().
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add pause-before-switchover support for postcopy.
After starting postcopy it will transition
active->pre-switchover->postcopy_active
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
If a migration_cancel is issued during the new paused state,
kick the pause_sem to get to unpause so it can cancel.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Wait for a semaphore before completing the migration,
if the previously added capability was enabled.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Add two statuses for use when the 'pause-before-switchover'
capability is enabled.
'pre-switchover' is the state that we wait in for management
to allow us to continue.
'device' is the state we enter while serialising the devices
after management gives us the OK.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
When 'pause-before-switchover' is enabled, the outgoing migration
will pause before invalidating the block devices and serializing
the device state.
At this point the management layer gets the chance to clean up any
device jobs or other device users before the migration completes.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Once there, take a total size instead of the size of the pages. We
move the check that the new_size is bigger than one page from
xbzrle_cache_resize().
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Fix typo spotted by Peter Xu
It was not used at all since commit:
27af7d6ea5
which replaced its use by the dirty sync count.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Latest keycodemapdb has a fix for Sun keyboard Pause mapping
and backcompat fix for QEMU's treatment of 0xb7 as an alternative
to 0x54 for triggering Print/SysRq
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-10-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The 'sysrq' key was mistakenly added to QEMU to deal with incorrect handling
of the 'print' key in the ps2 device:
commit f2289cb692
Author: balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
Date: Wed Jun 4 10:14:16 2008 +0000
Add sysrq to key names known by "sendkey".
Adding sysrq keycode to the table enabling running sysrq debugging in
the guest via the monitor sendkey command, like:
(qemu) sendkey alt-sysrq-t
Tested on x86-64 target and Linux guest.
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
The ps2 device is now fixed wrt modifiers and the 'print' key. Further the
handling of the 'sysrq' key has some problems of its own, documented in the
previous commit. To cleanup this mess, we convert any use of 'sysrq' into
'print' prior to dispatching the event to device models.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-9-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The 'Pause' key is special in the AT set 1 / set 2 scancode definitions.
An unmodified 'Pause' key is supposed to send
AT Set 1: e1 1d 45 91 9d c5 (Down) <nothing> (Up)
AT Set 2: e1 14 77 e1 f0 14 f0 77 (Down) <nothing> (Up)
which QEMU gets right. When combined with Ctrl (both left and right variants),
a different sequence is expected
AT Set 1: e0 46 e0 c6 (Down) <nothing> (Up)
AT Set 2: e0 7e e0 f0 73 (Down) <nothing> (Up)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-8-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The ps2 device was previously fixed to send the special Pause/Print
scancode sequences in:
commit 8c10e0baf0
Author: Hervé Poussineau <hpoussin@reactos.org>
Date: Thu Sep 15 22:06:26 2016 +0200
ps2: use QEMU qcodes instead of scancodes
The sequence used for Pause had a small typo in the AT set 1, with a 0xe1
accidentally changed to 0x91. This is not immediately visible with Linux
guests since they run the ps2 device with AT set 2 scancodes.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-7-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The 'Print' key is special in the AT set 1 / set 2 scancode definitions.
An unmodified 'Print' key is supposed to send
AT Set 1: e0 2a e0 37 (Down) e0 b7 e0 aa (Up)
AT Set 2: e0 12 e0 7c (Down) e0 f0 7c e0 f0 12 (Up)
which QEMU gets right. When combined with Shift/Ctrl (both left and right
variants), the leading two bytes should be dropped, resulting in
AT Set 1: e0 37 (Down) e0 b7 (Up)
AT Set 2: e0 7c (Down) e0 f0 7c (Up)
This difference is pretty benign, since of all the operating systems I have
checked (Linux, FreeBSD and OpenStack), none bother to check the leading two
bytes anyway. This change none the less makes the ps2 device better follow real
hardware behaviour.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The 'Print' key is special in the AT set 1 / set 2 scancode definitions.
An unmodified 'Print' key is supposed to send
AT Set 1: e0 2a e0 37 (Down) e0 b7 e0 aa (Up)
AT Set 2: e0 12 e0 7c (Down) e0 f0 7c e0 f0 12 (Up)
which QEMU gets right. When pressed in combination with the 'Alt_L' or 'Alt_R'
keys (which signify SysRq), the scancodes are required to follow a different
scheme. With Alt_L, the expected sequences are
AT set 1: 38, 54 (Down) d4, b8 (Up)
AT set 2: 11, 84 (Down) f0 84, f0 11 (Up)
And with Alt_R
AT set 1: e0 38, 54 (Down) d4, e0 b8 (Up)
AT set 2: e0 11, 84 (Down) f0 84, f0 e0 11 (Up)
It is actually slightly more complicated than that, because (according results
of 'showkey -s', keyboards will in fact first release the currently pressed
modifier before sending the sequence above (which effectively re-presses &
then releases the modifier) and finally re-press the original modifier
afterwards. IOW, with Alt_L we need to send
AT set 1: b8, 38, 54 (Down) d4, b8, 38 (Up)
AT set 2: f0 11, 11, 84 (Down) f0 84, f0 11, 11 (Up)
And with Alt_R
AT set 1: e0 b8, e0 38, 54 (Down) d4, e0 b8, e0 38 (Up)
AT set 2: e0 f0 11, e0 11, 84 (Down) f0 84, e0 f0 11, e0 11 (Up)
The AT set 3 scancodes have no special handling for Alt-Print.
Rather than fixing the handling of the 'print' key in the ps2 driver to consider
the Alt modifiers, way back, a patch was commited that defined an extra 'sysrq'
key name:
commit f2289cb692
Author: balrog <balrog@c046a42c-6fe2-441c-8c8c-71466251a162>
Date: Wed Jun 4 10:14:16 2008 +0000
Add sysrq to key names known by "sendkey".
Adding sysrq keycode to the table enabling running sysrq debugging in
the guest via the monitor sendkey command, like:
(qemu) sendkey alt-sysrq-t
Tested on x86-64 target and Linux guest.
Signed-off-by: Ryan Harper <ryanh@us.ibm.com>
With this patch QEMU would send
AT set 1: 38, 54 (Down) d4, b8 (Up)
AT set 2: 11, 84 (Down) f0 84, f0 11 (Up)
but this doesn't match what actual real keyboards send, as it is not releasing
the original modifier & pressing it again afterwards. In addition the original
problem remains, and a new problem was added:
- The sequence 'alt-print-t' is still broken, acting as if 'print-t' was
requested
- The sequence 'sysrq-t' is broken, injecting an undefine scancode sequence
tot he guest os (bare 0x54)
To deal with this mess we make these changes to the ps2 code, so that we track
the state of modifier keys (Alt, Shift, Ctrl - both left & right). Then we can
vary what scancodes are sent for Q_KEY_CODE_PRINT according to the Alt key
modifier state
Interestingly, it appears that of operating systems I've checked (Linux, FreeBSD
and OpenSolaris), none of them actually bother to validate the full sequences
for a unmodified 'Print' key. They all just ignore the leading "e0 2a" and
trigger based off "e0 37" alone. The latter two byte sequence is what keyboards
send with 'Print' is combined with 'Shift' or 'Ctrl' modifiers.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20171019142848.572-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The code converting key numbers to QKeyCode in the 'input-send-event'
command mistakenly accessed the key->u.qcode union field instead of
the key->u.number field. This is harmless because the fields use the
same size datatype in both cases, but none the less it should be fixed
to avoid confusion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171019142848.572-4-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Previously we enforced that all key events are using QKeyCodes
at time they are sent:
commit af07e5ff02
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri Sep 29 11:12:00 2017 +0100
ui: convert key events to QKeyCodes immediately
This commit forget to fix the code for the legacy 'sendkey'
command which still accepts key numbers from the user, which
then need converting to QKeyCodes
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20171019142848.572-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
We don't need qemu-keymap when we build only linux-user qemu.
When we compile in static mode, the libxkbcommon is detected
by configure if the shared one is available, but cannot
be linked if the static version is not available.
As we don't need it for qemu-linux-user, and we generally need
a static link to use it in a chroot, disable qemu-keymap in
this case.
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-id: 20171019191606.14129-1-laurent@vivier.eu
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Previously we were kicking the cpu on every update. This caused
problems noticeable in SMP configurations where one CPU got pinned
continuously servicing timer exceptions.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Wire in ompic and add basic support for SMP. The OpenRISC is special in
that interrupts for devices are routed to each core's PIC. This is
achieved using the qemu_irq_split utility, but this currently limits
OpenRISC to 2 cores.
This models the reference architecture described in the OpenRISC spec
1.2 proposal.
https://github.com/stffrdhrn/doc/raw/arch-1.2-proposal/openrisc-arch-1.2-rev0.pdf
The changes to the intialization of the sim include:
CPU Reset
o Reset each cpu to the bootstrap PC rather than only a single cpu as
done before.
o During Kernel loading the bootstrap PC is saved in a static global.
Network Initialization
o Connect the interrupt to each CPU
o Use more simple sysbus_mmio_map() rather than memory_region_add_subregion()
Sim Initialization
o Initialize the pic and tick timer per cpu
o Wire in the OMPIC if SMP is enabled
o Wire the serial irq to each CPU using qemu_irq_split()
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
In order to support multicore system we move some of the previously
static state variables into the state of each core.
On the other hand in order to allow timers to be synced between each
code the ttcr (tick timer count register) is moved out of the core.
This is not as per real hardware spec which has a separate timer counter
per core, but it seems the most simple way to keep each clock in sync.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Previously coreid and numcores were hard coded as 0 and 1 respectively
as OpenRISC QEMU did not have multicore support.
Multicore support is now being added so these registers need to have
configured values.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
Add OpenRISC Multicore PIC which handles inter processor interrupts
(IPI) between cores. In OpenRISC all device interrupts are routed to
each core enabling this device to be simple.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Stafford Horne <shorne@gmail.com>
The last big chunk of s390x changes:
- (experimental) smp support under tcg
- provide the virtio-input devices for virtio-ccw
- improve error handling in the css code
- enable some simple virtio tests for s390x
- low-address protection in tcg
- some more cleanups and fixes
# gpg: Signature made Fri 20 Oct 2017 12:49:22 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20171020: (46 commits)
s390x/tcg: low-address protection support
accel/tcg: allow to invalidate a write TLB entry immediately
tests: Enable the very simple virtio tests on s390x, too
libqtest: Add qtest_[v]startf()
s390x: refactor error handling for MSCH handler
s390x: refactor error handling for HSCH handler
s390x: refactor error handling for CSCH handler
s390x: refactor error handling for XSCH handler
s390x: improve error handling for SSCH and RSCH
s390x/css: IO instr handler ending control
s390x: move s390x_new_cpu() into board code
s390x: fix cpu object referrence leak in s390x_new_cpu()
s390x/event-facility: variable-length event masks
s390x/MAINTAINERS: add mailing list
virtio-ccw: Add the virtio-input devices for CCW bus
target/s390x: special handling when starting a CPU with WAIT PSW
s390x/tcg: refactor stfl(e) to use s390_get_feat_block()
s390x/tcg: unlock NMI
s390x/cpumodel: allow to enable SENSE RUNNING STATUS for qemu
s390x/tcg: switch to new SIGP handling code
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Fri 20 Oct 2017 07:30:45 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/docker-pull-request:
docker: Fix PATH for ccache
docker: fix out-of-tree 'make docker-test-build@debian-powerpc-cross'
docker: allow running from srcdir != builddir build
docker: cleanup temp directory after test
docker: Don't allocate tty unless DEBUG=1
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This is a neat way to implement low address protection, whereby
only the first 512 bytes of the first two pages (each 4096 bytes) of
every address space are protected.
Store a tec of 0 for the access exception, this is what is defined by
Enhanced Suppression on Protection in case of a low address protection
(Bit 61 set to 0, rest undefined).
We have to make sure to to pass the access address, not the masked page
address into mmu_translate*().
Drop the check from testblock. So we can properly test this via
kvm-unit-tests.
This will check every access going through one of the MMUs.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016202358.3633-3-david@redhat.com>
[CH: restored error message for access register mode]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Background: s390x implements Low-Address Protection (LAP). If LAP is
enabled, writing to effective addresses (before any translation)
0-511 and 4096-4607 triggers a protection exception.
So we have subpage protection on the first two pages of every address
space (where the lowcore - the CPU private data resides).
By immediately invalidating the write entry but allowing the caller to
continue, we force every write access onto these first two pages into
the slow path. we will get a tlb fault with the specific accessed
addresses and can then evaluate if protection applies or not.
We have to make sure to ignore the invalid bit if tlb_fill() succeeds.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016202358.3633-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We have several callers that were formatting the argument strings
themselves; consolidate this effort by adding new convenience
functions directly in libqtest, and update some call-sites that
can benefit from it.
Note that the new functions qtest_startf() and qtest_vstartf()
behave more like qtest_init() (the caller must assign global_qtest
after the fact, rather than getting it implicitly set). This helps
us prepare for future patches that get rid of the global variable,
by explicitly highlighting which tests still depend on it now.
Signed-off-by: Eric Blake <eblake@redhat.com>
[thuth: Dropped the hunks that do not apply cleanly to qemu master
yet and added the missing g_free(args) in qtest_vstartf()]
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1508336428-20511-2-git-send-email-thuth@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Simplify the error handling of the SSCH and RSCH handler avoiding
arbitrary and cryptic error codes being used to tell how the instruction
is supposed to end. Let the code detecting the condition tell how it's
to be handled in a less ambiguous way. It's best to handle SSCH and RSCH
in one go as the emulation of the two shares a lot of code.
For passthrough this change isn't pure refactoring, but changes the way
kernel reported EFAULT is handled. After clarifying the kernel interface
we decided that EFAULT shall be mapped to unit exception. Same goes for
unexpected error codes and absence of required ORB flags.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20171017140453.51099-4-pasic@linux.vnet.ibm.com>
Tested-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
[CH: cosmetic changes]
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
CSS code needs to tell the IO instruction handlers located in ioinst.c
how the emulated instruction should be ended. Currently this is done by
returning generic (POSIX) error codes, and mapping them to outcomes like
condition codes. This makes bugs easy to create and hard to recognize.
As a preparation for moving away from (mis)using generic error codes for
flow control let us introduce a type which tells the instruction
handler function how to end the instruction, in a more straight-forward
and less ambiguous way.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20171017140453.51099-3-pasic@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
[CH: cosmetic changes]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
object_new() returns cpu with refcnt == 1 and after realize
refcnt == 2*. s390x_new_cpu() as an owner of the first refcnt
should have released it on exit in both cases (on error and
success) to avoid it leaking. Do so for both cases.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1508247680-98800-2-git-send-email-imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
When we try to start a CPU with a WAIT PSW, we have to take care that
TCG will actually try to continue executing instructions.
We must therefore really only unhalt the CPU if we don't have a WAIT
PSW. Also document the special order for restart interrupts, which
load a new PSW and change the state to operating.
To keep KVM working, simply don't have a look at the WAIT bit when
loading the PSW. Otherwise the behavior of a restart interrupt when
a CPU stopped would be changed.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-31-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Refactor it to use s390_get_feat_block(). Directly write into the mapped
lowcore with stfl and make sure it is really only compiled if needed.
While at it, add an alignment check for STFLE and avoid
potential_page_fault() by properly restoring the CPU state.
Due to s390_get_feat_block(), we will now also indicate the
"Configuration-z-architectural-mode", which is with new SIGP code the
right thing to do.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-30-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This effectively enables experimental SMP support. Floating interrupts are
still a mess, so allow it but print a big warning. There also seems
to be a problem with CPU hotplug (after the main loop started).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-27-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
[CH: changed insn-data.def as pointed out by Richard]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Implement them like KVM implements/handles them. Both can only be
triggered via SIGP instructions. RESET has (almost) the lowest priority if
the CPU is running, and the highest if the CPU is STOPPED. This is handled
in SIGP code already. On delivery, we only have to care about the
"CPU running" scenario.
STOP is defined to be delivered after all other interrupts have been
delivered. Therefore it has the actual lowest priority.
As both can wake up a CPU if sleeping, indicate them correctly to
external code (e.g. cpu_has_work()).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-25-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We want to use the same code base for TCG, so let's cleanly factor it
out.
The sigp mutex is currently not really needed, as everything is
protected by the iothread mutex. But this could change later, so leave
it in place and initialize it properly from common code.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-17-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
KVM handles the wait PSW itself and triggers a WAIT ICPT in case it
really wants to sleep (disabled wait).
This will later allow us to change the order of loading a restart
interrupt and setting a CPU to OPERATING on SIGP RESTART without
changing KVM behavior.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-11-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If we encounter a WAIT PSW, we have to halt immediately. Using
cpu_loop_exit() at this point feels wrong. Simply leaving
cs->exception_index set doesn't result in an immediate stop.
This is also necessary to properly handle SIGP STOP interrupts later.
The CPU_INTERRUPT_HALT will be processed immediately and properly set
the CPU to halted (also resetting cs->exception_index to EXCP_HLT)
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-10-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Going to OPERATING here looks wrong. A CPU should even never be
!OPERATING at this point. Unhalting will already be done in
cpu_handle_halt() if there is work, so we can drop this statement
completely.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-8-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Currently, enabling/disabling of interrupts is not really supported.
Let's improve interrupt handling code by explicitly checking for
deliverable interrupts only. This is the first step. Checking for
external interrupt subclasses will be done next.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-5-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There are still some leftovers from old virtio interrupts in there.
Most importantly, we don't have to queue service interrupts anymore.
Just like KVM, we can simply multiplex the SCLP service interrupts and
avoid the queue.
Also, now only valid parameters/cpu_addr will be stored on service
interrupts.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-3-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
External interrupts are currently all handled like floating external
interrupts, they are queued. Let's prepare for a split of floating
and local interrupts by turning INTERRUPT_EXT into a mask.
While we can have various floating external interrupts of one kind, there
is usually only one (or a fixed number) of the local external interrupts.
So turn INTERRUPT_EXT into a mask and properly indicate the kind of
external interrupt. Floating interrupts will have to moved out of
one CPU instance later once we have SMP support.
The only floating external interrupts used right now are SERVICE
interrupts, so let's use that name. Following patches will clean up
SERVICE interrupt injection.
This get's rid of the ugly special handling for cpu timer and clock
comparator interrupts. And we really only store the parameters as
defined by the PoP.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928203708.9376-2-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Calling do_subchannel_work with no function control flags set in SCSW is
a programming error. Currently we handle this differently in
do_subchannel_work_virtual and do_subchannel_work_passthrough. Let's be
consistent and guard with a common assert against this programming error.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20171004154144.88995-2-pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Merge tpm 2017/10/19 v1
# gpg: Signature made Thu 19 Oct 2017 16:42:39 BST
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-10-19-1: (21 commits)
tpm: move recv_data_callback to TPM interface
tpm: add a QOM TPM interface
tpm-tis: fold TPMTISEmuState in TPMState
tpm-tis: remove tpm_tis.h header
tpm-tis: move TPMState to TIS header
tpm: remove locty_data from TPMState
tpm-emulator: fix error handling
tpm: add TPMBackendCmd to hold the request state
tpm: remove locty argument from receive_cb
tpm: remove needless cast
tpm: remove unused TPMBackendCmd
tpm: remove configure_tpm() hop
tpm: remove init() class method
tpm: remove TPMDriverOps
tpm: move TPMSizedBuffer to tpm_tis.h
tpm: remove tpm_register_driver()
tpm: replace tpm_get_backend_driver() to drop be_drivers
tpm: lookup tpm backend class in tpm_driver_find_by_type()
tpm: make tpm_get_backend_driver() static
tpm-tis: remove RAISE_STS_IRQ
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
gcc warning:
/qemu/util/oslib-posix.c:304:11: error:
variable ‘addr’ might be clobbered by ‘longjmp’ or ‘vfork’
[-Werror=clobbered]
Fix also some related data types:
numpages, hpagesize are used as pointer offset.
Always use size_t for them and also for the derived
numpages_per_thread and size_per_thread.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Message-id: 20171016202912.1117-1-sw@weilnetz.de
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Before bcd7f06f57 we source /etc/profile
so the PATH included the right paths to ccache binaries. Now we need to
update $PATH explicitly from run script.
Keep the old /usr/lib around just so that in the future, ccache from 32
bit images will just work.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171018073841.30062-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The existence of tty in the container seems to urge gcc into colorizing
the errors, but the escape chars will clutter the report once turned
into email replies on patchew. Move -t to debug mode.
Reported-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20171013011954.9975-1-famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
This was introduced by:
commit aef45d51d1
Author: Daniel P. Berrange <berrange@redhat.com>
Date: Fri Sep 29 11:11:56 2017 +0100
build: automatically handle GIT submodule checkout for dtc
On my system, I see the following with a fresh clone:
% ./configure --disable-gtk --target-list=aarch64-softmmu
% make -j8
GEN aarch64-softmmu/config-devices.mak.tmp
GEN config-host.h
mkdir -p dtc/libfdt
GIT ui/keycodemapdb dtc
mkdir -p dtc/tests
GEN qemu-options.def
[snip]
GEN migration/trace.h
make: *** [git-submodule-update] Error 1
make: *** Waiting for unfinished jobs....
Upon closer inspection, the root cause of the error is:
% git submodule update --init ui/keycodemapdb dtc
fatal: destination path 'dtc' already exists and is not an empty directory.
Clone of 'git://git.qemu-project.org/dtc.git' into submodule path 'dtc' failed
This patch fixes this race condition by forcing the 'dtc/%' rule which caused
'dtc' to be non-empty to wait on '.git-submodule-status'.
Signed-off-by: Aaron Lindsay <alindsay@codeaurora.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1508352023-28591-1-git-send-email-alindsay@codeaurora.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This will simplify backend / interface objects relationship, so the
frontend interface will simply have to implement the TPM QOM interface.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
The previous patch cleaned up a bit error handling, and exposed an
existing bug: error_report_err() could be called with a NULL error.
Instead, make tpm_emulator_set_locality() set the error.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This is the seabios update for qemu 2.11. Well, almost, seabios is in
freeze for the upcoming 1.11 release. This updates seabios to current
git master snapshot, and it will be updated again to 1.11 final before
the 2.11 release.
With this two-step seabios gets some more wide testing before the actual
release and the update to 1.11 final (which will most likely happen
after qemu freeze) should have bugfix patches only.
git shortlog
============
Aleksandr Bezzubikov (3):
pci: refactor pci_find_capapibilty to get bdf as the first argument instead of the whole pci_device
pci: add QEMU-specific PCI capability structure
pci: enable RedHat PCI bridges to reserve additional resources on PCI init
Ben Warren (5):
QEMU DMA: Add DMA write capability
romfile-loader: Switch to using named structs
QEMU fw_cfg: Add command to write back address of file
QEMU fw_cfg: Add functions for accessing files by key
QEMU fw_cfg: Write fw_cfg back on S3 resume
Daniel Verkamp (5):
nvme: support NVMe 1.0 controllers
nvme: extend command timeout to 5 seconds
nvme: fix reversed loop condition in cmd_readwrite
nvme: fix extraction of status code bits
nvme: fix copy-paste mistake in comment
Filippo Sironi (1):
nvme: Use the Maximum Queue Entries Supported (MQES) to initialize I/O queues
Gerd Hoffmann (7):
usb: add hub portmap
usb-xhci: use hub portmap
std: add cp437 to unicode map
kbd: make enqueue_key public, add ascii_to_keycode
romfile: add support for constant files.
paravirt: serial console configuration.
add serial console support
Igor Mammedov (1):
drop "etc/boot-cpus" fw_cfg file and reuse legacy QEMU_CFG_NB_CPUS
Jason Wang (1):
virtio: IOMMU support
Julian Stecklina (2):
block: add NVMe boot support
nvme: fix out of memory behavior
Julius Werner (1):
coreboot: Adapt to upstream CBMEM console changes
Kevin O'Connor (26):
usb: Make usb_time_sigatt variable static
tpm: Add comment banners to tcg.c separating major parts of spec
tpm: Don't call tpm_set_failure() from tpm12_get_capability()
tpm: Move code around in tcgbios.c to keep like code together
acpi: Generalize find_fadt() and find_tcpa_by_rsdp() into find_acpi_table()
tpm: Don't call tpm_build_and_send_cmd() from tpm20_stirrandom()
tpm: Rework tpm_build_and_send_cmd() into tpm_simple_cmd()
ps2port: Disable keyboard/mouse prior to resetting ps2 controller
docs: Note release dates for 1.10.1 and 1.10.2
resume: Don't attempt to use generic reboot mechanisms on QEMU
boot: Increase description size in boot menu
src: Minor - remove tab characters that slipped into SeaBIOS C code
NVMe: Allow NVMe to be enabled on real hardware
smm: Backup and restore A20 on an SMI based mode switch
stacks: Make sure to initialize Call16Data
stacks: Don't update the A20 settings if they haven't changed
stacks: There is no need to disable NMI if it is already disabled
vga: Fix bug in stdvga_get_linesize()
docs: Fix typos in Memory_Model.md
tcgbios: Fix use of unitialized variable
boot: Rename drive_g to drive
disk: Don't require the 'struct drive_s' to be in the f-segment
block: Rename disk_op_s->drive_gf to drive_fl
virtio: Allocate drive_s storage in low memory
xhci: Build TRBs directly in xhci_trb_queue()
xhci: Verify the device is still present in xhci_cmd_submit()
Ladi Prosek (1):
ahci: Set upper 32-bit registers to zero
Patrick Rudolph (4):
SeaVGABios/cbvga: Advertise correct pixel format
SeaVGABIOS/vbe: Query driver for scanline pitch v2
SeaVGABios/cbvga: Use active mode to clear screen
SeaVGABios/cbvga: Advertise compatible VESA modes
Paul Menzel (1):
vgasrc: Increase debug level
Petr Berky (1):
config: Add function to check if fw_cfg exists
Ricardo Ribalda Delgado (1):
serialio: Support for mmap serial ports
Roman Kagan (11):
blockcmd: accept only disks and CD-ROMs
blockcmd: generic SCSI luns enumeration
virtio-scsi: enumerate luns with REPORT LUNS
esp-scsi: enumerate luns with REPORT LUNS
usb-uas: enumerate luns with REPORT LUNS
pvscsi: fix the comment about lun enumeration
mpt-scsi: try to enumerate luns with REPORT LUNS
lsi-scsi: reset in case of a serious problem
lsi-scsi: try to enumerate luns with REPORT LUNS
blockcmd: start REPORT_LUNS with the smallest buffer
Revert "lsi-scsi: reset in case of a serious problem"
Stefan Berger (1):
tpm: Log TPM 2 digest structure in little endian format
Youness Alaoui (1):
nvme: Enable NVMe support for non-qemu hardware
Zeh, Werner (1):
ahci: Disable Native Command Queueing
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Commit 8d93297 introduced a bug whereby non-inbuilt NICs are realized before
setting the default MAC address causing an assert. Switch NIC creation
over from pci_create_simple() to pci_create() which works exactly the
same except omitting the realize as originally intended.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
This patch updates the sun4u model to being much closer to a real Ultra 5
by moving devices behind the 2 simba PCI bridges (A and B) as found on real
hardware.
The most noticeable change introduced by this patchset is that in-built devices
are no longer attached to the PCI root bus, but instead behind PCI bridge A.
Along with this the interrupt routing is updated accordingly to match the
official documentation.
Since the existing code currently bypasses the PCI bridge interrupt
swizzling, the interrupt mapping functions are reorganised so that
pci_pbm_map_irq() is used by the PCI bridges and pci_apb_map_irq() is
used by the PCI host bridge.
Behind the sabre PCI host bridge, the PCI IO space now needs to be
split into two separate halves at 0x8000000. Therefore we also setup a new
PCI IO space region of increased size on the PCI host bridge and enable
32-bit PCI IO accesses to allow IO accesses to reach devices behind PCI
bridge B correctly.
As part of this change we also combine the onboard sunhme NIC and the ebus
into a single multi-function device as done on a real Ultra 5. For other
NICs the existing behaviour is preserved, i.e. we initialise them and
place them into the next free slot on PCI bus B.
Finally we mark the physically unavailable slots (plus slot 0 in busA) as
reserved to ensure that users can't plug devices into non-existent slots
which will break interrupt routing.
Note: since this commit changes PCI topology and interrupt routing, an
updated openbios-sparc64 binary is included with this commit containing the
associated changes to maintain bisectability.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Artyom Tarasenko <atar4qemu@gmail.com>
Logical block size of a SCSI disk should never be larger than
physical block size. From an ATA/SCSI perspective, it makes no sense
to have the logical block size greater than the physical block size,
and it cannot even be effectively expressed in the command set. The
whole point of adding the physical block size to the ATA/SCSI command
set was to communicate a desire for a larger block size (than logical),
while maintaining backwards compatibility with legacy 512 byte block
size.
When setting logical_block_size > physical_block_size, QEMU cannot express
it in READ CAPACITY(16) output, and all it can do is set the physical
block exponent to 0 (i.e. logical_block_size == physical_block_size).
Reporting the error properly, however, is better.
Signed-off-by: Mark Kanda <mark.kanda@oracle.com>
Reviewed-by: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com>
Reviewed-by: Martin K. Petersen <martin.petersen@oracle.com>
Message-Id: <1508185024-5840-1-git-send-email-mark.kanda@oracle.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
DEVICE_DEL is currently emitted when a Device is unparented, as
opposed to when it is finalized. The main design motivation for this
seems to be that after unparent()/unrealize(), the Device is no
longer visible to the guest, and thus the operation is complete
from the perspective of management.
However, there are cases where remaining host-side cleanup is also
pertinent to management. The is generally handled by treating these
resources as aspects of the "backend", which can be managed via
separate interfaces/events, such as blockdev_add/del, netdev_add/del,
object_add/del, etc, but some devices do not have this level of
compartmentalization, namely vfio-pci, and possibly to lend themselves
well to it.
In the case of vfio-pci, the "backend" cleanup happens as part of
the finalization of the vfio-pci device itself, in particular the
cleanup of the VFIO group FD. Failing to wait for this cleanup can
result in tools like libvirt attempting to rebind the device to
the host while it's still being used by VFIO, which can result in
host crashes or other misbehavior depending on the host driver.
Deferring DEVICE_DEL still affords us the ability to manage backends
explicitly, while also addressing cases like vfio-pci's, so we
implement that approach here.
An alternative proposal involving having VFIO emit a separate event
to denote completion of host-side cleanup was discussed, but the
prevailing opinion seems to be that it is not worth the added
complexity, and leaves the issue open for other Device implementations
to solve in the future.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20171016222315.407-4-mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit abed886ec6.
This patch originally addressed an issue where a DEVICE_DELETED
event could be emitted (in device_unparent()) before a Device's
QemuOpts were cleaned up (in device_finalize()), leading to a
"duplicate ID" error if management attempted to immediately add
a device with the same ID in response to the DEVICE_DELETED event.
An alternative will be implemented in a subsequent patch where we
defer the DEVICE_DELETED event until device_finalize(), which would
also prevent the race, so we revert the original fix in preparation.
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20171016222315.407-3-mdroth@linux.vnet.ibm.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
device_unparent(dev, ...) is called when a device is unparented,
either directly, or as a result of a parent device being
finalized, and handles some final cleanup for the device. Part
of this includes emiting a DEVICE_DELETED QMP event to notify
management, which includes the device's path in the composition
tree as provided by object_get_canonical_path().
object_get_canonical_path() assumes the device is still connected
to the machine/root container, and will assert otherwise, but
in some situations this isn't the case:
If the parent is finalized as a result of object_unparent(), it
will still be attached to the composition tree at the time any
children are unparented as a result of that same call to
object_unparent(). However, in some cases, object_unparent()
will complete without finalizing the parent device, due to
lingering references that won't be released till some time later.
One such example is if the parent has MemoryRegion children (which
take a ref on their parent), who in turn have AddressSpace's (which
take a ref on their regions), since those AddressSpaces get cleaned
up asynchronously by the RCU thread.
In this case qdev:device_unparent() may be called for a child Device
that no longer has a path to the root/machine container, causing
object_get_canonical_path() to assert.
Fix this by storing the canonical path during realize() so the
information will still be available for device_unparent() in such
cases.
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Michael Roth <mdroth@linux.vnet.ibm.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Message-Id: <20171016222315.407-2-mdroth@linux.vnet.ibm.com>
[Clear dev->canonical_path at the post_realize_fail label, which is
cleaner. Suggested by David Gibson. - Paolo]
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
libmultipath has recently changed its API. The new API supports multi-threaded
clients better. Unfortunately there is no backwards-compatibility, so we just
switch to the new one. Running QEMU compiled with the new library on the old
library will likely crash, while doing the opposite will cause QEMU not to
start at all (because udev, get_multipath_config and put_multipath_config
are undefined).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however a write
write watchpoint would end up tearing the 8-byte write into two 4-byte
writes in access_with_adjusted_size().
Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Aligned 8-byte memory writes by a 64-bit target on a 64-bit host should
always turn into atomic 8-byte writes on the host, however if we missed
in the softmmu, and the TLB line was marked as not dirty, then we
would end up tearing the 8-byte write into two 4-byte writes in
access_with_adjusted_size().
Signed-off-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Message-Id: <20171013181913.7556-1-Andrew.Baumann@microsoft.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Attributes are not updated via region_add()/region_del(). Attribute changes
lead to a delete first, followed by a new add.
If this would ever not be the case, we would get an error when trying to
register the new slot.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-6-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It might be confusing for some listener implementations that implement
both, region_add and log_start (e.g. KVM) if we call log_start before an
actual region was added using region_add.
This makes current KVM code trigger an assertion
("kvm_section_update_flags: error finding slot"). So let's just reverse
the order instead of tolerating log_start on yet unknown regions.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20171016144302.24284-2-david@redhat.com>
Tested-by: Joe Clifford <joeclifford@gmail.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The TARGET_MTIOCTOP/TARGET_MTIOCGET/TARGET_MTIOCPOS values
were being defined in terms of host struct types, but
these structures are such that their size might differ
on different hosts. Switch to using a target struct
definition instead.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
This adds the -dfilter support to linux-user. There is a minor
checkpatch complaint about formatting which I've ignored for aesthetic
reasons.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
ppc patch queue 2017-10-17
Here's the currently accumulated set of ppc patches for qemu.
* The biggest set here is the ppc parts of Igor Mammedov's cleanups
to cpu model handling
* The above also includes a generic patches which are required as
prerequisites for the ppc parts. They don't seem to have been
merged by Eduardo yet, so I hope they're ok to include here.
* Apart from that it's basically just assorted bug fixes and cleanups
# gpg: Signature made Tue 17 Oct 2017 05:20:03 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20171017: (34 commits)
spapr_cpu_core: rewrite machine type sanity check
spapr_pci: fail gracefully with non-pseries machine types
spapr: Correct RAM size calculation for HPT resizing
ppc: pnv: consolidate type definitions and batch register them
ppc: pnv: drop PnvChipClass::cpu_model field
ppc: pnv: define core types statically
ppc: pnv: drop PnvCoreClass::cpu_oc field
ppc: pnv: normalize core/chip type names
ppc: pnv: use generic cpu_model parsing
ppc: spapr: use generic cpu_model parsing
ppc: move ppc_cpu_lookup_alias() before its first user
ppc: spapr: use cpu model names as tcg defaults instead of aliases
ppc: spapr: register 'host' core type along with the rest of core types
ppc: spapr: use cpu type name directly
ppc: spapr: define core types statically
ppc: move '-cpu foo,compat=xxx' parsing into ppc_cpu_parse_featurestr()
ppc: spapr: replace ppc_cpu_parse_features() with cpu_parse_cpu_model()
ppc: 40p/prep: replace cpu_model with cpu_type
ppc: virtex-ml507: replace cpu_model with cpu_type
ppc: replace cpu_model with cpu_type on ref405ep,taihu boards
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge QIO 2017/10/16 v1
# gpg: Signature made Mon 16 Oct 2017 17:10:54 BST
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qio-2017-10-16-1:
io: fix mem leak in websock error path
io: add trace points for websocket HTTP protocol headers
io: cope with websock 'Connection' header having multiple values
io: get rid of bounce buffering in websock write path
io: pass a struct iovec into qio_channel_websock_encode
io: get rid of qio_channel_websock_encode helper method
io: simplify websocket ping reply handling
io: monitor encoutput buffer size from websocket GSource
sockets: Handle race condition between binds to the same port
sockets: factor out create_fast_reuse_socket
sockets: factor out a new try_bind() function
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This fixes a potential data leak to the guest.
# gpg: Signature made Mon 16 Oct 2017 16:08:25 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: use g_malloc0 to allocate space for xattr
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
egl_texture_blit() blits a texture, simliar to egl_fb_blit() but by
rendering the texture to the screen instead of using a framebuffer blit.
egl_texture_blend() renders a texture with alpha blending, will be used
to render the cursor to the screen.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010135453.6704-6-kraxel@redhat.com
Add vertex shader which flips the texture upside down while blitting it.
Add argument to qemu_gl_run_texture_blit() to enable flipping.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010135453.6704-4-kraxel@redhat.com
With the upcoming dmabuf support in qemu there will be more users of the
shaders than just console-gl.c. So rename ConsoleGLState to
QemuGLShader, rename some functions too, move code from console-gl.c to
shaders.c.
No functional change.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010135453.6704-3-kraxel@redhat.com
This patch adds support for dma-bufs to the qemu console interfaces.
It adds a new "struct QemuDmaBuf" to represent a dmabuf with accociated
metatdata (size, format). It adds three functions (and
DisplayChangeListenerOps operations) to set a dma-buf as display
scanout, as cursor and to release a dmabuf.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010135453.6704-2-kraxel@redhat.com
Commit "3d90c62548 vga: stop passing pointers to vga_draw_line*
functions" is incomplete. It doesn't handle the case that the vga
rendering code tries to create a shared surface, i.e. a pixman image
backed by vga video memory. That can not work in case the guest display
wraps from end of video memory to the start. So force shadowing in that
case. Also adjust the snapshot region calculation.
Can trigger with cirrus only, when programming vbe modes using the bochs
api (stdvga, also qxl and virtio-vga in vga compat mode) wrap arounds
can't happen.
Fixes: CVE-2017-13672
Fixes: 3d90c62548
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20171010141323.14049-3-kraxel@redhat.com
This makes the code easier to understand and it is consistent with what
we already do for PHBs.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU currently crashes when the user tries to add an spapr-pci-host-bridge
on a non-pseries machine:
$ qemu-system-ppc64 -M ppce500 -device spapr-pci-host-bridge,index=1
hw/ppc/spapr_pci.c:1535:spapr_phb_realize:
Object 0x1003dacae60 is not an instance of type spapr-machine
Aborted (core dumped)
The same thing happens with the deprecated but still available child type
spapr-pci-vfio-host-bridge.
Fix both by checking the machine type with object_dynamic_cast().
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In order to prevent the guest from forcing the allocation of large amounts
of qemu memory (or host kernel memory, in the case of KVM HV), we limit
the size of Hashed Page Table (HPT) it is allowed to allocated, based on
its RAM size.
However, the current calculation is not correct: it only adds up the size
of plugged memory, ignoring the base memory size. This patch corrects it.
While we're there, use get_plugged_memory_size() instead of directly
calling pc_existing_dimms_capacity(). The only difference is that it
will abort on failure, which is right: a failure here indicates something
wrong within qemu.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
deduce core type directly from chip type instead of
maintaining type mapping in PnvChipClass::cpu_model.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
pnv core type definition doesn't have any fields that
require it to be defined at runtime. So replace code
that fills in TypeInfo at runtime with static TypeInfo
array that does the same at complie time.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
deduce cpu type directly from core type instead of
maintaining type mapping in PnvCoreClass::cpu_oc and doing
extra cpu_model parsing in pnv_core_class_init()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
typically for cpus/core type names following convention is used
new_type_prefix-superclass_typename
make PNV core/chip to follow common convention.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
use common cpu_model prasing in vl.c and set default cpu_model
using generic MachineClass::default_cpu_type.
Beside of switching to generic infrastructure it solves several
issues.
* ppc_cpu_class_by_name() is used to deal with lower/upper case
and alias translations into actual cpu type, which fixes
'-M powernv -cpu power8' and '-M powernv -cpu power9_v1.0'
usecases which error out with:
'invalid CPU model 'FOO' for powernv machine'
* allows to switch to lower-case typenames in pnv chip/core name
(by convention typnames should be lower-case)
* replace aliased names /power8, power9, .../ with exact cpu model
names (i.e. typenames should be stable but aliases might decide to
point to other cpu model withi family or changed by kvm). It will
also help to simplify pnv_chip/core code and get rid of dependency
on cpu_model parsing.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
[dwg: Updated to make DD2.0 as default POWER9 chip]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
use generic cpu_model parsing introduced by
(6063d4c0f vl.c: convert cpu_model to cpu type and set of global properties before machine_init())
it allows to:
* replace sPAPRMachineClass::tcg_default_cpu with
MachineClass::default_cpu_type
* drop cpu_parse_cpu_model() from hw/ppc/spapr.c and reuse
one in vl.c
* simplify spapr_get_cpu_core_type() by removing
not needed anymore recurrsion since alias look up
happens earlier at vl.c and spapr_get_cpu_core_type()
works only with resulted from that cpu type.
* spapr no more needs to parse/depend on being phased out
MachineState::cpu_model, all tha parsing done by generic
code and target specific callback.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
[dwg: Correct minor compile error]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
next commit will drop ppc_cpu_lookup_alias() declaration from header
and make it static which will break its last user ppc_cpu_class_by_name()
since ppc_cpu_class_by_name() defined before ppc_cpu_lookup_alias().
To avoid this move ppc_cpu_lookup_alias() right before
ppc_cpu_class_by_name().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr core type definition doesn't have any fields that
require it to be defined at runtime. So replace code
that fills in TypeInfo at runtime with static TypeInfo
array that does the same at complie time.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
there is a dedicated callback CPUClass::parse_features
which purpose is to convert -cpu features into a set of
global properties AND deal with compat/legacy features
that couldn't be directly translated into CPU's properties.
Create ppc variant of it (ppc_cpu_parse_featurestr) and
move 'compat=val' handling from spapr_cpu_core.c into it.
That removes a dependency of board/core code on cpu_model
parsing and would let to reuse common -cpu parsing
introduced by 6063d4c0
Set "max-cpu-compat" property only if it exists, in practice
it should limit 'compat' hack to spapr machine and allow
to avoid including machine/spapr headers in target/ppc/cpu.c
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
DEFINE_TYPES() will help to simplify following routine patterns:
static void foo_register_types(void)
{
type_register_static(&foo1_type_info);
type_register_static(&foo2_type_info);
...
}
type_init(foo_register_types)
or
static void foo_register_types(void)
{
int i;
for (i = 0; i < ARRAY_SIZE(type_infos); i++) {
type_register_static(&type_infos[i]);
}
}
type_init(foo_register_types)
with a single line
DEFINE_TYPES(type_infos)
where types have static definition which could be consolidated in
a single array of TypeInfo structures.
It saves us ~6-10LOC per use case and would help to replace
imperative foo_register_types() there with declarative style of
type registration.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
LMB removal is completed only when the spapr_lmb_release callback
is called after all DRCs of the dimm are detached. During this
time, it is possible that a unplug request for the same dimm
arrives, trying to detach DRCs that were detached by the guest
in the first unplug_request.
BQL doesn't help in this case - the lock will prevent any concurrent
removal from happening until the end of spapr_memory_unplug_request
only. What happens is that the second unplug_request ends up calling
spapr_drc_detach in a DRC that were detached already, causing an
assert error in spapr_drc_detach (e.g
https://bugs.launchpad.net/qemu/+bug/1718118).
spapr_lmb_release uses a structure called sPAPRDIMMState, stored in the
spapr->pending_dimm_unplugs QTAIL, to track how many LMB DRCs are left
to be detached by the guest. When there are no more DRCs left, this
structure is deleted and the pc-dimm unplug handler is called to
finish the process.
This patch reuses the sPAPRDIMMState to allow unplug_request to know
if there is an ongoing unplug process for a given dimm, aborting the
unplug request in this case, by doing the following changes:
- in spapr_lmb_release callback, move the dimm state removal to the
end, after pc-dimm unplug handler. With this change we can check for
the existence of the dimm state to see if the unplug process is
done.
- use spapr_pending_dimm_unplugs_find in spapr_memory_unplug_request
to check if the dimm state exists. If positive, there is an unplug
operation already in progress for this dimm, meaning that we should
abort it and warn the user about it.
Fixes: https://bugs.launchpad.net/qemu/+bug/1718118
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
For POWER ISA v3.0, the XER bit CA32 needs to be set by the shift
right algebraic instructions whenever the CA bit is to be set. This
change affects the following instructions:
* Shift Right Algebraic Word (sraw[.])
* Shift Right Algebraic Word Immediate (srawi[.])
* Shift Right Algebraic Doubleword (srad[.])
* Shift Right Algebraic Doubleword Immediate (sradi[.])
Signed-off-by: Sandipan Das <sandipan@linux.vnet.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
At the moment the only POWER9 model which is listed in qemu is v1.0 (aka
"DD1"). This is a very early (read, buggy) version which will never be
released to the public - it was included in qemu only for the convenience
of those doing bringup on the early silicon. For bonus points, we actually
had its PVR incorrect in the table (0x004e0000 instead of 0x004e0100). We
also never actually implemented the differences in behaviour (read, bugs)
that marked DD1 in qemu.
Now that we know the PVR for the substantially better v2.0 (DD2) chip,
include it and make it the default POWER9 in qemu. For the time being we
leave the DD1 definition in place for the poor souls (read, me) who still
need to work with DD1 hardware.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The CAS buffer is provided by SLOF. A broken SLOF could pass a silly
size: either smaller than the diff header, in which case the current
code will try to allocate 16 Exabytes of memory and g_malloc0() will
abort, or bigger than the maximum memory provisioned for SLOF (ie,
40 Megabytes), which doesn't make sense. Both cases indicate that
SLOF has a bug.
Let's print out an explicit error message and exit since rebooting as
we do with other errors would only result in a reset loop.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Fix format specifier that broke 32-bit builds]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
We don't have any 460 or 460F CPUs in QEMU, so the init functions
are just dead code. Let's simply remove them (translate_init.c
is already big enough without them).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The offset of the root node is guaranteed to be 0.
This doesn't fix anything, it's just trivial cleanup of the two
remaining places where this was done under hw/ppc.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 4f7265f "ppc/ide/macio: Add missing registers" added two extra macio
registers but forgot to add them to the corresponding VMStateDescription.
The version number is bumped accordingly, although this will have little
effect given that the Mac machines are practically unmigratable.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Acked-by: John Snow <jsnow@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When using filter-mirror like the example below where the interface
'ndev0' does not exist on the host, QEMU crashes into segmentation
fault.
$ qemu-system-x86_64 -S -machine pc -netdev user,id=ndev0 -object filter-mirror,id=test-object,netdev=ndev0
This happens because the function filter_mirror_setup() does not check
if the device actually exists and still keep on processing calling
qemu_chr_find(). This patch fixes this issue.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The struct OrIRQState has an unused member field in_irqs.
This is a legacy of earlier versions of the patch; the
code that used it was dropped from the final version of
the code that went into master, but we forgot to delete
the no-longer-used struct field. Do so now.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This fixes a compiler warning:
/qemu/io/channel-websock.c:163:5: error:
function might be possible candidate for ‘gnu_printf’ format attribute
[-Werror=suggest-attribute=format]
Signed-off-by: Stefan Weil <sw@weilnetz.de>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Comments explaining why we include a header tend to go bad. This
one's almost comical: not only doesn't qemu-options.hx use
MAP_POPULATE anymore (since commit ef36fa1, v2.0.0, 2013), even the
include it applies to got moved away in commit 02d0e09 (v2.7.0).
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The header file was introduced by fbcc3e5 ("qemu-thread: optimize QemuLockCnt
with futexes on Linux", 2017-01-16) without header guards. Add them.
Signed-off-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Remove trailing whitespace in qemu-doc.texi, as it causes
reproducibility issues depending on the echo implementation
used by the Makefile.
Reported-By: Vagrant Cascadian <vagrant@debian.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Some m68k, qtest and config improvements
# gpg: Signature made Mon 16 Oct 2017 13:38:03 BST
# gpg: using RSA key 0x2ED9D774FE702DB5
# gpg: Good signature from "Thomas Huth <th.huth@gmx.de>"
# gpg: aka "Thomas Huth <thuth@redhat.com>"
# gpg: aka "Thomas Huth <huth@tuxfamily.org>"
# gpg: aka "Thomas Huth <th.huth@posteo.de>"
# Primary key fingerprint: 27B8 8847 EEE0 2501 18F3 EAB9 2ED9 D774 FE70 2DB5
* remotes/huth/tags/pull-request-2017-10-16:
default-configs: Enable CONFIG_VMXNET3_PCI only on x86
tests/prom-env: Bump the timeout, and test pseries only in slow mode
tests: use g_new() family of functions
M68K: use g_new() family of functions
hw/m68k: Replace fprintf(stderr, "*\n" with error_report()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pc, pci, virtio: fixes, features
A bunch of fixes all over the place.
A new vmcore device - the user interface around it is still somewhat
controversial, but I feel most of the code is fine, suggestions can be
addressed by adding patches on top.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Sun 15 Oct 2017 04:02:23 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream: (26 commits)
tests/pxe: Test more NICs when running in SPEED=slow mode
pc: remove useless hot_add_cpu initialisation
isapc: Remove unnecessary migration compatibility code
virtio-pci: Replace modern_as with direct access to modern_bar
virtio: fix descriptor counting in virtqueue_pop
hw/gen_pcie_root_port: make IO RO 0 on IO disabled
pci: Validate interfaces on base_class_init
xen/pt: Mark TYPE_XEN_PT_DEVICE as hybrid
pci: Add INTERFACE_CONVENTIONAL_PCI_DEVICE to Conventional PCI devices
pci: Add INTERFACE_PCIE_DEVICE to all PCIe devices
pci: Add interface names to hybrid PCI devices
pci: conventional-pci-device and pci-express-device interfaces
PCI: PCIe access should always be little endian
virtio/pci/migration: Convert to VMState
hw/pci-bridge/pcie_pci_bridge: properly handle MSI unavailability case
pci: allow 32-bit PCI IO accesses to pass through the PCI bridge
virtio/vhost: reset dev->log after syncing
MAINTAINERS: add Dump maintainers
scripts/dump-guest-memory.py: add vmcoreinfo
kdump: set vmcoreinfo location
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Besides being more correct, arbitrarily long instruction allow the
generation of a translation block that spans three pages. This
confuses the generator and even allows ring 3 code to poison the
translation block cache and inject code into other processes that are
in guest ring 3.
This is an improved (and more invasive) fix for commit 30663fd ("tcg/i386:
Check the size of instruction being translated", 2017-03-24). In addition
to being more precise (and generating the right exception, which is #GP
rather than #UD), it distinguishes better between page faults and too long
instructions, as shown by this test case:
#include <sys/mman.h>
#include <string.h>
#include <stdio.h>
int main()
{
char *x = mmap(NULL, 8192, PROT_READ|PROT_WRITE|PROT_EXEC,
MAP_PRIVATE|MAP_ANON, -1, 0);
memset(x, 0x66, 4096);
x[4096] = 0x90;
x[4097] = 0xc3;
char *i = x + 4096 - 15;
mprotect(x + 4096, 4096, PROT_READ|PROT_WRITE);
((void(*)(void)) i) ();
}
... which produces a #GP without the mprotect, and a #PF with it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
These take care of advancing s->pc, and will provide a unified point
where to check for the 15-byte instruction length limit.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This should be done by all target and, since commit 53f6672bcf
("gen-icount: use tcg_ctx.tcg_env instead of cpu_env", 2017-06-30),
is causing the NIOS2 target to hang.
This is because the test for "should I exit to the main loop"
was being done with the correct offset to the icount decrementer,
but using TCG temporary 0 (the frame pointer) rather than the
env pointer.
Cc: qemu-stable@nongnu.org
Cc: Marek Vasut <marex@denx.de>
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It is used by all PPC targets; we can give the directory its own
Makefile.objs file, and include it directly from target/ppc.
target/s390 can do the same when it starts using it.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Coverity pointed out the 'date' is not free()d in the error
path
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The noVNC server sends a header "Connection: keep-alive, Upgrade" which
fails our simple equality test. Split the header on ',', trim whitespace
and then check for 'upgrade' token.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Currently most outbound I/O on the websock channel gets copied into the
rawoutput buffer, and then immediately copied again into the encoutput
buffer, with a header prepended. Now that qio_channel_websock_encode
accepts a struct iovec, we can trivially remove this bounce buffering
and write directly to encoutput.
In doing so, we also now correctly validate the encoutput size against
the QIO_CHANNEL_WEBSOCK_MAX_BUFFER limit.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Instead of requiring use of another Buffer, pass a struct iovec
into qio_channel_websock_encode, which gives callers more
flexibility in how they process data.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The qio_channel_websock_encode method is only used in one place,
everything else calls qio_channel_websock_encode_buffer directly.
It can also be pushed up a level into the qio_channel_websock_writev
method, since every other caller of qio_channel_websock_write_wire
has already filled encoutput.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
We must ensure we don't get flooded with ping replies if the outbound
channel is slow. Currently we do this by keeping the ping reply in a
separate temporary buffer and only writing it if the encoutput buffer
is completely empty. This is overly pessimistic, as it is reasonable
to add a ping reply to the encoutput buffer even if it has previous
data in it, as long as that previous data doesn't include a ping
reply.
To track this better, put the ping reply directly into the encoutput
buffer, and then record the size of encoutput at this time in
pong_remain. As we write encoutput to the underlying channel, we
can decrement the pong_remain counter. Once it hits zero, we can
accept further ping replies for transmission.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The websocket GSource is monitoring the size of the rawoutput
buffer to determine if the channel can accepts more writes.
The rawoutput buffer, however, is merely a temporary staging
buffer before data is copied into the encoutput buffer. Thus
its size will always be zero when the GSource runs.
This flaw causes the encoutput buffer to grow without bound
if the other end of the underlying data channel doesn't
read data being sent. This can be seen with VNC if a client
is on a slow WAN link and the guest OS is sending many screen
updates. A malicious VNC client can act like it is on a slow
link by playing a video in the guest and then reading data
very slowly, causing QEMU host memory to expand arbitrarily.
This issue is assigned CVE-2017-15268, publically reported in
https://bugs.launchpad.net/qemu/+bug/1718964
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
If an offset of ports is specified to the inet_listen_saddr function(),
and two or more processes tries to bind from these ports at the same time,
occasionally more than one process may be able to bind to the same
port. The condition is detected by listen() but too late to avoid a failure.
This function is called by socket_listen() and used
by all socket listening code in QEMU, so all cases where any form of dynamic
port selection is used should be subject to this issue.
Add code to close and re-establish the socket when this
condition is observed, hiding the race condition from the user.
Also clean up some issues with error handling to allow more
accurate reporting of the cause of an error.
This has been developed and tested by means of the
test-listen unit test in the previous commit.
Enable the test for make check now that it passes.
Reviewed-by: Bhavesh Davda <bhavesh.davda@oracle.com>
Reviewed-by: Yuval Shaia <yuval.shaia@oracle.com>
Reviewed-by: Girish Moodalbail <girish.moodalbail@oracle.com>
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Another refactoring step to prepare for fixing the problem
exposed with the test-listen test in the previous commit
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
A refactoring step to prepare for the problem
exposed by the test-listen test in the previous commit.
Simplify and reorganize the IPv6 specific extra
measures and move it out of the for loop to increase
code readability. No semantic changes.
Signed-off-by: Knut Omang <knut.omang@oracle.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
nbd patches for 2017-10-14
- Marc-André Lureau - NBD: use g_new() family of functions
- Vladimir Sementsov-Ogievskiy - first half of 00/13 nbd minimal structured read
# gpg: Signature made Sun 15 Oct 2017 01:38:47 BST
# gpg: using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2017-10-14:
nbd: header constants indenting
nbd/server: simplify reply transmission
nbd/server: refactor nbd_co_send_simple_reply parameters
nbd/server: do not use NBDReply structure
nbd/server: structurize simple reply header sending
nbd: rename some simple-request related objects to be _simple_
block/nbd-client: refactor nbd_co_receive_reply
block/nbd-client: assert qiov len once in nbd_co_request
NBD: use g_new() family of functions
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We were defining TARGET_FS_IOC_GETFLAGS and TARGET_FS_IOC_SETFLAGS
using the host 'long' type in the size field, which meant that
they had the wrong values if the host and guest had different
sized longs. Switch to abi_long instead.
This fixes a bug where these ioctls don't work on 32-bit guests
on 64-bit hosts (and makes the LTP test 'setxattr03' pass
where it did not previously.)
Reported-by: pgndev <pgnet.dev@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
We had a check using TARGET_VIRT_ADDR_SPACE_BITS to make sure
that the allocation coming in from the command-line option was
not too large, but that didn't include target-specific knowledge
about other restrictions on user-space.
Remove several target-specific hacks in linux-user/main.c.
For MIPS and Nios, we can replace them with proper adjustments
to the respective target's TARGET_VIRT_ADDR_SPACE_BITS definition.
For ARM, we had no existing ifdef but I suspect that the current
default value of 0xf7000000 was chosen with this in mind. Define
a workable value in linux-user/arm/, and also document why the
special case is required.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20170708025030.15845-3-rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Most of the users of page_set_flags offset (page, page + len) as
the end points. One might consider this an error, since the other
users do supply an endpoint as the last byte of the region.
However, the first thing that page_set_flags does is round end UP
to the start of the next page. Which means computing page + len - 1
is in the end pointless. Therefore, accept this usage and do not
assert when given the exact size of the vm as the endpoint.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170708025030.15845-2-rth@twiddle.net>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
The 32-bit ARM validate_guest_space() check tests whether the
specified -R value leaves enough space for us to put the
commpage in at 0xffff0f00. However it was incorrectly doing
a <= check for the check against (guest_base + guest_size),
which meant that it wasn't permitting the guest space to
butt right up against the commpage.
Fix the comparison, so that -R values all the way up to 0xffff0000
work correctly.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Riku Voipio <riku.voipio@linaro.org>
Since O_TMPFILE might differ between guest and host,
add it to the bitmask_transtbl. While at it, fix the definitions
of O_DIRECTORY etc which should arm32 according to kernel sources.
This fixes open14 and openat03 ltp testcases. Fixes:
https://bugs.launchpad.net/qemu/+bug/1709170
The gd_gl_area_scanout_texture must destroy framebuffer if there is
no texture id instead of no framebuffer id.
The effect was a black screen with "-vga virtio -display gtk,gl=on"
options.
The bug was introduce by a4f113fd "gtk: use framebuffer helper functions."
Signed-off-by: Anthoine Bourgeois <anthoine.bourgeois@blade-group.com>
Message-id: 20171002124052.13829-1-anthoine.bourgeois@gmail.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Always use QKeyCode in the InputKeyEvent struct, by converting key
numbers to QKeyCode at the time the event is created. This allows
the code processing / consuming key events to assume QKeyCode is
used. The only place we accept a key number in the InputKeyEvent
struct is with QMP commands sent by the user.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170929101201.21039-6-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Replace the number_to_qcode, qcode_to_number and linux_to_qcode
tables with automatically generated tables.
Missing entries in linux_to_qcode now fixed:
KEY_LINEFEED -> Q_KEY_CODE_LF
KEY_KPEQUAL -> Q_KEY_CODE_KP_EQUALS
KEY_COMPOSE -> Q_KEY_CODE_COMPOSE
KEY_AGAIN -> Q_KEY_CODE_AGAIN
KEY_PROPS -> Q_KEY_CODE_PROPS
KEY_UNDO -> Q_KEY_CODE_UNDO
KEY_FRONT -> Q_KEY_CODE_FRONT
KEY_COPY -> Q_KEY_CODE_COPY
KEY_OPEN -> Q_KEY_CODE_OPEN
KEY_PASTE -> Q_KEY_CODE_PASTE
KEY_CUT -> Q_KEY_CODE_CUT
KEY_HELP -> Q_KEY_CODE_HELP
KEY_MEDIA -> Q_KEY_CODE_MEDIASELECT
In addition, some fixes:
- KEY_PLAYPAUSE now maps to Q_KEY_CODE_AUDIOPLAY, instead of
KEY_PLAYCD. KEY_PLAYPAUSE is defined across almost all scancodes
sets, while KEY_PLAYCD only appears in AT set1, so the former is
a more useful mapping.
Missing entries in qcode_to_number now fixed:
Q_KEY_CODE_AGAIN -> 0x85
Q_KEY_CODE_PROPS -> 0x86
Q_KEY_CODE_UNDO -> 0x87
Q_KEY_CODE_FRONT -> 0x8c
Q_KEY_CODE_COPY -> 0xf8
Q_KEY_CODE_OPEN -> 0x64
Q_KEY_CODE_PASTE -> 0x65
Q_KEY_CODE_CUT -> 0xbc
Q_KEY_CODE_LF -> 0x5b
Q_KEY_CODE_HELP -> 0xf5
Q_KEY_CODE_COMPOSE -> 0xdd
Q_KEY_CODE_KP_EQUALS -> 0x59
Q_KEY_CODE_MEDIASELECT -> 0xed
In addition, some fixes:
- Q_KEY_CODE_MENU was incorrectly mapped to the compose
scancode (0xdd) and is now mapped to 0x9e
- Q_KEY_CODE_FIND was mapped to 0xe065 (Search) instead
of to 0xe041 (Find)
- Q_KEY_CODE_HIRAGANA was mapped to 0x70 (Katakanahiragana)
instead of of 0x77 (Hirigana)
- Q_KEY_CODE_PRINT was mapped to 0xb7 which is not a defined
scan code in AT set 1, it is now mapped to 0x54 (sysrq)
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170929101201.21039-5-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
The https://gitlab.com/keycodemap/keycodemapdb/ repo contains a
data file mapping between all the different scancode/keycode/keysym
sets that are known, and a tool to auto-generate lookup tables for
different combinations.
It is used by GTK-VNC, SPICE-GTK and libvirt for mapping keys.
Using it in QEMU will let us replace many hand written lookup
tables with auto-generated tables from a master data source,
reducing bugs. Adding new QKeyCodes will now only require the
master table to be updated, all ~20 other tables will be
automatically updated to follow.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170929101201.21039-4-berrange@redhat.com
[ kraxel: fix build ]
[ kraxel: switch repo to qemu.git mirror ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
When building the tarball to pass into the docker/vm test image,
the code relies on the git submodules being checked out in the
main checkout.
ie if the developer has not run 'git submodule update --init dtc'
many of the docker tests will fail due to the libfdt package not
being present in the test images. Patchew manually checks out the
dtc submodule in the main git checkout, but this is a bad idea.
When running tests we want to have a predictable set of submodules
included in the source that's tested. The build environment is
completely independent of the developers host OS, so the submodules
the developer has checked out should not be considered relevant for
the tests.
This changes the archive-source.sh script so that it clones the
current git checkout into a temporary directory, checks out a
fixed set of submodules, builds the tarball and finally removes
the temporary git clone.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170929101201.21039-3-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently if DTC is required by configure and not available in the host
OS install, we exit with an error message telling the user to checkout a
git submodule or install the library.
This introduces automatic handling of the git submodule checkout process
and enables it for dtc. This only runs if building from GIT, so users of
release tarballs still need the system library install. The current state
of the git checkout is stashed in .git-submodule-status, and a helper
program is used to determine if this state matches the desired submodule
state. A dependency against 'Makefile' ensures that the submodule state
is refreshed at the start of the build process
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170929101201.21039-2-berrange@redhat.com
[ kraxel: use /bin/sh not bash for scripts/git-submodule.sh ]
[ kraxel: fix Makefile dependencies ]
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
[fixup] Makefile dep
9p back-end first queries the size of an extended attribute,
allocates space for it via g_malloc() and then retrieves its
value into allocated buffer. Race between querying attribute
size and retrieving its could lead to memory bytes disclosure.
Use g_malloc0() to avoid it.
Reported-by: Tuomas Tynkkynen <tuomas.tynkkynen@iki.fi>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Merge tpm 2017/10/04 v3
# gpg: Signature made Fri 13 Oct 2017 12:37:07 BST
# gpg: using RSA key 0x75AD65802A0B4211
# gpg: Good signature from "Stefan Berger <stefanb@linux.vnet.ibm.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: B818 B9CA DF90 89C2 D5CE C66B 75AD 6580 2A0B 4211
* remotes/stefanberger/tags/pull-tpm-2017-10-04-3:
specs: Describe the TPM support in QEMU
tpm: Move tpm_cleanup() to right place
tpm: Added support for TPM emulator
tpm-passthrough: move reusable code to utils
tpm-backend: Move realloc_buffer() implementation to tpm-tis model
tpm-backend: Add new API to read backend TpmInfo
tpm-backend: Made few interface methods optional
tpm-backend: Initialize and free data members in it's own methods
tpm-backend: Move thread handling inside TPMBackend
tpm-backend: Remove unneeded member variable from backend class
tpm: Use EMSGSIZE instead of EBADMSG to compile on OpenBSD
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The device can not be instantiated on many non-x86 and just prints
some error messages, e.g.:
$ qemu-system-ppc64 -device vmxnet3 -M g3beige
[vmxnet3][WR][vmxnet3_init_msix]: Failed to initialize MSI-X, error -95
[vmxnet3][WR][vmxnet3_pci_realize]: Failed to initialize MSI-X, configuration is inconsistent.
Since vmxnet3 is a para-virtualized device that is only useful on x86,
it should also only be enabled on the x86 targets.
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If QEMU has been compiled with the flags --enable-tcg-interpreter and
--enable-debug, the guest is running incredibly slow. The prom-env
test is approximately 10 times slower than normal in this case, and
it takes up to 500 seconds until the test with the pseries machine
finishs. While we should still look for ways to speed up the test
on the pseries machine here, let's bump the timeout to 600 seconds to
allow the test to pass with this unusal configuration already now.
Also move the pseries test into the "slow" category - since it is
really a very slow test.
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Replace a large number of the fprintf(stderr, "*\n" calls with
error_report(). The functions were renamed with these commands and then
compiler issues where manually fixed.
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N; {s|fprintf(stderr, "\(.*\)\\n"\(.*\));|error_report("\1"\2);|Ig}' \
{} +
Some lines where then manually tweaked to pass checkpatch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Thomas Huth <huth@tuxfamily.org>
[thuth: Remove "qemu:" prefix from strings]
Signed-off-by: Thomas Huth <thuth@redhat.com>
# gpg: Signature made Thu 12 Oct 2017 21:52:28 BST
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/vu-pull-request:
libvhost-user: Support VHOST_USER_SET_SLAVE_REQ_FD
libvhost-user: Update and fix feature and request lists
vhost-user-bridge: Only process received packets on started queues
libvhost-user: vu_queue_started
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The pxe-test is a very good test to excercise NICs, thus we should use
it to test all NICs that can be used by the BIOS for booting via network.
However, to avoid that the default testing time increases too much, the
additional NICs are only tested in the "make check SPEED=slow" mode.
The virtio-net NIC on ppc64 is now also only tested in slow mode, since
the test on ppc64 is really quite slow and we've got test coverage for
virtio-net in big endian mode now on s390x, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Since 4458fb3a79 (pc: Eliminate pc_default_machine_options()),
hot_add_cpu is set in pc_machine_class_init(), so we don't
need to set it in pc_q35_machine_options(), pc_i440fx_machine_options()
and xenfv_machine_options(), except to clear it in
pc_i440fx_1_4_machine_opt().
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
We don't touch isapc when we change guest ABI and add new entries
to PC_COMPAT_* or new PCMachineClass compat flags. This means
isapc never guaranteed guest ABI and cross-QEMU-version live
migration compatibility. There's no point in keeping code for
kvm-pv-eoi and APIC ID compatibility in pc_init_isa().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The modern bar is accessed now via yet another address space created just
for that purpose and it does not really need FlatView and dispatch tree
as it has a single memory region so it is just a waste of memory. Things
get even worse when there are dozens or hundreds of virtio-pci devices -
since these address spaces are global, changing any of them triggers
rebuilding all address spaces.
This replaces indirect accesses to the modern BAR with a simple lookup
and direct calls to memory_region_dispatch_read/write.
This is expected to save lots of memory at boot time after applying:
[Qemu-devel] [PULL 00/32] Misc changes for 2017-09-22
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While changing the s/g list allocation, commit 3b3b0628
also changed the descriptor counting to count iovec entries
as split by cpu_physical_memory_map(). Previously only the
actual descriptor entries were counted and the split into
the iovec happened afterwards in virtqueue_map().
Count the entries again instead to avoid erroneous
"Looped descriptor" errors.
Reported-by: Hans Middelhoek <h.middelhoek@ospito.nl>
Link: https://forum.proxmox.com/threads/vm-crash-with-memory-hotplug.35904/
Fixes: 3b3b062821 ("virtio: slim down allocation of VirtQueueElements")
Signed-off-by: Wolfgang Bumiller <w.bumiller@proxmox.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IO_LIMIT and IO_BASE registers should not be writable if
gen_pcie_root_port's io-reserve property is set to 0.
The COMMAND register should have the IO flag read only.
Signed-off-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Make sure we don't forget to add the Conventional PCI or PCI
Express interface names on PCI device classes in the future.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Revieed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
xen-pt doesn't set the is_express field, but is supposed to be
able to handle PCI Express devices too. Mark it as hybrid.
Suggested-by: Jan Beulich <JBeulich@suse.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add INTERFACE_CONVENTIONAL_PCI_DEVICE to all direct subtypes of
TYPE_PCI_DEVICE, except:
1) The ones that already have INTERFACE_PCIE_DEVICE set:
* base-xhci
* e1000e
* nvme
* pvscsi
* vfio-pci
* virtio-pci
* vmxnet3
2) base-pci-bridge
Not all PCI bridges are Conventional PCI devices, so
INTERFACE_CONVENTIONAL_PCI_DEVICE is added only to the subtypes
that are actually Conventional PCI:
* dec-21154-p2p-bridge
* i82801b11-bridge
* pbm-bridge
* pci-bridge
The direct subtypes of base-pci-bridge not touched by this patch
are:
* xilinx-pcie-root: Already marked as PCIe-only.
* pcie-pci-bridge: Already marked as PCIe-only.
* pcie-port: all non-abstract subtypes of pcie-port are already
marked as PCIe-only devices.
3) megasas-base
Not all megasas devices are Conventional PCI devices, so the
interface names are added to the subclasses registered by
megasas_register_types(), according to information in the
megasas_devices[] array.
"megasas-gen2" already implements INTERFACE_PCIE_DEVICE, so add
INTERFACE_CONVENTIONAL_PCI_DEVICE only to "megasas".
Acked-by: Alberto Garcia <berto@igalia.com>
Acked-by: John Snow <jsnow@redhat.com>
Acked-by: Anthony PERARD <anthony.perard@citrix.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
The following devices support both PCI Express and Conventional
PCI, by including special code to handle the QEMU_PCI_CAP_EXPRESS
flag and/or conditional pcie_endpoint_cap_init() calls:
* vfio-pci (is_express=1, but legacy PCI handled by
vfio_populate_device())
* vmxnet3 (is_express=0, but PCIe handled by vmxnet3_realize())
* pvscsi (is_express=0, but PCIe handled by pvscsi_realize())
* virtio-pci (is_express=0, but PCIe handled by
virtio_pci_dc_realize(), and additional legacy PCI code at
virtio_pci_realize())
* base-xhci (is_express=1, but pcie_endpoint_cap_init() call
is conditional on pci_bus_is_express(dev->bus)
* Note that xhci does not clear QEMU_PCI_CAP_EXPRESS like the
other hybrid devices
Cc: Dmitry Fleytman <dmitry@daynix.com>
Cc: Jason Wang <jasowang@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Cc: Alex Williamson <alex.williamson@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Those two interfaces will be used to indicate which device types
support Conventional PCI or PCI Express buses. Management
software will be able to use the qom-list-types QMP command to
query that information.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
PCIe busses are always little endian, so set the endianness of the
memory region to little endian rather than native such that operations
work as expected on big endian targets.
Signed-off-by: Matt Redfearn <matt.redfearn@imgtec.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Convert the 'modern_state' part of virtio-pci to modern migration
macros.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
QEMU with the pcie-pci-bridge device crashes if the guest board doesn't support MSI,
e.g. 'qemu-system-ppc64 -M prep -device pcie-pci-bridge'.
This is caused by wrong pcie-pci-bridge instantiation error handling. This patch fixes this issue
by falling back to legacy INTx if MSI is not available.
Also set the bridge's 'msi' property default value to 'auto' in order to trigger errors
only when user explicitly set msi=on.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Whilst the underlying PCI bridge implementation supports 32-bit PCI IO
accesses, unfortunately they are truncated at the legacy 64K limit.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost_log_put() is called to decomission the dirty log between qemu and
a vhost device when stopping the device. Such a call can happen from
migration_completion().
Present code sets dev->log_size to zero too early in vhost_log_put(),
causing the sync check to always return false. As a consequence, the
last pass on the dirty bitmap never happens at the end of migration.
If a vhost device was busy (writing to guest memory) until the last
moments before vhost_virtqueue_stop(), this error will result in guest
memory corruption (at least) following migrations.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Acked-by: Jason Wang <jasowang@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a vmcoreinfo ELF note in the dump if vmcoreinfo device has the
memory location details.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
kdump header provides offset and size of the vmcoreinfo content,
append it if available (skip the ELF note header).
crash-7.1.9 was the first version that started looking in the
vmcoreinfo data for phys_base instead of in the kdump_sub_header.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
If the guest note is VMCOREINFO, try to get phys_base from it.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Read the guest ELF PT_NOTE from guest memory when fw_cfg
etc/vmcoreinfo entry provides the location, and write it as an
additional note in the dump.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
See docs/specs/vmcoreinfo.txt for details.
"etc/vmcoreinfo" fw_cfg entry is added when using "-device vmcoreinfo".
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reintroduce the write callback that was removed when write support was
removed in commit 023e314856.
Contrary to the previous callback implementation, the write_cb
callback is called whenever a write happened, so handlers must be
ready to handle partial write as necessary.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
ioh3420_interrupts_init() pass error message to local_err, then
propagate it to errp by error_propagate(), which is not necessary.
So eliminate it and pass errp directly instead of local_err.
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Marcel Apfelbaum <marcel@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On commit f8cd1b02 ("pci: Convert to realize"), no error_set*()
call was added for the pcie_chassis_add_slot() error case.
pcie_chassis_add_slot() errors get ignored, making QEMU crash
later. e.g.:
$ qemu-system-x86_64 -device ioh3420 -device xio3130-downstream
qemu-system-x86_64: memory.c:2166: memory_region_del_subregion: Assertion `subregion->container == mr' failed.
Aborted (core dumped)
Fix it by reporting the error using error_setg().
Fixes: f8cd1b0201
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
NBDReply structure will be upgraded in future patches to handle both
simple and structured replies and will be used only in the client
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20171012095319.136610-6-vsementsov@virtuozzo.com>
[eblake: rebase to tweaks earlier in series]
Signed-off-by: Eric Blake <eblake@redhat.com>
BlockDriverState has a bdrv_co_drain() callback but no equivalent for
the end of the drain. The throttle driver (block/throttle.c) needs a way
to mark the end of the drain in order to toggle io_limits_disabled
correctly, thus bdrv_co_drain_end is needed.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This patch adds a description of the current TPM support in QEMU
to the specs.
Several public specs are referenced via their landing page on the
trustedcomputinggroup.org website.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
This change introduces a new TPM backend driver that can communicate with
swtpm(software TPM emulator) using unix domain socket interface. QEMU talks to
the TPM emulator using QEMU's socket-based chardev backend device.
Swtpm uses two Unix sockets for communications, one for plain TPM commands and
responses, and one for out-of-band control messages. QEMU passes the data
socket to be used over the control channel.
The swtpm and associated tools can be found here:
https://github.com/stefanberger/swtpm
The swtpm's control channel protocol specification can be found here:
https://github.com/stefanberger/swtpm/wiki/Control-Channel-Specification
Usage:
# setup TPM state directory
mkdir /tmp/mytpm
chown -R tss:root /tmp/mytpm
/usr/bin/swtpm_setup --tpm-state /tmp/mytpm --createek
# Ask qemu to use TPM emulator with given tpm state directory
qemu-system-x86_64 \
[...] \
-chardev socket,id=chrtpm,path=/tmp/swtpm-sock \
-tpmdev emulator,id=tpm0,chardev=chrtpm \
-device tpm-tis,tpmdev=tpm0 \
[...]
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Tested-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
TPM configuration options are backend implementation details and shall not be
part of base TPMBackend object, and these shall not be accessed directly outside
of the class, hence added a new interface method, get_tpm_options() to
TPMDriverOps., which shall be implemented by the derived classes to return
configured tpm options.
A new tpm backend api - tpm_backend_query_tpm() which uses _get_tpm_options() to
prepare TpmInfo.
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
This allows backend implementations left optional interface methods.
For mandatory methods assertion checks added.
Took the opportunity to remove unused methods:
- tpm_backend_get_desc()
- TPMDriverOps->handle_startup_error
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger<stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Initialize and free TPMBackend data members in it's own instance_init() and
instance_finalize methods.
Took the opportunity to remove unneeded destroy() method from TpmDriverOps
interface as TPMBackend is a Qemu Object, we can use object_unref() inplace of
tpm_backend_destroy() to free the backend object, hence removed destroy() from
TPMDriverOps interface.
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Move thread handling inside TPMBackend, this way backend implementations need
not to maintain their own thread life cycle, instead they needs to implement
'handle_request()' class method that always been called from a thread.
This change made tpm_backend_int.h kind of useless, hence removed it.
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
TPMDriverOps inside TPMBackend is not required, as it is supposed to be a class
member. The only possible reason for keeping in TPMBackend was, to get the
backend type in tpm.c where dedicated backend api, tpm_backend_get_type() is
present.
Signed-off-by: Amarnath Valluri <amarnath.valluri@intel.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
EBADMSG was only added to OpenBSD very recently. To make QEMU compilable
on older OpenBSD versions use EMSGSIZE instead when a mismatch between
number of received bytes and message size indicated in the header was
found.
Return -EMSGSIZE and convert all other errnos in the same functions to
return the negative errno.
Signed-off-by: Stefan Berger <stefanb@linux.vnet.ibm.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
target-arm queue:
* v8M: SG, BLXNS, secure-return
* v8M: fixes for coverity issues in previous patches
* arm: fix armv7m_init() declaration to match definition
* watchdog/aspeed: fix variable type to store reload value
# gpg: Signature made Thu 12 Oct 2017 17:02:49 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171012:
nvic: Fix miscalculation of offsets into ITNS array
nvic: Add missing 'break'
target/arm: Implement SG instruction corner cases
target/arm: Support some Thumb insns being always unconditional
target-arm: Simplify insn_crosses_page()
target/arm: Pull Thumb insn word loads up to top level
target-arm: Don't check for "Thumb2 or M profile" for not-Thumb1
target/arm: Implement secure function return
target/arm: Implement BLXNS
target/arm: Implement SG instruction
target/arm: Add M profile secure MMU index values to get_a32_user_mem_index()
arm: fix armv7m_init() declaration to match definition
watchdog/aspeed: fix variable type to store reload value
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This calculation of the first exception vector in
the ITNS<n> register being accessed:
int startvec = 32 * (offset - 0x380) + NVIC_FIRST_IRQ;
is incorrect, because offset is in bytes, so we only want
to multiply by 8.
Spotted by Coverity (CID 1381484, CID 1381488), though it is
not correct that it actually overflows the buffer, because
we have a 'startvec + i < s->num_irq' guard.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507650856-11718-1-git-send-email-peter.maydell@linaro.org
The common situation of the SG instruction is that it is
executed from S&NSC memory by a CPU in NS state. That case
is handled by v7m_handle_execute_nsc(). However the instruction
also has defined behaviour in a couple of other cases:
* SG instruction in NS memory (behaves as a NOP)
* SG in S memory but CPU already secure (clears IT bits and
does nothing else)
* SG instruction in v8M without Security Extension (NOP)
These can be implemented in translate.c.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-10-git-send-email-peter.maydell@linaro.org
A few Thumb instructions are always unconditional even inside an
IT block (as opposed to being UNPREDICTABLE if used inside an
IT block): BKPT, the v8M SG instruction, and the A profile
HLT (debug halt) instruction.
This means we need to suppress the jump-over-instruction-on-condfail
code generation (though the IT state still advances as usual and
subsequent insns in the IT block may be conditional).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-9-git-send-email-peter.maydell@linaro.org
Recent changes have left insn_crosses_page() more complicated
than it needed to be:
* it's only called from thumb_tr_translate_insn() so we know
for certain that we're looking at a Thumb insn
* the caller's check for dc->pc >= dc->next_page_start - 3
means that dc->pc can't possibly be 4 aligned, so there's
no need to check that (the check was partly there to ensure
that we didn't treat an ARM insn as Thumb, I think)
* we now have thumb_insn_is_16bit() which lets us do a precise
check of the length of the next insn, rather than opencoding
an inaccurate check
Simplify it down to just loading the first half of the insn
and calling thumb_insn_is_16bit() on it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-8-git-send-email-peter.maydell@linaro.org
Refactor the Thumb decode to do the loads of the instruction words at
the top level rather than only loading the second half of a 32-bit
Thumb insn in the middle of the decode.
This is simple apart from the awkward case of Thumb1, where the
BL/BLX prefix and suffix instructions live in what in Thumb2 is the
32-bit insn space. To handle these we decode enough to identify
whether we're looking at a prefix/suffix that we handle as a 16 bit
insn, or a prefix that we're going to merge with the following suffix
to consider as a 32 bit insn. The translation of the 16 bit cases
then moves from disas_thumb2_insn() to disas_thumb_insn().
The refactoring has the benefit that we don't need to pass the
CPUARMState* down into the decoder code any more, but the major
reason for doing this is that some Thumb instructions must be always
unconditional regardless of the IT state bits, so we need to know the
whole insn before we emit the "skip this insn if the IT bits and cond
state tell us to" code. (The always unconditional insns are BKPT,
HLT and SG; the last of these is 32 bits.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-7-git-send-email-peter.maydell@linaro.org
The code which implements the Thumb1 split BL/BLX instructions
is guarded by a check on "not M or THUMB2". All we really need
to check here is "not THUMB2" (and we assume that elsewhere too,
eg in the ARCH(6T2) test that UNDEFs the Thumb2 insns).
This doesn't change behaviour because all M profile cores
have Thumb2 and so ARM_FEATURE_M implies ARM_FEATURE_THUMB2.
(v6M implements a very restricted subset of Thumb2, but we
can cross that bridge when we get to it with appropriate
feature bits.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-6-git-send-email-peter.maydell@linaro.org
Secure function return happens when a non-secure function has been
called using BLXNS and so has a particular magic LR value (either
0xfefffffe or 0xfeffffff). The function return via BX behaves
specially when the new PC value is this magic value, in the same
way that exception returns are handled.
Adjust our BX excret guards so that they recognize the function
return magic number as well, and perform the function-return
unstacking in do_v7m_exception_exit().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1507556919-24992-5-git-send-email-peter.maydell@linaro.org
I've recently seen this with valgrind while running the HMP tester:
==22373== Conditional jump or move depends on uninitialised value(s)
==22373== at 0x4A41FD: arm_disas_set_info (cpu.c:504)
==22373== by 0x3867A7: monitor_disas (disas.c:390)
==22373== by 0x38E80E: memory_dump (monitor.c:1339)
==22373== by 0x38FA43: handle_hmp_command (monitor.c:3123)
==22373== by 0x38FB9E: qmp_human_monitor_command (monitor.c:613)
==22373== by 0x4E3124: qmp_marshal_human_monitor_command (qmp-marshal.c:1736)
==22373== by 0x769678: do_qmp_dispatch (qmp-dispatch.c:104)
==22373== by 0x769678: qmp_dispatch (qmp-dispatch.c:131)
==22373== by 0x38B734: handle_qmp_command (monitor.c:3853)
==22373== by 0x76ED07: json_message_process_token (json-streamer.c:105)
==22373== by 0x78D40A: json_lexer_feed_char (json-lexer.c:323)
==22373== by 0x78D4CD: json_lexer_feed (json-lexer.c:373)
==22373== by 0x38A08D: monitor_qmp_read (monitor.c:3895)
And indeed, in monitor_disas, the read_memory_inner_func variable was
not initialized, but arm_disas_set_info() expects this to be NULL
or a valid pointer. Let's properly set this to NULL in the
INIT_DISASSEMBLE_INFO to fix it in all functions that use the
disassemble_info struct.
Fixes: f7478a92dd ("Fix Thumb-1 BE32 execution")
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1506524313-20037-1-git-send-email-thuth@redhat.com>
heterogeneous cpus are not supported and hotplugging different
cpu model crashes QEMU:
qemu-system-x86_64 -cpu qemu64 -smp 1,maxcpus=2
(qemu) device_add host-x86_64-cpu,socket-id=1,core-id=0,thread-id=0,id=foo
(qemu) info cpus
error: failed to get MSR 0x38d
qemu-system-x86_64: target/i386/kvm.c:2121: kvm_get_msrs: Assertion `ret == cpu->kvm_msr_buf->nmsrs' failed.
Aborted (core dumped)
Gracefully fail hotplug process in case of user mistake.
Reported-by: Greg Kurz <groug@kaod.org>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1507638879-200718-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This patch let address_space_get_iotlb_entry() to use the newly
introduced page_mask parameter in flatview_do_translate(). Then we
will be sure the IOTLB can be aligned to page mask, also we should
nicely support huge pages now when introducing a764040.
Fixes: a764040 ("exec: abstract address_space_do_translate()")
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Acked-by: Michael S. Tsirkin <mst@redhat.com>
Message-Id: <20171010094247.10173-3-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The function is originally used for flatview_space_translate() and what
we care about most is (xlat, plen) range. However for iotlb requests, we
don't really care about "plen", but the size of the page that "xlat" is
located on. While, plen cannot really contain this information.
A simple example to show why "plen" is not good for IOTLB translations:
E.g., for huge pages, it is possible that guest mapped 1G huge page on
device side that used this GPA range:
0x100000000 - 0x13fffffff
Then let's say we want to translate one IOVA that finally mapped to GPA
0x13ffffe00 (which is located on this 1G huge page). Then here we'll
get:
(xlat, plen) = (0x13fffe00, 0x200)
So the IOTLB would be only covering a very small range since from
"plen" (which is 0x200 bytes) we cannot tell the size of the page.
Actually we can really know that this is a huge page - we just throw the
information away in flatview_do_translate().
This patch introduced "page_mask" optional parameter to capture that
page mask info. Also, I made "plen" an optional parameter as well, with
some comments for the whole function.
No functional change yet.
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Maxime Coquelin <maxime.coquelin@redhat.com>
Message-Id: <20171010094247.10173-2-maxime.coquelin@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The tcp_chr_free_connection & tcp_chr_disconnect methods both
skip all of their cleanup work unless the 's->connected' flag
is set. This flag is set when the incoming client connection
is ready to use. Crucially this is *after* the TLS handshake
has been completed. So if the TLS handshake fails and we try
to cleanup the failed client, all the cleanup is skipped as
's->connected' is still false.
The only important thing that should be skipped in this case
is sending of the CHR_EVENT_CLOSED, because we never got as
far as sending the corresponding CHR_EVENT_OPENED. Every other
bit of cleanup can be robust against being called even when
s->connected is false.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171005155057.7664-1-berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Linux kernel will query the ATA IDENTITY DEVICE data, word 217
to determine the rotations per minute of the disk. If this has
the value 1, it is taken to be an SSD and so Linux sets the
'rotational' flag to 0 for the I/O queue and will stop using that
disk as a source of random entropy. Other operating systems may
also take into account rotation rate when setting up default
behaviour.
Mgmt apps should be able to set the rotation rate for virtualized
block devices, based on characteristics of the host storage in use,
so that the guest OS gets sensible behaviour out of the box. This
patch thus adds a 'rotation-rate' parameter for 'ide-hd' device
types.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171004114008.14849-3-berrange@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The Linux kernel will query the SCSI "Block device characteristics"
VPD to determine the rotations per minute of the disk. If this has
the value 1, it is taken to be an SSD and so Linux sets the
'rotational' flag to 0 for the I/O queue and will stop using that
disk as a source of random entropy. Other operating systems may
also take into account rotation rate when setting up default
behaviour.
Mgmt apps should be able to set the rotation rate for virtualized
block devices, based on characteristics of the host storage in use,
so that the guest OS gets sensible behaviour out of the box. This
patch thus adds a 'rotation-rate' parameter for 'scsi-hd' and
'scsi-block' device types. For the latter, this parameter will be
ignored unless the host device has TYPE_DISK.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20171004114008.14849-2-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
stgit produces patch files that lack the ".patch" extensions. Others
might be using ".diff" too. But since we are already limiting source files
to only a handful of extensions, we can reuse that in the mode selection
code.
While at it, do not match "../foo" as a branch name.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All scripts that use the QEMUMachine and QEMUQtestMachine classes
(device-crash-test, tests/migration/*, iotests.py, basevm.py)
already configure logging.
The basicConfig() call inside QEMUMachine.__init__() is being
kept just to make sure a script would still work if it didn't
configure logging.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171005172013.3098-4-ehabkost@redhat.com>
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Use logging module for the QMP debug messages. The only scripts
that set debug=True are iotests.py and guestperf/engine.py, and
they already call logging.basicConfig() to set up logging.
Scripts that don't configure logging are safe as long as they
don't need debugging output, because debug messages don't trigger
the "No handlers could be found for logger" message from the
Python logging module.
Scripts that already configure logging but don't use debug=True
(e.g. scripts/vm/basevm.py) will get QMP debugging enabled for
free.
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Fam Zheng <famz@redhat.com>
Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171005172013.3098-3-ehabkost@redhat.com>
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Just setting level=DEBUG when debug is enabled is not enough: we
need to set up a log handler if we want debug messages generated
using logging.getLogger(...).debug() to be printed.
This was not a problem before because logging.debug() calls
logging.basicConfig() implicitly, but it's safer to not rely on
that.
Cc: "Alex Bennée" <alex.bennee@linaro.org>
Cc: Fam Zheng <famz@redhat.com>
Cc: "Philippe Mathieu-Daudé" <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170927130339.21444-4-ehabkost@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Queued TCG patches
# gpg: Signature made Tue 10 Oct 2017 20:23:12 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20171010:
tcg/mips: delete commented out extern keyword.
tcg: define TCG_HIGHWATER
util: move qemu_real_host_page_size/mask to osdep.h
tcg: take .helpers out of TCGContext
tci: move tci_regs to tcg_qemu_tb_exec's stack
exec-all: extract tb->tc_* into a separate struct tc_tb
translate-all: define and use DEBUG_TB_CHECK_GATE
translate-all: define and use DEBUG_TB_INVALIDATE_GATE
exec-all: introduce TB_PAGE_ADDR_FMT
translate-all: define and use DEBUG_TB_FLUSH_GATE
exec-all: bring tb->invalid into tb->cflags
tcg: consolidate TB lookups in tb_lookup__cpu_state
tcg: remove addr argument from lookup_tb_ptr
tcg/mips: constify tcg_target_callee_save_regs
tcg/i386: constify tcg_target_callee_save_regs
cpu-exec: rename have_tb_lock to acquired_tb_lock in tb_find
translate-all: make have_tb_lock static
exec-all: fix typos in TranslationBlock's documentation
tcg: fix corruption of code_time profiling counter upon tb_flush
cputlb: bring back tlb_flush_count under !TLB_DEBUG
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It is unneeded in the VusDev device structure, and also simplify a bit
the code.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This file implements a bridge from the vu_init API of libvhost-user to
GSource, so that libvhost-user can be used inside a GLib main loop.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
- PLOG is unused
- code is compiled out unless debug is enabled
- logging is too verbose
- you can pipe to ts to have timestamp if needed, or use structured
logging with more recent glib
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Use the one from the source with casting, like any other glib source.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
There is no need to include hw/virtio/virtio-scsi.h, then the conflict
with SCSI_XFER enum goes away.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
It is confusing and could easily conflict with future versions.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
- use Vus prefix consistently
- use CamelCase, since that's glib & libvhost-user style
- avoid _t postfix, usually for system headers
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
There is no code to support more than 1 yet, no need for that today.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Instead of a preliminary check, add an assert to the function that has
the pre-condition.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Always remove the unix path when leaving the program (instead of when
freeing scsi_dev). Note that unix_sock_new() also unlink() exisiting
path before creating the socket.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Use g_new/g_free instead of plain malloc. This simplify a bit memory
handling since glib will abort if it cannot allocate.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
This simplify a little bit memory management in the following patches.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
libvhost-user is meant to be free of glib dependency. Make sure it is
by droping qemu/osdep.h (which included glib.h)
This fixes a bad malloc()/g_free() pair.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
These only depend on the host and therefore belong in the common
osdep, not in a target-dependent object.
While at it, query the host during an init constructor, which guarantees
the page size will be well-defined throughout the execution of the program.
Suggested-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting multiple TCG contexts.
The hash table becomes read-only after it is filled in,
so we can save space by keeping just a global pointer to it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Groundwork for supporting multiple TCG contexts.
Compile-tested for all targets on an x86_64 host.
Suggested-by: Richard Henderson <rth@twiddle.net>
Acked-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
In preparation for adding tc.size to be able to keep track of
TB's using the binary search tree implementation from glib.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This prevents bit rot by ensuring the debug code is compiled when
building a user-mode target.
Unfortunately the helpers are user-mode-only so we cannot fully
get rid of the ifdef checks. Add a comment to explain this.
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
And fix the following warning when DEBUG_TB_INVALIDATE is enabled
in translate-all.c:
CC mipsn32-linux-user/accel/tcg/translate-all.o
/data/src/qemu/accel/tcg/translate-all.c: In function ‘tb_alloc_page’:
/data/src/qemu/accel/tcg/translate-all.c:1201:16: error: format ‘%lx’ expects argument of type ‘long unsigned int’, but argument 2 has type ‘tb_page_addr_t {aka unsigned int}’ [-Werror=format=]
printf("protecting code page: 0x" TARGET_FMT_lx "\n",
^
cc1: all warnings being treated as errors
/data/src/qemu/rules.mak:66: recipe for target 'accel/tcg/translate-all.o' failed
make[1]: *** [accel/tcg/translate-all.o] Error 1
Makefile:328: recipe for target 'subdir-mipsn32-linux-user' failed
make: *** [subdir-mipsn32-linux-user] Error 2
cota@flamenco:/data/src/qemu/build ((18f3fe1...) *$)$
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This gets rid of some ifdef checks while ensuring that the debug code
is compiled, which prevents bit rot.
Suggested-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
It is unlikely that we will ever want to call this helper passing
an argument other than the current PC. So just remove the argument,
and use the pc we already get from cpu_get_tb_cpu_state.
This change paves the way to having a common "tb_lookup" function.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reusing the have_tb_lock name, which is also defined in translate-all.c,
makes code reviewing unnecessarily harder.
Avoid potential confusion by renaming the local have_tb_lock variable
to something else.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Whenever there is an overflow in code_gen_buffer (e.g. we run out
of space in it and have to flush it), the code_time profiling counter
ends up with an invalid value (that is, code_time -= profile_getclock(),
without later on getting += profile_getclock() due to the goto).
Fix it by using the ti variable, so that we only update code_time
when there is no overflow. Note that in case there is an overflow
we fail to account for the elapsed coding time, but this is quite rare
so we can probably live with it.
"info jit" before/after, roughly at the same time during debian-arm bootup:
- before:
Statistics:
TB flush count 1
TB invalidate count 4665
TLB flush count 998
JIT cycles -615191529184601 (-256329.804 s at 2.4 GHz)
translated TBs 302310 (aborted=0 0.0%)
avg ops/TB 48.4 max=438
deleted ops/TB 8.54
avg temps/TB 32.31 max=38
avg host code/TB 361.5
avg search data/TB 24.5
cycles/op -42014693.0
cycles/in byte -121444900.2
cycles/out byte -5629031.1
cycles/search byte -83114481.0
gen_interm time -0.0%
gen_code time 100.0%
optim./code time -0.0%
liveness/code time -0.0%
cpu_restore count 6236
avg cycles 110.4
- after:
Statistics:
TB flush count 1
TB invalidate count 4665
TLB flush count 1010
JIT cycles 1996899624 (0.832 s at 2.4 GHz)
translated TBs 297961 (aborted=0 0.0%)
avg ops/TB 48.5 max=438
deleted ops/TB 8.56
avg temps/TB 32.31 max=38
avg host code/TB 361.8
avg search data/TB 24.5
cycles/op 138.2
cycles/in byte 398.4
cycles/out byte 18.5
cycles/search byte 273.1
gen_interm time 14.0%
gen_code time 86.0%
optim./code time 19.4%
liveness/code time 10.3%
cpu_restore count 6372
avg cycles 111.0
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Commit f0aff0f124 ("cputlb: add assert_cpu_is_self checks") buried
the increment of tlb_flush_count under TLB_DEBUG. This results in
"info jit" always (mis)reporting 0 TLB flushes when !TLB_DEBUG.
Besides, under MTTCG tlb_flush_count is updated by several threads,
so in order not to lose counts we'd either have to use atomic ops
or distribute the counter, which is more scalable.
This patch does the latter by embedding tlb_flush_count in CPUArchState.
The global count is then easily obtained by iterating over the CPU list.
Note that this change also requires updating the accessors to
tlb_flush_count to use atomic_read/set whenever there may be conflicting
accesses (as defined in C11) to it.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Change qemu_config_parse() to return the number of config groups
in success and -EINVAL on error. This will allow callers of
qemu_config_parse() to check if something was really loaded from
the config file.
All existing callers of qemu_config_parse() and
qemu_read_config_file() only check if the return value was
negative, so the change shouldn't affect them.
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20171004025043.3788-2-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This patch add a MachineClass element that can be set in the machine C
code to specify a list of supported CPU types. If the supported CPU
types are specified the user enter CPU (by -cpu at runtime) is checked
against the supported types and QEMU exits if they aren't supported.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <b8474e9d2e0a219d9bac901342f983b13d009301.1507059418.git.alistair.francis@xilinx.com>
[ehabkost: removed assert(), rewrote comment]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Block layer patches
# gpg: Signature made Fri 06 Oct 2017 16:52:59 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (54 commits)
block/mirror: check backing in bdrv_mirror_top_flush
qcow2: truncate the tail of the image file after shrinking the image
qcow2: fix return error code in qcow2_truncate()
iotests: Fix 195 if IMGFMT is part of TEST_DIR
block/mirror: check backing in bdrv_mirror_top_refresh_filename
block: support passthrough of BDRV_REQ_FUA in crypto driver
block: convert qcrypto_block_encrypt|decrypt to take bytes offset
block: convert crypto driver to bdrv_co_preadv|pwritev
block: fix data type casting for crypto payload offset
crypto: expose encryption sector size in APIs
block: use 1 MB bounce buffers for crypto instead of 16KB
iotests: Add test 197 for covering copy-on-read
block: Perform copy-on-read in loop
block: Add blkdebug hook for copy-on-read
iotests: Restore stty settings on completion
block: Uniform handling of 0-length bdrv_get_block_status()
qemu-io: Add -C for opening with copy-on-read
commit: Remove overlay_bs
qemu-iotests: Test commit block job where top has two parents
qemu-iotests: Allow QMP pretty printing in common.qemu
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm:
* v8M: more preparatory work
* nvic: reset properly rather than leaving the nvic in a weird state
* xlnx-zynqmp: Mark the "xlnx, zynqmp" device with user_creatable = false
* sd: fix out-of-bounds check for multi block reads
* arm: Fix SMC reporting to EL2 when QEMU provides PSCI
# gpg: Signature made Fri 06 Oct 2017 16:58:15 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20171006:
nvic: Add missing code for writing SHCSR.HARDFAULTPENDED bit
target/arm: Factor out "get mmuidx for specified security state"
target/arm: Fix calculation of secure mm_idx values
target/arm: Implement security attribute lookups for memory accesses
nvic: Implement Security Attribution Unit registers
target/arm: Add v8M support to exception entry code
target/arm: Add support for restoring v8M additional state context
target/arm: Update excret sanity checks for v8M
target/arm: Add new-in-v8M SFSR and SFAR
target/arm: Don't warn about exception return with PC low bit set for v8M
target/arm: Warn about restoring to unaligned stack
target/arm: Check for xPSR mismatch usage faults earlier for v8M
target/arm: Restore SPSEL to correct CONTROL register on exception return
target/arm: Restore security state on exception return
target/arm: Prepare for CONTROL.SPSEL being nonzero in Handler mode
target/arm: Don't switch to target stack early in v7M exception return
nvic: Clear the vector arrays and prigroup on reset
hw/arm/xlnx-zynqmp: Mark the "xlnx, zynqmp" device with user_creatable = false
hw/sd: fix out-of-bounds check for multi block reads
arm: Fix SMC reporting to EL2 when QEMU provides PSCI
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For the SG instruction and secure function return we are going
to want to do memory accesses using the MMU index of the CPU
in secure state, even though the CPU is currently in non-secure
state. Write arm_v7m_mmu_idx_for_secstate() to do this job,
and use it in cpu_mmu_index().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-17-git-send-email-peter.maydell@linaro.org
In cpu_mmu_index() we try to do this:
if (env->v7m.secure) {
mmu_idx += ARMMMUIdx_MSUser;
}
but it will give the wrong answer, because ARMMMUIdx_MSUser
includes the 0x40 ARM_MMU_IDX_M field, and so does the
mmu_idx we're adding to, and we'll end up with 0x8n rather
than 0x4n. This error is then nullified by the call to
arm_to_core_mmu_idx() which masks out the high part, but
we're about to factor out the code that calculates the
ARMMMUIdx values so it can be used without passing it through
arm_to_core_mmu_idx(), so fix this bug first.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-16-git-send-email-peter.maydell@linaro.org
Implement the security attribute lookups for memory accesses
in the get_phys_addr() functions, causing these to generate
various kinds of SecureFault for bad accesses.
The major subtlety in this code relates to handling of the
case when the security attributes the SAU assigns to the
address don't match the current security state of the CPU.
In the ARM ARM pseudocode for validating instruction
accesses, the security attributes of the address determine
whether the Secure or NonSecure MPU state is used. At face
value, handling this would require us to encode the relevant
bits of state into mmu_idx for both S and NS at once, which
would result in our needing 16 mmu indexes. Fortunately we
don't actually need to do this because a mismatch between
address attributes and CPU state means either:
* some kind of fault (usually a SecureFault, but in theory
perhaps a UserFault for unaligned access to Device memory)
* execution of the SG instruction in NS state from a
Secure & NonSecure code region
The purpose of SG is simply to flip the CPU into Secure
state, so we can handle it by emulating execution of that
instruction directly in arm_v7m_cpu_do_interrupt(), which
means we can treat all the mismatch cases as "throw an
exception" and we don't need to encode the state of the
other MPU bank into our mmu_idx values.
This commit doesn't include the actual emulation of SG;
it also doesn't include implementation of the IDAU, which
is a per-board way to specify hard-coded memory attributes
for addresses, which override the CPU-internal SAU if they
specify a more secure setting than the SAU is programmed to.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-15-git-send-email-peter.maydell@linaro.org
Implement the register interface for the SAU: SAU_CTRL,
SAU_TYPE, SAU_RNR, SAU_RBAR and SAU_RLAR. None of the
actual behaviour is implemented here; registers just
read back as written.
When the CPU definition for Cortex-M33 is eventually
added, its initfn will set cpu->sau_sregion, in the same
way that we currently set cpu->pmsav7_dregion for the
M3 and M4.
Number of SAU regions is typically a configurable
CPU parameter, but this patch doesn't provide a
QEMU CPU property for it. We can easily add one when
we have a board that requires it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-14-git-send-email-peter.maydell@linaro.org
Add support for v8M and in particular the security extension
to the exception entry code. This requires changes to:
* calculation of the exception-return magic LR value
* push the callee-saves registers in certain cases
* clear registers when taking non-secure exceptions to avoid
leaking information from the interrupted secure code
* switch to the correct security state on entry
* use the vector table for the security state we're targeting
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-13-git-send-email-peter.maydell@linaro.org
Attempting to do an exception return with an exception frame that
is not 8-aligned is UNPREDICTABLE in v8M; warn about this.
(It is not UNPREDICTABLE in v7M, and our implementation can
handle the merely-4-aligned case fine, so we don't need to
do anything except warn.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-8-git-send-email-peter.maydell@linaro.org
ARM v8M specifies that the INVPC usage fault for mismatched
xPSR exception field and handler mode bit should be checked
before updating the PSR and SP, so that the fault is taken
with the existing stack frame rather than by pushing a new one.
Perform this check in the right place for v8M.
Since v7M specifies in its pseudocode that this usage fault
check should happen later, we have to retain the original
code for that check rather than being able to merge the two.
(The distinction is architecturally visible but only in
very obscure corner cases like attempting an invalid exception
return with an exception frame in read only memory.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-7-git-send-email-peter.maydell@linaro.org
On exception return for v8M, the SPSEL bit in the EXC_RETURN magic
value should be restored to the SPSEL bit in the CONTROL register
banked specified by the EXC_RETURN.ES bit.
Add write_v7m_control_spsel_for_secstate() which behaves like
write_v7m_control_spsel() but allows the caller to specify which
CONTROL bank to use, reimplement write_v7m_control_spsel() in
terms of it, and use it in exception return.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-6-git-send-email-peter.maydell@linaro.org
Now that we can handle the CONTROL.SPSEL bit not necessarily being
in sync with the current stack pointer, we can restore the correct
security state on exception return. This happens before we start
to read registers off the stack frame, but after we have taken
possible usage faults for bad exception return magic values and
updated CONTROL.SPSEL.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-5-git-send-email-peter.maydell@linaro.org
In the v7M architecture, there is an invariant that if the CPU is
in Handler mode then the CONTROL.SPSEL bit cannot be nonzero.
This in turn means that the current stack pointer is always
indicated by CONTROL.SPSEL, even though Handler mode always uses
the Main stack pointer.
In v8M, this invariant is removed, and CONTROL.SPSEL may now
be nonzero in Handler mode (though Handler mode still always
uses the Main stack pointer). In preparation for this change,
change how we handle this bit: rename switch_v7m_sp() to
the now more accurate write_v7m_control_spsel(), and make it
check both the handler mode state and the SPSEL bit.
Note that this implicitly changes the point at which we switch
active SP on exception exit from before we pop the exception
frame to after it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-4-git-send-email-peter.maydell@linaro.org
Currently our M profile exception return code switches to the
target stack pointer relatively early in the process, before
it tries to pop the exception frame off the stack. This is
awkward for v8M for two reasons:
* in v8M the process vs main stack pointer is not selected
purely by the value of CONTROL.SPSEL, so updating SPSEL
and relying on that to switch to the right stack pointer
won't work
* the stack we should be reading the stack frame from and
the stack we will eventually switch to might not be the
same if the guest is doing strange things
Change our exception return code to use a 'frame pointer'
to read the exception frame rather than assuming that we
can switch the live stack pointer this early.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1506092407-26985-3-git-send-email-peter.maydell@linaro.org
Reset for devices does not include an automatic clear of the
device state (unlike CPU state, where most of the state
structure is cleared to zero). Add some missing initialization
of NVIC state that meant that the device was left in the wrong
state if the guest did a warm reset.
(In particular, since we were resetting the computed state like
s->exception_prio but not all the state it was computed
from like s->vectors[x].active, the NVIC wound up in an
inconsistent state that could later trigger assertion failures.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1506092407-26985-2-git-send-email-peter.maydell@linaro.org
The device uses serial_hds in its realize function and thus can't be
used twice. Apart from that, the comma in its name makes it quite hard
to use for the user anyway, since a comma is normally used to separate
the device name from its properties when using the "-device" parameter
or the "device_add" HMP command.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1506441116-16627-1-git-send-email-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block patches
# gpg: Signature made Fri Oct 6 16:30:57 2017 CEST
# gpg: using RSA key F407DB0061D5CF40
# gpg: Good signature from "Max Reitz <mreitz@redhat.com>"
# Primary key fingerprint: 91BE B60A 30DB 3E88 57D1 1829 F407 DB00 61D5 CF40
* mreitz/tags/pull-block-2017-10-06:
block/mirror: check backing in bdrv_mirror_top_flush
qcow2: truncate the tail of the image file after shrinking the image
qcow2: fix return error code in qcow2_truncate()
iotests: Fix 195 if IMGFMT is part of TEST_DIR
block/mirror: check backing in bdrv_mirror_top_refresh_filename
block: support passthrough of BDRV_REQ_FUA in crypto driver
block: convert qcrypto_block_encrypt|decrypt to take bytes offset
block: convert crypto driver to bdrv_co_preadv|pwritev
block: fix data type casting for crypto payload offset
crypto: expose encryption sector size in APIs
block: use 1 MB bounce buffers for crypto instead of 16KB
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
do_run_qemu() in iotest 195 first applies _filter_imgfmt when printing
qemu's command line and _filter_testdir only afterwards. Therefore, if
the image format is part of the test directory path, _filter_testdir
will no longer apply and the actual output will differ from the
reference output even in case of success.
For example, TEST_DIR might be "/tmp/test-qcow2", in which case
_filter_imgfmt first transforms this to "/tmp/test-IMGFMT" which is no
longer recognized as the TEST_DIR by _filter_testdir.
Fix this by not applying _filter_imgfmt in do_run_qemu() but in
run_qemu() instead, and only after _filter_testdir.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Message-id: 20170927211334.3988-1-mreitz@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Make the crypto driver implement the bdrv_co_preadv|pwritev
callbacks, and also use bdrv_co_preadv|pwritev for I/O
with the protocol driver beneath. This replaces sector based
I/O with byte based I/O, and allows us to stop assuming the
physical sector size matches the encryption sector size.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170927125340.12360-5-berrange@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The crypto APIs report the offset of the data payload as an uint64_t
type, but the block driver is casting to size_t or ssize_t which will
potentially truncate.
Most of the block APIs use int64_t for offsets meanwhile, so even if
using uint64_t in the crypto block driver we are still at risk of
truncation.
Change the block crypto driver to use uint64_t, but add asserts that
the value is less than INT64_MAX.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170927125340.12360-4-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
While current encryption schemes all have a fixed sector size of
512 bytes, this is not guaranteed to be the case in future. Expose
the sector size in the APIs so the block layer can remove assumptions
about fixed 512 byte sectors.
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170927125340.12360-3-berrange@redhat.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Using 16KB bounce buffers creates a significant performance
penalty for I/O to encrypted volumes on storage which high
I/O latency (rotating rust & network drives), because it
triggers lots of fairly small I/O operations.
On tests with rotating rust, and cache=none|directsync,
write speed increased from 2MiB/s to 32MiB/s, on a par
with that achieved by the in-kernel luks driver. With
other cache modes the in-kernel driver is still notably
faster because it is able to report completion of the
I/O request before any encryption is done, while the
in-QEMU driver must encrypt the data before completion.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170927125340.12360-2-berrange@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
Add a test for qcow2 copy-on-read behavior, including exposure
for the just-fixed bugs.
The copy-on-read behavior is always to a qcow2 image, but the
test is careful to allow running with most image protocol/format
combos as the backing file being copied from (luks being the
exception, as it is harder to pass the right secret to all the
right places). In fact, for './check nbd', this appears to be
the first time we've had a qcow2 image wrapping NBD, requiring
an additional line in _filter_img_create to match the similar
line in _filter_img_info.
Invoking blkdebug to prove we don't write too much took some
effort to get working; and it requires that $TEST_WRAP (based
on $TEST_DIR) not be subject to word splitting. We may decide
later to have the entire iotests suite use relative rather than
absolute names, to avoid problems inherited by the absolute
name of $PWD or $TEST_DIR, at which point the sanity check in
this commit could be simplified.
This test requires at least 2G of consecutive memory to succeed;
as such, it is prone to spurious failures, particularly on
32-bit machines under load. This situation is detected and
triggers an early exit to skip the test, rather than a failure.
To manually provoke this setup on a beefier machine, I used:
$ (ulimit -S -v 1000000; ./check -qcow2 197)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Improve our braindead copy-on-read implementation. Pre-patch,
we have multiple issues:
- we create a bounce buffer and perform a write for the entire
request, even if the active image already has 99% of the
clusters occupied, and really only needs to copy-on-read the
remaining 1% of the clusters
- our bounce buffer was as large as the read request, and can
needlessly exhaust our memory by using double the memory of
the request size (the original request plus our bounce buffer),
rather than a capped maximum overhead beyond the original
- if a driver has a max_transfer limit, we are bypassing the
normal code in bdrv_aligned_preadv() that fragments to that
limit, and instead attempt to read the entire buffer from the
driver in one go, which some drivers may assert on
- a client can request a large request of nearly 2G such that
rounding the request out to cluster boundaries results in a
byte count larger than 2G. While this cannot exceed 32 bits,
it DOES have some follow-on problems:
-- the call to bdrv_driver_pread() can assert for exceeding
BDRV_REQUEST_MAX_BYTES, if the driver is old and lacks
.bdrv_co_preadv
-- if the buffer is all zeroes, the subsequent call to
bdrv_co_do_pwrite_zeroes is a no-op due to a negative size,
which means we did not actually copy on read
Fix all of these issues by breaking up the action into a loop,
where each iteration is capped to sane limits. Also, querying
the allocation status allows us to optimize: when data is
already present in the active layer, we don't need to bounce.
Note that the code has a telling comment that copy-on-read
should probably be a filter driver rather than a bolt-on hack
in io.c; but that remains a task for another day.
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Executing qemu with a terminal as stdin will temporarily alter stty
settings on that terminal (for example, disabling echo), because of
how we run both the monitor and any multiplexing with guest input.
Normally, qemu restores the original settings on exit; but if an
iotest triggers qemu to abort in the middle, we can be left with
the altered terminal setup. This can make life very annoying when
debugging an iotest failure (not everyone remembers the trick of
blind-typing 'stty sane' without echo, and some people prefer
terminal settings that are slightly different than the defaults
picked by 'stty sane').
It is possible to avoid qemu corrupting the terminal by not passing
a terminal to qemu's stdin in the first place (as in, use
'./check ... </dev/null'), but that's extra typing to have to
remember. But running 'exec </dev/null' in the harness seems like
it might be too heavy of a hammer. So I instead went the the
solution of saving and restoring the stty settings, only when the
harness detects that it is run interactively.
I tested this patch by forcing an allocation failure (I can't
guarantee that this particular limit will work on all setups, but
it shows the idea):
$ (ulimit -S -v 500000; ./check -qcow2 1)
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Handle a 0-length block status request up front, with a uniform
return value claiming the area is not allocated.
Most callers don't pass a length of 0 to bdrv_get_block_status()
and friends; but it definitely happens with a 0-length read when
copy-on-read is enabled. While we could audit all callers to
ensure that they never make a 0-length request, and then assert
that fact, it was just as easy to fix things to always report
success (as long as the callers are careful to not go into an
infinite loop). However, we had inconsistent behavior on whether
the status is reported as allocated or defers to the backing
layer, depending on what callbacks the driver implements, and
possibly wasting quite a few CPU cycles to get to that answer.
Consistently reporting unallocated up front doesn't really hurt
anything, and makes it easier both for callers (0-length requests
now have well-defined behavior) and for drivers (drivers don't
have to deal with 0-length requests).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We don't need to make any assumptions about the graph layout above the
top node of the commit operation any more. Remove the use of
bdrv_find_overlay() and related variables from the commit job code.
bdrv_drop_intermediate() doesn't use the 'active' parameter any more, so
we can just drop it.
The overlay node was previously added to the block job to get a
BLK_PERM_GRAPH_MOD. We really need to respect those permissions in
bdrv_drop_intermediate() now, but as long as we haven't figured out yet
how BLK_PERM_GRAPH_MOD is actually supposed to work, just leave a TODO
comment there.
With this change, it is now possible to perform another block job on an
overlay node without conflicts. qemu-iotests 030 is changed accordingly.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
QMP responses to certain commands can become quite long, which doesn't
only make reading them hard, but also means that the maximum line length
in patch emails can be exceeded. Allow tests to switch to QMP pretty
printing, which results in more, but shorter lines.
We also need to make sure to keep indentation in the response for this
to work as expected.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
This changes the commit block job to support operation in a graph where
there is more than a single active layer that references the top node.
This involves inserting the commit filter node not only on the path
between the given active node and the top node, but between the top node
and all of its parents.
On completion, bdrv_drop_intermediate() must consider all parents for
updating the backing file link. These parents may be backing files
themselves and as such read-only; reopen them temporarily if necessary.
Previously this was achieved by the bdrv_reopen() calls in the commit
block job that made overlay_bs read-write for the whole duration of the
block job, even though write access is only needed on completion.
Now that we consider all parents, overlay_bs is meaningless. It is left
in place in this commit, but we'll remove it soon.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
There is no good reason for bdrv_drop_intermediate() to know the active
layer above the subchain it is operating on - even more so, because
the assumption that there is a single active layer above it is not
generally true.
In order to prepare removal of the active parameter, use a BdrvChildRole
callback to update the backing file string in the overlay image instead
of directly calling bdrv_change_backing_file().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
"check" is full of qemu-iotests--specific details. Separating it
from "common" does not make much sense anymore.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The variable is almost unused, and one of the two uses is actually
uninitialized.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The variable is used in "common" but defined only after the file
is sourced.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Split "check" parts from tests part.
For the directory setup, the actual computation of directories goes
in "check", while the sanity checks go in the tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
These are never used by "check", with one exception that does not need
$QEMU_OPTIONS. Keep them in common.rc, which will be soon included only
by the tests.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Instead of ./check failing when a binary is missing, we try each test
case now and each one fails with tons of test case diffs. Also, all the
variables were initialized by "check" prior to "common" being sourced,
and then (uselessly) checked for emptiness again in "check".
Centralize the search for programs in "common" (which will soon be
one with "check"), including the "realpath" invocation which can be done
just once in "check" rather than in the tests.
For qnio_server, move the detection to "common", simplifying
set_prog_path to stop handling the unused second argument, and
embedding the "realpath" pass.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some functions in common.rc are never used by the tests. Move
them out of that file and into common, which is already included
only by "check".
Code that actually *is* common to "check" and tests can be placed in
common.config.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This includes shell function, shell variables and command line options
(randomize.awk does not exist).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that all callers are using byte-based interfaces, there's no
reason for our internal hbitmap to remain with sector-based
granularity. It also simplifies our internal scaling, since we
already know that hbitmap widens requests out to granularity
boundaries.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Both callers already had bytes available, but were scaling to
sectors. Move the scaling to internal code. In the case of
bdrv_aligned_pwritev(), we are now passing the exact offset
rather than a rounded sector-aligned value, but that's okay
as long as dirty bitmap widens start/bytes to granularity
boundaries.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
iotests 165 was rather weak - on a default 64k-cluster image, where
bitmap granularity also defaults to 64k bytes, a single cluster of
the bitmap table thus covers (64*1024*8) bits which each cover 64k
bytes, or 32G of image space. But the test only uses a 1G image,
so it cannot trigger any more than one loop of the code in
store_bitmap_data(); and it was writing to the first cluster. In
order to test that we are properly aligning which portions of the
bitmap are being written to the file, we really want to test a case
where the first dirty bit returned by bdrv_dirty_iter_next() is not
aligned to the start of a cluster, which we can do by modifying the
test to write data that doesn't happen to fall in the first cluster
of the image.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This is new code, but it is easier to read if it makes passes over
the image using bytes rather than sectors (and will get easier in
the future when bdrv_get_block_status is converted to byte-based).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that we have adjusted the majority of the calls this function
makes to be byte-based, it is easier to read the code if it makes
passes over the image using bytes rather than sectors.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some of the callers were already scaling bytes to sectors; others
can be easily converted to pass byte offsets, all in our shift
towards a consistent byte interface everywhere. Making the change
will also make it easier to write the hold-out callers to use byte
rather than sectors for their iterations; it also makes it easier
for a future dirty-bitmap patch to offload scaling over to the
internal hbitmap. Although all callers happen to pass
sector-aligned values, make the internal scaling robust to any
sub-sector requests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Half the callers were already scaling bytes to sectors; the other
half can eventually be simplified to use byte iteration. Both
callers were already using the result as a bool, so make that
explicit. Making the change also makes it easier for a future
dirty-bitmap patch to offload scaling over to the internal hbitmap.
Remember, asking whether a byte is dirty is effectively asking
whether the entire granularity containing the byte is dirty, since
we only track dirtiness by granularity.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Thanks to recent cleanups, all callers were scaling a return value
of sectors into bytes; do the scaling internally instead.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Thanks to recent cleanups, most callers were scaling a return value
of sectors into bytes (the exception, in qcow2-bitmap, will be
converted to byte-based iteration later). Update the interface to
do the scaling internally instead.
In qcow2-bitmap, the code was specifically checking for an error
return of -1. To avoid a regression, we either have to make sure
we continue to return -1 (rather than a scaled -512) on error, or
we have to fix the caller to treat all negative values as error
rather than just one magic value. It's easy enough to make both
changes at the same time, even though either one in isolation
would work.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers to bdrv_dirty_iter_new() passed 0 for their initial
starting point, drop that parameter.
Most callers to bdrv_set_dirty_iter() were scaling a byte offset to
a sector number; the exception qcow2-bitmap will be converted later
to use byte rather than sector iteration. Move the scaling to occur
internally to dirty bitmap code instead, so that callers now pass
in bytes.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are gradually converting to byte-based interfaces, as they are
easier to reason about than sector-based. Change the qcow2 bitmap
helper function sectors_covered_by_bitmap_cluster(), renaming it
to bytes_covered_by_bitmap_cluster() in the process.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Right now, the dirty-bitmap code exposes the fact that we use
a scale of sector granularity in the underlying hbitmap to anything
that wants to serialize a dirty bitmap. It's nicer to uniformly
expose bytes as our dirty-bitmap interface, matching the previous
change to bitmap size. The only caller to serialization is currently
qcow2-cluster.c, which becomes a bit more verbose because it is still
tracking sectors for other reasons, but a later patch will fix that
to more uniformly use byte offsets everywhere. Likewise, within
dirty-bitmap, we have to add more assertions that we are not
truncating incorrectly, which can go away once the internal hbitmap
is byte-based rather than sector-based.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We are still using an internal hbitmap that tracks a size in sectors,
with the granularity scaled down accordingly, because it lets us
use a shortcut for our iterators which are currently sector-based.
But there's no reason we can't track the dirty bitmap size in bytes,
since it is (mostly) an internal-only variable (remember, the size
is how many bytes are covered by the bitmap, not how many bytes the
bitmap occupies). A later cleanup will convert dirty bitmap
internals to be entirely byte-based, eliminating the intermediate
sector rounding added here; and technically, since bdrv_getlength()
already rounds up to sectors, our use of DIV_ROUND_UP is more for
theoretical completeness than for any actual rounding.
Use is_power_of_2() while at it, instead of open-coding that.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We're already reporting bytes for bdrv_dirty_bitmap_granularity();
mixing bytes and sectors in our return values is a recipe for
confusion. A later cleanup will convert dirty bitmap internals
to be entirely byte-based, but in the meantime, we should report
the bitmap size in bytes.
The only external caller in qcow2-bitmap.c is temporarily more verbose
(because it is still using sector-based math), but will later be
switched to track progress by bytes instead of sectors.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We've previously fixed several places where we failed to account
for possible errors from bdrv_nb_sectors(). Fix another one by
making bdrv_dirty_bitmap_truncate() take the new size from the
caller instead of querying itself; then adjust the sole caller
bdrv_truncate() to pass the size just determined by a successful
resize, or to reuse the size given to the original truncate
operation when refresh_total_sectors() was not able to confirm the
actual size (the two sizes can potentially differ according to
rounding constraints), thus avoiding sizing the bitmaps to -1.
This also fixes a bug where not all failure paths in
bdrv_truncate() would set errp.
Note that bdrv_truncate() is still a bit awkward. We may want
to revisit it later and clean up things to better guarantee that
a resize attempt either fails cleanly up front, or cannot fail
after guest-visible changes have been made (if temporary changes
are made, then they need to be cleanly rolled back). But that
is a task for another day; for now, the goal is the bare minimum
fix to ensure that just bdrv_dirty_bitmap_truncate() cannot fail.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We had several functions that no one is currently using, and which
use sector-based interfaces. I'm trying to convert towards byte-based
interfaces, so it's easier to just drop the unused functions:
bdrv_dirty_bitmap_get_meta
bdrv_dirty_bitmap_get_meta_locked
bdrv_dirty_bitmap_reset_meta
bdrv_dirty_bitmap_meta_granularity
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
When subdividing a bitmap serialization, the code in hbitmap.c
enforces that start/count parameters are aligned (except that
count can end early at end-of-bitmap). We exposed this required
alignment through bdrv_dirty_bitmap_serialization_align(), but
forgot to actually check that we comply with it.
Fortunately, qcow2 is never dividing bitmap serialization smaller
than one cluster (which is a minimum of 512 bytes); so we are
always compliant with the serialization alignment (which insists
that we partition at least 64 bits per chunk) because we are doing
at least 4k bits per chunk.
Still, it's safer to add an assertion (for the unlikely case that
we'd ever support a cluster smaller than 512 bytes, or if the
hbitmap implementation changes what it considers to be aligned),
rather than leaving bdrv_dirty_bitmap_serialization_align()
without a caller.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The only client of hbitmap_serialization_granularity() is dirty-bitmap's
bdrv_dirty_bitmap_serialization_align(). Keeping the two names consistent
is worthwhile, and the shorter name is more representative of what the
function returns (the required alignment to be used for start/count of
other serialization functions, where violating the alignment causes
assertion failures).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
All callers of bdrv_img_create() pass in a size, or -1 to read the
size from the backing file. We then set that size as the QemuOpt
default, which means we will reuse that default rather than the
final parameter to qemu_opt_get_size() several lines later. But
it is rather confusing to read subsequent checks of 'size == -1'
when it looks (without seeing the full context) like size defaults
to 0; it also doesn't help that a size of 0 is valid (for some
formats).
Rework the logic to make things more legible.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
s390x changes:
- support for IDA (indirect addressing in ccws) via ccw data stream
- support for extended TOD-Clock (z14 feature)
- various fixes and improvements all over the place
# gpg: Signature made Fri 06 Oct 2017 10:52:22 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20171006: (33 commits)
hw/s390x: Mark the "sclpquiesce" device with user_creatable = false
s390x/tcg: initialize machine check queue
s390x/sclp: mark sclp-cpu-hotplug as non-usercreatable
s390x/sclp: Mark the sclp device with user_creatable = false
s390/kvm: make TOD setting failures fatal for migration
s390/kvm: Support for get/set of extended TOD-Clock for guest
s390x/css: fix css migration compat handling
s390x: sort some devices into categories
s390x/tcg: make STFL store into the lowcore
s390x: introduce and use S390_MAX_CPUS
target/s390x: get rid of next_core_id
s390x/cpumodel: fix max STFL(E) bit number
s390x: raise CPU hotplug irq after really hotplugged
MAINTAINERS: use KVM s390x maintainers for kvm-stubs.c and kvm_s390x.h
s390x/3270: handle writes of arbitrary length
s390x/3270: IDA support for 3270 via CcwDataStream
Revert "s390x/ccw: create s390 phb conditionally"
s390x/tcg: make idte/ipte use the new _real mmu
s390x/tcg: make testblock use the new _real mmu
s390x/tcg: make stora(g) use the new _real mmu
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The "sclpquiesce" device is just an internal device that should not be
created by the user directly. Though it currently does not seem to cause
any obvious trouble when the user instantiates an additional device, let's
better mark it with user_creatable = false to avoid unexpected behavior,
e.g. because the quiesce notifier gets registered multiple times.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1507193105-15627-1-git-send-email-thuth@redhat.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Just as for external interrupts and I/O interrupts, we need to
initialize mchk_index during cpu reset.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
A TYPE_SCLP_CPU_HOTPLUG device for handling cpu hotplug events
is already created by the sclp event facility. Adding a second
TYPE_SCLP_CPU_HOTPLUG device via -device sclp-cpu-hotplug creates
an ambiguity in raise_irq_cpu_hotplug(), leading to a crash once
a cpu is hotplugged.
To fix this, disallow creating a sclp-cpu-hotplug device manually.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The "sclp" device is just an internal device that can not be instantiated
by the users. If they try to use it, they only get a simple error message:
$ qemu-system-s390x -nographic -device sclp
qemu-system-s390x: Option '-device s390-sclp-event-facility' cannot be
handled by this machine
Since sclp_init() tries to create a TYPE_SCLP_EVENT_FACILITY which is
a non-pluggable sysbus device, there is really no way that the "sclp"
device can be used by the user, so let's set the user_creatable = false
accordingly.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1507125199-22562-1-git-send-email-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Farhan Ali <alifm@linux.vnet.ibm.com>
Acked-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If we fail to set a proper TOD clock on the target system, this can
already result in some problematic cases. We print several warn messages
on source and target in that case.
If kvm fails to set a nonzero epoch index, then we must ultimately fail
the migration as this will result in a giant time leap backwards. This
patch lets the migration fail if we can not set the guest time on the
target.
On failure the guest will resume normally on the original host machine.
Signed-off-by: Collin L. Walling <walling@linux.vnet.ibm.com>
Reviewed-by: Eric Farman <farman@linux.vnet.ibm.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[split failure change from epoch index change, minor fixups]
Message-Id: <20171004105751.24655-3-borntraeger@de.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Commit e996583eb3 ("s390x/css: activate ChannelSubSys migration",
2017-07-11) was supposed to enable css migration for virtio-ccw
machines starting 2.10, but it ended up effectively enabling it
only for 2.10 as the registration of the appropriate VMStateDescription
happens in ccw_machine_2_10_instance_options which does not get
called for machines more recent than 2_10.
Let us move the corresponding chunk of code (which conditionally enables
the migration based on the value of the corresponding class property) to
ccw_init, which is called for each virtio-ccw machine instance.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20171004110109.16525-1-pasic@linux.vnet.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Using virtual memory access is wrong and will soon include low-address
protection checks, which is to be bypassed for STFL.
STFL is a privileged instruction and using LowCore requires
!CONFIG_USER_ONLY, so add the ifdef and move the declaration to the
right place.
This was originally part of a bigger STFL(E) refactoring.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170927170027.8539-4-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
core_id is not needed by linux-user, as the core_id a.k.a. CPU address
is only accessible from kernel space.
Therefore, drop next_core_id and make cpu_index get autoassigned again
for linux-user.
While at it, shield core_id and cpuid completely from linux-user. cpuid
can also only be queried from kernel space.
Suggested-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928134609.16985-5-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let's move it into the machine, so we trigger the IRQ after setting
ms->possible_cpus (which SCLP uses to construct the list of
online CPUs).
This also fixes a problem reported by Thomas Huth, whereby qemu can be
crashed using the none machine
qemu-s390x-softmmu -M none -monitor stdio
-> device_add qemu-s390-cpu
Reviewed-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170928134609.16985-3-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The problem is, that the current implementation places unrealistic and
arbitrary constraints on the length of writes to the device (that is the
outbound requests), by asserting ccw.count being such that that even the
worst case escaped payload will fit an more or less arbitrary sized
buffer. Actually on protocol level there is nothing to justify such
a limitation.
Another strange thing is the return value which more or less reflects
the size (written) after escaping instead of before escaping. This
is strange, because this return value is used to calculate SCSW.count.
Let us teach 3270 how to deal with arbitrary long writes.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Reported-by: Jason J . Herne <jjherne@linux.vnet.ibm.com>
Tested-by: Jason J . Herne <jjherne@linux.vnet.ibm.com>
Message-Id: <20170920172314.102710-3-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Let us convert the 3270 code so it uses the recently introduced
CcwDataStream abstraction instead of blindly assuming direct data access.
This patch does not change behavior beyond introducing IDA support: for
direct data access CCWs everything stays as-is. (If there are bugs, they
are also preserved).
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Message-Id: <20170920172314.102710-2-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This reverts commit d32bd032d8.
Turns out that old QEMUs always created a pci host bridge
and for many CPU models the migration from old QEMUs to new
QEMUs will fail with
qemu-system-s390x: Unknown savevm section or instance 'PCIBUS' 0
qemu-system-s390x: load of migration failed: Invalid argument
As a quick fix we will revert the commit and always create the
pci host bridge.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
[fixed revert to keep the comment fixup, added a comment in the code]
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Message-Id: <20170928131831.81393-1-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This makes it easy to access real addresses (prefix) and in addition
checks for valid memory addresses, which is missing when using e.g.
stl_phys().
We can later reuse it to implement low address protection checks (then
we might even decide to introduce yet another MMU for absolute
addresses, just for handling storage keys and low address protection).
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170926183318.12995-3-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The architecture mandates the addresses to be accessed on the first
indirection level (that is, the data addresses without IDA, and the
(M)IDAW addresses with (M)IDA) to be checked against an CCW format
dependent limit maximum address. If a violation is detected, the storage
access is not to be performed and a channel program check needs to be
generated. As of today, we fail to do this check.
Let us stick even closer to the architecture specification.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170921180841.24490-5-pasic@linux.vnet.ibm.com>
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
env->psa is a 64bit value, while we copy 4 bytes into the save area,
resulting always in 0 getting stored.
Let's try to reduce such errors by using a proper structure. While at
it, use correct cpu->be conversion (and get_psw_mask()), as we will be
reusing this code for TCG soon.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170922140338.6068-1-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The STFLE bits for the MSA (extension) facilities simply indicate that
the respective instructions can be executed. The QUERY subfunction can then
be used to identify which features exactly are available.
Availability of subfunctions can also vary on real hardware. For now, we
simply implement a CPU model without any available subfunctions except
QUERY (which is always around).
As all MSA functions behave quite similarly, we can use one translation
handler for now. Prepare the code for implementation of actual subfunctions.
At least MSA is helpful for now, as older Linux kernels require this
facility when compiled for a z9 model. Allow to enable the facilities
for the qemu cpu model.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170920153016.3858-4-david@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
HMP pull 2017-10-05
# gpg: Signature made Thu 05 Oct 2017 11:50:13 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-hmp-20171005:
hmp-commands-info: Change "@findex FOO" to "@findex info FOO"
hmp-commands-info: Move Texinfo stanzas to conventional place
hmp-commands-info: Fix "info rocker-FOO" misspellings
hmp: Fix unknown command for subtable
hmp: Missing handle_errors
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Merge qio 2017/10/04 v1
# gpg: Signature made Wed 04 Oct 2017 13:23:04 BST
# gpg: using RSA key 0xBE86EBB415104FDF
# gpg: Good signature from "Daniel P. Berrange <dan@berrange.com>"
# gpg: aka "Daniel P. Berrange <berrange@redhat.com>"
# Primary key fingerprint: DAF3 A6FD B26B 6291 2D0E 8E3F BE86 EBB4 1510 4FDF
* remotes/berrange/tags/pull-qio-2017-10-04-1:
io: add trace events for websockets frame handling
io: Attempt to send websocket close messages to client
io: Reply to ping frames
io: Ignore websocket PING and PONG frames
io: Allow empty websocket payload
io: Add support for fragmented websocket binary frames
io: Small updates in preparation for websocket changes
ui: Always remove an old VNC channel watch before adding a new one
io: use case insensitive check for Connection & Upgrade websock headers
io: include full error message in websocket handshake trace
io: send proper HTTP response for websocket errors
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It is useful to trace websockets frame encoding/decoding when debugging
problems.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Make a best effort attempt to close websocket connections according to
the RFC. Sends the close message, as room permits in the socket buffer,
and immediately closes the socket.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Add an immediate ping reply (pong) to the outgoing stream when a ping
is received. Unsolicited pongs are ignored.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Keep pings and gratuitous pongs generated by web browsers from killing
websocket connections.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Some browsers send pings/pongs with no payload, so allow empty payloads
instead of closing the connection.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Allows fragmented binary frames by saving the previous opcode. Handles
the case where an intermediary (i.e., web proxy) fragments frames
originally sent unfragmented by the client.
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Gets rid of unnecessary bit shifting and performs proper EOF checking to
avoid a large number of repeated calls to recvmsg() when a client
abruptly terminates a connection (bug fix).
Signed-off-by: Brandon Carpenter <brandon.carpenter@cypherpath.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When checking the value of the Connection and Upgrade HTTP headers
the websock RFC (6455) requires the comparison to be case insensitive.
The Connection value should be an exact match not a substring.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When the websocket handshake fails it is useful to log the real
error message via the trace points for debugging purposes.
Fixes bug: #1715186
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
When any error occurs while processing the websockets handshake,
QEMU just terminates the connection abruptly. This is in violation
of the HTTP specs and does not help the client understand what they
did wrong. This is particularly bad when the client gives the wrong
path, as a "404 Not Found" would be very helpful.
Refactor the handshake code so that it always sends a response to
the client unless there was an I/O error.
Fixes bug: #1715186
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
NVIDIA has defined a specification for creating GPUDirect "cliques",
where devices with the same clique ID support direct peer-to-peer DMA.
When running on bare-metal, tools like NVIDIA's p2pBandwidthLatencyTest
(part of cuda-samples) determine which GPUs can support peer-to-peer
based on chipset and topology. When running in a VM, these tools have
no visibility to the physical hardware support or topology. This
option allows the user to specify hints via a vendor defined
capability. For instance:
<qemu:commandline>
<qemu:arg value='-set'/>
<qemu:arg value='device.hostdev0.x-nv-gpudirect-clique=0'/>
<qemu:arg value='-set'/>
<qemu:arg value='device.hostdev1.x-nv-gpudirect-clique=1'/>
<qemu:arg value='-set'/>
<qemu:arg value='device.hostdev2.x-nv-gpudirect-clique=1'/>
</qemu:commandline>
This enables two cliques. The first is a singleton clique with ID 0,
for the first hostdev defined in the XML (note that since cliques
define peer-to-peer sets, singleton clique offer no benefit). The
subsequent two hostdevs are both added to clique ID 1, indicating
peer-to-peer is possible between these devices.
QEMU only provides validation that the clique ID is valid and applied
to an NVIDIA graphics device, any validation that the resulting
cliques are functional and valid is the user's responsibility. The
NVIDIA specification allows a 4-bit clique ID, thus valid values are
0-15.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
If the hypervisor needs to add purely virtual capabilties, give us a
hook through quirks to do that. Note that we determine the maximum
size for a capability based on the physical device, if we insert a
virtual capability, that can change. Therefore if maximum size is
smaller after added virt capabilities, use that.
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
If vfio_add_std_cap() errors then going to out prepends irrelevant
errors for capabilities we haven't attempted to add as we unwind our
recursive stack. Just return error.
Fixes: 7ef165b9a8 ("vfio/pci: Pass an error object to vfio_add_capabilities")
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
After iothread is enabled internally inside QEMU with GMainContext, we
may encounter this warning when destroying the iothread:
(qemu-system-x86_64:19925): GLib-CRITICAL **: g_source_remove_poll:
assertion '!SOURCE_DESTROYED (source)' failed
The problem is that g_source_remove_poll() does not allow to remove one
source from array if the source is detached from its owner
context. (peterx: which IMHO does not make much sense)
Fix it on QEMU side by avoid calling g_source_remove_poll() if we know
the object is during destruction, and we won't leak anything after all
since the array will be gone soon cleanly even with that fd.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20170928025958.1420-6-peterx@redhat.com
[peterx: write the commit message]
Signed-off-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
When gcontext is used with iothread, the context will be destroyed
during iothread_stop(). That's not good since sometimes we would like
to keep the resources until iothread is destroyed, but we may want to
stop the thread before that point.
Delay the destruction of gcontext to iothread finalize. Then we can do:
iothread_stop(thread);
some_cleanup_on_resources();
iothread_destroy(thread);
We may need this patch if we want to run chardev IOs in iothreads and
hopefully clean them up correctly. For more specific information,
please see 2b316774f6 ("qemu-char: do not operate on sources from
finalize callbacks").
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20170928025958.1420-5-peterx@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
So that internal iothread users can explicitly stop one iothread without
destroying it.
Since at it, fix iothread_stop() to allow it to be called multiple
times. Before this patch we may call iothread_stop() more than once on
single iothread, while that may not be correct since qemu_thread_join()
is not allowed to run twice. From manual of pthread_join():
Joining with a thread that has previously been joined results in
undefined behavior.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20170928025958.1420-4-peterx@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
IOThread is a general framework that contains IO loop environment and a
real thread behind. It's also good to be used internally inside qemu.
Provide some helpers for it to create iothreads to be used internally.
Put all the internal used iothreads into the internal object container.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-id: 20170928025958.1420-3-peterx@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# gpg: Signature made Fri 29 Sep 2017 04:17:37 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/docker-testing-pull-request:
docker: Don't mount ccache db if NOUSER=1
docker: test-block: Don't continue if build fails
tests/docker/run: don't source /etc/profile
docker: Fix test-mingw
docker: add installation to build tests
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
On a modern server-class ppc host with the following CPU topology:
Architecture: ppc64le
Byte Order: Little Endian
CPU(s): 32
On-line CPU(s) list: 0,8,16,24
Off-line CPU(s) list: 1-7,9-15,17-23,25-31
Thread(s) per core: 1
If both KVM PR and KVM HV loaded and we pass:
-machine pseries,accel=kvm,kvm-type=PR -smp 8
We expect QEMU to warn that this exceeds the number of online CPUs:
Warning: Number of SMP cpus requested (8) exceeds the recommended
cpus supported by KVM (4)
Warning: Number of hotpluggable cpus requested (8) exceeds the
recommended cpus supported by KVM (4)
but nothing is printed...
This happens because on ppc the KVM_CAP_NR_VCPUS capability is VM
specific ndreally depends on the KVM type, but we currently use it
as a global capability. And KVM returns a fallback value based on
KVM HV being present. Maybe KVM on POWER shouldn't presume anything
as long as it doesn't have a VM, but in all cases, we should call
KVM_CREATE_VM first and use KVM_CAP_NR_VCPUS as a VM capability.
This patch hence changes kvm_recommended_vcpus() accordingly and
moves the sanity checking of smp_cpus after the VM creation.
It is okay for the other archs that also implement KVM_CAP_NR_VCPUS,
ie, mips, s390, x86 and arm, because they don't depend on the VM
being created or not.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <150600966286.30533.10909862523552370889.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
On a server-class ppc host, this capability depends on the KVM type,
ie, HV or PR. If both KVM are present in the kernel, we will always
get the HV specific value, even if we explicitely requested PR on
the command line.
This can have an impact if we're using hugepages or a balloon device.
Since we've already created the VM at the time any user calls
kvm_has_sync_mmu(), switching to kvm_vm_check_extension() is
enough to fix any potential issue.
It is okay for the other archs that also implement KVM_CAP_SYNC_MMU,
ie, mips, s390, x86 and arm, because they don't depend on the VM being
created or not.
While here, let's cache the state of this extension in a bool variable,
since it has several users in the code, as suggested by Thomas Huth.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150600965332.30533.14702405809647835716.stgit@bahia.lan>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This is a library header, so angle brackets are more appropriate; also
move the line to before QEMU headers, as is recommended in HACKING.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-id: 20170920085952.3872-1-famz@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Valgrind detects an invalid read operation when hot-plugging of an
USB device fails:
$ valgrind x86_64-softmmu/qemu-system-x86_64 -device usb-ehci -nographic -S
==30598== Memcheck, a memory error detector
==30598== Copyright (C) 2002-2015, and GNU GPL'd, by Julian Seward et al.
==30598== Using Valgrind-3.12.0 and LibVEX; rerun with -h for copyright info
==30598== Command: x86_64-softmmu/qemu-system-x86_64 -device usb-ehci -nographic -S
==30598==
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) device_add usb-tablet
(qemu) device_add usb-tablet
(qemu) device_add usb-tablet
(qemu) device_add usb-tablet
(qemu) device_add usb-tablet
(qemu) device_add usb-tablet
==30598== Invalid read of size 8
==30598== at 0x60EF50: object_unparent (object.c:445)
==30598== by 0x580F0D: usb_try_create_simple (bus.c:346)
==30598== by 0x581BEB: usb_claim_port (bus.c:451)
==30598== by 0x582310: usb_qdev_realize (bus.c:257)
==30598== by 0x4CB399: device_set_realized (qdev.c:914)
==30598== by 0x60E26D: property_set_bool (object.c:1886)
==30598== by 0x61235E: object_property_set_qobject (qom-qobject.c:27)
==30598== by 0x61000F: object_property_set_bool (object.c:1162)
==30598== by 0x4567C3: qdev_device_add (qdev-monitor.c:630)
==30598== by 0x456D52: qmp_device_add (qdev-monitor.c:807)
==30598== by 0x470A99: hmp_device_add (hmp.c:1933)
==30598== by 0x3679C3: handle_hmp_command (monitor.c:3123)
The object_unparent() here is not necessary anymore since commit
69382d8b3e ("qdev: Fix object reference leak in case device.realize()
fails"), so let's remove it now.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-id: 1506526106-30971-1-git-send-email-thuth@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Currently, iothread_stop_all() makes all iothread objects unsafe
to be destroyed, because qemu_thread_join() ends up being called
twice.
To fix this, make iothread_stop() idempotent by checking
thread->stopped.
Fixes the following crash:
qemu-system-x86_64 -object iothread,id=iothread0 -monitor stdio -display none
QEMU 2.10.50 monitor - type 'help' for more information
(qemu) quit
qemu: qemu_thread_join: No such process
Aborted (core dumped)
Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170926130028.12471-1-ehabkost@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu uses wheel-up/down button events for mouse wheel input, however
linux applications typically want REL_WHEEL events.
This fixes wheel with linux guests. Tested with X11/wayland, and
windows virtio-input driver.
Based on a patch from Marc.
Added property to enable/disable wheel axis.
Cc: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170926113243.26081-1-kraxel@redhat.com
Rename the functions to to say "setup" instead of "create" because they
support being called multiple times on the same egl framebuffer.
Properly delete unused textures, update function interfaces to support
this.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170927115031.12063-1-kraxel@redhat.com
Handle the translation from vga chars to curses chars in curses_update()
instead of console_write_ch(). Purge any curses support bits from
ui/console.h include file.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170927103811.19249-1-kraxel@redhat.com
The usual behaviour of /etc/profile is to set the default PATH for
users. This runs into problems when we have updated PATH in our
dockerfile e.g. to access a cross-compiler in a non-standard
location. It shouldn't be needed anyway as we inherit the env from the
image when it was setup.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
CC: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170926133622.14991-1-alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Feature "dtc" is explicitly required by test-mingw, but is not detected
by the run script since we switched to archive-source.sh in b7f404201e.
Since it isn't available in the Fedora image which runs this test on
patchew, the way we get dtc is still from submodule.
archive-source.sh takes care of bundling the submodule files already, so
what we need to do is just checking if files are there. Makefile is
chosen because it is one that is unlikely to get renamed in the future.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170925082913.22089-1-famz@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Migration pull 2017-09-27
# gpg: Signature made Wed 27 Sep 2017 14:56:23 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170927a:
migration: Route more error paths
migration: Route errors up through vmstate_save
migration: wire vmstate_save_state errors up to vmstate_subsection_save
migration: Check field save returns
migration: check pre_save return in vmstate_save_state
migration: pre_save return int
migration: disable auto-converge during bulk block migration
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-09-27
Contains
* a number of Mac machine type fixes
* a number of embedded machine type fixes (preliminary to adding the
Sam460ex board)
* a important fix for handling of migration with KVM PR
* assorted other minor fixes and cleanups
# gpg: Signature made Wed 27 Sep 2017 08:40:48 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20170927: (26 commits)
macio: use object link between MACIO_IDE and MAC_DBDMA object
macio: pass channel into MACIOIDEState via qdev property
mac_dbdma: remove DBDMA_init() function
mac_dbdma: QOMify
mac_dbdma: remove unused IO fields from DBDMAState
spapr: fix the value of SDR1 in kvmppc_put_books_sregs()
ppc/pnv: check for OPAL firmware file presence
ppc: remove all unused CPU definitions
ppc: remove unused CPU definitions
spapr_pci: make index property mandatory
macio: convert pmac_ide_ops from old_mmio
ppc/pnv: Improve macro parenthesization
spapr: introduce helpers to migrate HPT chunks and the end marker
ppc/kvm: generalize the use of kvmppc_get_htab_fd()
ppc/kvm: change kvmppc_get_htab_fd() to return -errno on error
ppc: Fix OpenPIC model
ppc/ide/macio: Add missing registers
ppc/mac: More rework of the DBDMA emulation
ppc/mac: Advertise a high clock frequency for NewWorld Macs
ppc: QOMify g3beige machine
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Block layer patches
# gpg: Signature made Tue 26 Sep 2017 14:52:32 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream: (24 commits)
block/qcow2-bitmap: fix use of uninitialized pointer
qemu-iotests: add shrinking image test
qcow2: add shrink image support
qcow2: add qcow2_cache_discard
qemu-img: add --shrink flag for resize
iotests: fix 181: enable postcopy-ram capability on target
qemu-iotests: Test change-backing-file command
block: Fix permissions after bdrv_reopen()
block: reopen: Queue children after their parents
block: Base permissions on rw state after reopen
block: Add reopen queue to bdrv_check_perm()
block: Add reopen_queue to bdrv_child_perm()
qemu-io: Drop write permissions before read-only reopen
block: Clean up some bad code in the vvfat driver
block/throttle-groups.c: allocate RestartData on the heap
throttle: Assert that bkt->max is valid in throttle_compute_wait()
iotests: Print full path of bad output if mismatch
iotests: use virtio aliases for 067
iotests: use -ccw on s390x for 051
iotests: use -ccw on s390x for 040, 139, and 182
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Modify the pre_save method on VMStateDescription to return an int
rather than void so that it potentially can fail.
Changed zillions of devices to make them return 0; the only
case I've made it return non-0 is hw/intc/s390_flic_kvm.c that already
had an error_report/return case.
Note: If you add an error exit in your pre_save you must emit
an error_report to say why.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170925112917.21340-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
auto-converge and block migration currently do not play well together.
During block migration the auto-converge logic detects that ram
migration makes no progress and thus throttles down the vm until
it nearly stalls completely. Avoid this by disabling the throttling
logic during the bulk phase of the block migration.
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Lieven <pl@kamp.de>
Message-Id: <1506421996-12513-1-git-send-email-pl@kamp.de>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Using a standard QOM object link we can pass a reference to the MAC_DBDMA
controller to the MACIO_IDE object which removes the last external parameter
to macio_ide_register_dma().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
One of the reasons macio_ide_register_dma() needs to exist is because the
channel id isn't passed into the MACIO_IDE object. Pass in the channel id
using a qdev property to remove this requirement.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead we can now instantiate the MAC_DBDMA object directly within the
macio device. We also add the DBDMA device as a child property so that
it is possible to retrieve later.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These fields were used to manually handle IO requests that weren't aligned
to a sector boundary before this feature was supported by the block API.
Once the block API changed to support byte-aligned IO requests, the macio
controller was switched over to use it in commit be1e343 but these fields
were accidentally left behind. Remove them, including the initialisation
in DBDMA_init().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When running with KVM PR, if a new HPT is allocated we need to inform
KVM about the HPT address and size. This is currently done by hacking
the value of SDR1 and pushing it to KVM in several places.
Also, migration breaks the guest since it is very unlikely the HPT has
the same address in source and destination, but we push the incoming
value of SDR1 to KVM anyway.
This patch introduces a new virtual hypervisor hook so that the spapr
code can provide the correct value of SDR1 to be pushed to KVM each
time kvmppc_put_books_sregs() is called.
It allows to get rid of all the hacking in the spapr/kvmppc code and
it fixes migration of nested KVM PR.
Suggested-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
and exit before uselessly trying to load it if the file does not
exists.
Issue discovered by Coverity Scan.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Remove *all* unused CPU definitions as indicated by compile-time
`#if 0` constructs.
Signed-off-by: John Snow <jsnow@redhat.com>
[dwg: Removed some additional now-useless comments]
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PHBs can be created with an index property, in which case the machine
code automatically sets all the MMIO windows at addresses derived from
the index. Alternatively, they can be manually created without index,
but the user has to provide addresses for all MMIO windows.
The non-index way happens to be more trouble than it's worth: it's
difficult to use, keeps requiring (potentially incompatible) changes
when some new parameter needs adding, and is awkward to check for
collisions. It currently even has a bug that prevents to use two
non-index PHBs because their child DRCs are all derived from the
same index == -1 value, and, thus, collide.
This patch hence makes the index property mandatory. As a consequence,
the PHB's memory regions and BUID are now always configured according
to the index, and it is no longer possible to set them from the command
line.
This DOES BREAK backwards compat, but we don't think the non-index
PHB feature was used in practice (at least libvirt doesn't) and the
simplification is worth it.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Although none of the existing macro call-sites were broken,
it's always better to write macros that properly parenthesize
arguments that can be complex expressions, so that the intended
order of operations is not broken.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The use of KVM_PPC_GET_HTAB_FD is open-coded in kvmppc_read_hptes()
and kvmppc_write_hpte().
This patch modifies kvmppc_get_htab_fd() so that it can be used
everywhere we need to access the in-kernel htab:
- add an index argument
=> only kvmppc_read_hptes() passes an actual index, all other users
pass 0
- add an errp argument to propagate error messages to the caller.
=> spapr migration code prints the error
=> hpte helpers pass &error_abort to keep the current behavior
of hw_error()
While here, this also fixes a bug in kvmppc_write_hpte() so that it
opens the htab fd for writing instead of reading as it currently does.
This never broke anything because we currently never call this code,
as explained in the changelog of commit c138593380:
"This support updating htab managed by the hypervisor. Currently
we don't have any user for this feature. This actually bring the
store_hpte interface in-line with the load_hpte one. We may want
to use this when we want to emulate henter hcall in qemu for HV
kvm."
The above is still true today.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When kvmppc_get_htab_fd() fails, its return value is propagated up to
qemu_savevm_state_iterate() or to qemu_savevm_state_complete_precopy().
All savevm handlers expect to receive a negative errno on error.
Let's patch kvmppc_get_htab_fd() accordingly.
While here, let's change htab_load() in the spapr code to also
propagate the error, since it doesn't make sense to abort() if
we couldn't get the htab fd from KVM.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The timing register exists on all variants of MacIO IDE, we just
store and return its value.
The interrupts register only exists on KeyLargo but it doesn't
hurt to have it. The lack of this register causes MacOS X to
hangs under some circumstances.
Both are 32-bit only. The HW might support smaller access sizes
but no known OS uses them.
Because the core IDE subsystem doesn't provide us with a way
to query the main (level) interrupt state, nor do we have a way
to know that DBDMA issued a (edge) interrupt, we reflect both
through a private pair of qirq's in order to maintain the
register state.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This completely reworks the handling of the control register
according to my understanding of the HW and the spec.
It should (hopefully ... still testing) fix a number of issues
most notably cases of MacOS hanging.
Also update dbdma_unassigned_rw() and dbdma_unassigned_flush() to
have the expected behaviour now that flush is handled slightly
differently.
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
These registers are present in 440 SoCs (and maybe in others too) and
U-Boot accesses them when printing register info. We don't emulate
these but add them to avoid crashing when they are read or written.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The following capabilities are VM specific:
- KVM_CAP_PPC_SMT_POSSIBLE
- KVM_CAP_PPC_HTAB_FD
- KVM_CAP_PPC_ALLOC_HTAB
If both KVM HV and KVM PR are present, checking them always return
the HV value, even if we explicitely requested to use PR.
This has no visible effect for KVM_CAP_PPC_ALLOC_HTAB, because we also
try the KVM_PPC_ALLOCATE_HTAB ioctl which is only suppored by HV. As
a consequence, the spapr code doesn't even check KVM_CAP_PPC_HTAB_FD.
However, this will cause kvmppc_hint_smt_possible(), introduced by
commit fa98fbfcdf, to report several VSMT modes (eg, Available
VSMT modes: 8 4 2 1) whereas PR only support mode 1.
This patch fixes all three anyway to use kvm_vm_check_extension(). It
is okay since the VM is already created at the time kvm_arch_init() or
kvmppc_reset_htab() is called.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
BQL bug fix
# gpg: Signature made Mon 25 Sep 2017 23:14:48 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20170925:
accel/tcg/cputlb: avoid recursive BQL (fixes#1706296)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This patch add shrinking of the image file for qcow2. As a result, this allows
us to reduce the virtual image size and free up space on the disk without
copying the image. Image can be fragmented and shrink is done by punching holes
in the image file.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170918124230.8152-4-pbutsykin@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
Whenever l2/refcount table clusters are discarded from the file we can
automatically drop unnecessary content of the cache tables. This reduces
the chance of eviction useful cache data and eliminates inconsistent data
in the cache with the data in the file.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170918124230.8152-3-pbutsykin@virtuozzo.com
Signed-off-by: Max Reitz <mreitz@redhat.com>
The flag is additional precaution against data loss. Perhaps in the future the
operation shrink without this flag will be blocked for all formats, but for now
we need to maintain compatibility with raw.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 20170918124230.8152-2-pbutsykin@virtuozzo.com
[mreitz: Added a missing space to a warning]
Signed-off-by: Max Reitz <mreitz@redhat.com>
Migration capabilities should be enabled on both source and
destination qemu processes.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This involves a temporary read-write reopen if the backing file link in
the middle of a backing file chain should be changed and is therefore a
good test for the latest bdrv_reopen() vs. op blockers fixes.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If we switch between read-only and read-write, the permissions that
image format drivers need on bs->file change, too. Make sure to update
the permissions during bdrv_reopen().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
We will calculate the required new permissions in the prepare stage of a
reopen. Required permissions of children can be influenced by the
changes made to their parents, but parents are independent from their
children. This means that permissions need to be calculated top-down. In
order to achieve this, queue parents before their children rather than
queuing the children first.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
When new permissions are calculated during bdrv_reopen(), they need to
be based on the state of the graph as it will be after the reopen has
completed, not on the current state of the involved nodes.
This patch makes bdrv_is_writable() optionally accept a BlockReopenQueue
from which the new flags are taken. This is then used for determining
the new bs->file permissions of format drivers as soon as we add the
code to actually pass a non-NULL reopen queue to the .bdrv_child_perm
callbacks.
While moving bdrv_is_writable(), make it static. It isn't used outside
block.c.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the context of bdrv_reopen(), we'll have to look at the state of the
graph as it will be after the reopen. This interface addition is in
preparation for the change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
In the context of bdrv_reopen(), we'll have to look at the state of the
graph as it will be after the reopen. This interface addition is in
preparation for the change.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-io provides a 'reopen' command that allows switching from writable
to read-only access. We need to make sure that we don't try to keep
write permissions to a BlockBackend that becomes read-only, otherwise
things are going to fail.
This requires a bdrv_drain() call because otherwise in-flight AIO
write requests could issue new internal requests while the permission
has already gone away, which would cause assertion failures. Draining
the queue doesn't break AIO requests in any new way, bdrv_reopen() would
drain it anyway only a few lines later.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Remove the unnecessary home-grown redefinition of the assert() macro here,
and remove the unusable debug code at the end of the checkpoint() function.
The code there uses assert() with side-effects (assignment to the "mapping"
variable), which should be avoided. Looking more closely, it seems as it is
apparently also only usable for one certain directory layout (with a file
named USB.H in it) and thus is of no use for the rest of the world.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
RestartData is the opaque data of the throttle_group_restart_queue_entry
coroutine. By being stack allocated, it isn't available anymore if
aio_co_enter schedules the coroutine with a bottom half and runs after
throttle_group_restart_queue returns.
Cc: qemu-stable@nongnu.org
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
If bkt->max == 0 and bkt->burst_length > 1 then we could have a
division by 0 in throttle_do_compute_wait(). That configuration is
however not permitted and is already detected by throttle_is_valid(),
but let's assert it in throttle_compute_wait() to make it explicit.
Found by Coverity (CID: 1381016).
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default cpu model on s390x does not provide zPCI, which is
not yet wired up on tcg. Moreover, virtio-ccw is the standard
on s390x.
Using virtio-scsi will implicitly pick the right device, so just
switch to that for simplicity.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default cpu model on s390x does not provide zPCI, which is
not yet wired up on tcg. Moreover, virtio-ccw is the standard
on s390x, so use the -ccw instead of the -pci versions of virtio
devices on s390x.
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The default cpu model on s390x does not provide zPCI, which is
not yet wired up on tcg. Moreover, virtio-ccw is the standard
on s390x, so use the -ccw instead of the -pci versions of virtio
devices on s390x.
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: QingFeng Hao <haoqf@linux.vnet.ibm.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Block driver documentation is available in qemu-doc.html. It would be
convenient to have documentation for formats, protocols, and filter
drivers in a man page.
Extract the relevant part of qemu-doc.html into a new file called
docs/qemu-block-drivers.texi. This file can also be built as a
stand-alone document (man, html, etc).
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
People get surprised when, after "qemu-img create -f raw /dev/sdX", they
still see qcow2 with "qemu-img info", if previously the bdev had a qcow2
header. While this is natural because raw doesn't need to write any
magic bytes during creation, hdev_create is free to clear out the first
sector to make sure the stale qcow2 header doesn't cause such confusion.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's not too surprising when a user specifies the backing file relative
to the current working directory instead of the top layer image. This
causes error when they differ. Though the error message has enough
information to infer the fact about the misunderstanding, it is better
if we document this explicitly, so that users don't have to learn from
mistakes.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A basic set of qemu options is initialised in ./common:
export QEMU_OPTIONS="-nodefaults -machine accel=qtest"
However, two test cases (172 and 186) overwrite QEMU_OPTIONS and neglect
to manually set '-machine accel=qtest'. Add the missing option for 172.
186 probably only copied the code from 172, it doesn't actually need to
overwrite QEMU_OPTIONS, so remove that in 186.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Add a firmware path config option to configure. Multiple directories
are accepted, with the usual colon as separator. Default value is
${prefix}/share/qemu-firmware. The path is searched in addition to the
current search path (typically ${prefix}/share/qemu).
This prepares qemu for the planned split of the prebuilt firmware blobs
into a separate project.
Distributions can also use this to get rid of the firmware symlink farm
and add -- for example -- /usr/share/seabios to the firmware path
instead.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170914114236.25343-3-kraxel@redhat.com
Add helper function to add a directory to the qemu search path, so we
don't duplicate the checks. Add a check for duplicate entries, so we
stop trying to open files twice.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170914114236.25343-2-kraxel@redhat.com
QEMU currently aborts if you try to use the device at the command
line:
$ ppc64-softmmu/qemu-system-ppc64 -S -machine prep -device pc87312
Unexpected error in qemu_chr_fe_init() at chardev/char-fe.c:222:
qemu-system-ppc64: -device pc87312: Device 'parallel0' is in use
Aborted (core dumped)
It uses parallel_hds in its realize function, so I can not be
instantiated by the user again.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This is required to be removed on SmartOS (Illumos).
As of now there are no alternative supported SunOS distributions.
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If QEMU has been compiled with the flags --enable-tcg-interpreter and
--enable-debug, the guest is running incredibly slow. The pxe boot test
can take up to 400 seconds when testing the pseries ppc64 machine. While
we should still look for ways to speed up the test on the pseries machine,
it's better to increase the timeout in this test to 600 seconds anyway to
allow the test to pass successfully now with this unusal configuration
already.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If 'bs' is a complex expression, we were only casting the front half
rather than the full expression. Luckily, none of the callers were
passing bad arguments, but it's better to be robust up front.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The virtio-gpu-pci device is already in the display category, so the
virtio-gpu-device should be there, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
When using bit-wise operations that exploit the power-of-two
nature of the second argument of ROUND_UP(), we still need to
ensure that the mask is as wide as the first argument (done
by using a ternary to force proper arithmetic promotion).
Unpatched, ROUND_UP(2ULL*1024*1024*1024*1024, 512U) produces 0,
instead of the intended 2TiB, because negation of an unsigned
32-bit quantity followed by widening to 64-bits does not
sign-extend the mask.
Broken since its introduction in commit 292c8e50 (v1.5.0).
Callers that passed the same width type to both macro parameters,
or that had other code to ensure the first parameter's maximum
runtime value did not exceed the second parameter's width, are
unaffected, but I did not audit to see which (if any) existing
clients of the macro could trigger incorrect behavior (I found
the bug while adding a new use of the macro).
While preparing the patch, checkpatch complained about poor
spacing, so I also fixed that here and in the nearby DIV_ROUND_UP.
CC: qemu-trivial@nongnu.org
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Instead of using the hardcoded (MemTxAttrs){0} for no memory attributes
let's use the already defined MEMTXATTRS_UNSPECIFIED macro instead.
This is technically a change of behaviour as MEMTXATTRS_UNSPECIFIED sets
the unspecified field to 1, but it doesn't look like anything is
checking this field.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Acked-by: Max Filippov <jcmvbkbc@gmail.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Error process of baum_chr_open needs to set brlapi null, so it won't
get released twice in char_braille_finalize, which will cause
"/usr/bin/qemu-system-x86_64: double free or corruption (!prev)"
Signed-off-by: Liang Yan <lyan@suse.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Remove trailing whitespace in qemu-options documentation, as it causes
reproducibility issues depending on the echo implementation used by
the Makefile.
Reported-By: Vagrant Cascadian <vagrant@debian.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
It may be better to add a trace event to monitor the last moment of
a key event from QEMU to guest VM
Signed-off-by: Liang Yan <lyan@suse.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This device is private and is created once per aux-bus.
So don't allow the user to create one from command-line.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
In qemu-thread-posix.c we have two implementations of the
various qemu_sem_* functions, one of which uses native POSIX
sem_* and the other of which emulates them with pthread conditions.
This is necessary because not all our host OSes support
sem_timedwait().
Instead of a hard-coded list of OSes which don't implement
sem_timedwait(), which gets out of date, make configure
test for the presence of the function and set a new
CONFIG_HAVE_SEM_TIMEDWAIT appropriately.
In particular, newer NetBSDs have sem_timedwait(), so this
commit will switch them over to using it. OSX still does
not have an implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This change fixes conflict with the DragonFly BSD headers.
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If we are woken up from while() loop in nbd_read_reply_entry
handles must be equal. If we are woken up from
nbd_recv_coroutines_wake_all s->quit must be true, so we do
not need checking handles equality.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170920124507.18841-3-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
If 'bs' is a complex expression, we were only casting the front half
rather than the full expression. Luckily, none of the callers were
passing bad arguments, but it's better to be robust up front.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170918214649.17550-1-eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
NULL sockets are used for NDP, BOOTP, and other critical operations.
If the topmost mbuf in a NULL session is blocked pending resolution,
it may cause problems if it blocks other packets with a NULL socket.
So do not add mbufs with a NULL socket field to the same session.
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
if_output() originally sent one mbuf per call and used the slirp->next_m
variable to keep track of where it left off. But nowadays it tries to
send all of the mbufs from the fastq, and one mbuf from each session on
the batchq. The next_m variable is both redundant and harmful: there is
a case[0] involving delayed packets in which next_m ends up pointing
to &slirp->if_batchq when an active session still exists, and this
blocks all traffic for that session until qemu is restarted.
The test case was created to reproduce a problem that was seen on
long-running Chromium OS VM tests[1] which rapidly create and
destroy ssh connections through hostfwd.
[0] https://pastebin.com/NNy6LreF
[1] https://bugs.chromium.org/p/chromium/issues/detail?id=766323
Signed-off-by: Kevin Cernekee <cernekee@chromium.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
e.g.
./x86_64-softmmu/qemu-system-x86_64 -nographic -netdev 'user,id=vnet,hostfwd=:555.0.0.0:0-:22'
qemu-system-x86_64: -netdev user,id=vnet,hostfwd=:555.0.0.0:0-:22: Invalid host forwarding rule ':555.0.0.0:0-:22' (Bad host address)
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
We had a per-chardev cache for context, then we don't need this
parameter to be passed in every time when chr_update_read_handler()
called. As long as we are calling chr_update_read_handler() using
qemu_chr_be_update_read_handlers() we'll be fine.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1505975754-21555-5-git-send-email-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It was only passed in by chr_update_read_handlers(). However when
reconnect, we'll lose that context information. So if a chardev was
running on another context (rather than the default context, the NULL
pointer), it'll switch back to the default context if reconnection
happens. But, it should really stick to the old context.
Convert all the callers of io_add_watch_poll() to use the internally
cached gcontext. Then the context should be able to survive even after
reconnections.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1505975754-21555-4-git-send-email-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It caches the gcontext that is used to poll the chardev IO. Before this
patch, we only passed it in via chr_update_read_handlers(). However
that may not be enough if the char backend is disconnected and
reconnected afterward. There are chardev codes that still assumed the
context be NULL (which is the main context). Will fix that up in
following up patches.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1505975754-21555-3-git-send-email-peterx@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Proper support of persistent reservation for multipath devices requires
communication with the multipath daemon, so that the reservation is
registered and applied when a path comes up. The device mapper
utilities provide a library to do so; this patch makes qemu-pr-helper.c
detect multipath devices and, when one is found, delegate the operation
to libmpathpersist.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Introduce a privileged helper to run persistent reservation commands.
This lets virtual machines send persistent reservations without using
CAP_SYS_RAWIO or out-of-tree patches. The helper uses Unix permissions
and SCM_RIGHTS to restrict access to processes that can access its socket
and prove that they have an open file descriptor for a raw SCSI device.
The next patch will also correct the usage of persistent reservations
with multipath devices.
It would also be possible to support for Linux's IOC_PR_* ioctls in
the future, to support NVMe devices. For now, however, only SCSI is
supported.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Cleber and I are volunteering to review and queue patches for the
Python scripts and modules in scripts/.
I'm setting "M: Odd fixes" because not all scripts are actively
maintained.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170915230744.22942-1-ehabkost@redhat.com>
[ehabkost: add tests/*.py too]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Not all scripts using qemu.py configure the Python logging
module, and end up generating a "No handlers could be found for
logger" message instead of actual log messages.
To avoid requiring every script using qemu.py to configure
logging manually, call basicConfig() when creating a QEMUMachine
object. This won't affect scripts that already set up logging,
but will ensure that scripts that don't configure logging keep
working.
Reported-by: Kevin Wolf <kwolf@redhat.com>
Fixes: 4738b0a85a
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170921162234.847-1-ehabkost@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Acked-by: Lukáš Doktor <ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This modification is necessary for userfault fd features which are
required to be requested from userspace.
UFFD_FEATURE_THREAD_ID is a one of such "on demand" feature, which will
be introduced in the next patch.
QEMU have to use separate userfault file descriptor, due to
userfault context has internal state, and after first call of
ioctl UFFD_API it changes its state to UFFD_STATE_RUNNING (in case of
success), but kernel while handling ioctl UFFD_API expects UFFD_STATE_WAIT_API.
So only one ioctl with UFFD_API is possible per ufd.
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
That tiny refactoring is necessary to be able to set
UFFD_FEATURE_THREAD_ID while requesting features, and then
to create downtime context in case when kernel supports it.
Signed-off-by: Alexey Perevalov <a.perevalov@samsung.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Fill postcopy-able pending only if ram postcopy is enabled.
It is necessary because of there will be other postcopy-able states and
when ram postcopy is disabled, it should not spoil common postcopy
related pending.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Now postcopy-able states are recognized by not NULL
save_live_complete_postcopy handler. But when we have several different
postcopy-able states, it is not convenient. Ram postcopy may be
disabled, while some other postcopy enabled, in this case Ram state
should behave as it is not postcopy-able.
This patch add separate has_postcopy handler to specify behaviour of
savevm state.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Provide helpers to convert bitmaps to little endian format. It can be
used when we want to send one bitmap via network to some other hosts.
One thing to mention is that, these helpers only solve the problem of
endianess, but it does not solve the problem of different word size on
machines (the bitmaps managing same count of bits may contains different
size when malloced). So we need to take care of the size alignment issue
on the callers for now.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Creation of the threads, nothing inside yet.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
--
Use pointers instead of long array names
Move to use semaphores instead of conditions as paolo suggestion
Put all the state inside one struct.
Use a counter for the number of threads created. Needed during cancellation.
Add error return to thread creation
Add id field
Rename functions to multifd_save/load_setup/cleanup
Change recv parameters to a pointer to struct
Change back to a struct
Use Error * for _cleanup
Indicates how many pages we are going to send in each batch to a multifd
thread.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Be consistent with defaults and documentation
Use new DEFINE_PROP_*
Rename x-multifd-group to x-multifd-page-count
Indicates the number of channels that we will create. By default we
create 2 channels.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
--
Catch inconsistent defaults (eric).
Improve comment stating that number of threads is the same than number
of sockets
Use new DEFIN_PROP_*
Rename x-multifd-threads to x-multifd-threads
This function allows us to decide when to close the listener socket.
For now, we only need one connection.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
As this is defined on glib 2.32, add compatibility macros for older glibs.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
We pass the ioc instead of the fd. This will allow us to have more
than one channel open. We also make sure that we set the
from_src_file sooner, so we don't need to pass it as a parameter.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
--
Do not assing mis->from_src_file (peterxu)
# gpg: Signature made Fri 22 Sep 2017 08:28:38 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/build-and-test-automation-pull-request: (36 commits)
docker: Drop 'set -e' from run script
docker: Use archive-source.py
tests: Add README for vm tests
MAINTAINERS: Add tests/vm entry
Makefile: Add rules to run vm tests
tests: Add OpenBSD image
tests: Add NetBSD image
tests: Add FreeBSD image
tests: Add ubuntu.i386 image
tests: Add vm test lib
tests: Add a test key pair
scripts: Add archive-source.sh
qemu.py: Add "wait()" method
gitignore: Ignore vm test images
MAINTAINERS: Fix subsystem name for "Build and test automation"
buildsys: Move rdma libs to per object
buildsys: Move brlapi libs to per object
buildsys: Move usb redir cflags/libs to per object
buildsys: Move libusb cflags/libs to per object
buildsys: Move libcacard cflags/libs to per object
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The only prototype doesn't need anything from the lib header, and not
including it here allows files that include this header, for example
vl.c, to compile without the libseccomp cflags.
The breakage is since c3883e1f93 for environments where `pkg-config
--cflags libseccomp" is non-empty.
Reported-by: Jan Kiszka <jan.kiszka@siemens.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Acked-by: Eduardo Otubo <otubo@redhat.com>
Message-id: 20170920083647.14599-1-famz@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The migration interface for ais was introduced with kernel 4.13
but the capability itself had been active since 4.12. As migration
support is considered necessary lets disable ais in the 2.10
stable version. A proper fix and re-enablement will be done
for qemu 2.11.
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Message-Id: <20170921140834.14233-2-borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is the common code to implement a "VM test" to
1) Download and initialize a pre-defined VM that has necessary
dependencies to build QEMU and SSH access.
2) Archive $SRC_PATH to a .tar file.
3) Boot the VM, and pass the source tar file to the guest.
4) SSH into the VM, untar the source tarball, build from the source.
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
The subsystem name for the "Build test automation" section is
"-------------------------", because an actual subsystem name
line is missing:
$ ./scripts/get_maintainer.pl -f tests/docker/docker.py
"Alex Bennée" <alex.bennee@linaro.org> (maintainer:-----------------...)
Fam Zheng <famz@redhat.com> (maintainer:-----------------...)
"Philippe Mathieu-Daudé" <f4bug@amsat.org> (reviewer:-----------------...)
qemu-devel@nongnu.org (open list:-----------------...)
Fix the issue by inserting a subsystem name line where
get_maintainer.pl expects it.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170921170209.9101-1-ehabkost@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
The 'run' script already creats src, build and install directories under
$TEST_DIR, use it in common.rc.
Also the tests always run from $QEMU_SRC/tests/docker, so use a relative
$CMD string.
Message-Id: <20170817035721.11064-1-famz@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
If you invoke with NOCACHE=1 we pass --no-cache in the argv to
docker.py but may still not force a rebuild if the dockerfile checksum
hasn't changed. By testing for its presence we can force builds
without having to manually remove the docker image.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Message-Id: <20170725133425.436-5-alex.bennee@linaro.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
It is a common requirement for virtual machine to send persistent
reservations, but this currently requires either running QEMU with
CAP_SYS_RAWIO, or using out-of-tree patches that let an unprivileged
QEMU bypass Linux's filter on SG_IO commands.
As an alternative mechanism, the next patches will introduce a
privileged helper to run persistent reservation commands without
expanding QEMU's attack surface unnecessarily.
The helper is invoked through a "pr-manager" QOM object, to which
file-posix.c passes SG_IO requests for PERSISTENT RESERVE OUT and
PERSISTENT RESERVE IN commands. For example:
$ qemu-system-x86_64
-device virtio-scsi \
-object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock
-drive if=none,id=hd,driver=raw,file.filename=/dev/sdb,file.pr-manager=helper0
-device scsi-block,drive=hd
or:
$ qemu-system-x86_64
-device virtio-scsi \
-object pr-manager-helper,id=helper0,path=/var/run/qemu-pr-helper.sock
-blockdev node-name=hd,driver=raw,file.driver=host_device,file.filename=/dev/sdb,file.pr-manager=helper0
-device scsi-block,drive=hd
Multiple pr-manager implementations are conceivable and possible, though
only one is implemented right now. For example, a pr-manager could:
- talk directly to the multipath daemon from a privileged QEMU
(i.e. QEMU links to libmpathpersist); this makes reservation work
properly with multipath, but still requires CAP_SYS_RAWIO
- use the Linux IOC_PR_* ioctls (they require CAP_SYS_ADMIN though)
- more interestingly, implement reservations directly in QEMU
through file system locks or a shared database (e.g. sqlite)
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This shares an cached empty FlatView among address spaces. The empty
FV is used every time when a root MR renders into a FV without memory
sections which happens when MR or its children are not enabled or
zero-sized. The empty_view is not NULL to keep the rest of memory
API intact; it also has a dispatch tree for the same reason.
On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this halves
the amount of FlatView's in use (557 -> 260) and dispatch tables
(~800000 -> ~370000). In an unrelated experiment with 112 non-virtio
devices on x86 ("-M pc"), only 4 FlatViews are alive, and about ~2000
are created at startup.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-16-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
A container can be used instead of an alias to allow switching between
multiple subregions. In this case we cannot directly share the
subregions (since they only belong to a single parent), but if the
subregions are aliases we can in turn walk those.
This is not enough to remove all source of quadratic FlatView creation,
but it enables sharing of the PCI bus master FlatViews (and their
AddressSpaceDispatch structures) across all PCI devices. For 112
virtio-net-pci devices, boot time is reduced from 25 to 10 seconds and
memory consumption from 1.4 to 1 G.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This avoids usual memory_region_transaction_commit() which rebuilds
all FVs.
On POWER8 with 255 CPUs, 255 virtio-net, 40 PCI bridges guest this brings
down the boot time from 25s to 20s and reduces the amount of temporary FVs
allocated during machine constructon (~800000 -> ~640000) and amount of
temporary dispatch trees (~370000 -> ~300000), the total memory footprint
goes down (18G -> 17G).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-18-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since FlatViews are shared now and ASes not, this gets rid of
address_space_init_shareable().
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-17-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds a new "-d" switch to "info mtree" to print dispatch tree
internals.
This changes the way "-f" is handled - it prints now flat views and
associated address spaces.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-15-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This allows sharing flat views between address spaces (AS) when
the same root memory region is used when creating a new address space.
This is done by walking through all ASes and caching one FlatView per
a physical root MR (i.e. not aliased).
This removes search for duplicates from address_space_init_shareable() as
FlatViews are shared elsewhere and keeping as::ref_count correct seems
an unnecessary and useless complication.
This should cause no change and memory use or boot time yet.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-13-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Address spaces get to keep a root MR (alias or not) but FlatView stores
the actual MR as this is going to be used later on to decide whether to
share a particular FlatView or not.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-10-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We store AddressSpaceDispatch* in FlatView anyway so there is no need
to carry it from mem_add() to register_subpage/register_multipage.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-8-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
AS in ASD is only used to pass AS from mem_begin() to register_subpage()
to store it in MemoryRegionSection, we can do this directly now.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-6-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As we are going to share FlatView's between AddressSpace's,
and AddressSpaceDispatch is a structure to perform quick lookup
in FlatView, this moves ASD to FlatView.
After previosly open coded ASD rendering, we can also remove
as->next_dispatch as the new FlatView pointer is stored
on a stack and set to an AS atomically.
flatview_destroy() is executed under RCU instead of
address_space_dispatch_free() now.
This makes mem_begin/mem_commit to work with ASD and mem_add with FV
as later on mem_add will be taking FV as an argument anyway.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-5-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This moves a FlatView allocation and initialization to a helper.
While we are nere, replace g_new with g_new0 to not to bother if we add
new fields in the future.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-4-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are going to share FlatView's between AddressSpace's and per-AS
memory listeners won't suit the purpose anymore so open code
the dispatch tree rendering.
Since there is a good chance that dispatch_listener was the only
listener, this avoids address_space_update_topology_pass() if there is
no registered listeners; this should improve starting time.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-3-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This adds an AS** parameter to address_space_do_translate()
to make it easier for the next patch to share FlatViews.
This should cause no behavioural change.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-Id: <20170921085110.25598-2-aik@ozlabs.ru>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It's possible for address_space_get_flatview() as it currently stands
to cause a use-after-free for the returned FlatView, if the reference
count is incremented after the FlatView has been replaced by a writer:
thread 1 thread 2 RCU thread
-------------------------------------------------------------
rcu_read_lock
read as->current_map
set as->current_map
flatview_unref
'--> call_rcu
flatview_ref
[ref=1]
rcu_read_unlock
flatview_destroy
<badness>
Since FlatViews are not updated very often, we can just detect the
situation using a new atomic op atomic_fetch_inc_nonzero, similar to
Linux's atomic_inc_not_zero, which performs the refcount increment only if
it hasn't already hit zero. This is similar to Linux commit de09a9771a53
("CRED: Fix get_task_cred() and task_state() to not resurrect dead
credentials", 2010-07-29).
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
target-arm queue:
* more preparatory work for v8M support
* convert some omap devices away from old_mmio
* remove out of date ARM ARM section references in comments
* add the Smartfusion2 board
# gpg: Signature made Thu 21 Sep 2017 17:40:40 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20170921: (31 commits)
msf2: Add Emcraft's Smartfusion2 SOM kit
msf2: Add Smartfusion2 SoC
msf2: Add Smartfusion2 SPI controller
msf2: Microsemi Smartfusion2 System Register block
msf2: Add Smartfusion2 System timer
hw/arm/omap2.c: Don't use old_mmio
hw/i2c/omap_i2c.c: Don't use old_mmio
hw/timer/omap_gptimer: Don't use old_mmio
hw/timer/omap_synctimer.c: Don't use old_mmio
hw/gpio/omap_gpio.c: Don't use old_mmio
hw/arm/palm.c: Don't use old_mmio for static_ops
target/arm: Remove out of date ARM ARM section references in A64 decoder
nvic: Support banked exceptions in acknowledge and complete
nvic: Make SHCSR banked for v8M
nvic: Make ICSR banked for v8M
target/arm: Handle banking in negative-execution-priority check in cpu_mmu_index()
nvic: Handle v8M changes in nvic_exec_prio()
nvic: Disable the non-secure HardFault if AIRCR.BFHFNMINS is clear
nvic: Implement v8M changes to fixed priority exceptions
nvic: In escalation to HardFault, support HF not being priority -1
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the A64 decoder, we have a lot of references to section numbers
from version A.a of the v8A ARM ARM (DDI0487). This version of the
document is now long obsolete (we are currently on revision B.a),
and various intervening versions renumbered all the sections.
The most recent B.a version of the document doesn't assign
section numbers at all to the individual instruction classes
in the way that the various A.x versions did. The simplest thing
to do is just to delete all the out of date C.x.x references.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Message-id: 20170915150849.23557-1-peter.maydell@linaro.org
Update armv7m_nvic_acknowledge_irq() and armv7m_nvic_complete_irq()
to handle banked exceptions:
* acknowledge needs to use the correct vector, which may be
in sec_vectors[]
* acknowledge needs to return to its caller whether the
exception should be taken to secure or non-secure state
* complete needs its caller to tell it whether the exception
being completed is a secure one or not
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-20-git-send-email-peter.maydell@linaro.org
Now that we have a banked FAULTMASK register and banked exceptions,
we can implement the correct check in cpu_mmu_index() for whether
the MPU_CTRL.HFNMIENA bit's effect should apply. This bit causes
handlers which have requested a negative execution priority to run
with the MPU disabled. In v8M the test has to check this for the
current security state and so takes account of banking.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-17-git-send-email-peter.maydell@linaro.org
Update nvic_exec_prio() to support the v8M changes:
* BASEPRI, FAULTMASK and PRIMASK are all banked
* AIRCR.PRIS can affect NS priorities
* AIRCR.BFHFNMINS affects FAULTMASK behaviour
These changes mean that it's no longer possible to
definitely say that if FAULTMASK is set it overrides
PRIMASK, and if PRIMASK is set it overrides BASEPRI
(since if PRIMASK_NS is set and AIRCR.PRIS is set then
whether that 0x80 priority should take effect or the
priority in BASEPRI_S depends on the value of BASEPRI_S,
for instance). So we switch to the same approach used
by the pseudocode of working through BASEPRI, PRIMASK
and FAULTMASK and overriding the previous values if
needed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-16-git-send-email-peter.maydell@linaro.org
If AIRCR.BFHFNMINS is clear, then although NonSecure HardFault
can still be pended via SHCSR.HARDFAULTPENDED it mustn't actually
preempt execution. The simple way to achieve this is to clear the
enable bit for it, since the enable bit isn't guest visible.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-15-git-send-email-peter.maydell@linaro.org
In v7M, the fixed-priority exceptions are:
Reset: -3
NMI: -2
HardFault: -1
In v8M, this changes because Secure HardFault may need
to be prioritised above NMI:
Reset: -4
Secure HardFault if AIRCR.BFHFNMINS == 1: -3
NMI: -2
Secure HardFault if AIRCR.BFHFNMINS == 0: -1
NonSecure HardFault: -1
Make these changes, including support for changing the
priority of Secure HardFault as AIRCR.BFHFNMINS changes.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-14-git-send-email-peter.maydell@linaro.org
When escalating to HardFault, we must go into Lockup if we
can't take the synchronous HardFault because the current
execution priority is already at or below the priority of
HardFault. In v7M HF is always priority -1 so a simple < 0
comparison sufficed; in v8M the priority of HardFault can
vary depending on whether it is a Secure or NonSecure
HardFault, so we must check against the priority of the
HardFault exception vector we're about to use.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-13-git-send-email-peter.maydell@linaro.org
In armv7m_nvic_set_pending() we have to compare the
priority of an exception against the execution priority
to decide whether it needs to be escalated to HardFault.
In the specification this is a comparison against the
exception's group priority; for v7M we implemented it
as a comparison against the raw exception priority
because the two comparisons will always give the
same answer. For v8M the existence of AIRCR.PRIS and
the possibility of different PRIGROUP values for secure
and nonsecure exceptions means we need to explicitly
calculate the vector's group priority for this check.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-12-git-send-email-peter.maydell@linaro.org
Make the armv7m_nvic_set_pending() and armv7m_nvic_clear_pending()
functions take a bool indicating whether to pend the secure
or non-secure version of a banked interrupt, and update the
callsites accordingly.
In most callsites we can simply pass the correct security
state in; in a couple of cases we use TODO comments to indicate
that we will return the code in a subsequent commit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-10-git-send-email-peter.maydell@linaro.org
For v8M, the NVIC has a new set of registers per interrupt,
NVIC_ITNS<n>. These determine whether the interrupt targets Secure
or Non-secure state. Implement the register read/write code for
these, and make them cause NVIC_IABR, NVIC_ICER, NVIC_ISER,
NVIC_ICPR, NVIC_IPR and NVIC_ISPR to RAZ/WI for non-secure
accesses to fields corresponding to interrupts which are
configured to target secure state.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-8-git-send-email-peter.maydell@linaro.org
The Application Interrupt and Reset Control Register has some changes
for v8M:
* new bits SYSRESETREQS, BFHFNMINS and PRIS: these all have
real state if the security extension is implemented and otherwise
are constant
* the PRIGROUP field is banked between security states
* non-secure code can be blocked from using the SYSRESET bit
to reset the system if SYSRESETREQS is set
Implement the new state and the changes to register read and write.
For the moment we ignore the effects of the secure PRIGROUP.
We will implement the effects of PRIS and BFHFNMIS later.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-6-git-send-email-peter.maydell@linaro.org
Instead of looking up the pending priority
in nvic_pending_prio(), cache it in a new state struct
field. The calculation of the pending priority given
the interrupt number is more complicated in v8M with
the security extension, so the caching will be worthwhile.
This changes nvic_pending_prio() from returning a full
(group + subpriority) priority value to returning a group
priority. This doesn't require changes to its callsites
because we use it only in comparisons of the form
execution_prio > nvic_pending_prio()
and execution priority is always a group priority, so
a test (exec prio > full prio) is true if and only if
(execprio > group_prio).
(Architecturally the expected comparison is with the
group priority for this sort of "would we preempt" test;
we were only doing a test with a full priority as an
optimisation to avoid the mask, which is possible
precisely because the two comparisons always give the
same answer.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-5-git-send-email-peter.maydell@linaro.org
With banked exceptions, just the exception number in
s->vectpending is no longer sufficient to uniquely identify
the pending exception. Add a vectpending_is_s_banked bool
which is true if the exception is using the sec_vectors[]
array.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1505240046-11454-4-git-send-email-peter.maydell@linaro.org
For the v8M security extension, some exceptions must be banked
between security states. Add the new vecinfo array which holds
the state for the banked exceptions and migrate it if the
CPU the NVIC is attached to implements the security extension.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
In v8M the MSR and MRS instructions have extra register value
encodings to allow secure code to access the non-secure banked
version of various special registers.
(We don't implement the MSPLIM_NS or PSPLIM_NS aliases, because
we don't currently implement the stack limit registers at all.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505240046-11454-2-git-send-email-peter.maydell@linaro.org
MIPS patches 2017-09-21
Changes:
QOMify MIPS cpu
Improve macro parenthesization
# gpg: Signature made Thu 21 Sep 2017 13:50:37 BST
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170921:
mips: Improve macro parenthesization
mips: replace cpu_mips_init() with cpu_generic_init()
mips: MIPSCPU model subclasses
mips: call cpu_mips_realize_env() from mips_cpu_realizefn()
mips: split cpu_mips_realize_env() out of cpu_mips_init()
mips: introduce internal.h and cleanup cpu.h
mips: move hw/mips/cputimer.c to target/mips/
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Although none of the existing macro call-sites were broken,
it's always better to write macros that properly parenthesize
arguments that can be complex expressions, so that the intended
order of operations is not broken.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
now cpu_mips_init() reimplements subset of cpu_generic_init()
tasks, so just drop it and use cpu_generic_init() directly.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMD: use internal.h instead of cpu.h]
Tested-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Register separate QOM types for each mips cpu model,
so it would be possible to reuse generic CPU creation
routines.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[PMD: use internal.h, use void* to hold cpu_def in MIPSCPUClass,
mark MIPSCPU abstract, address Eduardo Habkost review]
Tested-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
This changes the order between cpu_mips_realize_env() and
cpu_exec_initfn(), but cpu_exec_initfn() don't have anything that
depends on cpu_mips_realize_env() being called first.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
This timer is a required part of the MIPS32/MIPS64 System Control coprocessor
(CP0). Moving it with the other architecture related files will allow an opaque
use of CPUMIPSState* in the next commit (introduce "internal.h").
also remove it from 'user' targets, remove an unnecessary include.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: James Hogan <james.hogan@imgtec.com>
Acked-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
This avoids a name clash with the access macro on windows 64:
make
CHK version_gen.h
CC aarch64-softmmu/memory.o
/home/konrad/qemu/memory.c: In function 'access_with_adjusted_size':
/home/konrad/qemu/memory.c:591:73: error: macro "access" passed 7 arguments, \
but takes just 2
(size - access_size - i) * 8, access_mask, attrs);
^
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Message-Id: <1505988260-8483-1-git-send-email-frederic.konrad@adacore.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
pflash toggles mr->romd_mode. So this assert does not always hold.
1) a device was added with !mr->romd_mode, therefore effectively not
creating a kvm slot as we want to trap every access (add = false).
2) mr->romd_mode was toggled on before remove it. There is now
actually no slot to remove and the assert is wrong.
So let's just drop the assert.
Reported-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170920145025.19403-1-david@redhat.com>
Tested-by: Gerd Hoffmann <kraxel@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We should guarantee that RAM will not be modified while VM has a stopped
state, otherwise it can lead to negative consequences during post-copy
migration. In RUN_STATE_FINISH_MIGRATE step, it's expected that RAM on
source side will not be modified as this could lead to non-consistent vm state
on the destination side. Also RAM access during postcopy-ram migration with
enabled release-ram capability can lead to sad consequences.
Let's add enable_backend() callback to avoid undesirable virtioqueue changes
in the guest memory.
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Message-Id: <20170919120733.22020-1-pbutsykin@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
When a MSI interrupt is bound to a guest using
xc_domain_update_msi_irq (XEN_DOMCTL_bind_pt_irq) the interrupt is
left masked by default.
This causes problems with guests that first configure interrupts and
clean the per-entry MSIX table mask bit and afterwards enable MSIX
globally. In such scenario the Xen internal msixtbl handlers would not
detect the unmasking of MSIX entries because vectors are not yet
registered since MSIX is not enabled, and vectors would be left
masked.
Introduce a new flag in the gflags field to signal Xen whether a MSI
interrupt should be unmasked after being bound.
This also requires to track the mask register for MSI interrupts, so
QEMU can also notify to Xen whether the MSI interrupt should be bound
masked or unmasked
Signed-off-by: Roger Pau Monné <roger.pau@citrix.com>
Reviewed-by: Jan Beulich <jbeulich@suse.com>
Reported-by: Andreas Kinzler <hfp@posteo.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
g_malloc0_n is available since glib-2.24. To allow build with older glib
versions use the generic g_new0, which is already used in many other
places in the code.
Fixes commit 3284fad728 ("xen-disk: add support for multi-page shared rings")
Signed-off-by: Olaf Hering <olaf@aepfle.de>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
These patches fix regressions in 2.10
# gpg: Signature made Wed 20 Sep 2017 07:51:07 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: check the size of transport buffer before marshaling
9pfs: fix name_to_path assertion in v9fs_complete_rename()
9pfs: fix readdir() for 9p2000.u
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Machine/CPU/NUMA queue, 2017-09-19
# gpg: Signature made Tue 19 Sep 2017 21:17:01 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/machine-next-pull-request:
MAINTAINERS: Update git URLs for my trees
hw/acpi-build: Fix SRAT memory building in case of node 0 without RAM
NUMA: Replace MAX_NODES with nb_numa_nodes in for loop
numa: cpu: calculate/set default node-ids after all -numa CLI options are parsed
arm: drop intermediate cpu_model -> cpu type parsing and use cpu type directly
pc: use generic cpu_model parsing
vl.c: convert cpu_model to cpu type and set of global properties before machine_init()
cpu: make cpu_generic_init() abort QEMU on error
qom: cpus: split cpu_generic_init() on feature parsing and cpu creation parts
hostmem-file: Add "discard-data" option
osdep: Define QEMU_MADV_REMOVE
vl: Clean up user-creatable objects when exiting
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
v9fs_do_readdir_with_stat() should check for a maximum buffer size
before an attempt to marshal gathered data. Otherwise, buffers assumed
as misconfigured and the transport would be broken.
The patch brings v9fs_do_readdir_with_stat() in conformity with
v9fs_do_readdir() behavior.
Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com>
[groug, regression caused my commit 8d37de41ca # 2.10]
Signed-off-by: Greg Kurz <groug@kaod.org>
The third parameter of v9fs_co_name_to_path() must not contain `/'
character.
The issue is most likely related to 9p2000.u protocol only.
Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com>
[groug, regression caused by commit f57f587857 # 2.10]
Signed-off-by: Greg Kurz <groug@kaod.org>
If the client is using 9p2000.u, the following occurs:
$ cd ${virtfs_shared_dir}
$ mkdir -p a/b/c
$ ls a/b
ls: cannot access 'a/b/a': No such file or directory
ls: cannot access 'a/b/b': No such file or directory
a b c
instead of the expected:
$ ls a/b
c
This is a regression introduced by commit f57f5878578a;
local_name_to_path() now resolves ".." and "." in paths,
and v9fs_do_readdir_with_stat()->stat_to_v9stat() then
copies the basename of the resulting path to the response.
With the example above, this means that "." and ".." are
turned into "b" and "a" respectively...
stat_to_v9stat() currently assumes it is passed a full
canonicalized path and uses it to do two different things:
1) to pass it to v9fs_co_readlink() in case the file is a symbolic
link
2) to set the name field of the V9fsStat structure to the basename
part of the given path
It only has two users: v9fs_stat() and v9fs_do_readdir_with_stat().
v9fs_stat() really needs 1) and 2) to be performed since it starts
with the full canonicalized path stored in the fid. It is different
for v9fs_do_readdir_with_stat() though because the name we want to
put into the V9fsStat structure is the d_name field of the dirent
actually (ie, we want to keep the "." and ".." special names). So,
we only need 1) in this case.
This patch hence adds a basename argument to stat_to_v9stat(), to
be used to set the name field of the V9fsStat structure, and moves
the basename logic to v9fs_stat().
Signed-off-by: Jan Dakinevich <jan.dakinevich@gmail.com>
(groug, renamed old name argument to path and updated changelog)
Signed-off-by: Greg Kurz <groug@kaod.org>
List the branches where I queue patches for Machine Core, NUMA,
Memory Backends, and X86. Update the NUMA section to list the
"machine-next" branch instead of "numa".
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170901153928.17058-1-ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Currently, Using the fisrt node without memory on the machine makes
QEMU unhappy. With this example command line:
... \
-m 1024M,slots=4,maxmem=32G \
-numa node,nodeid=0 \
-numa node,mem=1024M,nodeid=1 \
-numa node,nodeid=2 \
-numa node,nodeid=3 \
Guest reports "No NUMA configuration found" and the NUMA topology is
wrong.
This is because when QEMU builds ACPI SRAT, it regards node 0 as the
default node to deal with the memory hole(640K-1M). this means the
node0 must have some memory(>1M), but, actually it can have no
memory.
Fix this problem by cut out the 640K hole in the same way the PCI
4G hole does.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Message-Id: <1504231805-30957-2-git-send-email-douly.fnst@cn.fujitsu.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
In QEMU, the number of the NUMA nodes is determined by parse_numa_opts().
Then, QEMU uses it for iteration, for example:
for (i = 0; i < nb_numa_nodes; i++)
However, in memory_region_allocate_system_memory(), it uses MAX_NODES
not nb_numa_nodes.
So, replace MAX_NODES with nb_numa_nodes to keep code consistency and
reduce the loop times.
Signed-off-by: Dou Liyang <douly.fnst@cn.fujitsu.com>
Message-Id: <1503387936-3483-1-git-send-email-douly.fnst@cn.fujitsu.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Calculating default node-ids for CPUs in possible_cpu_arch_ids()
is rather fragile since defaults calculation uses nb_numa_nodes but
callback might be potentially called early before all -numa CLI
options are parsed, which would lead to cpus assigned only upto
nb_numa_nodes at the time possible_cpu_arch_ids() is called.
Issue was introduced by
(7c88e65 numa: mirror cpu to node mapping in MachineState::possible_cpus)
and for example CLI:
-smp 4 -numa node,cpus=0 -numa node
would set props.node-id in possible_cpus array for every non
explicitly mapped CPU to the first node.
Issue is not visible to guest nor to mgmt interface due to
1) implictly mapped cpus are forced to the first node in
case of partial mapping
2) in case of default mapping possible_cpu_arch_ids() is
called after all -numa options are parsed (resulting
in correct mapping).
However it's fragile to rely on late execution of
possible_cpu_arch_ids(), therefore add machine specific
callback that returns node-id for CPU and use it to calculate/
set defaults at machine_numa_finish_init() time when all -numa
options are parsed.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1496314408-163972-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Assorted s390x patches:
- introduce virtio-gpu-ccw, with virtio-gpu endian fixes
- lots of cleanup in the s390x code
- make device_add work for s390x cpus
- enable seccomp on s390x
- an ivshmem endian fix
- set the reserved DHCP client architecture id for netboot
- fixes in the css and pci support
# gpg: Signature made Tue 19 Sep 2017 17:39:45 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170919-v2: (38 commits)
MAINTAINERS/s390x: add terminal3270.c
virtio-ccw: Create a virtio gpu device for the ccw bus
virtio-gpu: Handle endian conversion
s390x/ccw: create s390 phb for compat reasons as well
configure: Allow --enable-seccomp on s390x, too
virtio-ccw: remove stale comments on endianness
s390x: allow CPU hotplug in random core-id order
s390x: generate sclp cpu information from possible_cpus
s390x: get rid of cpu_s390x_create()
s390x: get rid of cpu_states and use possible_cpus instead
s390x: implement query-hotpluggable-cpus
s390x: CPU hot unplug via device_del cannot work for now
s390x: allow cpu hotplug via device_add
s390x: print CPU definitions in sorted order
target/s390x: rename next_cpu_id to next_core_id
target/s390x: use "core-id" for cpu number/address/id handling
target/s390x: set cpu->id for linux user when realizing
s390x: allow only 1 CPU with TCG
target/s390x: use program_interrupt() in per_check_exception()
target/s390x: use trigger_pgm_exception() in s390_cpu_handle_mmu_fault()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
d32bd032d8 ("s390x/ccw: create s390 phb conditionally") made
registering the s390 pci host bridge conditional on presense
of the zpci facility bit. Sadly, that breaks migration from
machines that did not use the cpu model (2.7 and previous).
Create the s390 phb for pre-cpu model machines as well: We can
tweak s390_has_feat() to always indicate the zpci facility bit
when no cpu model is available (on 2.7 and previous compat machines).
Fixes: d32bd032d8 ("s390x/ccw: create s390 phb conditionally")
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We have two stale comments suggesting one should think about virtio
config space endianness a bit longer. We have just done that, and came to
the conclusion we are fine as is: it's the responsibility of the virtio
device and not of the transport (and that is how it works now). Putting
the responsibility into the transport isn't even possible, because the
transport would have to know about the config space layout of each
device.
Let us remove the stale comments.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Suggested-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <20170914105535.47941-1-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
SCLP correctly indicates the core-id aka. CPU address for each available
CPU.
As the core-id corresponds to cpu_index, also a newly created kvm vcpu
gets assigned this core-id as vcpu id. So SIGP in the kernel works
correctly (it uses the vcpu id to lookup the correct CPU).
So there should be nothing hindering us from hotplugging CPUs in random
core-id order.
This now makes sure that the output from "query-hotpluggable-cpus"
is completely true. Until now, a specific order is implicit. Performance
vice, hotplugging CPUs in non-sequential order might not be the best thing
to do, as VCPU lookup inside KVM might be a little slower. But that
doesn't hinder us from supporting it.
next_core_id is now used by linux user only.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-23-david@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
This is the first step to allow hot plugging of CPUs in a non-sequential
order. If a cpu is available ("plugged") can directly be decided by
looking at the cpu state pointer.
This makes sure, that really only cpus attached to the machine are
reported.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-22-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Now that there is only one user of cpu_s390x_create() left, make cpu
creation look like on x86.
- Perform the model/properties split and checks in s390_init_cpus()
- Parse features only once without having to remember if already parsed
- Pass only the typename to s390x_new_cpu()
- Use the typename of an existing CPU for hotplug via cpu-add
Acked-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-21-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
device_del on a CPU will currently do nothing. Let's emit an error
telling that this is will currently not work (there is no architecture
support on s390x). Error message copied from ppc.
(qemu) device_del cpu1
device_del cpu1
CPU hot unplug not supported on this machine
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
E.g. the following now works:
device_add host-s390-cpu,id=cpu1,core-id=1
The system will perform the same checks as when using cpu_add:
- If the core_id is already in use
- If the next sequential core_id isn't used
- If core-id >= max_cpu is specified
In addition, mixed CPU models are checked. E.g. if starting with
-cpu host and trying to hotplug "qemu-s390-cpu":
"Mixed CPU models are not supported on s390x."
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-17-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Some time ago we discussed that using "id" as property name is not the
right thing to do, as it is a reserved property for other devices and
will not work with device_add.
Switch to the term "core-id" instead, and use it as an equivalent to
"CPU address" mentioned in the PoP. There is no such thing as cpu number,
so rename env.cpu_num to env.core_id. We use "core-id" as this is the
common term to use for device_add later on (x86 and ppc).
We can get rid of cpu->id now. Keep cpu_index and env->core_id in sync.
cpu_index was already implicitly used by e.g. cpu_exists(), so keeping
both in sync seems to be the right thing to do.
cpu_index will now no longer automatically get set via
cpu_exec_realizefn(). For now, we were lucky that both implicitly stayed
in sync.
Our new cpu property "core-id" can be a static property. Range checks can
be avoided by using the correct type and the "setting after realized"
check is done implicitly.
device_add will later need the reserved "id" property. Hotplugging a CPU
on s390x will then be: "device_add host-s390-cpu,id=cpu2,core-id=2".
Reviewed-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-14-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Specifying more than 1 CPU (e.g. -smp 5) leads to SIGP errors (the
guest tries to bring these CPUs up but fails), because we don't support
multiple CPUs on s390x under TCG.
Let's bail out if more than 1 is specified, so we don't raise people's
hope.
Tested-by: Matthew Rosato <mjrosato@linux.vnet.ibm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170913132417.24384-12-david@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Back then in the time of df1fe5bb49 ("s390: Virtual channel subsystem
support.", 2013-01-24) -EIO used to map to a channel-program check (via
the default label of the switch statement). Then 2dc95b4cac
("s390x/3270: 3270 data stream handling", 2016-04-01) came along
and that changed dramatically.
Let us roll back this undesired side effect, and go back to
channel-program check.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: 2dc95b4cac "s390x/3270: 3270 data stream handling"
Message-Id: <20170908152446.14606-3-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The architecture says that channel-data check is indicating that
an uncorrected storage (memory) error has been detected in regard
to the data residing in main storage (memory) that is currently
used for an I/O operation. The described detection is done using
the CBC technology.
The ccw interpretation code is however generating a channel-data check
effectively when the (device specific) ccw_cb returns -EFAULT. In case
of virtio-ccw devices this happens when mapping memory fails, or when a
NULL pointer is encountered. So this behavior is not architecture
conform.
Furthermore the best fit for these situations (null pointer, mapping a
piece of guest memory fails) from architectural perspective the condition
described as the channel subsystem refers to a location that is not
available, which when encountered shall result in a channel-program
check.
To fix this, all we have to do is to get rid of the switch case matching
-EFAULT: the default is generating a channel-program check.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170908152446.14606-2-pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The "slow" ivshmem-tests currently fail when they are running on a
big endian host:
$ uname -m
ppc64
$ V=1 QTEST_QEMU_BINARY=x86_64-softmmu/qemu-system-x86_64 tests/ivshmem-test -m slow
/x86_64/ivshmem/single: OK
/x86_64/ivshmem/hotplug: OK
/x86_64/ivshmem/memdev: OK
/x86_64/ivshmem/pair: OK
/x86_64/ivshmem/server-msi: qemu-system-x86_64:
-device ivshmem-doorbell,chardev=chr0,vectors=2: server sent invalid ID message
Broken pipe
The problem is that the server side code in ivshmem_server_send_one_msg()
correctly translates all messages IDs into little endian 64-bit values,
but the client side code in the ivshmem_recv_msg() function does not swap
the byte order back. Fix it by passing the value through le64_to_cpu().
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1504100343-26607-1-git-send-email-thuth@redhat.com>
Tested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The guest uses the mpcifc instruction to register the aibvo of a zpci
device, which is the starting offset of indicators in the indicator
area and thus remains constant. Each msix vector is an offset from the
aibvo. When we map a msix route to an adapter route, we should not
modify the starting offset, but instead add the vector to the starting
offset to get the absolute offset in the specific route.
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <1504606380-49341-3-git-send-email-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
PCIDevice pointer has been a parameter of kvm_arch_fixup_msi_route().
So we don't need to store zpci idx in msix message data to find out the
specific zpci device. Instead, we could use pci device id to find its
corresponding zpci device.
Signed-off-by: Yi Min Zhao <zyimin@linux.vnet.ibm.com>
Message-Id: <1504606380-49341-2-git-send-email-zyimin@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
We can use the drive_del test on s390x, too, to check that adding and
deleting also works fine with the virtio-ccw bus. But we have to make
sure that we use the devices with the "-ccw" suffix instead of the
"-pci" suffix for the virtio-ccw transport on s390x. Introduce a helper
function called qvirtio_get_dev_type() that returns the correct string
for the current architecture.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1504190408-11143-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The function ioinst_handle_xsch is presenting cc 2 when it's supposed to
present cc 1 and the other way around, because css_do_xsch has the error
codes mixed up. Because cc 1 has precedence over cc 2 we also have to
swap the two checks.
Let us fix this.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Reported-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Message-Id: <20170831121828.85885-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The pixman submodule does not exist anymore, and its removal broke
docker-based tests. Fix it.
Cc: Fam Zheng <famz@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Using $(and ...) is dangerous here: It only works as long as the first
argument is set to 'y' or completely unset. It does not work if the
first argument is set to 'n' for example. Let's use the "land" make
function instead which has been written explicitely for this purpose.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1505759538-15365-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The script doesn't know about all possible types and learn them as
it parses the code. If it reaches a line with a type cast but the
type isn't known yet, it is misinterpreted as an identifier.
For example the following line:
foo = (hwaddr) -1;
results in the following false-positive to be reported:
ERROR: spaces required around that '-' (ctx:VxV)
Let's add this standard QEMU type to the list of pre-known types.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150538015789.8149.10902725348939486674.stgit@bahia.lan>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently before submitting a series, devs should run checkpatch.pl
across each patch to be submitted. This can be automated using a
command such as:
git rebase -i master -x 'git show | ./scripts/checkpatch.pl -'
This is rather long winded to type, so this patch introduces a way
to tell checkpatch.pl to validate a series of GIT revisions.
There are now three modes it can operate in 1) check a patch 2) check a source
file, or 3) check a git branch.
If no flags are given, the mode is determined by checking the args passed to
the command. If the args contain a literal ".." it is treated as a GIT revision
list. If the args end in ".patch" or equal "-" it is treated as a patch file.
Otherwise it is treated as a source file.
This automatic guessing can be overridden using --[no-]patch --[no-]file or
--[no-]branch
For example to check a GIT revision list:
$ ./scripts/checkpatch.pl master..
total: 0 errors, 0 warnings, 297 lines checked
b886d352a2bf58f0996471fb3991a138373a2957 has no obvious style problems and is ready for submission.
total: 0 errors, 0 warnings, 182 lines checked
2a731f9a9ce145e0e0df6d42dd2a3ce4dfc543fa has no obvious style problems and is ready for submission.
total: 0 errors, 0 warnings, 102 lines checked
11844169bcc0c8ed4449eb3744a69877ed329dd7 has no obvious style problems and is ready for submission.
If a genuine patch filename contains the characters '..' it is
possible to force interpretation of the arg as a patch
$ ./scripts/checkpatch.pl --patch master..
will force it to load a patch file called "master..", or equivalently
$ ./scripts/checkpatch.pl --no-branch master..
will simply turn off guessing of GIT revision lists.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170913091000.9005-1-berrange@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
All definitions related to Hyper-V emulation are now taken from the QEMU
own header, so the one imported from the kernel is no longer needed.
Unfortunately it's included by kvm_para.h.
So, until this is fixed in the kernel, teach the header harvesting
script to substitute kernel's hyperv.h with a dummy.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20170713201522.13765-3-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The definitions for Hyper-V emulation are currently taken from a header
imported from the Linux kernel.
However, as these describe a third-party protocol rather than a kernel
API, it probably wasn't a good idea to publish it in the kernel uapi.
This patch introduces a header that provides all the necessary
definitions, superseding the one coming from the kernel.
The new header supports (temporary) coexistence with the kernel one.
The constants explicitly named in the Hyper-V specification (e.g. msr
numbers) are defined in a non-conflicting way. Other constants and
types have got new names.
While at this, the protocol data structures are defined in a more
conventional way, without bitfields, enums, and excessive unions.
The code using this stuff is adjusted, too; it can now be built both
with and without the kernel header in the tree.
Signed-off-by: Roman Kagan <rkagan@virtuozzo.com>
Message-Id: <20170713201522.13765-2-rkagan@virtuozzo.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Starting with Windows Server 2012 and Windows 8, if
CPUID.40000005.EAX contains a value of -1, Windows assumes specific
limit to the number of VPs. In this case, Windows Server 2012
guest VMs may use more than 64 VPs, up to the maximum supported
number of processors applicable to the specific Windows
version being used.
https://docs.microsoft.com/en-us/virtualization/hyper-v-on-windows/reference/tlfs
For compatibility, Let's introduce a new property for X86CPU,
named "x-hv-max-vps" as Eduardo's suggestion, and set it
to 0x40 before machine 2.10.
(The "x-" prefix indicates that the property is not supposed to
be a stable user interface.)
Signed-off-by: Gonglei <arei.gonglei@huawei.com>
Message-Id: <1505143227-14324-1-git-send-email-arei.gonglei@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert any remaining uses of fprintf(stderr, "warning:"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.
All of the warnings were changed using this command:
find ./* -type f -exec sed -i 's|fprintf(.*".*warning[,:] |warn_report("|Ig' {} +
The #include lines and chagnes to the test Makefile were manually
updated to allow the code to compile.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Message-Id: <2c94ac3bb116cc6b8ebbcd66a254920a69665515.1503077821.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This test provides its own mocks, so do not use the "standard"
stubs in libqemustub.a or the event loop implementation in
libqemuutil.a.
This is required on OS X, which otherwise brings in qemu-timer.o,
async.o and main-loop.o from libqemuutil.a.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Convert all the multi-line uses of fprintf(stderr, "warning:"..."\n"...
to use warn_report() instead. This helps standardise on a single
method of printing warnings to the user.
All of the warnings were changed using these commands:
find ./* -type f -exec sed -i \
'N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
find ./* -type f -exec sed -i \
'N;N;N;N;N;N;N; {s|fprintf(.*".*warning[,:] \(.*\)\\n"\(.*\));|warn_report("\1"\2);|Ig}' \
{} +
Indentation fixed up manually afterwards.
Some of the lines were manually edited to reduce the line length to below
80 charecters. Some of the lines with newlines in the middle of the
string were also manually edit to avoid checkpatch errrors.
The #include lines were manually updated to allow the code to compile.
Several of the warning messages can be improved after this patch, to
keep this patch mechanical this has been moved into a later patch.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: "Michael S. Tsirkin" <mst@redhat.com>
Cc: Igor Mammedov <imammedo@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Stefano Stabellini <sstabellini@kernel.org>
Cc: Anthony Perard <anthony.perard@citrix.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Cornelia Huck <cohuck@redhat.com>
Cc: Christian Borntraeger <borntraeger@de.ibm.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Jason Wang <jasowang@redhat.com>
Cc: David Gibson <david@gibson.dropbear.id.au>
Cc: Gerd Hoffmann <kraxel@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <5def63849ca8f551630c6f2b45bcb1c482f765a6.1505158760.git.alistair.francis@xilinx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Flatview will make sure that we can only end up in this function with
memory sections that correspond to exactly one slot. So we don't
have to iterate multiple times. There won't be overlapping slots but
only matching slots.
Properly align the section and look up the corresponding slot. This
heavily simplifies this function.
We can now get rid of kvm_lookup_overlapping_slot().
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-7-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The way flatview handles memory sections, we will never have overlapping
memory sections in kvm.
address_space_update_topology_pass() will make sure that we will only
get called for
a) an existing memory section for which we only update parameters
(log_start, log_stop).
b) an existing memory section we want to delete (region_del)
c) a brand new memory section we want to add (region_add)
We cannot have overlapping memory sections in kvm as we will first remove
the overlapping sections and then add the ones without conflicts.
Therefore we can remove the complexity for handling prefix and suffix
slots.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-5-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We already require DESTROY_MEMORY_REGION_WORKS, JOIN_MEMORY_REGIONS_WORKS
was added just half a year later.
In addition, with flatview overlapping memory regions are first
removed before adding the changed one. So we can't really detect joining
memory regions this way.
Let's just get rid of this special handling.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170911174933.20789-2-david@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
While loading kernel via multiboot-v1 image, (flags & 0x00010000)
indicates that multiboot header contains valid addresses to load
the kernel image. These addresses are used to compute kernel
size and kernel text offset in the OS image. Validate these
address values to avoid an OOB access issue.
This is CVE-2017-14167.
Reported-by: Thomas Garnier <thgarnie@google.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Message-Id: <20170907063256.7418-1-ppandit@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
SunOS declares struct queue in <netinet/in.h>.
This fixes build on SmartOS (Joyent).
Patch cherry-picked from pkgsrc by jperkin (Joyent).
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Message-Id: <20170903163304.17919-1-n54@gmx.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
As of kernel commit eb82feea59d6 ("KVM: hyperv: support HV_X64_MSR_TSC_FREQUENCY
and HV_X64_MSR_APIC_FREQUENCY"), KVM supports two new MSRs which are required
for nested Hyper-V to read timestamps with RDTSC + TSC page.
This commit makes QEMU advertise the MSRs with CPUID.40000003H:EAX[11] and
CPUID.40000003H:EDX[8] as specified in the Hyper-V TLFS and experimentally
verified on a Hyper-V host. The feature is enabled with the existing hv-time CPU
flag, and only if the TSC frequency is stable across migrations and known.
Signed-off-by: Ladi Prosek <lprosek@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170807085703.32267-5-lprosek@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
there are 2 use cases to deal with:
1: fixed CPU models per board/soc
2: boards with user configurable cpu_model and fallback to
default cpu_model if user hasn't specified one explicitly
For the 1st
drop intermediate cpu_model parsing and use const cpu type
directly, which replaces:
typename = object_class_get_name(
cpu_class_by_name(TYPE_ARM_CPU, cpu_model))
object_new(typename)
with
object_new(FOO_CPU_TYPE_NAME)
or
cpu_generic_init(BASE_CPU_TYPE, "my cpu model")
with
cpu_create(FOO_CPU_TYPE_NAME)
as result 1st use case doesn't have to invoke not necessary
translation and not needed code is removed.
For the 2nd
1: set default cpu type with MachineClass::default_cpu_type and
2: use generic cpu_model parsing that done before machine_init()
is run and:
2.1: drop custom cpu_model parsing where pattern is:
typename = object_class_get_name(
cpu_class_by_name(TYPE_ARM_CPU, cpu_model))
[parse_features(typename, cpu_model, &err) ]
2.2: or replace cpu_generic_init() which does what
2.1 does + create_cpu(typename) with just
create_cpu(machine->cpu_type)
as result cpu_name -> cpu_type translation is done using
generic machine code one including parsing optional features
if supported/present (removes a bunch of duplicated cpu_model
parsing code) and default cpu type is defined in an uniform way
within machine_class_init callbacks instead of adhoc places
in boadr's machine_init code.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1505318697-77161-6-git-send-email-imammedo@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
define default CPU type in generic way in pc_machine_class_init()
and let common machine code to handle cpu_model parsing
Patch also introduces TARGET_DEFAULT_CPU_TYPE define for 2 purposes:
* make foo_machine_class_init() look uniform on every target
* use define in [bsd|linux]-user targets to pick default
cpu type
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1505318697-77161-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
All machines that support user specified cpu_model either call
cpu_generic_init() or cpu_class_by_name()/CPUClass::parse_features
to parse feature string and to get CPU type to create.
Which leads to code duplication and hard-codding default CPU model
within machine_foo_init() code. Which makes it impossible to
get CPU type before machine_init() is run.
So instead of setting default CPUs models and doing parsing in
target specific machine_foo_init() in various ways, provide
a generic data driven cpu_model parsing before machine_init()
is called.
in follow up per target patches, it will allow to:
* define default CPU type in consistent/generic manner
per machine type and drop custom code that fallbacks
to default if cpu_model is NULL
* drop custom features parsing in targets and do it
in centralized way.
* for cases of
cpu_generic_init(TYPE_BASE/DEFAULT_CPU, "some_cpu")
replace it with
cpu_create(machine->cpu_type) || cpu_create(TYPE_FOO)
depending if CPU type is user settable or not.
not doing useless parsing and clearly documenting where
CPU model is user settable or fixed one.
Patch allows machine subclasses to define default CPU type
per machine class at class_init() time and if that is set
generic code will parse cpu_model into a MachineState::cpu_type
which will be used to create CPUs for that machine instance
and allows gradual per board conversion.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1505318697-77161-4-git-send-email-imammedo@redhat.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Almost every user of cpu_generic_init() checks for
returned NULL and then reports failure in a custom way
and aborts process.
Some users assume that call can't fail and don't check
for failure, though they should have checked for it.
In either cases cpu_generic_init() failure is fatal,
so instead of checking for failure and reporting
it various ways, make cpu_generic_init() report
errors in consistent way and terminate QEMU on failure.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1505318697-77161-3-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Complete the transition by renaming this header, which was
shared by block/iscsi.c and the SCSI emulation code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The new option can be used to indicate that the file contents can
be destroyed and don't need to be flushed to disk when QEMU exits
or when the memory backend object is removed.
Internally, it will trigger a madvise(MADV_REMOVE) call when the
memory backend is removed.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170824192315.5897-4-ehabkost@redhat.com>
[ehabkost: fixup: improved documentation]
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Tested-by: Zack Cornelius <zack.cornelius@kove.net>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Move more knowledge of SG_IO out of hw/scsi/scsi-generic.c, for
reusability.
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Move more knowledge of sense data format out of hw/scsi/scsi-bus.c
for reusability.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
util/scsi.c includes some SCSI code that is shared by block/iscsi.c and
hw/scsi, but the introduction of the persistent reservation helper
will add many more instances of this. There is also include/block/scsi.h,
which actually is not part of the core block layer.
The persistent reservation manager will also need a home. A scsi/
directory provides one for both the aforementioned shared code and
the PR manager code.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
After introducing the scsi/ subdirectory, there will be a scsi_build_sense
function that is the same as scsi_req_build_sense but without needing
a SCSIRequest. The existing scsi_build_sense function gets in the way,
remove it.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Since Linux switched to blk-mq as the default in Linux commit
5c279bd9e406 ("scsi: default to scsi-mq"), virtio-scsi LUNs consume
about 10x as much guest kernel memory.
This commit allows you to choose the virtqueue size for each
virtio-scsi-pci controller like this:
-device virtio-scsi-pci,id=scsi,virtqueue_size=16
The default is still 128 as before. Using smaller virtqueue_size
allows many more disks to be added to small memory virtual machines.
For a 1 vCPU, 500 MB, no swap VM I observed:
With scsi-mq enabled (upstream kernel): 175 disks
-"- ditto -"- virtqueue_size=64: 318 disks
-"- ditto -"- virtqueue_size=16: 775 disks
With scsi-mq disabled (kernel before 5c279bd9e406): 1755 disks
Note that to have any effect, this requires a kernel patch:
https://lkml.org/lkml/2017/8/10/689
Signed-off-by: Richard W.M. Jones <rjones@redhat.com>
Message-Id: <20170810165255.20865-1-rjones@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The SSE4.1 phminposuw instruction finds the minimum 16-bit element in
the source vector, putting the value of that element in the low 16
bits of the destination vector, the index of that element in the next
three bits and zeroing the rest of the destination. The helper for
this operation fills the destination from high to low, meaning that
when the source and destination are the same register, the minimum
source element can be overwritten before it is copied to the
destination. This patch fixes it to fill the destination from low to
high instead, so the minimum source element is always copied first.
This fixes one gcc test failure in my GCC 6-based testing (and so
concludes the present sequence of patches, as I don't have any further
gcc test failures left in that testing that I attribute to QEMU bugs).
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708111422580.11919@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
One of the cases of the SSE4.2 pcmpestri / pcmpestrm / pcmpistri /
pcmpistrm instructions does a substring search. The implementation of
this case in the pcmpxstrx helper is incorrect. The operation in this
case is a search for a string (argument d to the helper) in another
string (argument s to the helper); if a copy of d at a particular
position would run off the end of s, the resulting output bit should
be 0 whether or not the strings match in the region where they
overlap, but the QEMU implementation was wrongly comparing only up to
the point where s ends and counting it as a match if an initial
segment of d matched a terminal segment of s. Here, "run off the end
of s" means that some byte of d would overlap some byte outside of s;
thus, if d has zero length, it is considered to match everywhere,
including after the end of s. This patch fixes the implementation to
correspond with the proper instruction semantics. This fixes four gcc
test failures in my GCC 6-based testing.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708102139310.8101@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The SSE4.1 packusdw instruction combines source and destination
vectors of signed 32-bit integers into a single vector of unsigned
16-bit integers, with unsigned saturation. When the source and
destination are the same register, this means each 32-bit element of
that register is used twice as an input, to produce two of the 16-bit
output elements, and so if the operation is carried out
element-by-element in-place, no matter what the order in which it is
applied to the elements, the first element's operation will overwrite
some future input. The helper for packssdw avoids this issue by
computing the result in a local temporary and copying it to the
destination at the end; this patch fixes the packusdw helper to do
likewise. This fixes three gcc test failures in my GCC 6-based
testing.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708100023050.9262@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
It turns out that my recent fix to set rip_offset when emulating some
SSE4.1 instructions needs generalizing to cover a wider class of
instructions. Specifically, every instruction in the sse_op_table7
table, coming from various instruction set extensions, has an 8-bit
immediate operand that comes after any memory operand, and so needs
rip_offset set for correctness if there is a memory operand that is
rip-relative, and my patch only set it for a subset of those
instructions. This patch moves the rip_offset setting to cover the
wider class of instructions, so fixing 9 further gcc testsuite
failures in my GCC 6-based testing. (I do not know whether there
might be still further classes of instructions missing this setting.)
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708082350340.23380@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The SSE4.1 pmovsx* and pmovzx* instructions take packed 1-byte, 2-byte
or 4-byte inputs and sign-extend or zero-extend them to a wider vector
output. The associated helpers for these instructions do the
extension on each element in turn, starting with the lowest. If the
input and output are the same register, this means that all the input
elements after the first have been overwritten before they are read.
This patch makes the helpers extend starting with the highest element,
not the lowest, to avoid such overwriting. This fixes many GCC test
failures (161 in the gcc testsuite in my GCC 6-based testing) when
testing with a default CPU setting enabling those instructions.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708082018390.23380@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The current FIS printing routines dump the FIS to screen. adjust this
such that it dumps to buffer instead, then use this ability to have
FIS dump mechanisms via trace-events instead of compiled defines.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170901001502.29915-9-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Create a new enum so that we can name the IRQ bits, which will make debugging
them a little nicer if we can print them out. Not handled in this patch, but
this will make it possible to get a nice debug printf detailing exactly which
status bits are set, as it can be multiple at any given time.
As a consequence of this patch, it is no longer possible to set multiple IRQ
codes at once, but nothing was utilizing this ability anyway.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170901001502.29915-8-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Remove the DEBUG_IDE preprocessor definition with something more
appropriately flexible, using the trace-events subsystem.
This will be less prone to bitrot and will more effectively allow
us to target just the functions we care about.
Signed-off-by: John Snow <jsnow@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170901001502.29915-2-jsnow@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
QEMU currently aborts with an assertion message when the user is trying
to remove a dscm1xxxx again:
$ aarch64-softmmu/qemu-system-aarch64 -S -M integratorcp -nographic
QEMU 2.9.93 monitor - type 'help' for more information
(qemu) device_add dscm1xxxx,id=xyz
(qemu) device_del xyz
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
Looks like this device has to be wired up in code and is not meant
to be hot-pluggable, so let's mark it with user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1503543783-17192-1-git-send-email-thuth@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
Fixes read after freeing error reported
https://lists.gnu.org/archive/html/qemu-devel/2017-08/msg04243.html
Message-Id: <59a56959-ca12-ea75-33fa-ff07eba1b090@redhat.com>
ich9-ahci device creates ide buses and attaches them as QOM children
at realize time, however it forgets to properly clean them up
at unrealize time and frees memory containing these children,
with following call-chain:
qdev_device_add()
object_property_set_bool('realized', true)
device_set_realized()
...
pci_qdev_realize() -> pci_ich9_ahci_realize() -> ahci_realize()
...
s->dev = g_new0(AHCIDevice, ports);
...
AHCIDevice *ad = &s->dev[i];
ide_bus_new(&ad->port, sizeof(ad->port), qdev, i, 1);
^^^ creates bus in memory allocated by above gnew()
and adds it as child propety to ahci device
...
hotplug_handler_plug(); -> goto post_realize_fail;
pci_qdev_unrealize() -> pci_ich9_uninit() -> ahci_uninit()
...
g_free(s->dev);
^^^ free memory that holds children busses
return with error from device_set_realized()
As result later when qdev_device_add() tries to unparent ich9-ahci
after failed device_set_realized(),
object_unparent() -> object_property_del_child()
iterates over existing QOM children including buses added by
ide_bus_new() and tries to unparent them, which causes access to
freed memory where they where located.
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-id: 1503938085-169486-1-git-send-email-imammedo@redhat.com
Signed-off-by: John Snow <jsnow@redhat.com>
It's not even clear what the interface REG and VAL32 were supposed to mean.
All uses had REG = 0 and VAL32 was the bitset assigned to the destination.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This fixes building for ppc64 on ppc32 (changed in 5964fca8a1):
tcg/ppc/tcg-target.inc.c: In function 'tb_target_set_jmp_target':
include/qemu/compiler.h:86:30: error: static assertion failed: \
"not expecting: sizeof(*(uint64_t *)jmp_addr) > ATOMIC_REG_SIZE"
QEMU_BUILD_BUG_ON(sizeof(*ptr) > ATOMIC_REG_SIZE); \
^
tcg/ppc/tcg-target.inc.c:1377:9: note: in expansion of macro 'atomic_set'
atomic_set((uint64_t *)jmp_addr, pair);
^
Suggested-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170911204936.5020-1-f4bug@amsat.org>
[rth: Added commentary requested by pmm.]
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Python queue, 2017-09-15
# gpg: Signature made Sat 16 Sep 2017 00:14:01 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/python-next-pull-request:
qemu.py: include debug information on launch error
qemu.py: improve message on negative exit code
qemu.py: use os.path.null instead of /dev/null
qemu.py: avoid writing to stdout/stderr
qemu.py: fix is_running() return before first launch()
qtest.py: Few pylint/style fixes
qmp.py: Avoid overriding a builtin object
qmp.py: Avoid "has_key" usage
qmp.py: Use object-based class for QEMUMonitorProtocol
qmp.py: Couple of pylint/style fixes
qemu.py: Use custom exceptions rather than Exception
qemu.py: Simplify QMP key-conversion
qemu.py: Use iteritems rather than keys()
qemu|qtest: Avoid dangerous arguments
qemu.py: Pylint/style fixes
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When launching a VM, if an exception happens and the VM is not
initiated, it might be useful to see the qemu command line and
the qemu command output.
This patch creates that message. Notice that self._iolog needs to be
cleaned up in the beginning of the launch() to make sure we will not
expose the qemu log from a previous launch if the current one fails.
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20170901112829.2571-6-apahim@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The current message shows 'self._args', which contains only part of the
options used in the Qemu command line.
This patch makes the qemu full args list an instance variable and then
uses it in the negative exit code message.
Message was moved outside the 'if is_running' block to make sure it will
be logged if the VM finishes before the call to shutdown().
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20170901112829.2571-5-apahim@redhat.com>
[ehabkost: removed superfluous parenthesis]
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
This module should not write directly to stdout/stderr. Instead, it
should either raise exceptions or just log the messages and let the
callers handle them and decide what to do. For example, scripts could
choose to send the log messages stderr or/and write them to a file if
verbose or debugging mode is enabled.
This patch replaces the writes to stderr by an exception in the
send_fd_scm() when _socket_scm_helper is not set or not present. In the
same method, the subprocess Popen will now redirect the stdout/stderr to
logging.debug instead of writing to system stderr. As consequence, since
the Popen.communicate() is now used (in order to get the stdout), the
further call to wait() became redundant and was replaced by
Popen.returncode.
The shutdown() message on negative exit code will now be logged
to logging.warn instead of written to system stderr.
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20170901112829.2571-3-apahim@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
is_running() returns None when called before the first time we
call launch():
>>> import qemu
>>> vm = qemu.QEMUMachine('qemu-system-x86_64')
>>> vm.is_running()
>>>
It should return False instead. This patch fixes that.
For consistence, this patch removes the parenthesis from the
second clause as it's not really needed.
Signed-off-by: Amador Pahim <apahim@redhat.com>
Message-Id: <20170901112829.2571-2-apahim@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The "id" is a builtin method to get object's identity and should not be
overridden. This might bring some issues in case someone was directly
calling "cmd(..., id=id)" but I haven't found such usage on brief search
for "cmd\(.*id=".
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170818142613.32394-10-ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
No actual code changes, just initializing attributes earlier to avoid
AttributeError on early introspection, a few pylint/style fixes and
docstring clarifications.
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170818142613.32394-7-ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The naked Exception should not be widely used. It makes sense to be a
bit more specific and use better-suited custom exceptions. As a benefit
we can store the full reply in the exception in case someone needs it
when catching the exception.
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170818142613.32394-6-ldoktor@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The list object is mutable in python and potentially might modify other
object's arguments when used as default argument. Reproducer:
>>> vm1 = QEMUMachine("qemu")
>>> vm2 = QEMUMachine("qemu")
>>> vm1._wrapper.append("foo")
>>> print vm2._wrapper
['foo']
In this case the `args` is actually copied so it would be safe to keep
it, but it's not a good practice to keep it. The same issue applies in
inherited qtest module.
Signed-off-by: Lukáš Doktor <ldoktor@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Message-Id: <20170818142613.32394-3-ldoktor@redhat.com>
Reviewed-by: Cleber Rosa <crosa@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
pull-seccomp-20170915
# gpg: Signature made Fri 15 Sep 2017 09:21:15 BST
# gpg: using RSA key 0xDF32E7C0F0FFF9A2
# gpg: Good signature from "Eduardo Otubo (Senior Software Engineer) <otubo@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: D67E 1B50 9374 86B4 0723 DBAB DF32 E7C0 F0FF F9A2
* remotes/otubo/tags/pull-seccomp-20170915:
buildsys: Move seccomp cflags/libs to per object
seccomp: add resourcecontrol argument to command line
seccomp: add spawn argument to command line
seccomp: add elevateprivileges argument to command line
seccomp: add obsolete argument to command line
seccomp: changing from whitelist to blacklist
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-09-15
Here's the current batch of accumulated ppc patches. These are all
pretty simple bugfixes or cleanups, no big new features here.
# gpg: Signature made Fri 15 Sep 2017 04:50:00 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20170915:
ppc/kvm: use kvm_vm_check_extension() in kvmppc_is_pr()
spapr_events: use QTAILQ_FOREACH_SAFE() in spapr_clear_pending_events()
spapr_cpu_core: cleaning up qdev_get_machine() calls
spapr_pci: don't create 64-bit MMIO window if we don't need to
spapr_pci: convert sprintf() to g_strdup_printf()
spapr_cpu_core: fail gracefully with non-pseries machine types
xics: fix several error leaks
vfio, spapr: Fix levels calculation
spapr_pci: handle FDT creation errors with _FDT()
spapr_pci: use the common _FDT() helper
spapr: fix CAS-generated reset
ppc/xive: fix OV5_XIVE_EXPLOIT bits
spapr: only update SDR1 once per-cpu during CAS
spapr_pci: use g_strdup_printf()
spapr_pci: drop useless check in spapr_populate_pci_child_dt()
spapr_pci: drop useless check in spapr_phb_vfio_get_loc_code()
hw/ppc/spapr.c: cleaning up qdev_get_machine() calls
net: Add SunGEM device emulation as found on Apple UniNorth
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Right now, function trace_event_set_vcpu_state_dynamic() asynchronously enables
events in the case a vCPU is executing TCG code. If the vCPU is being created
this makes some events like "guest_cpu_enter" to not be traced.
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Message-id: 150525662577.19850.13767570977540117247.stgit@frigg.lan
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Like many other libraries, libseccomp cflags and libs should only apply
to the building of necessary objects. Do so in the usual way with the
help of per object variables.
Signed-off-by: Fam Zheng <famz@redhat.com>
This patch adds [,resourcecontrol=deny] to `-sandbox on' option. It
blacklists all process affinity and scheduler priority system calls to
avoid any bigger of the process.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
This patch adds [,spawn=deny] argument to `-sandbox on' option. It
blacklists fork and execve system calls, avoiding Qemu to spawn new
threads or processes.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
This patch introduces the new argument
[,elevateprivileges=allow|deny|children] to the `-sandbox on'. It allows
or denies Qemu process to elevate its privileges by blacklisting all
set*uid|gid system calls. The 'children' option will let forks and
execves run unprivileged.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
This patch introduces the argument [,obsolete=allow] to the `-sandbox on'
option. It allows Qemu to run safely on old system that still relies on
old system calls.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
This patch changes the default behavior of the seccomp filter from
whitelist to blacklist. By default now all system calls are allowed and
a small black list of definitely forbidden ones was created.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
hmp() passes its string argument through the sprintf() family;
with a proper attribute, gcc -Wformat warns us when we do something
dangerous like passing a non-constant format string. Fortunately,
all our strings were safe, but checking whether the string can
contain an unintended % is easy to avoid and therefore worth doing.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Prior to commit 063c23d9, we were tracking a list of parallel
qtest objects, in order to safely clean up a SIGABRT handler
only after the last connection quits. But when we switched to
more of glib's infrastructure, the list became dead code that
is never assigned to.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Assertions should be separate from the side effects, since in
theory, g_assert() can be disabled (in practice, we can't really
ever do that).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Back when the test was introduced, in commit 62c39b307, the
test was set up to run qemu-ga directly on the host performing
the test, and defaults to limiting itself to safe commands. At
the time, it was envisioned that setting QGA_TEST_SIDE_EFFECTING
in the environment could cover a few more commands, while noting
the potential danger of those side effects running in the host.
But this has NEVER been tested: if you enable the environment
variable, the test WILL fail. One obvious reason: if you are not
running as root, you'll probably get a permission failure when
trying to freeze the file systems, or when changing system time.
Less obvious: if you run the test as root (wow, you're brave), you
could end up hanging if the test tries to log things to a
temporarily frozen filesystem. But the cutest reason of all: if
you get past the above hurdles, the test uses invalid JSON in
test_qga_fstrim() (missing '' around the dictionary key 'minimum'),
and will thus fail an assertion in qmp_fd().
Rather than leave this untested time-bomb in place, rip it out.
Hopefully, as originally envisioned, we can find an opportunity
to test an actual sandboxed guest where the guest-agent has
full permissions and will not unduly affect the host running
the test - if so, 'git revert' can be used if desired, for
salvaging any useful parts of this attempt.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Broken with commit b4ba67d9a7 ("libqos: Change PCI accessors to take
opaque BAR handle") a while ago, but nobody noticed since the tests are
not run by default: The msix_pba_bar is not correctly initialized
anymore if bir_pba has the same value as bir_table. With this fix,
"make check SPEED=slow" should work fine again.
Fixes: b4ba67d9a7
Tested-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The user can currently still cause an abort() if running certain tests
(like the prom-env-test) without setting the QTEST_QEMU_BINARY first.
A similar problem has been fixed with commit 7c933ad61b
already, but forgot to also take care of the qtest_get_arch() function,
so let's introduce a proper wrapper around getenv("QTEST_QEMU_BINARY")
that can be used in both locations now.
Buglink: https://bugs.launchpad.net/qemu/+bug/1713434
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
The problem with puv3 has been fixed with 0ac241bcf9
('unicore32: abort when entering "x 0" on the monitor') and the problem
with tricore_testboard has been fixed with b190f477e2
('qemu-system-tricore: segfault when entering "x 0" on the monitor').
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
A lot of tests provide code for adding and removing a device via the
device_add and device_del QMP commands. Maintaining this code in so many
places is cumbersome and error-prone (some of the code parts check the
responses for device deletion in an incorrect way, for example, we've got
to deal with both, error code and DEVICE_DEL event here). So let's provide
some proper generic functions for adding and removing a device instead.
The code for correctly unplugging a device has been taken from a patch
from Peter Xu.
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Peter Xu <peterx@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
If the host has both KVM PR and KVM HV loaded and we pass:
-machine pseries,accel=kvm,kvm-type=PR
the kvmppc_is_pr() returns false instead of true. Since the helper
is mostly used as fallback, it doesn't have any real impact with
recent kernels. A notable exception is the workaround to allow
migration between compatible hosts with different PVRs (eg, POWER8
and POWER8E), since KVM still doesn't provide a way to check if a
specific PVR is supported (see commit c363a37a45 for details).
According to the official KVM API documentation [1], KVM_PPC_GET_PVINFO
is "vm ioctl", but we check it as a global ioctl. The following function
in KVM is hence called with kvm == NULL and considers we're in HV mode.
int kvm_vm_ioctl_check_extension(struct kvm *kvm, long ext)
{
int r;
/* Assume we're using HV mode when the HV module is loaded */
int hv_enabled = kvmppc_hv_ops ? 1 : 0;
if (kvm) {
/*
* Hooray - we know which VM type we're running on. Depend on
* that rather than the guess above.
*/
hv_enabled = is_kvmppc_hv_enabled(kvm);
}
Let's use kvm_vm_check_extension() to fix the issue.
[1] https://www.kernel.org/doc/Documentation/virtual/kvm/api.txt
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QTAILQ_FOREACH_SAFE() must be used when removing the current element
inside the loop block.
This fixes a user-after-free error introduced by commit 5625817423
and reported by Coverity (CID 1381017).
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch removes the qdev_get_machine() calls that are made
in spapr_cpu_core.c in situations where we can get an existing
pointer for the MachineState by either passing it as an argument
to the function or by using other already available pointers.
Credits to Daniel Henrique Barboza for the idea and the changelog
text.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When running a pseries-2.2 or older machine type, we get the following
lines in info mtree:
address-space: memory
...
ffffffffffffffff-ffffffffffffffff (prio 0, i/o): alias
pci@800000020000000.mmio64-alias @pci@800000020000000.mmio
ffffffffffffffff-ffffffffffffffff
address-space: cpu-memory
...
ffffffffffffffff-ffffffffffffffff (prio 0, i/o): alias
pci@800000020000000.mmio64-alias @pci@800000020000000.mmio
ffffffffffffffff-ffffffffffffffff
The same thing occurs when running a pseries-2.7 with
-global spapr-pci-host-bridge.mem_win_size=2147483648
This happens because we always create a 64-bit MMIO window, even if
we didn't explicitely requested it (ie, mem64_win_size == 0) and the
32-bit window is below 2GiB. It doesn't seem to have an impact on the
guest though because spapr_populate_pci_dt() doesn't advertise the
bogus windows when mem64_win_size == 0.
Since these memory regions don't induce any state, we can safely
choose to not create them when their address is equal to -1,
without breaking migration from existing setups.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Since commit 7cca3e466e ("ppc: spapr: Move VCPU ID calculation into
sPAPR"), QEMU aborts when started with a *-spapr-cpu-core device and
a non-pseries machine.
Let's rely on the already existing call to object_dynamic_cast() instead
of using the SPAPR_MACHINE() macro.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If object_property_get_link() fails then it allocates an error, which
must be freed before returning. The error_get_pretty() function is
merely an accessor to the error message and doesn't free anything.
The error.h header indicates how to do it right:
* Pass an existing error to the caller with the message modified:
* error_propagate(errp, err);
* error_prepend(errp, "Could not frobnicate '%s': ", name);
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The existing tries to round up the number of pages but @pages is always
calculated as the rounded up value minus one which makes ctz64() always
return 0 and have create.levels always set 1.
This removes wrong "-1" and allows having more than 1 levels. This becomes
handy for >128GB guests with standard 64K pages as this requires blocks
with zone order 9 and the popular limit of CONFIG_FORCE_MAX_ZONEORDER=9
means that only blocks up to order 8 are allowed.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
libfdt failures when creating the FDT should cause QEMU to terminate.
Let's use the _FDT() macro which does just that instead of propagating
the error to the caller. spapr_populate_pci_child_dt() no longer needs
to return a value in this case.
Note that, on the way, this get rids of the following nonsensical lines:
g_assert(!ret);
if (ret) {
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
All other users in hw/ppc already consider an error when building
the FDT to be fatal, even on hotplug paths. There's no valid reason
for spapr_pci to behave differently. So let's used the common _FDT()
helper which terminates QEMU when libfdt fails.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The OV5_MMU_RADIX_300 requires special handling in the CAS negotiation
process. It is cleared from the option vector of the guest before
evaluating the changes and re-added later. But, when testing for a
possible CAS reset :
spapr->cas_reboot = spapr_ovec_diff(ov5_updates,
ov5_cas_old, spapr->ov5_cas);
the bit OV5_MMU_RADIX_300 will each time be seen as removed from the
previous OV5 set, hence generating a reset loop.
Fix this problem by also clearing the same bit in the ov5_cas_old set.
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
On POWER9, the Client Architecture Support (CAS) negotiation process
determines whether the guest operates in XIVE Legacy compatibility or
in XIVE exploitation mode. Now that we have initial guest support for
the XIVE interrupt controller, let's fix the bits definition which have
evolved in the latest specs.
The platform advertises the XIVE Exploitation Mode support using the
property "ibm,arch-vec-5-platform-support-vec-5", byte 23 bits 0-1 :
- 0b00 XIVE legacy mode Only
- 0b01 XIVE exploitation mode Only
- 0b10 XIVE legacy or exploitation mode
The OS asks for XIVE Exploitation Mode support using the property
"ibm,architecture-vec-5", byte 23 bits 0-1:
- 0b00 XIVE legacy mode Only
- 0b01 XIVE exploitation mode Only
Signed-off-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit b55d295e3e added the possibility to support HPT resizing with KVM.
In the case of PR, we need to pass the userspace address of the HPT to KVM
using the SDR1 slot.
This is handled by kvmppc_update_sdr1() which uses CPU_FOREACH() to update
all CPUs. It is hence not needed to call kvmppc_update_sdr1() for each CPU.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Building strings with g_strdup_printf() instead of snprintf() is
a QEMU common practice.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
spapr_phb_get_loc_code() either returns a non-null pointer, or aborts
if g_strdup_printf() failed to allocate memory.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Grammatical fix to commit message]
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
g_strdup_printf() either returns a non-null pointer, or aborts if it
failed to allocate memory.
Signed-off-by: Greg Kurz <groug@kaod.org>
[dwg: Grammatical fix to commit message]
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch removes the qdev_get_machine() calls that are made in
spapr.c in situations where we can get an existing pointer for
the MachineState by either passing it as an argument to the function
or by using other already available pointers.
The following changes were made:
- spapr_node0_size: static function that is called two times:
at spapr_setup_hpt_and_vrma and ppc_spapr_init. In both cases we can
pass an existing MachineState pointer to it.
- spapr_build_fdt: MachineState pointer can be retrieved from
the existing sPAPRMachineState pointer.
- spapr_boot_set: the opaque in the first arg is a sPAPRMachineState
pointer as we can see inside ppc_spapr_init:
qemu_register_boot_set(spapr_boot_set, spapr);
We can get a MachineState pointer from it.
- spapr_machine_device_plug and spapr_machine_device_unplug_request: the
MachineState, sPAPRMachineState, MachineClass and sPAPRMachineClass pointers
can all be retrieved from the HotplugHandler pointer.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This adds a simplistic emulation of the Sun GEM ethernet controller
found in Apple ASICs.
Currently we only support the Apple UniNorth 1.x variant, but the
other Apple or Sun variants should mostly be a matter of adding
PCI IDs options.
We have a very primitive emulation of a single Broadcom 5201 PHY
which is supported by the MacOS driver.
This model brings out-of-the-box networking to MacOS 9, and all
versions of OS X I tried with the mac99 platform.
Further improvements from Mark:
- Remove sungem.h file, moving constants into sungem.c as required
- Switch to using tracepoints for debugging
- Split register blocks into separate memory regions
- Use arrays in SunGEMState to hold register values
- Add state-saving support
Signed-off-by: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Previously when single stepping through ERET instruction via GDB
would result in debugger entering the "next" PC after ERET instruction.
When debugging in kernel mode, this will also cause unintended behavior,
because debugger will try to access memory from EL0 point of view.
Signed-off-by: Jaroslaw Pelczar <j.pelczar@samsung.com>
Message-id: 001c01d32895$483027f0$d89077d0$@samsung.com
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a machine level virtualization property. This defaults to false and can be
set to true using this machine command line argument:
-machine xlnx-zcu102,virtualization=on
This follows what the ARM virt machine does.
This property only applies to the ZCU102 machine. The EP108 machine does
not have this property.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a machine level secure property. This defaults to false and can be
set to true using this machine command line argument:
-machine xlnx-zcu102,secure=on
This follows what the ARM virt machine does.
This property only applies to the ZCU102 machine. The EP108 machine does
not have this property.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In preperation for future work let's manually create the Xilnx machines.
This will allow us to set properties for the machines in the future.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The EP108 is a early access development board. Now that silicon is in
production people have access to the ZCU102. Let's rename the internal
QEMU files and variables to use the ZCU102.
There is no functional change here as the EP108 is still a valid board
option.
Signed-off-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
In the v7M and v8M ARM ARM, the magic exception return values are
referred to as EXC_RETURN values, and in QEMU we use V7M_EXCRET_*
constants to define bits within them. Rename the 'type' variable
which holds the exception return value in do_v7m_exception_exit()
to excret, making it clearer that it does hold an EXC_RETURN value.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505137930-13255-8-git-send-email-peter.maydell@linaro.org
The exception-return magic values get some new bits in v8M, which
makes some bit definitions for them worthwhile.
We don't use the bit definitions for the switch on the low bits
which checks the return type for v7M, because this is defined
in the v7M ARM ARM as a set of valid values rather than via
per-bit checks.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1505137930-13255-7-git-send-email-peter.maydell@linaro.org
In several places we were unconditionally applying the
nvic_gprio_mask() to a priority value. This is incorrect
if the priority is one of the fixed negative priority
values (for NMI and HardFault), so don't do it.
This bug would have caused both NMI and HardFault to be
considered as the same priority and so NMI wouldn't
correctly preempt HardFault.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1505137930-13255-5-git-send-email-peter.maydell@linaro.org
HMP pull 2017-09-14
# gpg: Signature made Thu 14 Sep 2017 15:57:30 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-hmp-20170914:
hmp: introduce 'info memory_size_summary' command
qmp: introduce query-memory-size-summary command
hmp: extend "info numa" with hotplugged memory information
tests/hmp: test "none" machine with memory
dump: do not dump non-existent guest memory
hmp: fix "dump-quest-memory" segfault (arm)
hmp: fix "dump-quest-memory" segfault (ppc)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
It does not really make sense to dump memory that is not there.
Moreover, that fixes a segmentation fault when calling dump-guest-memory
with no filter for a machine with no memory defined.
New behaviour is:
(qemu) dump-guest-memory /dev/null
dump: no guest memory to dump
(qemu) dump-guest-memory /dev/null 0 4096
dump: no guest memory to dump
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Tested-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Peter Xu <peterx@redhat.com>
Message-Id: <20170913142036.2469-4-lvivier@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Running QEMU with
qemu-system-aarch64 -M none -nographic -m 256
and executing
dump-guest-memory /dev/null 0 8192
results in segfault
Fix by checking if we have CPU, and exit with
error if there is no CPU:
(qemu) dump-guest-memory /dev/null
this feature or command is not currently supported
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Message-Id: <20170913142036.2469-3-lvivier@redhat.com>
Reviewed-by: Eric Auger <eric.auger@redhat.com>
Tested-by: Eric Auger <eric.auger@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Running QEMU with
qemu-system-ppc64 -M none -nographic -m 256
and executing
dump-guest-memory /dev/null 0 8192
results in segfault
Fix by checking if we have CPU, and exit with
error if there is no CPU:
(qemu) dump-guest-memory /dev/null
this feature or command is not currently supported
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <20170913142036.2469-2-lvivier@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Nowdays we use libusb for usb-host, so we don't have different code
for linux vs. bsd any more. So there is little reason to have the
HOST_USB variable, we can just write things directly into the Makefile
and avoid a pointless indirection.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 20170908111217.21985-2-kraxel@redhat.com
The existing XHCI code reads the Event Ring Segment Table Base Address
Register (ERSTBA) every time when it is changed. However zero is its
default state so one would think that zero there means it is not in use.
This adds a check for ERSTBA in addition to the existing check for
the Event Ring Segment Table Size Register (ERSTSZ).
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Message-id: 20170911065606.40600-1-aik@ozlabs.ru
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Some termcaps (found using SLES11SP1) use [? sequences. According to man
console_codes (http://linux.die.net/man/4/console_codes) the question mark
is a nop and should simply be ignored.
This patch does exactly that, rendering screen output readable when
outputting guest serial consoles to the graphical console emulator.
Signed-off-by: Alexander Graf <agraf@suse.de>
Message-id: 20170829113818.42482-1-agraf@suse.de
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
virtio-gpu can trigger the assert added by commit "6905b93447 console:
add same surface replace pre-condition" in multihead setups (where
surface can be NULL for secondary displays). Allow surface being NULL.
Fixes: 6905b93447
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170906142109.2685-1-kraxel@redhat.com
Remove pixman switches from configure, should not be needed any more,
configure can figure by itself whenever pixman is needed or not.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170905140116.28181-3-kraxel@redhat.com
Drop pixman submodule and support for the "internal" pixman build.
pixman should be reasonably well established meanwhile so we don't
need the fallback submodule any more. While being at it also drop
some #ifdefs for pixman versions older than what we require in
configure anyway.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170905140116.28181-2-kraxel@redhat.com
Don't reset window layout information (passed via virtio_gpu_ui_info) on
device reset, so the user interface window layout will be kept intact
over reboots. The head size and position was commented out already, so
this patch just drops the dead code. Additionally the enabled head mask
must be kept so multihead setups work properly too.
Fixes: https://bugzilla.redhat.com/show_bug.cgi?id=1460595
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 20170906142058.2460-1-kraxel@redhat.com
GCC 4.7.2 on SunOS reports that the values assigned to array members are not
real constants:
target/m68k/fpu_helper.c:32:5: error: initializer element is not constant
target/m68k/fpu_helper.c:32:5: error: (near initialization for 'fpu_rom[0]')
rules.mak:66: recipe for target 'target/m68k/fpu_helper.o' failed
Convert the array to make_floatx80_init() to fix it.
Replace floatx80_pi-like constants with make_floatx80_init() as they are
defined as make_floatx80().
This fixes build on SmartOS (Joyent).
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170904212306.3020-1-n54@gmx.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
ppc patch queue 2017-09-08
This is the first batch of ppc related patches for qemu-2.11, and it's
accumulated quite a few things. Includes:
* A cleanup to handling of ppc cpu models from Igor
* First parts of fixes to handling of guest vs. host SMT modes from
Sam Bobroff
* Preliminary patches towards supporting the Sam460 board from
Balaton Zoltan
* Several fixes for hotplug logic
* Assorted other fixes and cleanups
# gpg: Signature made Fri 08 Sep 2017 06:28:42 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.11-20170908: (40 commits)
ppc: spapr: Move VCPU ID calculation into sPAPR
ppc: remove non implemented cpu models
ppc: drop caching ObjectClass from PowerPCCPUAlias
ppc: simplify cpu model lookup by PVR
ppc: replace inter-function cyclic dependency/recurssion with 2 simple lookups
ppc: make cpu alias point only to real cpu models
ppc: make cpu_model translation to type consistent
ppc: use macros to make cpu type name from string literal
target/ppc: Remove old STATUS file
PPC: KVM: Support machine option to set VSMT mode
spapr: fallback to raw mode if best compat mode cannot be set during CAS
hw/nvram/spapr_nvram: Device can not be created by the users
hw/ppc/spapr_cpu_core: Add a proper check for spapr machine
ppc4xx: Export ECB and PLB emulation
ppc4xx_i2c: Move to hw/i2c
ppc4xx_i2c: QOMify
ppc4xx: Split off 4xx I2C emulation from ppc405_uc to its own file
ppc4xx: Make MAL emulation more generic
ppc4xx: Move MAL from ppc405_uc to ppc4xx_devs
spapr_iommu: Realloc guest visible TCE table when hot(un)plugging vfio-pci
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The callback is called on select.
Furthermore, the next patch introduced a new callback, so rename the
function type with a generic name.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Add a new slot_reserved_mask bitmask to PCIBus indicating whether or not each
PCI slot on the bus is reserved. Ensure that it is initialised to zero to
maintain the existing behaviour that all slots are available by default, and
add the additional check with appropriate error reporting to
do_pci_register_device().
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This compat property sole function is to prevent the device from being
instantiated. Instead of requiring an extra compat property, check if
fw_cfg has DMA enabled.
fw_cfg is a built-in device that is initialized very early by the
machine init code. We have at least one other device that also
assumes fw_cfg_find() can be safely used on realize: pvpanic.
This has the additional benefit of handling other cases properly, like:
$ qemu-system-x86_64 -device vmgenid -machine none
qemu-system-x86_64: -device vmgenid: vmgenid requires DMA write support in fw_cfg, which this machine type does not provide
$ qemu-system-x86_64 -device vmgenid -machine pc-i440fx-2.9 -global fw_cfg.dma_enabled=off
qemu-system-x86_64: -device vmgenid: vmgenid requires DMA write support in fw_cfg, which this machine type does not provide
$ qemu-system-x86_64 -device vmgenid -machine pc-i440fx-2.6 -global fw_cfg.dma_enabled=on
[boots normally]
Suggested-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Ben Warren <ben@skyportsystems.com>
Reviewed-by: Laszlo Ersek <lersek@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Moved vmgenid from uncategorized to misc category in QEMU help menu
Signed-off-by: Yoni Bettan <ybettan@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
In vtd_switch_address_space() we did the memory region switch, however
it's possible that the caller of it has not taken the BQL at all. Make
sure we have it.
CC: Paolo Bonzini <pbonzini@redhat.com>
CC: Jason Wang <jasowang@redhat.com>
CC: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
To enable hotplugging of a newly created pcie-pci-bridge,
we need to tell firmware (e.g. SeaBIOS) to reserve
additional buses or IO/MEM/PREF space for pcie-root-port.
Additional bus reservation allows us to hotplug pcie-pci-bridge into this root port.
The number of buses and IO/MEM/PREF space to reserve are provided to the device via
a corresponding property, and to the firmware via new PCI capability.
The properties' default values are -1 to keep default behavior unchanged.
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
On PCI init PCI bridges may need some extra info about bus number,
IO, memory and prefetchable memory to reserve. QEMU can provide this
with a special vendor-specific PCI capability.
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Introduce a new PCIExpress-to-PCI Bridge device,
which is a hot-pluggable PCI Express device and
supports devices hot-plug with SHPC.
This device is intended to replace the DMI-to-PCI Bridge.
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Tested-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This reverts commit 153eba4726.
This patch prevents PCI passthrough hotplug on Xen. Even if the Xen tool
stack prepares its own ACPI tables, we still rely on QEMU for hotplug
ACPI notifications.
The original issue is fixed by the two previous patch:
hw/acpi: Limit hotplug to root bus on legacy mode
hw/acpi: Move acpi_set_pci_info to pcihp
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
HW part of ACPI PCI hotplug in QEMU depends on ACPI_PCIHP_PROP_BSEL
being set on a PCI bus that supports ACPI hotplug. It should work
regardless of the source of ACPI tables (QEMU generator/legacy SeaBIOS/Xen).
So move ACPI_PCIHP_PROP_BSEL initialization into HW ACPI implementation
part from QEMU's ACPI table generator.
To do PCI passthrough with Xen, the property ACPI_PCIHP_PROP_BSEL needs
to be set, but this was done only when ACPI tables are built which is
not needed for a Xen guest. The need for the property starts with commit
"pc: pcihp: avoid adding ACPI_PCIHP_PROP_BSEL twice"
(f0c9d64a68).
Adding find_i440fx into stubs so that mips-softmmu target can be built.
Reported-by: Sander Eikelenboom <linux@eikelenboom.it>
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
vhost registers a MemoryListener where it adds and removes references
to MemoryRegions as the MemoryRegionSections pass through. The
region_add callback is invoked for each existing section when the
MemoryListener is registered, but unregistering the MemoryListener
performs no reciprocal region_del callback. It's therefore the
owner of the MemoryListener's responsibility to cleanup any persistent
changes, such as these memory references, after unregistering.
The consequence of this bug is that if we have both a vhost device
and a vfio device, the vhost device will reference any mmap'd MMIO of
the vfio device via this MemoryListener. If the vhost device is then
removed, those references remain outstanding. If we then attempt to
remove the vfio device, it never gets finalized and the only way to
release the kernel file descriptors is to terminate the QEMU process.
Fixes: dfde4e6e1a ("memory: add ref/unref calls")
Cc: Michael S. Tsirkin <mst@redhat.com>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: qemu-stable@nongnu.org # v1.6.0+
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Fri 08 Sep 2017 03:00:34 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
colo-compare: Update the COLO document to add the IOThread configuration
colo-compare: Use IOThread to Check old packet regularly and Process pactkets of the primary
qemu-iothread: IOThread supports the GMainContext event loop
net/colo-compare.c: Fix comments and scheme
net/colo-compare.c: Adjust net queue pop order for performance
net/colo-compare.c: Optimize unpredictable tcp options comparison
e1000: Rename the SEC symbol to SEQEC
net/socket: Improve -net socket error reporting
net/net: Convert parse_host_port() to Error
net/socket: Convert several helper functions to Error
net/socket: Don't treat odd socket type as SOCK_STREAM
MAINTAINERS: Update mail address for COLO Proxy
net: rtl8139: do not use old_mmio accesses
net/rocker: Fix the unusual macro name
net/rocker: Convert to realize()
net/rocker: Plug memory leak in pci_rocker_init()
net/rocker: Remove the dead error handling
net/filter-rewriter.c: Fix rewirter checksum bug when use virtio-net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
TCG constant pools
# gpg: Signature made Thu 07 Sep 2017 23:35:45 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tcg-20170907: (23 commits)
tcg/ppc: Use constant pool for movi
tcg/ppc: Look for shifted constants
tcg/ppc: Change TCG_REG_RA to TCG_REG_TB
tcg/arm: Use constant pool for call
tcg/arm: Use constant pool for movi
tcg/arm: Extract INSN_NOP
tcg/arm: Code rearrangement
tcg/arm: Tighten tlb indexing offset test
tcg/arm: Improve tlb load for armv7
tcg/sparc: Use constant pool for movi
tcg/sparc: Introduce TCG_REG_TB
tcg/aarch64: Use constant pool for movi
tcg/s390: Use constant pool for cmpi
tcg/s390: Use constant pool for xori
tcg/s390: Use constant pool for ori
tcg/s390: Use constant pool for andi
tcg/s390: Use constant pool for movi
tcg/s390: Fix sign of patch_reloc addend
tcg/s390: Introduce TCG_REG_TB
tcg/i386: Store out-of-range call targets in constant pool
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Conversion to TranslatorOps
# gpg: Signature made Thu 07 Sep 2017 19:42:48 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-pa-20170907:
target/hppa: Convert to TranslatorOps
target/hppa: Convert to DisasContextBase
target/hppa: Convert to DisasJumpType
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Queued target/alpha patches
# gpg: Signature made Thu 07 Sep 2017 19:17:22 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-axp-20170907:
target/alpha: Switch to do_transaction_failed() hook
target/alpha: Convert to TranslatorOps
target/alpha: Convert to DisasContextBase
target/alpha: Convert to DisasJumpType
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Remove the task which check old packet in the comparing thread,
then use IOthread context timer to handle it.
Process pactkets in the IOThread which arrived over the socket.
we use iothread_get_g_main_context to create a new g_main_loop in
the IOThread.then the packets from the primary and the secondary
are processed in the IOThread.
Finally remove the colo-compare thread using the IOThread instead.
Reviewed-by: Zhang Chen<zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Wang Yong <wang.yong155@zte.com.cn>
Signed-off-by: Wang Guang <wang.guang55@zte.com.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
IOThread uses AioContext event loop and does not run a GMainContext.
Therefore,chardev cannot work in IOThread,such as the chardev is
used for colo-compare packets reception.
This patch makes the IOThread run the GMainContext event loop,
chardev and IOThread can work together.
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Wang Yong <wang.yong155@zte.com.cn>
Signed-off-by: Wang Guang <wang.guang55@zte.com.cn>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The packet_enqueue() use g_queue_push_tail() to
enqueue net packet, so it is more efficent way use
g_queue_pop_head() to get packet for compare.
That will improve the success rate of comparison.
In my test the performance of ftp put 1000M file
will increase 10%
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When network is busy, some tcp options(like sack) will unpredictable
occur in primary side or secondary side. it will make packet size
not same, but the two packet's payload is identical. colo just
care about packet payload, so we skip the option field.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
SunOS defines SEC in <sys/time.h> as 1 (commonly used time symbols).
This fixes build on SmartOS (Joyent).
Patch cherry-picked from pkgsrc by jperkin (Joyent).
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
When -net socket fails, it first reports a specific error, then
a generic one, like this:
$ ./x86_64-softmmu/qemu-system-x86_64 -net socket,mcast=230.0.0.1:1234,listen
qemu-system-x86_64: -net socket,mcast=230.0.0.1:1234,listen: exactly one of listen=, connect=, mcast= or udp= is required
qemu-system-x86_64: -net socket,mcast=230.0.0.1:1234,listen: Device 'socket' could not be initialized
Convert net_socket_*_init() to Error to get rid of the superfluous second
error message. After the patch, the effect like this:
$ ./x86_64-softmmu/qemu-system-x86_64 -net socket,mcast=230.0.0.1:1234,listen
qemu-system-x86_64: -net socket,mcast=230.0.0.1:1234,listen: exactly one of listen=, connect=, mcast= or udp= is requireda
This also fixes a few silent failures to report an error.
Cc: jasowang@redhat.com
Cc: armbru@redhat.com
Cc: berrange@redhat.com
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
In net_socket_fd_init(), the 'default' case is odd: it warns,
then continues as if the socket type was SOCK_STREAM. The
comment explains "this could be a eg. a pty", but that makes
no sense. If @fd really was a pty, getsockopt() would fail
with ENOTSOCK. If @fd was a socket, but neither SOCK_DGRAM nor
SOCK_STREAM. It should not be treated as if it was SOCK_STREAM.
Turn this case into an Error. If there is a genuine reason to
support something like SOCK_RAW, it should be explicitly
handled.
Cc: jasowang@redhat.com
Cc: armbru@redhat.com
Cc: berrange@redhat.com
Cc: armbru@redhat.com
Cc: eblake@redhat.com
Suggested-by: Markus Armbruster <armbru@redhat.com>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
My Fujitsu mail account will be disabled soon, update the mail info
to my private mail.
Signed-off-by: Zhang Chen <zhangchen.fnst@cn.fujitsu.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Both io and memory use the same mmio functions in the rtl8139 device.
This patch removes the separate MemoryRegionOps and old_mmio accessors
for memory, and replaces it with an alias to the io memory region.
Signed-off-by: Matt Parker <mtparkr@gmail.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
The rocker device still implements the old PCIDeviceClass .init()
instead of the new .realize(). All devices need to be converted to
.realize().
.init() reports errors with fprintf() and return 0 on success, negative
number on failure. Meanwhile, when -device rocker fails, it first report
a specific error, then a generic one, like this:
$ x86_64-softmmu/qemu-system-x86_64 -device rocker,name=qemu-rocker
rocker: name too long; please shorten to at most 9 chars
qemu-system-x86_64: -device rocker,name=qemu-rocker: Device initialization failed
Now, convert it to .realize() that passes errors to its callers via its
errp argument. Also avoid the superfluous second error message. After
the patch, effect like this:
$ x86_64-softmmu/qemu-system-x86_64 -device rocker,name=qemu-rocker
qemu-system-x86_64: -device rocker,name=qemu-rocker: name too long; please shorten to at most 9 chars
Cc: jasowang@redhat.com
Cc: jiri@resnulli.us
Cc: armbru@redhat.com
Cc: f4bug@amsat.org
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Memory allocation functions like world_alloc, desc_ring_alloc etc,
they are all wrappers around g_malloc, g_new etc. But g_malloc and
similar functions doesn't return null. Because they ignore the fact
that g_malloc() of 0 bytes returns null. So error checks for these
allocation failure are superfluous. Now, remove them entirely.
Cc: jasowang@redhat.com
Cc: jiri@resnulli.us
Cc: armbru@redhat.com
Cc: f4bug@amsat.org
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
Move the calculation of a CPU's VCPU ID out of the generic PPC code
(ppc_cpu_realizefn()) and into sPAPR specific code
(spapr_cpu_core_realize()) where it belongs.
Unfortunately, due to the way things are ordered, we still need to
default the VCPU ID in ppc_cpu_realizfn() but at least doing that
doesn't require any interaction with sPAPR.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Remove cpu models that aren't implemented and are not
compiled/tested since they are under TODO ifdef
which isn't defined in sources.
If someone really needs a removed model he/she should add
as regular one with corresponding implementation.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Caching there practically doesn't give any benefits
and that at slow path druring querying supported CPU list.
But it introduces non conventional path of where from
comes used CPU type name (kvm_ppc_register_host_cpu_type).
Taking in account that kvm_ppc_register_host_cpu_type()
fixes up models the aliases point to, it's sufficient to
make ppc_cpu_class_by_name() translate cpu alias to
correct cpu type name.
So drop PowerPCCPUAlias::oc field + ppc_cpu_class_by_alias()
and let ppc_cpu_class_by_name() do conversion to cpu type name,
which simplifies code a little bit saving ~20LOC and trouble
wondering why ppc_cpu_class_by_alias() is necessary.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
previous patches cleaned up cpu model/alias naming which
allows to simplify cpu model/alias to cpu type lookup a bit
byt removing recurssion and dependency of ppc_cpu_class_by_name() /
ppc_cpu_class_by_alias() on each other.
Besides of simplifying code it reduces it by ~15LOC.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
alias pointing to another alias forces lookup code to
do recurrsive translation till real cpu model is reached.
Drop this nonsence and make each alias point to cpu model
that has corresponding CPU type. It will allow to drop
recurrsion in cpu model translation code and actually
make ppc_cpu_aliases[] content use PowerPCCPUAlias
fields properly
(i.e. alias goes into .alias and model goes into .model)
While at it add TODO defines around aliases that point to
cpu models excluded by the same TODO defines.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
PPC handles -cpu FOO rather incosistently,
i.e. it does case-insensitive matching of FOO to
a CPU type (see: ppc_cpu_compare_class_name) but
handles alias names as case-sensitive, as result:
# qemu-system-ppc64 -M mac99 -cpu g3
qemu-system-ppc64: unable to find CPU model ' kN�U'
# qemu-system-ppc64 -cpu 970MP_V1.1
qemu-system-ppc64: Unable to find sPAPR CPU Core definition
while
# qemu-system-ppc64 -M mac99 -cpu G3
# qemu-system-ppc64 -cpu 970MP_v1.1
start up just fine.
Considering we can't take case-insensitive matching away,
make it case-insensitive for all alias/type/core_type
lookups.
As side effect it allows to remove duplicate core types
which are the same except of using different cased letters in name.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Replace
"-" TYPE_POWERPC_CPU
when composing cpu type name from cpu model string literal
and the same pattern in format strings with
POWERPC_CPU_TYPE_SUFFIX and POWERPC_CPU_TYPE_NAME(model)
macroses like we do in x86.
Later POWERPC_CPU_TYPE_NAME() will be used to define default
cpu type per machine type and as bonus it will be consistent
and easy grep-able pattern across all other targets that I'm
plannig to treat the same way.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The target/ppc/STATUS file has seen its last real update 10 years
ago - so the information in there is not up to date anymore. Since
nobody seems to care about this file, let's simply remove it.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
KVM now allows writing to KVM_CAP_PPC_SMT which has previously been
read only. Doing so causes KVM to act, for that VM, as if the host's
SMT mode was the given value. This is particularly important on Power
9 systems because their default value is 1, but they are able to
support values up to 8.
This patch introduces a way to control this capability via a new
machine property called VSMT ("Virtual SMT"). If the value is not set
on the command line a default is chosen that is, when possible,
compatible with legacy systems.
Note that the intialization of KVM_CAP_PPC_SMT has changed slightly
because it has changed (in KVM) from a global capability to a
VM-specific one. This won't cause a problem on older KVMs because VM
capabilities fall back to global ones.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
KVM PR doesn't allow to set a compat mode. This causes ppc_set_compat_all()
to fail and we return H_HARDWARE to the guest right away.
This is excessive: even if we favor compat mode since commit 152ef803ce,
we should at least fallback to raw mode if the guest supports it.
This patch modifies cas_check_pvr() so that it also reports that the real
PVR was found in the table supplied by the guest. Note that this is only
makes sense if raw mode isn't explicitely disabled (ie, the user didn't
set the machine "max-cpu-compat" property). If this is the case, we can
simply ignore ppc_set_compat_all() failures, and let the guest run in raw
mode.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Trying to add a spapr-nvram device currently aborts QEMU like this:
$ ppc64-softmmu/qemu-system-ppc64 -device spapr-nvram
qemu-system-ppc64: hw/ppc/spapr_rtas.c:407: spapr_rtas_register:
Assertion `!rtas_table[token].name' failed.
Aborted (core dumped)
This NVRAM device registers RTAS calls during its realize function
and thus can only be used once - and that's internally from spapr.c.
So let's mark the device with user_creatable = false to avoid that
the users can crash their QEMU this way.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU currently crashes when the user tries to add a spapr-cpu-core
on a non-pseries machine:
$ qemu-system-ppc64 -S -machine ppce500,accel=tcg \
-device POWER5+_v2.1-spapr-cpu-core
hw/ppc/spapr_cpu_core.c:178:spapr_cpu_core_realize_child:
Object 0x55cee1f55160 is not an instance of type spapr-machine
Aborted (core dumped)
So let's add a proper check for the correct machine time with
a more friendly error message here.
Reported-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Make these device models available outside ppc405_uc.c for reuse in
460EX emulation. They are left in their current place for now because
they are used mostly unchanged and I'm not sure these correctly model
the components in 440 SoCs (but they seem to be good enough). These
functions could be moved in a subsequent clean up series when this is
confirmed.
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This device appears in other SoCs as well not just in 405 ones and
subsequent patches will modify it, so move it out of ppc405_uc.c in
preparation
Signed-off-by: BALATON Zoltan <balaton@eik.bme.hu>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This replaces g_malloc() with spapr_tce_alloc_table() as this is
the standard way of allocating tables and this allows moving the table
back to KVM when unplugging a VFIO PCI device and VFIO TCE acceleration
support is not present in the KVM.
Although spapr_tce_alloc_table() is expected to fail with EBUSY
if called when previous fd is not closed yet, in practice we will not
see it because cap_spapr_vfio is false at the moment.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Some OS don't populate the TSIZE field when using a fixed size TLB which result
in a 1KB TLB. When the TLB is a fixed size TLB the TSIZE field should be
ignored.
Fix this wrong behavior with MAV 2.0.
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This fixes booke206_tlbnps for MAV 2.0 by checking the MMUCFG register and
return directly the right tlbnps instead of computing it from non existing
field.
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The concept of a VCPU ID that differs from the CPU's index
(cpu->cpu_index) exists only within SPAPR machines so, move the
functions ppc_get_vcpu_id() and ppc_get_cpu_by_vcpu_id() into spapr.c
and rename them appropriately.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This field actually records the VCPU ID used by KVM and, although the
value is also used in the device tree it is primarily the VCPU ID so
rename it as such.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
[dwg: Updated comment missed in cpu.h]
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The e500 platform code uses the function ppc_get_vcpu_dt_id() to get
an id to put in its device tree. Which seems like it makes sense, but
ppc_get_vcpu_dt_id() is actually badly named - it only differs from
cpu_index in cases where you're running on KVM HV and the host's
number of threads differs from the guests. Since KVM HV only supports
PAPR, not e500, it doesn't make sense to use it here.
Simply use the cpu_index instead (which is 'i' in this context
because qemu_get_cpu(i) returns the cpu with cpu_index == i).
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
[dwg: Rewrote commit message]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When hot-unplugging a PHB, all its PCI DRC connectors get unrealized. This
patch adds an unrealize method to the physical DRC class, in order to undo
registrations performed in realize_physical().
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This memory region should be owned by the PHB. This ensures the PHB
cannot be finalized as long as the the region is guest visible, or
used by a CPU or a device.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Passing a stack allocated buffer of arbitrary length to snprintf()
without checking the return value can cause the resultant strings
to be silently truncated.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Passing a stack allocated buffer of arbitrary length to snprintf()
without checking the return value can cause the resultant strings
to be silently truncated.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Passing a null priority to memory_region_add_subregion_overlap() is
strictly equivalent to calling memory_region_add_subregion().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch is a follow up on the discussions made in patch
"hw/ppc: disable hotplug before CAS is completed" that can be
found at [1].
At this moment, we do not support CPU/memory hotplug in early
boot stages, before CAS. When a hotplug occurs, the event is logged
in an internal RTAS event log queue and an IRQ pulse is fired. In
regular conditions, the guest handles the interrupt by executing
check_exception, fetching the generated hotplug event and enabling
the device for use.
In early boot, this IRQ isn't caught (SLOF does not handle hotplug
events), leaving the event in the rtas event log queue. If the guest
executes check_exception due to another hotplug event, the re-assertion
of the IRQ ends up de-queuing the first hotplug event as well. In short,
a device hotplugged before CAS is considered coldplugged by SLOF.
This leads to device misbehavior and, in some cases, guest kernel
Ooops when trying to unplug the device.
A proper fix would be to turn every device hotplugged before CAS
as a colplugged device. This is not trivial to do with the current
code base though - the FDT is written in the guest memory at
ppc_spapr_reset and can't be retrieved without adding extra state
(fdt_size for example) that will need to managed and migrated. Adding
the hotplugged DT in the middle of CAS negotiation via the updated DT
tree works with CPU devs, but panics the guest kernel at boot. Additional
analysis would be necessary for LMBs and PCI devices. There are
questions to be made in QEMU/SLOF/kernel level about how we can make
this change in a sustainable way.
With Linux guests, a fix would be the kernel executing check_exception
at boot time, de-queueing the events that happened in early boot and
processing them. However, even if/when the newer kernels start
fetching these events at boot time, we need to take care of older
kernels that won't be doing that.
This patch works around the situation by issuing a CAS reset if a hotplugged
device is detected during CAS:
- the DRC conditions that warrant a CAS reset is the same as those that
triggers a DRC migration - the DRC must have a device attached and
the DRC state is not equal to its ready_state. With that in mind, this
patch makes use of 'spapr_drc_needed' to determine if a CAS reset
is needed.
- In the middle of CAS negotiations, the function
'spapr_hotplugged_dev_before_cas' goes through all the DRCs to see
if there are any DRC that requires a reset, using spapr_drc_needed. If
that happens, returns '1' in 'spapr_h_cas_compose_response' which will set
spapr->cas_reboot to true, causing the machine to reboot.
No changes are made for coldplug devices.
[1] http://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg02855.html
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The sPAPR machine isn't clearing up the pending events QTAILQ on
machine reboot. This allows for unprocessed hotplug/epow events
to persist in the queue after reset and, when reasserting the IRQs in
check_exception later on, these will be being processed by the OS.
This patch implements a new function called 'spapr_clear_pending_events'
that clears up the pending_events QTAILQ. This helper is then called
inside ppc_spapr_reset to clear up the events queue, preventing
old/deprecated events from persisting after a reset.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
This patch makes a small fix in 'spapr_drc_needed' to change how we detect
if a DRC has a device attached. Previously it used dr_entity_sense for this,
which works for physical DRCs.
However, for logical DRCs, it didn't cover the case where a logical DRC has
a drc->dev but the state is LOGICAL_UNUSABLE (e.g. a hotplugged CPU before
CAS). In this case, the dr_entity_sense of this DRC returns UNUSABLE and the
code was considering that there were no dev attached, making spapr_drc_needed
return 'false' when in fact we would like to migrate the DRC.
Changing it to check for drc->dev instead works for all DRC types.
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
At this point the conversion is a wash. Loading of TB+ofs is
smaller, but the actual return address from exit_tb is larger.
There are a few more insns required to transition between TBs.
But the expectation is that accesses to the constant pool will
on the whole be smaller.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Move constants before all of the functions.
Move tcg_out_<format> functions before all
of the others. No functional change.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We are not going to use ldrd for loading the comparator
for 32-bit guests, so don't limit cmp_off to 8 bits then.
This eliminates one insn in the tlb load for some guests.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use UBFX to avoid limitation on CPU_TLB_BITS. Since we're dropping
the initial shift, we need to replace the page masking. We can use
MOVW+BIC to do this without shifting. The result is the same size
as the armv6 path with one less conditional instruction.
Signed-off-by: Richard Henderson <rth@twiddle.net>
We were passing in -2 instead of +2, but then ignoring
the actual contents of addend in the calculation.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Already it saves 2 bytes per call, but also the constant pool
entry may well be shared across multiple calls.
Signed-off-by: Richard Henderson <rth@twiddle.net>
A new shared header tcg-pool.inc.c adds new_pool_label,
for registering a tcg_target_ulong to be emitted after
the generated code, plus relocation data to install a
pointer to the data.
A new pointer is added to the TCGContext, so that we
dump the constant pool as data, not code.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Dispense with TCGBackendData, as it has never been used for more than
holding a single pointer. Use a define in the cpu/tcg-target.h to
signal requirement for TCGLabelQemuLdst, so that we can drop the no-op
tcg-be-null.h stubs. Rename tcg-be-ldst.h to tcg-ldst.inc.c.
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Replace the USE_DIRECT_JUMP ifdef with a TCG_TARGET_HAS_direct_jump
boolean test. Replace the tb_set_jmp_target1 ifdef with an unconditional
function tb_target_set_jmp_target.
While we're touching all backends, add a parameter for tb->tc_ptr;
we're going to need it shortly for some backends.
Move tb_set_jmp_target and tb_add_jump from exec-all.h to cpu-exec.c.
This opens the possibility for TCG_TARGET_HAS_direct_jump to be
a runtime decision -- based on host cpu capabilities, the size of
code_gen_buffer, or a future debugging switch.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Switch the alpha target from the old unassigned_access hook
to the new do_transaction_failed hook. This allows us to
resolve a ??? in the old hook implementation.
The only part of the alpha target that does physical
memory accesses is reading the page table -- add a
TODO comment there to the effect that we should handle
bus faults on page table walks. (Since the palcode
doesn't actually do anything useful on a bus fault anyway
it's a bit moot for now.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <1502196172-13818-1-git-send-email-peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Nobody has mentioned AIX host support on the mailing list for years,
and we have no test systems for it so it is most likely broken.
We've advertised in configure for two releases now that we plan
to drop support for this host OS, and have had no complaints.
Drop the AIX host support code.
We can also drop the now-unused AIX version of sys_cache_info().
Note that the _CALL_AIX define used in the PPC tcg backend is
also used for Linux PPC64, and so that code should not be removed.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1504545540-8002-1-git-send-email-peter.maydell@linaro.org
nbd patches for 2017-09-06
- Daniel P. Berrange: [0/2] Fix / skip recent iotests with LUKS driver
- Eric Blake: [0/3] nbd: Use common read/write-all qio functions
# gpg: Signature made Wed 06 Sep 2017 16:17:55 BST
# gpg: using RSA key 0xA7A16B4A2527436A
# gpg: Good signature from "Eric Blake <eblake@redhat.com>"
# gpg: aka "Eric Blake (Free Software Programmer) <ebb9@byu.net>"
# gpg: aka "[jpeg image of size 6874]"
# Primary key fingerprint: 71C2 CC22 B1C4 6029 27D2 F3AA A7A1 6B4A 2527 436A
* remotes/ericb/tags/pull-nbd-2017-09-06:
nbd: Use new qio_channel_*_all() functions
io: Add new qio_channel_read{, v}_all_eof functions
io: Yield rather than wait when already in coroutine
iotests: blacklist 194 with the luks driver
iotests: rewrite 192 to use _launch_qemu to fix LUKS support
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
target-arm:
* cleanups converting to DEFINE_PROP_LINK
* allwinner-a10: mark as not user-creatable
* initial patches working towards ARMv8M support
* implement generating aborts on memory transaction failures
* make BXJ behave correctly (ie not UNDEF) on ARMv6-and-later
# gpg: Signature made Thu 07 Sep 2017 14:26:07 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20170907: (31 commits)
target/arm: Add Jazelle feature
target/arm: Implement new do_transaction_failed hook
hw/arm: Set ignore_memory_transaction_failures for most ARM boards
boards.h: Define new flag ignore_memory_transaction_failures
target/arm: Implement BXNS, and banked stack pointers
target/arm: Move regime_is_secure() to target/arm/internals.h
target/arm: Make CFSR register banked for v8M
target/arm: Make MMFAR banked for v8M
target/arm: Make CCR register banked for v8M
target/arm: Make MPU_CTRL register banked for v8M
target/arm: Make MPU_RNR register banked for v8M
target/arm: Make MPU_RBAR, MPU_RLAR banked for v8M
target/arm: Make MPU_MAIR0, MPU_MAIR1 registers banked for v8M
target/arm: Make VTOR register banked for v8M
nvic: Add NS alias SCS region
target/arm: Make CONTROL register banked for v8M
target/arm: Make FAULTMASK register banked for v8M
target/arm: Make PRIMASK register banked for v8M
target/arm: Make BASEPRI register banked for v8M
target/arm: Add MMU indexes for secure v8M
...
# Conflicts:
# target/arm/translate.c
migration pull 2017-09-06
# gpg: Signature made Wed 06 Sep 2017 19:39:23 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170906a:
migration: dump str in migrate_set_state trace
snapshot/tests: Try loadvm twice
migration: Reset rather than destroy main_thread_load_event
runstate/migrate: Two more transitions
host-utils: Simplify pow2ceil()
host-utils: Proactively fix pow2floor(), switch to unsigned
xbzrle: Drop unused cache_resize()
migration: Report when bdrv_inactivate_all fails
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
tcg generic translate loop v15
# gpg: Signature made Wed 06 Sep 2017 17:02:31 BST
# gpg: using RSA key 0x64DF38E8AF7E215F
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A 05C0 64DF 38E8 AF7E 215F
* remotes/rth/tags/pull-tgt-20170906: (32 commits)
target/arm: Perform per-insn cross-page check only for Thumb
target/arm: Split out thumb_tr_translate_insn
target/arm: Move ss check to init_disas_context
target/arm: [a64] Move page and ss checks to init_disas_context
target/arm: [tcg] Port to generic translation framework
target/arm: [tcg,a64] Port to disas_log
target/arm: [tcg] Port to disas_log
target/arm: [tcg,a64] Port to tb_stop
target/arm: [tcg] Port to tb_stop
target/arm: [tcg,a64] Port to translate_insn
target/arm: [tcg] Port to translate_insn
target/arm: [tcg,a64] Port to breakpoint_check
target/arm: [tcg,a64] Port to insn_start
target/arm: [tcg] Port to insn_start
target/arm: [tcg] Port to tb_start
target/arm: [tcg,a64] Port to init_disas_context
target/arm: [tcg] Port to init_disas_context
target/arm: [tcg] Port to DisasContextBase
target/i386: [tcg] Port to generic translation framework
target/i386: [tcg] Port to disas_log
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This adds a feature bit indicating support of the (trivial) Jazelle
implementation if ARM_FEATURE_V6 is set or if the processor is arm926
or arm1026. This fixes the issue that any BXJ instruction will
result in an illegal_op. BXJ instructions will now check if the
architecture supports ARM_FEATURE_JAZELLE.
Signed-off-by: Portia Stephens <portia.stephens@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 20170905211232.11092-1-portia.stephens@xilinx.com
[PMM: edited commit message and comment text a bit]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Set the MachineClass flag ignore_memory_transaction_failures
for almost all ARM boards. This means they retain the legacy
behaviour that accesses to unimplemented addresses will RAZ/WI
rather than aborting, when a subsequent commit adds support
for external aborts.
The exceptions are:
* virt -- we know that guests won't try to prod devices
that we don't describe in the device tree or ACPI tables
* mps2 -- this board was written to use unimplemented-device
for all the ranges with devices we don't yet handle
New boards should not set the flag, but instead be written
like the mps2.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1504626814-23124-3-git-send-email-peter.maydell@linaro.org
For the Xilinx boards:
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Define a new MachineClass field ignore_memory_transaction_failures.
If this is flag is true then the CPU will ignore memory transaction
failures which should cause the CPU to take an exception due to an
access to an unassigned physical address; the transaction will
instead return zero (for a read) or be ignored (for a write). This
should be set only by legacy board models which rely on the old
RAZ/WI behaviour for handling devices that QEMU does not yet model.
New board models should instead use "unimplemented-device" for all
memory ranges where the guest will attempt to probe for a device that
QEMU doesn't implement and a stub device is required.
We need this for ARM boards, where we're about to implement support for
generating external aborts on memory transaction failures. Too many
of our legacy board models rely on the RAZ/WI behaviour and we
would break currently working guests when their "probe for device"
code provoked an external abort rather than a RAZ.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Message-id: 1504626814-23124-2-git-send-email-peter.maydell@linaro.org
Implement the BXNS v8M instruction, which is like BX but will do a
jump-and-switch-to-NonSecure if the branch target address has bit 0
clear.
This is the first piece of code which implements "switch to the
other security state", so the commit also includes the code to
switch the stack pointers around, which is the only complicated
part of switching security state.
BLXNS is more complicated than just "BXNS but set the link register",
so we leave it for a separate commit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1503414539-28762-21-git-send-email-peter.maydell@linaro.org
Make the CCR register banked if v8M security extensions are enabled.
This is slightly more complicated than the other "add banking"
patches because there is one bit in the register which is not
banked. We keep the live data in the NS copy of the register,
and adjust it on register reads and writes. (Since we don't
currently implement the behaviour that the bit controls, there
is nowhere else that needs to care.)
This patch includes the enforcement of the bits which are newly
RES1 in ARMv8M.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1503414539-28762-17-git-send-email-peter.maydell@linaro.org
Make the MPU registers MPU_MAIR0 and MPU_MAIR1 banked if v8M security
extensions are enabled.
We can freely add more items to vmstate_m_security without
breaking migration compatibility, because no CPU currently
has the ARM_FEATURE_M_SECURITY bit enabled and so this
subsection is not yet used by anything.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1503414539-28762-14-git-send-email-peter.maydell@linaro.org
For v8M the range 0xe002e000..0xe002efff is an alias region which
for secure accesses behaves like a NonSecure access to the main
SCS region. (For nonsecure accesses including when the security
extension is not implemented, it is RAZ/WI.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1503414539-28762-11-git-send-email-peter.maydell@linaro.org
Make the FAULTMASK register banked if v8M security extensions are enabled.
Note that we do not yet implement the functionality of the new
AIRCR.PRIS bit (which allows the effect of the NS copy of FAULTMASK to
be restricted).
This patch includes the code to determine for v8M which copy
of FAULTMASK should be updated on exception exit; further
changes will be required to the exception exit code in general
to support v8M, so this is just a small piece of that.
The v8M ARM ARM introduces a notation where individual paragraphs
are labelled with R (for rule) or I (for information) followed
by a random group of subscript letters. In comments where we want
to refer to a particular part of the manual we use this convention,
which should be more stable across document revisions than using
section or page numbers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1503414539-28762-9-git-send-email-peter.maydell@linaro.org
As the first step in implementing ARM v8M's security extension:
* add a new feature bit ARM_FEATURE_M_SECURITY
* add the CPU state field that indicates whether the CPU is
currently in the secure state
* add a migration subsection for this new state
(we will add the Secure copies of banked register state
to this subsection in later patches)
* add a #define for the one new-in-v8M exception type
* make the CPU debug log print S/NS status
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1503414539-28762-4-git-send-email-peter.maydell@linaro.org
As part of ARMv8M, we need to add support for the PMSAv8 MPU
architecture.
PMSAv8 differs from PMSAv7 both in register/data layout (for instance
using base and limit registers rather than base and size) and also in
behaviour (for example it does not have subregions); rather than
trying to wedge it into the existing PMSAv7 code and data structures,
we define separate ones.
This commit adds the data structures which hold the state for a
PMSAv8 MPU and the register interface to it. The implementation of
the MPU behaviour will be added in a subsequent commit.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1503414539-28762-2-git-send-email-peter.maydell@linaro.org
QEMU currently exits unexpectedly when the user accidentially
tries to do something like this:
$ aarch64-softmmu/qemu-system-aarch64 -S -M integratorcp -nographic
QEMU 2.9.93 monitor - type 'help' for more information
(qemu) device_add allwinner-a10
Unsupported NIC model: smc91c111
Exiting just due to a "device_add" should not happen. Looking closer
at the the realize and instance_init function of this device also
reveals that it is using serial_hds and nd_table directly there, so
this device is clearly not creatable by the user and should be marked
accordingly.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 1503416789-32080-1-git-send-email-thuth@redhat.com
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Rather than open-coding our own read/write-all functions, we
can make use of the recently-added qio code. It slightly
changes the error message in one of the iotests.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-4-eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Some callers want to distinguish between clean EOF (no bytes read)
vs. a short read (at least one byte read, but EOF encountered
before reaching the desired length), as it allows clients the
ability to do a graceful shutdown when a server shuts down at
defined safe points in the protocol, rather than treating all
shutdown scenarios as an error due to EOF. However, we don't want
to require all callers to have to check for early EOF. So add
another wrapper function that can be used by the callers that care
about the distinction.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-3-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
The new qio_channel_{read,write}{,v}_all functions are documented
as yielding until data is available. When used on a blocking
channel, this yield is done via qio_channel_wait() which spawns
a nested event loop under the hood (so it is that secondary loop
which yields as needed); but if we are already in a coroutine (at
which point QIO_CHANNEL_ERR_BLOCK is only possible if we are a
non-blocking channel), we want to yield the current coroutine
instead of spawning a nested event loop.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170905191114.5959-2-eblake@redhat.com>
Acked-by: Daniel P. Berrange <berrange@redhat.com>
[commit message updated]
Signed-off-by: Eric Blake <eblake@redhat.com>
ARM is a fixed-length ISA and we can compute the page crossing
condition exactly once during init_disas_context.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
We need not check for ARM vs Thumb state in order to dispatch
disassembly of every instruction.
Tested-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Since AArch64 uses a fixed-width ISA, we can pre-compute the number of
insns remaining on the page. Also, we can check for single-step once.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Lluís Vilanova <vilanova@ac.upc.edu>
Message-Id: <150002073981.22386.9870422422367410100.stgit@frigg.lan>
[rth: Moved max_insns adjustment from tb_start to init_disas_context.
Removed pc_next return from translate_insn.
Removed tcg_check_temp_count from generic loop.
Moved gen_io_end to exactly match gen_io_start.
Use qemu_log instead of error_report for temporary leaks.
Moved TB size/icount assignments before disas_log.]
Signed-off-by: Richard Henderson <rth@twiddle.net>
There's nothing magic about the exception that we generate in order
to execute the magic kernel page. We can and should allow gdb to
set a breakpoint at this location.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Fold DISAS_EXC and DISAS_TB_JUMP into DISAS_NORETURN.
In both cases all following code is dead. In the first
case because we have exited the TB via exception; in the
second case because we have exited the TB via goto_tb
and its associated machinery.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This target is not sophisticated in its use of cleanups at the
end of the translation loop. For the most part, any condition
that exits the TB is dealt with by emitting the exiting opcode
right then and there. Therefore the only is_jmp indicator that
is needed is DISAS_NORETURN.
For two stack segment modifying cases, we have not yet exited
the TB (therefore DISAS_NORETURN feels wrong), but intend to exit.
The caller of gen_movl_seg_T0 currently checks for any non-zero
value, therefore DISAS_TOO_MANY seems acceptable for that usage.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This will allow some amount of cleanup to happen before
switching the backends over to enum DisasJumpType.
Reviewed-by: Emilio G. Cota <cota@braap.org>
Reviewed-by: Lluís Vilanova <vilanova@ac.upc.edu>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This allows LOAD HALFWORD IMMEDIATE ON CONDITION,
eliminating one insn in some common cases.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This allows using a 3-operand insn form for some arithmetic,
logicals and shifts.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Richard Henderson <rth@twiddle.net>
Use a switch instead of searching a table.
Acked-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
migration_incoming_state_destroy doesn't really destroy, it cleans up.
After a loadvm it's called, but the loadvm command can be run twice,
and so destroying an init-once mutex breaks on the second loadvm.
Reported-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170825141940.20740-2-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Stafford Horne <shorne@gmail.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
There's a race if someone does a 'stop' near the end of migrate;
the migration process goes through two runstates:
'finish migrate'
'postmigrate'
If the user issues a 'stop' between the two we end up with invalid
state transitions.
Add the transitions as valid.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170804175011.21944-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The function's stated contract is simple enough: "round down to the
nearest power of 2". Suggests the domain is the representable numbers
>= 1, because that's the smallest power of two.
The implementation doesn't check for domain errors, but returns
garbage instead:
* For negative arguments, pow2floor() returns -2^63, which is not even
a power of two, let alone the nearest one.
What sort of works is passing *unsigned* arguments >= 2^63. The
implicit conversion to signed is implementation defined, but
commonly yields the (negative) two's complement. pow2floor() then
returns -2^63. Callers that convert that back to unsigned get the
correct value 2^63.
* For a zero argument, pow2floor() shifts right by 64. Undefined
behavior. Common actual behavior is to shift by 0, yielding -2^63.
Fix by switching from int64_t to uint64_t and amending the contract to
map zero to zero.
Callers are fine with that:
* memory_access_size()
This function makes no sense unless the argument is positive and the
return value fits into int.
* raw_refresh_limits()
Passes an int between 1 and BDRV_REQUEST_MAX_BYTES.
* iscsi_refresh_limits()
Passes an integer between 0 and INT_MAX, converts the result to
uint32_t. Passing zero would be undefined behavior, but commonly
yield zero. The patch gives us the zero without the undefined
behavior.
* cache_init()
Passes a positive int64_t argument.
* xbzrle_cache_resize()
Passes a positive int64_t argument (>= TARGET_PAGE_SIZE, actually).
* spapr_node0_size()
Passes a positive uint64_t argument, and converts the result to
hwaddr, i.e. uint64_t.
* spapr_populate_memory()
Passes a positive hwaddr argument, and converts the result to
hwaddr.
Cc: Juan Quintela <quintela@redhat.com>
Cc: Dr. David Alan Gilbert <dgilbert@redhat.com>
Cc: Eric Blake <eblake@redhat.com>
Cc: Peter Maydell <peter.maydell@linaro.org>
Cc: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1501148776-16890-3-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
If the bdrv_inactivate_all fails near the end of the migration,
the migration will fail and often the only diagnostics in the log
are an I/O error which you can't distinguish from an error on
the socket connection.
Add an error so we know when it's actually a block problem.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-Id: <20170822170212.27347-1-dgilbert@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
After calling qcow2_inactivate(), all qcow2 caches must be flushed, but this
may not happen, because the last call qcow2_store_persistent_dirty_bitmaps()
can lead to marking l2/refcont cache as dirty.
Let's move qcow2_store_persistent_dirty_bitmaps() before the caсhe flushing
to fix it.
Cc: qemu-stable@nongnu.org
Signed-off-by: Pavel Butsykin <pbutsykin@virtuozzo.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
block/throttle.c uses existing I/O throttle infrastructure inside a
block filter driver. I/O operations are intercepted in the filter's
read/write coroutines, and referred to block/throttle-groups.c
The driver can be used with the syntax
-drive driver=throttle,file.filename=foo.qcow2,throttle-group=bar
which registers the throttle filter node with the ThrottleGroup 'bar'. The
given group must be created beforehand with object-add or -object.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Currently, we cannot use mttcg for running strong memory model guests
on weak memory model hosts due to missing ordering semantics.
We implicitly generate fence instructions for stronger guests if an
ordering mismatch is detected. We generate fences only for the orders
for which fence instructions are necessary, for example a fence is not
necessary between a store and a subsequent load on x86 since its
absence in the guest binary tells that ordering need not be
ensured. Also note that if we find multiple subsequent fence
instructions in the generated IR, we combine them in the TCG
optimization pass.
This patch allows us to boot an x86 guest on ARM64 hosts using mttcg.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-Id: <20170829063313.10237-4-bobby.prani@gmail.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Just make sure that nr_tables is size_t not int.
Once there, do the assert in the right place and be sure that we don't
have a division by zero.
Suggested-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Juan Quintela <quintela@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
--
Drop the s/g_new0/g_malloc0/ change.
Avoid division by zero with assert (danp)
We were using -1 instead of the real size because the functions check
what is bigger, size in bytes or the size of the iov. Recent gcc's
barf at this.
Signed-off-by: Juan Quintela <quintela@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Tested-by: Cleber Rosa <crosa@redhat.com>
--
Remove comments about this feature.
Fix missing -1.
We threatened to remove ia64 as host in v2.9.0. Its time has now come.
There are still some usages of defined(__ia64__) throughout the source
code that would be triggered if one were to enable TCI on an ia64 host.
Leave those alone for now.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
tests/vhost-user-test keeps failing on build-system since Aug 15:
ERROR:tests/vhost-user-test.c:835:test_flags_mismatch: child process (/i386/vhost-user/flags-mismatch/subprocess [4836]) failed unexpectedly
...
ERROR:tests/vhost-user-test.c:807:test_connect_fail: child process (/x86_64/vhost-user/connect-fail/subprocess [58910]) failed unexpectedly
Suggested-by: Peter Maydell <peter.maydell@linaro.org>
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170905180602.28698-1-f4bug@amsat.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit 206a0fc75d.
The linux-headers directory is for kernel headers which we keep in
sync with the upstream kernel via scripts/update-linux-headers.sh, so
we shouldn't be applying our code cleanups to it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ThrottleGroup is converted to an object. This will allow the future
throttle block filter drive easy creation and configuration of throttle
groups in QMP and cli.
A new QAPI struct, ThrottleLimits, is introduced to provide a shared
struct for all throttle configuration needs in QMP.
ThrottleGroups can be created via CLI as
-object throttle-group,id=foo,x-iops-total=100,x-..
where x-* are individual limit properties. Since we can't add non-scalar
properties in -object this interface must be used instead. However,
setting these properties must be disabled after initialization because
certain combinations of limits are forbidden and thus configuration
changes should be done in one transaction. The individual properties
will go away when support for non-scalar values in CLI is implemented
and thus are marked as experimental.
ThrottleGroup also has a `limits` property that uses the ThrottleLimits
struct. It can be used to create ThrottleGroups or set the
configuration in existing groups as follows:
{ "execute": "object-add",
"arguments": {
"qom-type": "throttle-group",
"id": "foo",
"props" : {
"limits": {
"iops-total": 100
}
}
}
}
{ "execute" : "qom-set",
"arguments" : {
"path" : "foo",
"property" : "limits",
"value" : {
"iops-total" : 99
}
}
}
This also means a group's configuration can be fetched with qom-get.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Alberto Garcia <berto@igalia.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Some trivial fixes/cleanup and a fix to cause QEMU to error out gracefully
instead of aborting.
# gpg: Signature made Tue 05 Sep 2017 16:57:19 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
virtfs: error out gracefully when mandatory suboptions are missing
9pfs: local: clarify fchmodat_nofollow() implementation
fsdev: fix memory leak in main()
9pfs: avoid sign conversion error simplifying the code
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
We internally convert -virtfs to -fsdev/-device. If the user doesn't
provide the path or security_model suboptions, and the fsdev backend
requires them, we hit an assertion when populating the internal -fsdev
option:
util/qemu-option.c:547: opt_set: Assertion `opt->str' failed.
Aborted (core dumped)
Let's test the suboption presence on the command line before trying
to set it in the internal -fsdev option, and let the backend code
error out gracefully (ie, like it already does when the user passes
-fsdev on the command line).
Reported-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Since fchmodat(2) on Linux doesn't support AT_SYMLINK_NOFOLLOW, we have to
implement it using workarounds. There are two different ways, depending on
whether the system supports O_PATH or not.
In the case O_PATH is supported, we rely on the behavhior of openat(2)
when passing O_NOFOLLOW | O_PATH and the file is a symbolic link. Even
if openat_file() already adds O_NOFOLLOW to the flags, this patch makes
it explicit that we need both creation flags to obtain the expected
behavior.
This is only cleanup, no functional change.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Move the CoMutex and CoQueue inits inside throttle_group_register_tgm()
which is called whenever a ThrottleGroupMember is initialized. There's
no need for them to be separate.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
timer_cb() needs to know about the current Aio context of the throttle
request that is woken up. In order to make ThrottleGroupMember backend
agnostic, this information is stored in an aio_context field instead of
accessing it from BlockBackend.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This commit eliminates the 1:1 relationship between BlockBackend and
throttle group state. Users will be able to create multiple throttle
nodes, each with its own throttle group state, in the future. The
throttle group state cannot be per-BlockBackend anymore, it must be
per-throttle node. This is done by gathering ThrottleGroup membership
details from BlockBackendPublic into ThrottleGroupMember and refactoring
existing code to use the structure.
Reviewed-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The TLS I/O channel test had mistakenly used && instead
of || when checking for handshake completion. As a
result it could terminate the handshake process before
it had actually completed. This was harmless before but
changes in GNUTLS 3.6.0 exposed this bug and caused the
test suite to fail.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
These functions wait until they are able to read / write the full
requested data buffer(s).
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
The non-blocking connect mechanism is obsolete, and it doesn't
work well in inet connection, because it will call getaddrinfo
first and getaddrinfo will blocks on DNS lookups. Since commit
e65c67e4 & d984464e, the non-blocking connect of migration goes
through QIOChannel in a different manner(using a thread), and
nobody use this old non-blocking connect anymore.
Any newly written code which needs a non-blocking connect should
use the QIOChannel code, so we can drop NonBlockingConnectHandler
as a concept entirely.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Cao jin <caoj.fnst@cn.fujitsu.com>
Signed-off-by: Mao Zhongyi <maozy.fnst@cn.fujitsu.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
@rpath and @sock_name are not freed and leaked.
[groug, not really leaked since the program exits just after that. But it
is always good practice to free allocated memory]
Signed-off-by: Zhipeng Lu <lu.zhipeng@zte.com.cn>
Signed-off-by: Greg Kurz <groug@kaod.org>
(note this is how other functions also handle the errors).
hw/9pfs/9p.c:948:18: warning: Loss of sign in implicit conversion
offset = err;
^~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Greg Kurz <groug@kaod.org>
Switch from atexit.register() to a more elegant idiom of declaring
resources in a with statement:
with FilePath('monitor.sock') as monitor_path,
VM() as vm:
...
The files and VMs will be automatically cleaned up whether the test
passes or fails.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170824072202.26818-4-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The scratch/ (TEST_DIR) directory is not automatically cleaned up after
test execution. It is the responsibility of tests to remove any files
they create.
A nice way of doing this is to declare files at the beginning of the
test and automatically remove them with a context manager:
with iotests.FilePath('test.img') as img_path:
qemu_img(...)
qemu_io(...)
# img_path is guaranteed to be deleted here
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170824072202.26818-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There are a number of ways to ensure that the QEMU process is shut down
when the test ends, including atexit.register(), try: finally:, or
unittest.teardown() methods. All of these require extra code and the
programmer must remember to add vm.shutdown().
A nice solution is context managers:
with VM(binary) as vm:
...
# vm is guaranteed to be shut down here
Cc: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20170824072202.26818-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
As future sun4u PCI topologies place the ebus containing the in-built devices
behind a PCI bridge, add a busA property to the PBM PCI bridge that is then
used to allow IO accesses by default.
This allows early fw_cfg/NVRAM/serial access to occur even before OpenBIOS
has had a chance to configure the PCI bridges.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Rather than referring to the PCI busses as bus2 and bus3, refer to them as
busA and busB as per the documentation. Also replace the long bus names with
the shorter pciA and pciB aliases (to make it easier to attach additional
devices to either from the command line).
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
In order to wire up the ebus PCI address spaces differently then we need
access to the underlying PCIDevice.
Signed-off-by: Mark Cave-Ayland <mark.cave-ayland@ilande.co.uk>
Omitting the check for whether bdrv_getlength() and bdrv_truncate()
failed meant that it was theoretically possible to return an
incorrect offset to the caller. More likely, conditions for either
of these functions to fail would also cause one of our other calls
(such as bdrv_pread() or bdrv_pwrite_sync()) to also fail, but
auditing that we are safe is difficult compared to just patching
things to always forward on the error rather than ignoring it.
Use osdep.h macros instead of open-coded rounding while in the
area.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The old signature has an ambiguous meaning for a return of 0:
either no allocation was requested or necessary, or an error
occurred (but any errno associated with the error is lost to
the caller, which then has to assume EIO).
Better is to follow the example of qcow2, by changing the
signature to have a separate return value that cleanly
distinguishes between failure and success, along with a
parameter that cleanly holds a 64-bit value. Then update all
callers.
While auditing that all return paths return a negative errno
(rather than -1), I also simplified places where we can pass
NULL rather than a local Error that just gets thrown away.
Suggested-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
bdrv_co_get_block_status_from_file() and
bdrv_co_get_block_status_from_backing() set *file to bs->file and
bs->backing respectively, so that bdrv_co_get_block_status() can recurse
to them. Future block drivers won't have to duplicate code to implement
this.
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Now that bdrv_truncate is passed to bs->file by default, remove the
callback from block/blkdebug.c and set is_filter to true. is_filter also gives
access to other callbacks that are forwarded automatically to bs->file for
filters.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This function is not used anywhere, so remove it.
Markus Armbruster adds:
The i82078 floppy device model used to call bdrv_media_changed() to
implement its media change bit when backed by a host floppy. This
went away in 21fcf36 "fdc: simplify media change handling".
Probably broke host floppy media change. Host floppy pass-through
was dropped in commit f709623. bdrv_media_changed() has never been
used for anything else. Remove it.
(Source is Message-ID: <87y3ruaypm.fsf@dusky.pond.sub.org>)
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The following functions fail if bs->drv is a filter and does not
implement them:
bdrv_probe_blocksizes
bdrv_probe_geometry
bdrv_truncate
bdrv_has_zero_init
bdrv_get_info
Instead, the call should be passed to bs->file if it exists, to allow
filter drivers to support those methods without implementing them. This
commit makes `drv->is_filter = true` imply that these callbacks will be
forwarded to bs->file by default, so disabling support for these
functions must be done explicitly.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
target-arm:
* collection of M profile cleanups and minor bugfixes
* loader: handle ELF files with overlapping zero-init data
* virt: allow PMU instantiation with userspace irqchip
* wdt_aspeed: Add support for the reset width register
* cpu: Define new cpu_transaction_failed() hook
* Mark some SoC devices as not user-creatable
* arm: Fix aa64 ldp register writeback
* arm_gicv3_kvm: Fix compile warning
# gpg: Signature made Mon 04 Sep 2017 17:20:40 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20170904-2: (33 commits)
arm_gicv3_kvm: Fix compile warning
target/arm: Fix aa64 ldp register writeback
hw/arm/digic: Mark device with user_creatable = false
hw/arm/aspeed_soc: Mark devices as user_creatable = false
target/arm: Allow deliver_fault() caller to specify EA bit
target/arm: Factor out fault delivery code
cputlb: Support generating CPU exceptions on memory transaction failures
cpu: Define new cpu_transaction_failed() hook
memory.h: Move MemTxResult type to memattrs.h
aspeed_soc: Propagate silicon-rev to watchdog
watchdog: wdt_aspeed: Add support for the reset width register
target/arm/kvm: pmu: improve error handling
hw/arm/virt: allow pmu instantiation with userspace irqchip
target/arm/kvm: pmu: split init and set-irq stages
hw/arm/virt: add pmu interrupt state
hw/arm: use defined type name instead of hard-coded string
loader: Ignore zero-sized ELF segments
loader: Handle ELF files with overlapping zero-initialized data
nvic: Implement "user accesses BusFault" SCS region behaviour
armv7m_nvic.h: Move from include/hw/arm to include/hw/intc
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix the following warning:
/home/pranith/qemu/hw/intc/arm_gicv3_kvm.c:296:17: warning: logical not is only applied to the left hand side of this bitwise operator [-Wlogical-not-parentheses]
if (!c->gicr_ctlr & GICR_CTLR_ENABLE_LPIS) {
^ ~
/home/pranith/qemu/hw/intc/arm_gicv3_kvm.c:296:17: note: add parentheses after the '!' to evaluate the bitwise operator first
if (!c->gicr_ctlr & GICR_CTLR_ENABLE_LPIS) {
^
/home/pranith/qemu/hw/intc/arm_gicv3_kvm.c:296:17: note: add parentheses around left hand side expression to silence this warning
if (!c->gicr_ctlr & GICR_CTLR_ENABLE_LPIS) {
^
This logic error meant we were not setting the PTZ
bit when we should -- luckily as the comment suggests
this wouldn't have had any effects beyond making GIC
initialization take a little longer.
Signed-off-by: Pranith Kumar <bobby.prani@gmail.com>
Message-id: 20170829173226.7625-1-bobby.prani@gmail.com
Cc: qemu-stable@nongnu.org
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU currently shows some unexpected behavior when the user trys to
do a "device_add digic" on an unrelated ARM machine like integratorcp
in "-nographic" mode (the device_add command does not immediately
return to the monitor prompt), and trying to "device_del" the device
later results in a "qemu/qdev-monitor.c:872:qdev_unplug: assertion
failed: (hotplug_ctrl)" error condition.
Looking at the realize function of the device, it uses serial_hds
directly and this means that the device can not be added a second
time, so let's simply mark it with "user_creatable = false" now.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU currently aborts if the user is accidentially trying to
do something like this:
$ aarch64-softmmu/qemu-system-aarch64 -S -M integratorcp -nographic
QEMU 2.9.93 monitor - type 'help' for more information
(qemu) device_add ast2400
Unexpected error in error_set_from_qdev_prop_error()
at hw/core/qdev-properties.c:1032:
Aborted (core dumped)
The ast2400 SoC devices are clearly not creatable by the user since
they are using the serial_hds and nd_table arrays directly in their
realize function, so mark them with user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For external aborts, we will want to be able to specify the EA
(external abort type) bit in the syndrome field. Allow callers of
deliver_fault() to do that by adding a field to ARMMMUFaultInfo which
we use when constructing the syndrome values.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
We currently have some similar code in tlb_fill() and in
arm_cpu_do_unaligned_access() for delivering a data abort or prefetch
abort. We're also going to want to do the same thing to handle
external aborts. Factor out the common code into a new function
deliver_fault().
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Acked-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Call the new cpu_transaction_failed() hook at the places where
CPU generated code interacts with the memory system:
io_readx()
io_writex()
get_page_addr_code()
Any access from C code (eg via cpu_physical_memory_rw(),
address_space_rw(), ld/st_*_phys()) will *not* trigger CPU exceptions
via cpu_transaction_failed(). Handling for transactions failures for
this kind of call should be done by using a function which returns a
MemTxResult and treating the failure case appropriately in the
calling code.
In an ideal world we would not generate CPU exceptions for
instruction fetch failures in get_page_addr_code() but instead wait
until the code translation process tried a load and it failed;
however that change would require too great a restructuring and
redesign to attempt at this point.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Currently we have a rather half-baked setup for allowing CPUs to
generate exceptions on accesses to invalid memory: the CPU has a
cpu_unassigned_access() hook which the memory system calls in
unassigned_mem_write() and unassigned_mem_read() if the current_cpu
pointer is non-NULL. This was originally designed before we
implemented the MemTxResult type that allows memory operations to
report a success or failure code, which is why the hook is called
right at the bottom of the memory system. The major problem with
this is that it means that the hook can be called even when the
access was not actually done by the CPU: for instance if the CPU
writes to a DMA engine register which causes the DMA engine to begin
a transaction which has been set up by the guest to operate on
invalid memory then this will casue the CPU to take an exception
incorrectly. Another minor problem is that currently if a device
returns a transaction error then this won't turn into a CPU exception
at all.
The right way to do this is to have allow the CPU to respond
to memory system transaction failures at the point where the
CPU specific code calls into the memory system.
Define a new QOM CPU method and utility function
cpu_transaction_failed() which is called in these cases.
The functionality here overlaps with the existing
cpu_unassigned_access() because individual target CPUs will
need some work to convert them to the new system. When this
transition is complete we can remove the old cpu_unassigned_access()
code.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Move the MemTxResult type to memattrs.h. We're going to want to
use it in cpu/qom.h, which doesn't want to include all of
memory.h. In practice MemTxResult and MemTxAttrs are pretty
closely linked since both are used for the new-style
read_with_attrs and write_with_attrs callbacks, so memattrs.h
is a reasonable home for this rather than creating a whole
new header file for it.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
This is required to configure differences in behaviour between the
AST2400 and AST2500 watchdog IPs.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The reset width register controls how the pulse on the SoC's WDTRST{1,2}
pins behaves. A pulse is emitted if the external reset bit is set in
WDT_CTRL. On the AST2500 WDT_RESET_WIDTH can consume magic bit patterns
to configure push-pull/open-drain and active-high/active-low
behaviours and thus needs some special handling in the write path.
As some of the capabilities depend on the SoC version a silicon-rev
property is introduced, which is used to guard version-specific
behaviour.
Signed-off-by: Andrew Jeffery <andrew@aj.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
If a KVM PMU init or set-irq attr call fails we just silently stop
the PMU DT node generation. The only way they could fail, though,
is if the attr's respective KVM has-attr call fails. But that should
never happen if KVM advertises the PMU capability, because both
attrs have been available since the capability was introduced. Let's
just abort if this should-never-happen stuff does happen, because,
if it does, then something is obviously horribly wrong.
Signed-off-by: Andrew Jones <drjones@redhat.com>
Reviewed-by: Christoffer Dall <cdall@linaro.org>
Message-id: 1500471597-2517-5-git-send-email-drjones@redhat.com
[PMM: change kvm32.c kvm_arm_pmu_init() to the new API too]
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For embedded systems, notably ARM, one common use of ELF
file segments is that the 'physical addresses' represent load addresses
and the 'virtual addresses' execution addresses, such that
the load addresses are packed into ROM or flash, and the
relocation and zero-initialization of data is done at runtime.
This means that the 'memsz' in the segment header represents
the runtime size of the segment, but the size that needs to
be loaded is only the 'filesz'. In particular, paddr+memsz
may overlap with the next segment to be loaded, as in this
example:
0x70000001 off 0x00007f68 vaddr 0x00008150 paddr 0x00008150 align 2**2
filesz 0x00000008 memsz 0x00000008 flags r--
LOAD off 0x000000f4 vaddr 0x00000000 paddr 0x00000000 align 2**2
filesz 0x00000124 memsz 0x00000124 flags r--
LOAD off 0x00000218 vaddr 0x00000400 paddr 0x00000400 align 2**3
filesz 0x00007d58 memsz 0x00007d58 flags r-x
LOAD off 0x00007f70 vaddr 0x20000140 paddr 0x00008158 align 2**3
filesz 0x00000a80 memsz 0x000022f8 flags rw-
LOAD off 0x000089f0 vaddr 0x20002438 paddr 0x00008bd8 align 2**0
filesz 0x00000000 memsz 0x00004000 flags rw-
LOAD off 0x000089f0 vaddr 0x20000000 paddr 0x20000000 align 2**0
filesz 0x00000000 memsz 0x00000140 flags rw-
where the segment at paddr 0x8158 has a memsz of 0x2258 and
would overlap with the segment at paddr 0x8bd8 if QEMU's loader
tried to honour it. (At runtime the segments will not overlap
since their vaddrs are more widely spaced than their paddrs.)
Currently if you try to load an ELF file like this with QEMU then
it will fail with an error "rom: requested regions overlap",
because we create a ROM image for each segment using the memsz
as the size.
Support ELF files using this scheme, by truncating the
zero-initialized part of the segment if it would overlap another
segment. This will retain the existing loader behaviour for
all ELF files we currently accept, and also accept ELF files
which only need 'filesz' bytes to be loaded.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1502116754-18867-2-git-send-email-peter.maydell@linaro.org
The ARMv7M architecture specifies that most of the addresses in the
PPB region (which includes the NVIC, systick and system registers)
are not accessible to unprivileged accesses, which should
BusFault with a few exceptions:
* the STIR is configurably user-accessible
* the ITM (which we don't implement at all) is always
user-accessible
Implement this by switching the register access functions
to the _with_attrs scheme that lets us distinguish user
mode accesses.
This allows us to pull the handling of the CCR.USERSETMPEND
flag up to the level where we can make it generate a BusFault
as it should for non-permitted accesses.
Note that until the core ARM CPU code implements turning
MEMTX_ERROR into a BusFault the registers will continue to
act as RAZ/WI to user accesses.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-16-git-send-email-peter.maydell@linaro.org
For M profile the XPSR is a similar but not identical format to the
A profile CPSR/SPSR. (For instance the Thumb bit is in a different
place.) For guest accesses we make the M profile code go through
xpsr_read() and xpsr_write() which handle the different layout.
However for migration we use cpsr_read() and cpsr_write() to
marshal state into and out of the migration data stream. This
is pretty confusing and works more by luck than anything else.
Make M profile migration use xpsr_read() and xpsr_write() instead.
The most complicated part of this is handling the possibility
that the migration source is an older QEMU which hands us a
CPSR format value; helpfully we can always tell the two apart.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-11-git-send-email-peter.maydell@linaro.org
We currently store the M profile CPU register state PRIMASK and
FAULTMASK in the daif field of the CPU state in its I and F
bits. This is a legacy from the original implementation, which
tried to share the cpu_exec_interrupt code between A profile
and M profile. We've since separated out the two cases because
they are significantly different, so now there is no common
code between M and A profile which looks at env->daif: all the
uses are either in A-only or M-only code paths. Sharing the state
fields now is just confusing, and will make things awkward
when we implement v8M, where the PRIMASK and FAULTMASK
registers are banked between security states.
Switch M profile over to using v7m.faultmask and v7m.primask
fields for these registers.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-10-git-send-email-peter.maydell@linaro.org
Tighten up the T32 decoder in the places where new v8M instructions
will be:
* TT/TTT/TTA/TTAT are in what was nominally LDREX/STREX r15, ...
which is UNPREDICTABLE:
make the UNPREDICTABLE behaviour be to UNDEF
* BXNS/BLXNS are distinguished from BX/BLX via the low 3 bits,
which in previous architectural versions are SBZ:
enforce the SBZ via UNDEF rather than ignoring it, and move
the "ARCH(5)" UNDEF case up so we don't leak a TCG temporary
* SG is in the encoding which would be LDRD/STRD with rn = r15;
this is UNPREDICTABLE and we currently UNDEF:
move this check further up the code so that we don't leak
TCG temporaries in the UNDEF case and have a better place
to put the SG decode.
This means that if a v8M binary is accidentally run on v7M
or if a test case hits something that we haven't implemented
yet the behaviour will be obvious (UNDEF) rather than obscure
(plough on treating it as a different instruction).
In the process, add some comments about the instruction patterns
at these points in the decode. Our Thumb and ARM decoders are
very difficult to understand currently, but gradually adding
comments like this should help to clarify what exactly has
been decoded when.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-5-git-send-email-peter.maydell@linaro.org
Currently get_phys_addr() has PMSAv7 handling before the
"is translation disabled?" check, and then PMSAv5 after it.
Tidy this up by making the PMSAv5 code handle the "MPU disabled"
case itself, so that we have all the PMSA code in one place.
This will make adding the PMSAv8 code slightly cleaner, and
also means that pre-v7 PMSA cores benefit from the MPU lookup
logging that the PMSAv7 codepath had.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-4-git-send-email-peter.maydell@linaro.org
M profile cores can never trap on WFI or WFE instructions. Check for
M profile in check_wfx_trap() to ensure this.
The existing code will do the right thing for v7M cores because
the hcr_el2 and scr_el3 registers will be all-zeroes and so we
won't attempt to trap, but when we start setting ARM_FEATURE_V8
for v8M cores the v8A handling of SCTLR.nTWE and .nTWI will not
give the right results.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Edgar E. Iglesias <edgar.iglesias@xilinx.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 1501692241-23310-3-git-send-email-peter.maydell@linaro.org
QAPI patches for 2017-09-01
# gpg: Signature made Mon 04 Sep 2017 12:30:31 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-09-01-v3: (47 commits)
qapi: drop the sentinel in enum array
qapi: Change data type of the FOO_lookup generated for enum FOO
qapi: Convert indirect uses of FOO_lookup[...] to qapi_enum_lookup()
qapi: Mechanically convert FOO_lookup[...] to FOO_str(...)
qapi: Generate FOO_str() macro for QAPI enum FOO
qapi: Avoid unnecessary use of enum lookup table's sentinel
qapi: Use qapi_enum_parse() in input_type_enum()
crypto: Use qapi_enum_parse() in qcrypto_block_luks_name_lookup()
quorum: Use qapi_enum_parse() in quorum_open()
block: Use qemu_enum_parse() in blkdebug_debug_breakpoint()
hmp: Use qapi_enum_parse() in hmp_migrate_set_parameter()
hmp: Use qapi_enum_parse() in hmp_migrate_set_capability()
tpm: Clean up model registration & lookup
tpm: Clean up driver registration & lookup
qapi: Drop superfluous qapi_enum_parse() parameter max
qapi: Update qapi-code-gen.txt examples to match current code
qapi-schema: Improve section headings
qapi-schema: Move queries from common.json to qapi-schema.json
qapi-schema: Make block-core.json self-contained
qapi-schema: Fold event.json back into qapi-schema.json
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently, a FOO_lookup is an array of strings terminated by a NULL
sentinel.
A future patch will generate enums with "holes". NULL-termination
will cease to work then.
To prepare for that, store the length in the FOO_lookup by wrapping it
in a struct and adding a member for the length.
The sentinel will be dropped next.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170822132255.23945-13-marcandre.lureau@redhat.com>
[Basically redone]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-16-git-send-email-armbru@redhat.com>
[Rebased]
Currently, the FOO_lookup[] generated for QAPI enum types are
terminated by a NULL sentinel.
A future patch will generate enums with "holes". NULL-termination
will cease to work then.
To prepare for that, replace "have we reached the sentinel?"
predicates by "have we reached the FOO__MAX value?" predicates.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-12-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
The error message on invalid blkdebug events changes from
qemu-system-x86_64: LOCATION: Invalid event name "VALUE"
to
qemu-system-x86_64: LOCATION: invalid parameter value: VALUE
Slight degradation, but the message is sub-par even before the patch.
When complaining about a parameter value, both parameter name and
value should be mentioned, as the value may well not be unique. Left
for another day.
Also left is the error message's unhelpful location: it points to the
config=FILENAME rather than into that file.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170822132255.23945-11-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rebased, commit message rewritten]
Cc: Kevin Wolf <kwolf@redhat.com>
Cc: Max Reitz <mreitz@redhat.com>
Cc: qemu-block@nongnu.org
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-8-git-send-email-armbru@redhat.com>
We have a strict separation between enum TpmModel and tpm_models[]:
* TpmModel may have any number of members. It just happens to have one.
* tpm_register_model() uses the first empty slot in tpm_models[].
If you register more than tpm_models[] has space,
tpn_register_model() fails. Its caller silently ignores the
failure.
Register the same TpmModel more than once has no effect other than
wasting tpm_models[] slots: tpm_model_is_registered() is happy with
the first one it finds.
Since we only ever register one model, and tpm_models[] has space for
just that one, this contraption even works.
Turn tpm_models[] into a straight map from enum TpmType to bool. Much
simpler.
Cc: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-5-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
[Commit message typo fixed]
We have a strict separation between enum TpmType and be_drivers[]:
* TpmType may have any number of members. It just happens to have one.
* tpm_register_driver() uses the first empty slot in be_drivers[].
If you register more than tpm_models[] has space,
tpm_register_driver() fails. Its caller silently ignores the
failure.
If you register more than one with a given TpmType,
tpm_display_backend_drivers() will shows all of them, but
tpm_driver_find_by_type() and tpm_get_backend_driver() will find
only the one one that registered first.
Since we only ever register one driver, and be_drivers[] has space for
just that one, this contraption even works.
Turn be_drivers[] into a straight map from enum TpmType to driver.
Much simpler, and has a decent chance to actually work should we ever
acquire additional drivers.
While there, use qapi_enum_parse() in tpm_get_backend_driver().
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170822132255.23945-8-marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
[Rebased, superfluous initializer dropped, commit message rewritten]
Cc: Stefan Berger <stefanb@us.ibm.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503564371-26090-4-git-send-email-armbru@redhat.com>
The generated QEMU QMP reference is now structured as follows:
1.1 Introduction
1.2 Stability Considerations
1.3 Common data types
1.4 Socket data types
1.5 VM run state
1.6 Cryptography
1.7 Block devices
1.7.1 Block core (VM unrelated)
1.7.2 QAPI block definitions (vm unrelated)
1.8 Character devices
1.9 Net devices
1.10 Rocker switch device
1.11 TPM (trusted platform module) devices
1.12 Remote desktop
1.12.1 Spice
1.12.2 VNC
1.13 Input
1.14 Migration
1.15 Transactions
1.16 Tracing
1.17 QMP introspection
1.18 Miscellanea
Section "1.18 Miscellanea" is still too big: it documents 134 symbols.
Section "1.7.1 Block core (VM unrelated)" is also rather big: 128
symbols. All the others are of reasonable size.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503602048-12268-17-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Except for block-core.json, the sub-schemas are self-contained: if
they use a symbol defined in another sub-schema, they include that
sub-schema. To check, feed the sub-schema to qapi2texi (or any other
QAPI generator) along with the pragma from qapi-schema.json.
Fix up things to make block-core.json self-contained, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1503602048-12268-15-git-send-email-armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Bug: section "Rocker switch device" starts with the rocker stuff, but
then has unrelated stuff, like ReplayMode, xen-load-devices-state, ...
Cause: rocker.json is included in the middle of section "QMP commands".
Fix: include it in a sane place, namely next to the other sub-schemas.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1503602048-12268-4-git-send-email-armbru@redhat.com>
Bug: introspection documentation is in section "Tracing commands".
Cause: sub-schema qapi/introspect.json lacks a section header, and
therefore goes into whatever section precedes its include.
Fix: add a section header.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1503602048-12268-3-git-send-email-armbru@redhat.com>
Documentation generated with qapi2texi.py is in source order, with
included sub-schemas inserted at the first include directive
(subsequent include directives have no effect). To get a sane and
stable order, it's best to include each sub-schema just once, or
include it first in qapi-schema.json. Document that.
While there, drop a few redundant comments.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <1503602048-12268-2-git-send-email-armbru@redhat.com>
We check that all members of the QLit list are also in the QList. We
neglect to check the other direction. Fix that.
While there, use QLIST_FOREACH_ENTRY() to simplify the code and break
the loop on the first mismatch.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170825105913.4060-13-marcandre.lureau@redhat.com>
[Commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
We check that all members of the QLit dictionary are also in the
QDict. We neglect to check the other direction.
Comparing the number of members suffices, because QDict can't
contain duplicate members, and putting duplicates in a QLit is a
programming error.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170825105913.4060-12-marcandre.lureau@redhat.com>
[Commit message improved]
Signed-off-by: Markus Armbruster <armbru@redhat.com>
compare_litqobj_to_qobj() lacks a qlit_ prefix. Moreover, "compare"
suggests -1, 0, +1 for less than, equal and greater than. The
function actually returns non-zero for equal, zero for unequal.
Rename to qlit_equal_qobject().
Its return type will be cleaned up in the next patch.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <20170825105913.4060-6-marcandre.lureau@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
A command is a query if it has no side effect and yields a result.
Such commands are typically named query-FOO, but there are exceptions.
The basic idea is to find candidates with query-qmp-schema, filter out
the ones that aren't queries with an explicit blacklist, and test the
remaining ones against a QEMU with no special arguments.
The current blacklist is just add-fd.
The test can't do queries with arguments, because it knows nothing
about the arguments. No coverage for query-cpu-model-baseline,
query-cpu-model-comparison, query-cpu-model-expansion, query-rocker,
query-rocker-ports, query-rocker-of-dpa-flows, and
query-rocker-of-dpa-groups.
Most tested commands are expected to succeed. The test does not check
the return value then.
query-balloon and query-vm-generation-id are expected to fail because
they need a virtio-balloon / vmgenid device to succeed, and this test
is too dumb to set one up. Could be addressed later.
query-acpi-ospm-status and query-hotpluggable-cpus are expected to
fail because they require features provided only by special machine
types, and this test is too dumb to set that up. Could also be
addressed later.
Several commands may either be functional or stubs that always fail,
depending on build configuration. Ideally, the stubs shouldn't be in
query-qmp-schema, but that requires QAPI schema compile-time
configuration, which we don't have, yet. Until we do, we need to
figure out whether a command is a stub. When we have a suitable
CONFIG_FOO preprocessor symbol is available, use that. Else,
simply blacklist the command for now.
We get basic test coverage for the following commands, except as
noted:
qom-list-types
query-acpi-ospm-status (expected to fail)
query-balloon (expected to fail)
query-block
query-block-jobs
query-blockstats
query-chardev
query-chardev-backends
query-command-line-options
query-commands
query-cpu-definitions (blacklisted for now)
query-cpus
query-dump
query-dump-guest-memory-capability
query-events
query-fdsets
query-gic-capabilities (blacklisted for now)
query-hotpluggable-cpus (expected to fail)
query-iothreads
query-kvm
query-machines
query-memdev
query-memory-devices
query-mice
query-migrate
query-migrate-cache-size
query-migrate-capabilities
query-migrate-parameters
query-name
query-named-block-nodes
query-pci (blacklisted for now)
query-qmp-schema
query-rx-filter
query-spice
query-status
query-target
query-tpm
query-tpm-models
query-tpm-types
query-uuid
query-version
query-vm-generation-id (expected to fail)
query-vnc
query-vnc-servers
query-xen-replication-status
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Message-Id: <1502461148-10154-1-git-send-email-armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
[Typos in code under #ifndef and in the commit message fixed]
The test-io-channel-tls test was mistakenly using two of the
same directories as test-crypto-tlssession. This causes a
sporadic failure when using make -j$BIGNUM.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
GNUTLS 3.6.0 marked SHA1 as untrusted for certificates.
Unfortunately the gnutls_x509_crt_sign() method we are
using to create certificates in the test suite is fixed
to always use SHA1. We must switch to a different method
and explicitly ask for SHA256.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
$ make check-speed
tests/benchmark-crypto-hash.c: In function 'test_hash_speed':
tests/benchmark-crypto-hash.c:44:5: error: format '%ld' expects argument of type 'long int', but argument 2 has type 'size_t' [-Werror=format=]
g_print("Testing chunk_size %ld bytes ", chunk_size);
^
tests/benchmark-crypto-hash.c: In function 'main':
tests/benchmark-crypto-hash.c:62:9: error: format '%lu' expects argument of type 'long unsigned int', but argument 4 has type 'size_t' [-Werror=format=]
snprintf(name, sizeof(name), "/crypto/hash/speed-%lu", i);
^
cc1: all warnings being treated as errors
rules.mak:66: recipe for target 'tests/benchmark-crypto-hash.o' failed
make: *** [tests/benchmark-crypto-hash.o] Error 1
Reviewed-by: Longpeng(Mike) <longpeng@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
cpu_tilegx_init() always falls back to TYPE_TILEGX_CPU object
regardless of cpu_model. Put fallback logic into
tilegx_cpu_class_by_name() which would translate any cpu_model
into TYPE_TILEGX_CPU class and replace cpu_tilegx_init()
with cpu_generic_init().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <1503592308-93913-15-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
cpu_nios2_init() always falls back to TYPE_NIOS2_CPU object
regardless of cpu_model. Put fallback logic into
nios2_cpu_class_by_name() which would translate any cpu_model
into TYPE_NIOS2_CPU class and replace cpu_nios2_init()
with cpu_generic_init()
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1503592308-93913-14-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
cpu_mb_init() always falls back to TYPE_MICROBLAZE_CPU object
regardless of cpu_model. Put fallback logic into
mb_cpu_class_by_name() which would translate any cpu_model
into TYPE_MICROBLAZE_CPU class and replace cpu_mb_init()
with cpu_generic_init().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1503592308-93913-13-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
drop custom cpu_hppa_init() in favor of cpu_generic_init(),
to make cpu_generic_init() work all we need is to provide
cc->class_by_name callback that would resolve any cpu_model
to the sole TYPE_HPPA_CPU to match current behaviour.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1503592308-93913-11-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
cpu_alpha_init() used to provide default fallback if invalid
(i.e. non existent) cpu_model were provided.
dp264 machine provides its own default so sole user of fallback
is [bsd|linux]-user targets which specifies 'any' cpu model that
fallbacks to "ev67" in cpu_alpha_init(). Push fallback handling
into alpha_cpu_class_by_name() and replace cpu_alpha_init() with
cpu_generic_init().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Richard Henderson <rth@twiddle.net>
Message-Id: <1503592308-93913-10-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
cpu_s390x_init() is used only *-user targets indirectly
via cpu_init() macro and has a hack to assign ids to created
cpus (I'm not sure if 'id' really matters to *-user emulation).
So to on safe side, instead of having custom wrapper to do numbering
replace it with cpu_generic_init() and use S390CPUClass::next_cpu_id
which could serve the same purpose as static variable and move cpu->id
initialization to s390_cpu_initfn for CONFIG_USER_ONLY use-case.
PS:
ifdef is ugly but it allows us to hide s390x detail that isn't
set by *-user targets and reuse generic cpu creation utility
for btoh machine and user emulation.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1504185578-80843-1-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
with features converted to properties we can use the same
approach as x86 for features parsing and drop legacy
approach that manipulated CPU instance directly.
New sparc_cpu_parse_features() will allow only +-feat
and explicitly disable feat=on|off syntax for now.
With that in place and sparc_cpu_parse_features() providing
generic CPUClass::parse_features callback, the cpu_sparc_init()
will do the same job as cpu_generic_init() so replace content
of cpu_sparc_init() with it.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1503672460-109436-1-git-send-email-imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
SPARCCPU::env was initialized from previously set properties
(with help of sparc_cpu_parse_features) in cpu_sparc_register().
However there is not reason to keep it there as this task is
typically done at realize time. So move post properties
initialization into sparc_cpu_realizefn, which brings
cpu_sparc_init() closer to cpu_generic_init().
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1503592308-93913-6-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
SPARC is the last target that uses legacy way of parsing
and initializing cpu features, drop legacy approach and
convert features to properties so that SPARC could as minimum
benefit from generic cpu_generic_init(), common with
x86 +-feat parser
PS:
the main purpose is to remove legacy way of cpu creation as
a blocker for unifying cpu creation code across targets.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <1503592308-93913-5-git-send-email-imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
since commit ( 9262685b cpu: Factor out cpu_generic_init() )
features parsed by it were truncated only to the 1st feature
after CPU name due to fact that
featurestr = strtok(NULL, ",");
cc->parse_features(cpu, featurestr, &err);
would extract exactly one feature and parse_features() callback
would parse it and only it leaving the rest of features ignored.
Reuse approach from x86 custom impl. i.e. replace strtok() token
parsing with g_strsplit(), which would split feature string in
2 parts name and features list and pass the later to
parse_features() callback.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Message-Id: <1503592308-93913-2-git-send-email-imammedo@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Add a new base CPU model called 'EPYC' to model processors from AMD EPYC
family (which includes EPYC 76xx,75xx,74xx, 73xx and 72xx).
The following features bits have been added/removed compare to Opteron_G5
Added: monitor, movbe, rdrand, mmxext, ffxsr, rdtscp, cr8legacy, osvw,
fsgsbase, bmi1, avx2, smep, bmi2, rdseed, adx, smap, clfshopt, sha
xsaveopt, xsavec, xgetbv1, arat
Removed: xop, fma4, tbm
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: Richard Henderson <rth@twiddle.net>
Cc: Eduardo Habkost <ehabkost@redhat.com>
Cc: Tom Lendacky <Thomas.Lendacky@amd.com>
Signed-off-by: Brijesh Singh <brijesh.singh@amd.com>
Message-Id: <20170815170051.127257-1-brijesh.singh@amd.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The helper can be used for CPU object lookup using the CPU's
arch-specific ID (the one returned by CPUClass::get_arch_id()).
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
[Yi Wang: Added documentation comments]
Signed-off-by: Yi Wang <wang.yi59@zte.com.cn>
Signed-off-by: Yun Liu <liu.yunh@zte.com.cn>
[ehabkost: extracted cpu_by_arch_id() to a separate patch]
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The errp argument is ignored by all implementations of the
method, and user_creatable_del() would break if any
implementation set an error (because it calls error_setg(errp) if
the function returns false). Remove the unused parameter.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170829220337.23427-1-ehabkost@redhat.com>
Reviewed-by: Gonglei <arei.gonglei@huawei.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
The chunk size sanity check in qxl_render_cursor works for
SPICE_CURSOR_TYPE_ALPHA cursors only. So support for
SPICE_CURSOR_TYPE_MONO cursors must be broken for ages without anyone
noticing. Most likely it simply isn't used any more by guest drivers.
Drop the dead code.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828123933.30323-2-kraxel@redhat.com
Instead pass around the address (aka offset into vga memory).
Add vga_read_* helper functions which apply vbe_size_mask to
the address, to make sure the address stays within the valid
range, similar to the cirrus blitter fixes (commits ffaf857778
and 026aeffcb4).
Impact: DoS for privileged guest users. qemu crashes with
a segfault, when hitting the guard page after vga memory
allocation, while reading vga memory for display updates.
Fixes: CVE-2017-13672
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828122906.18993-1-kraxel@redhat.com
vga display update mis-calculated the region for the dirty bitmap
snapshot in case split screen mode is used. This can trigger an
assert in cpu_physical_memory_snapshot_get_dirty().
Impact: DoS for privileged guest users.
Fixes: CVE-2017-13673
Fixes: fec5e8c92b
Cc: P J P <ppandit@redhat.com>
Reported-by: David Buchanan <d@vidbuchanan.co.uk>
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170828123307.15392-1-kraxel@redhat.com
The conflict check added by commit c0644771 ("qapi: Reject
alternates that can't work with keyval_parse()") doesn't work
with the following declaration:
{ 'alternate': 'Alt',
'data': { 'one': 'bool',
'two': 'str' } }
It crashes with:
Traceback (most recent call last):
File "./scripts/qapi-types.py", line 295, in <module>
schema = QAPISchema(input_file)
File "/home/ehabkost/rh/proj/virt/qemu/scripts/qapi.py", line 1468, in __init__
self.exprs = check_exprs(parser.exprs)
File "/home/ehabkost/rh/proj/virt/qemu/scripts/qapi.py", line 958, in check_exprs
check_alternate(expr, info)
File "/home/ehabkost/rh/proj/virt/qemu/scripts/qapi.py", line 830, in check_alternate
% (name, key, types_seen[qtype]))
KeyError: 'QTYPE_QSTRING'
This happens because the previously-seen conflicting member
('one') can't be found at types_seen[qtype], but at
types_seen['QTYPE_BOOL'].
Fix the bug by moving the error check to the same loop that adds
new items to types_seen, raising an exception if types_seen[qt]
is already set.
Add two additional test cases that can detect the bug.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170717180926.14924-1-ehabkost@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
# gpg: Signature made Thu 31 Aug 2017 11:29:33 BST
# gpg: using RSA key 0xDAE8E10975969CE5
# gpg: Good signature from "Marc-André Lureau <marcandre.lureau@redhat.com>"
# gpg: aka "Marc-André Lureau <marcandre.lureau@gmail.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 87A9 BD93 3F87 C606 D276 F62D DAE8 E109 7596 9CE5
* remotes/elmarco/tags/tidy-pull-request: (29 commits)
eepro100: replace g_malloc()+memcpy() with g_memdup()
test-iov: replace g_malloc()+memcpy() with g_memdup()
i386: replace g_malloc()+memcpy() with g_memdup()
i386: introduce ELF_NOTE_SIZE macro
decnumber: use DIV_ROUND_UP
kvm: use DIV_ROUND_UP
i386/dump: use DIV_ROUND_UP
ppc: use DIV_ROUND_UP
msix: use DIV_ROUND_UP
usb-hub: use DIV_ROUND_UP
q35: use DIV_ROUND_UP
piix: use DIV_ROUND_UP
virtio-serial: use DIV_ROUND_UP
console: use DIV_ROUND_UP
monitor: use DIV_ROUND_UP
virtio-gpu: use DIV_ROUND_UP
vga: use DIV_ROUND_UP
ui: use DIV_ROUND_UP
vnc: use DIV_ROUND_UP
vvfat: use DIV_ROUND_UP
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
# gpg: Signature made Thu 31 Aug 2017 09:21:49 BST
# gpg: using RSA key 0x9CA4ABB381AB73C8
# gpg: Good signature from "Stefan Hajnoczi <stefanha@redhat.com>"
# gpg: aka "Stefan Hajnoczi <stefanha@gmail.com>"
# Primary key fingerprint: 8695 A8BF D3F9 7CDA AC35 775A 9CA4 ABB3 81AB 73C8
* remotes/stefanha/tags/block-pull-request:
qcow2: allocate cluster_cache/cluster_data on demand
qemu-doc: Add UUID support in initiator name
tests: migration/guestperf Python 2.6 argparse compatibility
docker.py: Python 2.6 argparse compatibility
scripts: add argparse module for Python 2.6 compatibility
misc: Remove unused Error variables
oslib-posix: Print errors before aborting on qemu_alloc_stack()
throttle: Test the valid range of config values
throttle: Make burst_length 64bit and add range checks
throttle: Make LeakyBucket.avg and LeakyBucket.max integer types
throttle: Remove throttle_fix_bucket() / throttle_unfix_bucket()
throttle: Make throttle_is_valid() a bit less verbose
throttle: Update the throttle_fix_bucket() documentation
throttle: Fix wrong variable name in the header documentation
nvme: Fix get/set number of queues feature, again
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
First batch of s390x patches:
- 2.11 compat machine
- support the new --s390-pgste linker option, making it possible to
avoid enabling the global vm.allocate_pgste systl if all pieces
are in place
- correctly identify some devices as not hotpluggable
- clean up some tests and enable them for s390x
- wire up the diag288 watchdog in tcg
- clean up dependencies on CONFIG_PCI, making it possible to disable
it by hand
- lots of cleanup in target/s390x/
- fix alignment of the ccw1 structure in the s390-ccw bios
- and some more bugfixes
# gpg: Signature made Wed 30 Aug 2017 17:40:34 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# gpg: aka "Cornelia Huck <cohuck@redhat.com>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170830: (44 commits)
s390x/pci: fixup trap_msix()
pc-bios/s390-ccw.img: update image
s390-ccw: Fix alignment for CCW1
s390x/s390-stattrib: Mark the storage attribute as not user_creatable
target/s390x: cleanup cpu.h
s390x/kvm: move KVM declarations and stubs to separate files
s390x: avoid calling kvm_ functions outside of target/s390x/
target/s390x: move a couple of functions to cpu.c
target/s390x: introduce internal.h
target/s390x: move get_per_in_range() to misc_helper.c
target/s390x: move s390_do_cpu_reset() to diag.c
target/s390x: move psw_key_valid() to mem_helper.c
target/s390x: move cpu_mmu_idx_to_asc() to excp_helper.c
target/s390x: move cc_name() to helper.c
target/s390x: move gtod_*() declarations to s390-virtio.h
s390x: drop inclusion of sysemu/kvm.h from some files
s390x/cpumodel: factor out determination of default model name
target/s390x: no need to pass kvm_state to savevm_gtod handlers
target/s390x: simplify gs_allowed()
target/s390x: simplify ri_allowed()
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
I found these pattern via grepping the source tree. I don't have a
coccinelle script for it!
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
I found these pattern via grepping the source tree. I don't have a
coccinelle script for it!
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
The if_fastq and if_batchq contain not only packets, but queues of packets
for the same socket. When sofree frees a socket, it thus has to clear ifq_so
from all the packets from the queues, not only the first.
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: qemu-stable@nongnu.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add nbd_co_request, to remove code duplications in
nbd_client_co_{pwrite,pread,...} functions. Also this is
needed for further refactoring.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-8-vsementsov@virtuozzo.com>
[eblake: make nbd_co_request a wrapper, rather than merging two
existing functions]
Signed-off-by: Eric Blake <eblake@redhat.com>
Rename nbd_recv_coroutines_enter_all to nbd_recv_coroutines_wake_all,
as it most probably just adds all recv coroutines into co_queue_wakeup,
rather than directly enter them.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-9-vsementsov@virtuozzo.com>
[eblake: tweak commit message]
Signed-off-by: Eric Blake <eblake@redhat.com>
Fix nbd_send_request to return int, as it returns a return value
of nbd_write (which is int), and the only user of nbd_send_request's
return value (nbd_co_send_request) consider it as int too.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-5-vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Refactor nbd_receive_reply to return 1 on success, 0 on eof, when no
data was read and <0 for other cases, because returned size of read
data is not actually used.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-4-vsementsov@virtuozzo.com>
[eblake: tweak function comments]
Signed-off-by: Eric Blake <eblake@redhat.com>
Refactor nbd_read_eof to return 1 on success, 0 on eof, when no
data was read and <0 for other cases, because returned size of
read data is not actually used.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Message-Id: <20170804151440.320927-3-vsementsov@virtuozzo.com>
[eblake: tweak function comments, rebase to test 083 enhancements]
Signed-off-by: Eric Blake <eblake@redhat.com>
083 only tests TCP. Some failures might be specific to UNIX domain
sockets.
A few adjustments are necessary:
1. Generating a port number and waiting for server startup is
TCP-specific. Use the new nbd-fault-injector.py startup protocol to
fetch the address. This is a little more elegant because we don't
need netstat anymore.
2. The NBD filter does not work for the UNIX domain sockets URIs we
generate and must be extended.
3. Run all tests twice: once for TCP and once for UNIX domain sockets.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170829122745.14309-4-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Currently 083 waits for the nbd-fault-injector.py server to start up by
looping until netstat shows the TCP listen socket.
The startup protocol can be simplified by passing a 0 port number to
nbd-fault-injector.py. The kernel will allocate a port in bind(2) and
the final port number can be printed by nbd-fault-injector.py.
This should make it slightly nicer and less TCP-specific to wait for
server startup. This patch changes nbd-fault-injector.py, the next one
will rewrite server startup in 083.
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170829122745.14309-3-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The following segfault is encountered if the NBD server closes the UNIX
domain socket immediately after negotiation:
Program terminated with signal SIGSEGV, Segmentation fault.
#0 aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
441 QSLIST_INSERT_HEAD_ATOMIC(&ctx->scheduled_coroutines,
(gdb) bt
#0 0x000000d3c01a50f8 in aio_co_schedule (ctx=0x0, co=0xd3c0ff2ef0) at util/async.c:441
#1 0x000000d3c012fa90 in nbd_coroutine_end (bs=bs@entry=0xd3c0fec650, request=<optimized out>) at block/nbd-client.c:207
#2 0x000000d3c012fb58 in nbd_client_co_preadv (bs=0xd3c0fec650, offset=0, bytes=<optimized out>, qiov=0x7ffc10a91b20, flags=0) at block/nbd-client.c:237
#3 0x000000d3c0128e63 in bdrv_driver_preadv (bs=bs@entry=0xd3c0fec650, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=0) at block/io.c:836
#4 0x000000d3c012c3e0 in bdrv_aligned_preadv (child=child@entry=0xd3c0ff51d0, req=req@entry=0x7f31885d6e90, offset=offset@entry=0, bytes=bytes@entry=512, align=align@entry=1, qiov=qiov@entry=0x7ffc10a91b20, f
+lags=0) at block/io.c:1086
#5 0x000000d3c012c6b8 in bdrv_co_preadv (child=0xd3c0ff51d0, offset=offset@entry=0, bytes=bytes@entry=512, qiov=qiov@entry=0x7ffc10a91b20, flags=flags@entry=0) at block/io.c:1182
#6 0x000000d3c011cc17 in blk_co_preadv (blk=0xd3c0ff4f80, offset=0, bytes=512, qiov=0x7ffc10a91b20, flags=0) at block/block-backend.c:1032
#7 0x000000d3c011ccec in blk_read_entry (opaque=0x7ffc10a91b40) at block/block-backend.c:1079
#8 0x000000d3c01bbb96 in coroutine_trampoline (i0=<optimized out>, i1=<optimized out>) at util/coroutine-ucontext.c:79
#9 0x00007f3196cb8600 in __start_context () at /lib64/libc.so.6
The problem is that nbd_client_init() uses
nbd_client_attach_aio_context() -> aio_co_schedule(new_context,
client->read_reply_co). Execution of read_reply_co is deferred to a BH
which doesn't run until later.
In the mean time blk_co_preadv() can be called and nbd_coroutine_end()
calls aio_wake() on read_reply_co. At this point in time
read_reply_co's ctx isn't set because it has never been entered yet.
This patch simplifies the nbd_co_send_request() ->
nbd_co_receive_reply() -> nbd_coroutine_end() lifecycle to just
nbd_co_send_request() -> nbd_co_receive_reply(). The request is "ended"
if an error occurs at any point. Callers no longer have to invoke
nbd_coroutine_end().
This cleanup also eliminates the segfault because we don't call
aio_co_schedule() to wake up s->read_reply_co if sending the request
failed. It is only necessary to wake up s->read_reply_co if a reply was
received.
Note this only happens with UNIX domain sockets on Linux. It doesn't
seem possible to reproduce this with TCP sockets.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170829122745.14309-2-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
This is the follow-up patch that was discussed[*] as part of feedback to
qemu-iotest 194.
Changes in this patch:
- Supply 'job-id' parameter to `drive-mirror` invocation.
- Once migration completes, issue QMP `block-job-cancel` command on
the source QEMU to gracefully complete `drive-mirror` operation.
- Once the BLOCK_JOB_COMPLETED event is emitted, stop the NBD server
on the destination QEMU.
- Check for both the events: MIGRATION and BLOCK_JOB_COMPLETED.
With the above, the test will also be (almost) in sync with the
procedure outlined in the document 'live-block-operations.rst'[+]
(section: "QMP invocation for live storage migration with
``drive-mirror`` + NBD").
[*] https://lists.nongnu.org/archive/html/qemu-devel/2017-08/msg04820.html
-- qemu-iotests: add 194 non-shared storage migration test
[+] https://git.qemu.org/gitweb.cgi?p=qemu.git;a=blob;f=docs/interop/live-block-operations.rst
Signed-off-by: Kashyap Chamarthy <kchamart@redhat.com>
Message-Id: <20170829165058.8229-1-kchamart@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Most qcow2 files are uncompressed so it is wasteful to allocate (32 + 1)
* cluster_size + 512 bytes upfront. Allocate s->cluster_cache and
s->cluster_data when the first read operation is performance on a
compressed cluster.
The buffers are freed in .bdrv_close(). .bdrv_open() no longer has any
code paths that can allocate these buffers, so remove the free functions
in the error code path.
This patch can result in significant memory savings when many qcow2
disks are attached or backing file chains are long:
Before 12.81% (1,023,193,088B)
After 5.36% (393,893,888B)
Reported-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Tested-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170821135530.32344-1-stefanha@redhat.com
Cc: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Let's do it just like the other architectures. Introduce kvm-stub.c
for stubs and kvm_s390x.h for the declarations.
Change license to GPL2+ and keep copyright notice.
As we are dropping the sysemu/kvm.h include from cpu.h, fix up includes.
Suggested-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-18-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
cpu.h should only contain what really has to be accessed outside of
target/s390x/. Add internal.h which can only be used inside target/s390x/.
Move everything that isn't fast enough to run away and restructure it
right away. We'll move all kvm_* stuff later.
Minor style fixes to avoid checkpatch warning to:
- struct Lowcore: "{" goes into same line as typedef
- struct LowCore: add spaces around "-" in array length calculations
- time2tod() and tod2time(): move "{" to separate line
- get_per_atmid(): add space between ")" and "?". Move cases by one char.
- get_per_atmid(): drop extra paremthesis around (1 << 6)
Change license of new file to GPL2+ and keep copyright notice.
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-15-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
No need for kvm_enabled() as this function is only called from KVM and
there is no reason why it shouldn't be allowed for tcg. It is simply not
available under tcg.
Also, there is no need to check for the machine type anymore. Just like
ri_enabled(), we can directly use the stored flag, which results in
"true" for the "none" machine.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170818114353.13455-5-david@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
QEMU currently aborts if the user tries to create a skey device:
$ s390x-softmmu/qemu-system-s390x -nographic -device s390-skeys-qemu
qemu-system-s390x: hw/s390x/s390-skeys.c:30: s390_get_skeys_device:
Assertion `ss' failed.
Aborted (core dumped)
The storage key devices are only meant to be instantiated one time,
internally. They can not be used by the user, so mark them with
user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1503569328-22197-1-git-send-email-thuth@redhat.com>
Reviewed-by: Claudio Imbrenda <imbrenda@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
VIRTIO_PCI should properly depend on CONFIG_PCI.
With this change, we can switch off pci for s390x by removing
'CONFIG_PCI=y' from the default config.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
If we do not provide zpci, pci reconfiguration via sclp is not available
either. I/O adapter configuration, however, should always be present.
Rename the values that refer to I/O adapter configuration (instead of only
pci) to make things clearer.
Move length checking of the sccb for I/O adapter configuration into the
common sclp code (out of the pci code). This also fixes an issue that
the pci code would refer to a field in the sccb before checking whether
it was actually long enough.
Check for the adapter type in the sccb and return unrecognized adapter
type if the guest tries to issue I/O adapter configure/deconfigure for
a type other than pci or for pci if the zpci facility is not provided.
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Don't create the s390 pci host bridge if we do not provide the zpci
facility.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The nt2 event class is pci-only - don't look for events if pci is
not in the active cpu model.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The msi routing code in kvm calls some pci functions: provide
some stubs to enable builds without pci.
Also, to make this more obvious, guard them via a pci_available boolean
(which also can be reused in other places).
Fixes: e1d4fb2de ("kvm-irqchip: x86: add msi route notify fn")
Fixes: 767a554a0 ("kvm-all: Pass requester ID to MSI routing functions")
Reviewed-by: Pierre Morel <pmorel@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Nothing in fsdev/ or hw/9pfs/ depends on pci; it should rather depend
on CONFIG_VIRTFS and CONFIG_VIRTIO/CONFIG_XEN only.
Acked-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
KVM guests on s390 need a different page table layout than normal
processes (2kb page table + 2kb page status extensions vs 2kb page table
only). As of today this has to be enabled via the vm.allocate_pgste
sysctl.
Newer kernels (>= 4.12) on s390 check for an S390_PGSTE program header
and enable the pgste page table extensions in that case. This makes the
vm.allocate_pgste sysctl unnecessary. We enable this program header for
the s390 system emulation (qemu-system-s390x) if we build on s390
- for s390 system emulation
- the linker supports --s390-pgste (binutils >= 2.29)
- KVM is enabled
This will allow distributions to disable the global vm.allocate_pgste
sysctl, which will improve the page table allocation for non KVM
processes as only 2kb chunks are necessary.
Cc: Christian Ehrhardt <christian.ehrhardt@canonical.com>
Cc: Alexander Graf <agraf@suse.de>
Cc: Dan Horak <dhorak@redhat.com>
Cc: David Hildenbrand <david@redhat.com>
Signed-off-by: Christian Borntraeger <borntraeger@de.ibm.com>
Acked-by: Janosch Frank <frankja@linux.vnet.ibm.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1503483383-199649-1-git-send-email-borntraeger@de.ibm.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
QEMU currently aborts when the user tries to hot-unplug a diag288
device:
$ qemu-system-s390x -nographic -nodefaults -S -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add diag288,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
The device is not designed as hot-pluggable (it should only be used
via the "-watchdog" parameter), so let's simply remove the possibility
to hotplug it to prevent that users can run into this ugly situation.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1502892528-22618-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
While the PoP is silent on the issue, z/VM documentation states
that unknown diagnose codes trigger a specification exception.
We already do that when running with kvm, so change tcg to do so
as well.
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Re-using the boot_sector code buffer from x86 for other architectures
is not very nice, especially if we add more architectures later. It's
also ugly that the test uses a huge pre-initialized array at all - the
size of the executable is very huge due to this array. So let's use a
separate buffer for each architecture instead, allocated from the heap,
so that we really just use the memory that we need.
Suggested-by: Michael Tsirkin <mst@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Message-Id: <1502431076-22849-2-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The s390-ipl device can not be created by the user, since it is meant only
to be instantiated once internally to load the ROMs and kernel. If the user
tries to do a "device_add s390-ipl" via the monitor later, QEMU aborts with
a "ROM images must be loaded at startup" error message.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1502861458-30270-1-git-send-email-thuth@redhat.com>
Reviewed-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The minimum Python version supported by QEMU is 2.6. The argparse
standard library module was only added in Python 2.7. Many scripts
would like to use argparse because it supports command-line
sub-commands.
This patch adds argparse. See the top of argparse.py for details.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: John Snow <jsnow@redhat.com>
Message-id: 20170825155732.15665-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
There's a few cases which we're passing an Error pointer to a function
only to discard it immediately afterwards without checking it. In
these cases we can simply remove the variable and pass NULL instead.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170829120836.16091-1-berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
If QEMU is running on a system that's out of memory and mmap()
fails, QEMU aborts with no error message at all, making it hard
to debug the reason for the failure.
Add perror() calls that will print error information before
aborting.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 20170829212053.6003-1-ehabkost@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
LeakyBucket.burst_length is defined as an unsigned integer but the
code never checks for overflows and it only makes sure that the value
is not 0.
In practice this means that the user can set something like
throttling.iops-total-max-length=4294967300 despite being larger than
UINT_MAX and the final value after casting to unsigned int will be 4.
This patch changes the data type to uint64_t. This does not increase
the storage size of LeakyBucket, and allows us to assign the value
directly from qemu_opt_get_number() or BlockIOThrottle and then do the
checks directly in throttle_is_valid().
The value of burst_length does not have a specific upper limit,
but since the bucket size is defined by max * burst_length we have
to prevent overflows. Instead of going for UINT64_MAX or something
similar this patch reuses THROTTLE_VALUE_MAX, which allows I/O bursts
of 1 GiB/s for 10 days in a row.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 1b2e3049803f71cafb2e1fa1be4fb47147a0d398.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Both the throttling limits set with the throttling.iops-* and
throttling.bps-* options and their QMP equivalents defined in the
BlockIOThrottle struct are integer values.
Those limits are also reported in the BlockDeviceInfo struct and they
are integers there as well.
Therefore there's no reason to store them internally as double and do
the conversion everytime we're setting or querying them, so this patch
uses uint64_t for those types. Let's also use an unsigned type because
we don't allow negative values anyway.
LeakyBucket.level and LeakyBucket.burst_level do however remain double
because their value changes depending on the fraction of time elapsed
since the previous I/O operation.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: f29b840422767b5be2c41c2dfdbbbf6c5f8fedf8.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The throttling code can change internally the value of bkt->max if it
hasn't been set by the user. The problem with this is that if we want
to retrieve the original value we have to undo this change first. This
is ugly and unnecessary: this patch removes the throttle_fix_bucket()
and throttle_unfix_bucket() functions completely and moves the logic
to throttle_compute_wait().
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Message-id: 5b0b9e1ac6eb208d709eddc7b09e7669a523bff3.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The way the throttling algorithm works is that requests start being
throttled once the bucket level exceeds the burst limit. When we get
there the bucket leaks at the level set by the user (bkt->avg), and
that leak rate is what prevents guest I/O from exceeding the desired
limit.
If we don't allow bursts (i.e. bkt->max == 0) then we can start
throttling requests immediately. The problem with keeping the
threshold at 0 is that it only allows one request at a time, and as
soon as there's a bit of I/O from the guest every other request will
be throttled and performance will suffer considerably. That can even
make the guest unable to reach the throttle limit if that limit is
high enough, and that happens regardless of the block scheduler used
by the guest.
Increasing that threshold gives flexibility to the guest, allowing it
to perform short bursts of I/O before being throttled. Increasing the
threshold too much does not make a difference in the long run (because
it's the leak rate what defines the actual throughput) but it does
allow the guest to perform longer initial bursts and exceed the
throttle limit for a short while.
A burst value of bkt->avg / 10 allows the guest to perform 100ms'
worth of I/O at the target rate without being throttled.
Signed-off-by: Alberto Garcia <berto@igalia.com>
Message-id: 31aae6645f0d1fbf3860fb2b528b757236f0c0a7.1503580370.git.berto@igalia.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The number of queues that should be return by the admin command should:
1) Only mention the number of non-admin queues.
2) It is zero-based, meaning that '0 == one non-admin queue',
'1 == two non-admin queues', and so forth.
Because our `num_queues` means the number of queues _plus_ the admin
queue, then the right calculation for the number returned from the admin
command is `num_queues - 2`, combining the two requirements mentioned.
The issue was discovered by reducing num_queues from 64 to 8 and running
a Linux VM with an SMP parameter larger than that (e.g. 22). It tries to
utilize all queues, and therefore fails with an invalid queue number
when trying to queue I/Os on the last queue.
Signed-off-by: Dan Aloni <dan@kernelim.com>
CC: Alex Friedman <alex@e8storage.com>
CC: Keith Busch <keith.busch@intel.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Keith Busch <keith.busch@intel.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The following scenario leads to an assertion failure in
qio_channel_yield():
1. Request coroutine calls qio_channel_yield() successfully when sending
would block on the socket. It is now yielded.
2. nbd_read_reply_entry() calls nbd_recv_coroutines_enter_all() because
nbd_receive_reply() failed.
3. Request coroutine is entered and returns from qio_channel_yield().
Note that the socket fd handler has not fired yet so
ioc->write_coroutine is still set.
4. Request coroutine attempts to send the request body with nbd_rwv()
but the socket would still block. qio_channel_yield() is called
again and assert(!ioc->write_coroutine) is hit.
The problem is that nbd_read_reply_entry() does not distinguish between
request coroutines that are waiting to receive a reply and those that
are not.
This patch adds a per-request bool receiving flag so
nbd_read_reply_entry() can avoid spurious aio_wake() calls.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170822125113.5025-1-stefanha@redhat.com>
Reviewed-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Tested-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
In the ->inactivate() callbacks, permissions are updated, which
typically involves a recursive check of the whole graph. Setting
BDRV_O_INACTIVE right before doing that creates a state that
bdrv_is_writable() returns false, which causes permission update
failure.
Reorder them so the flag is updated after calling the function. Note
that this doesn't break the assert in bdrv_child_cb_inactivate() because
for any specific BDS, we still update its flags first before calling
->inactivate() on it one level deeper in the recursion.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-5-famz@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
These two conditions corresponds to mirror job's source and target,
which need to be allowed as they are part of the non-shared storage
migration workflow: failing to inactivate either will result in a
failure during migration completion.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170823134242.12080-3-famz@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
[eblake: improve comment grammar]
Signed-off-by: Eric Blake <eblake@redhat.com>
The 'm->numa_auto_assign_ram = numa_legacy_auto_assign_ram;' line
was supposed to be in pc_i440fx_2_9_machine_options() (see commit
3bfe5716 "numa: equally distribute memory on nodes"), but the
merge commit adb354dd ("Merge remote-tracking branch
'mst/tags/for_upstream' into staging") moved it to the
pc_i440fx_2_10_machine_options().
Move the line back to pc_i440fx_2_9_machine_options().
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-id: 20170818190943.23858-1-ehabkost@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
travis builds fail at HEAD at rc3 master with
block/nbd-client.c: In function ‘nbd_read_reply_entry’:
block/nbd-client.c:110:8: error: ‘ret’ may be used uninitialized in this function [-Werror=uninitialized]
fix it by initializing 'ret' to 0
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-08-23
This is identical to the pull request from yesterday (20180822),
except that a bug in one patch is fixed so that it doesn't break TCG
on a ppc host.
Last minute ppc related fixes for qemu-2.10. I'm not sure if these
are critical enough to prompt another rc, but I'm submitting them for
consideration.
First, is Cornelia's fix for 480bc11e6 which meant "make check" would
always fail on a ppc host. Tracking that down delayed submission of
the rest of these patches, sorry.
The rest are all fairly important bugfixes for qemu crashes or guest
behaviour regression on ppc. Patches 2-4 specifically are fixes for
regressions from qemu-2.9, caused by the compatibility mode and
hotplug handling cleanups for the pseries machine type.
# gpg: Signature made Wed 23 Aug 2017 01:31:47 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.10-20170823:
hw/ppc/spapr_iommu: Fix crash when removing the "spapr-tce-table" device
hw/ppc/spapr_rtc: Mark the RTC device with user_creatable = false
hw/ppc/spapr: Fix segfault when instantiating a 'pc-dimm' without 'memdev'
spapr: Allow configure-connector to be called multiple times
ppc: fix ppc_set_compat() with KVM PR
target/ppc: 'PVR != host PVR' in KVM_SET_SREGS workaround
boot-serial-test: prefer tcg accelerator
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
QEMU currently aborts unexpectedly when the user tries to add and
remove a "spapr-tce-table" device:
$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-tce-table,id=x
(qemu) device_del x
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
The device should not be accessable for the users at all, it's just
used internally, so mark it with user_creatable = false.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU currently aborts unexpectedly when a user tries to do something
like this:
$ qemu-system-ppc64 -nographic -S -nodefaults -monitor stdio
QEMU 2.9.92 monitor - type 'help' for more information
(qemu) device_add spapr-rtc,id=spapr-rtc
(qemu) device_del spapr-rtc
**
ERROR:qemu/qdev-monitor.c:872:qdev_unplug: assertion failed: (hotplug_ctrl)
Aborted (core dumped)
The RTC device is not meant to be hot-pluggable - it's an internal
device only and it even should not be possible to create it a
second time with the "-device" parameter, so let's mark this
with "user_creatable = false".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
QEMU currently crashes when trying to use a 'pc-dimm' on the pseries
machine without specifying its 'memdev' property. This happens because
pc_dimm_get_memory_region() does not check whether the 'memdev' property
has properly been set by the user. Looking closer at this function, it's
also obvious that it is using &error_abort to call another function - and
this is bad in a function that is used in the hot-plugging calling chain
since this can also cause QEMU to exit unexpectedly.
So let's fix these issues in a proper way now: Add a "Error **errp"
parameter to pc_dimm_get_memory_region() which we use in case the 'memdev'
property has not been set by the user, and which we can use instead of
the &error_abort, and change the callers of get_memory_region() to make
use of this "errp" parameter for proper error checking.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
In case of in-kernel memory hot unplug, when the guest is not able
to remove all the LMBs that are requested for removal, it will add back
any LMBs that have been successfully removed. The DR Connectors of
these LMBs wouldn't have been unconfigured and hence the addition of
these LMBs will result in configure-connector call being issued on
LMB DR connectors that are already in configured state. Such
configure-connector calls will fail resulting in a DIMM which is
partially unplugged.
This however worked till recently before we overhauled the DRC
implementation in QEMU. Commit 9d4c0f4f0a: "spapr: Consolidate
DRC state variables" is the first commit where this problem shows up
as per git bisect.
Ideally guest shouldn't be issuing configure-connector call on an
already configured DR connector. However for now, work around this in
QEMU by allowing configure-connector to be called multiple times for
all types of DR connectors.
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Corrected buglet that would have initialized fdt pointers ready
for reading on a device not present at reset]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When running in KVM PR mode, kvmppc_set_compat() always fail because the
current PR implementation doesn't handle KVM_REG_PPC_ARCH_COMPAT. Now that
the machine code inconditionally calls ppc_set_compat_all() at reset time
to restore the compat mode default value (commit 66d5c492dd), it is
impossible to start a guest with PR:
qemu-system-ppc64: Unable to set CPU compatibility mode in KVM:
Invalid argument
A tentative patch [1] was recently sent by Suraj to address the issue, but
it would prevent the compat mode to be turned off on reset. And we really
don't want to explicitely check for KVM PR. During the patch's review,
David suggested that we should only call the KVM ioctl() if the compat
PVR changes. This allows at least to run with KVM PR, provided no compat
mode is requested from the command line (which should be the case when
running PR nested). This is what this patch does.
While here, we also fix the side effect where KVM would fail but we would
change the CPU state in QEMU anyway.
[1] http://patchwork.ozlabs.org/patch/782039/
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Suraj Jitindar Singh <sjitindarsingh@gmail.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit d5fc133eed ("ppc: Rework CPU compatibility testing
across migration") changed the way cpu_post_load behaves with
the PVR setting, causing an unexpected bug in KVM-HV migrations
between hosts that are compatible (POWER8 and POWER8E, for example).
Even with pvr_match() returning true, the guest freezes right after
cpu_post_load. The reason is that the guest kernel can't handle a
different PVR value other that the running host in KVM_SET_SREGS.
In [1] it was discussed the possibility of a new KVM capability
that would indicate that the guest kernel can handle a different
PVR in KVM_SET_SREGS. Even if such feature is implemented, there is
still the problem with older kernels that will not have this capability
and will fail to migrate.
This patch implements a workaround for that scenario. If running
with KVM, check if the guest kernel does not have the capability
(named here as 'cap_ppc_pvr_compat'). If it doesn't, calls
kvmppc_is_pr() to see if the guest is running in KVM-HV. If all this
happens, set env->spr[SPR_PVR] to the same value as the current
host PVR. This ensures that we allow migrations with 'close enough'
PVRs to still work in KVM-HV but also makes the code ready for
this new KVM capability when it is done.
A new function called 'kvmppc_pvr_workaround_required' was created
to encapsulate the conditions said above and to avoid calling too
many kvm.c internals inside cpu_post_load.
[1] https://lists.gnu.org/archive/html/qemu-ppc/2017-06/msg00503.html
Signed-off-by: Daniel Henrique Barboza <danielhb@linux.vnet.ibm.com>
[dwg: Fix for the case of using TCG on a PPC host]
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Prefer to use the tcg accelarator if it is available: This is our only
real smoke test for tcg, and fast enough to use it for that.
Fixes: 480bc11e6 ("boot-serial-test: fallback to kvm accelerator")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The mmio-interface device is not something we want to allow
users to create on the command line:
* it is intended as an implementation detail of the memory
subsystem, which gets created and deleted by that
subsystem on demand; it makes no sense to create it
by hand on the command line
* it uses a pointer property 'host_ptr' which can't be
set on the command line
Mark the device as not user_creatable to avoid confusion.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1502807418-9994-1-git-send-email-peter.maydell@linaro.org
Reviewed-by: Thomas Huth <thuth@redhat.com>
We are not providing the required single-copy atomic semantics for
the 64-bit operation that is the 32-bit paired load.
At the same time, leave the entire 64-bit value in cpu_exclusive_val
and stop writing to cpu_exclusive_high. This means that we do not
have to re-assemble the 64-bit quantity when it comes time to store.
At the same time, drop a redundant temporary and perform all loads
directly into the cpu_exclusive_* globals.
Tested-by: Alistair Francis <alistair.francis@xilinx.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-id: 20170815145714.17635-3-richard.henderson@linaro.org
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When we switched NBD to use coroutines for qemu 2.9 (in particular,
commit a12a712a), we introduced a regression: if a server sends us
garbage (such as a corrupted magic number), we quit the read loop
but do not stop sending further queued commands, resulting in the
client hanging when it never reads the response to those additional
commands. In qemu 2.8, we properly detected that the server is no
longer reliable, and cancelled all existing pending commands with
EIO, then tore down the socket so that all further command attempts
get EPIPE.
Restore the proper behavior of quitting (almost) all communication
with a broken server: Once we know we are out of sync or otherwise
can't trust the server, we must assume that any further incoming
data is unreliable and therefore end all pending commands with EIO,
and quit trying to send any further commands. As an exception, we
still (try to) send NBD_CMD_DISC to let the server know we are going
away (in part, because it is easier to do that than to further
refactor nbd_teardown_connection, and in part because it is the
only command where we do not have to wait for a reply).
Based on a patch by Vladimir Sementsov-Ogievskiy.
A malicious server can be created with the following hack,
followed by setting NBD_SERVER_DEBUG to a non-zero value in the
environment when running qemu-nbd:
| --- a/nbd/server.c
| +++ b/nbd/server.c
| @@ -919,6 +919,17 @@ static int nbd_send_reply(QIOChannel *ioc, NBDReply *reply, Error **errp)
| stl_be_p(buf + 4, reply->error);
| stq_be_p(buf + 8, reply->handle);
|
| + static int debug;
| + static int count;
| + if (!count++) {
| + const char *str = getenv("NBD_SERVER_DEBUG");
| + if (str) {
| + debug = atoi(str);
| + }
| + }
| + if (debug && !(count % debug)) {
| + buf[0] = 0;
| + }
| return nbd_write(ioc, buf, sizeof(buf), errp);
| }
Reported-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170814213426.24681-1-eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
As in the case of nbd_export_new(), bdrv_invalidate_cache() can be
called when migration is still in progress. In this case we are not
ready to tighten the shared permissions fenced by blk->disable_perm.
Defer to a VM state change handler.
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170815130740.31229-4-famz@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The 093 throttling test submits twice as many requests as the throttle
limit in order to ensure that we reach the limit. The remaining
requests are left in-flight at the end of each test iteration.
Commit 452589b6b4 ("vl.c/exit: pause cpus
before closing block devices") exposed a hang in 093. This happens
because requests are still in flight when QEMU terminates but
QEMU_CLOCK_VIRTUAL time is frozen. bdrv_drain_all() hangs forever since
throttled requests cannot complete.
Step the clock at the end of each test iteration so in-flight requests
actually finish. This solves the hang and is cleaner than leaving tests
in-flight.
Note that this could also be "fixed" by disabling throttling when drives
are closed in QEMU. That approach has two issues:
1. We must drain requests before disabling throttling, so the hang
cannot be easily avoided!
2. Any time QEMU disables throttling internally there is a chance that
malicious users can abuse the code path to bypass throttling limits.
Therefore it makes more sense to fix the test case than to modify QEMU.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-Id: <20170815130502.8736-1-stefanha@redhat.com>
Signed-off-by: Eric Blake <eblake@redhat.com>
The simpletrace.py script can pretty-print flight recorder ring buffers.
These are not full simpletrace binary trace files but just the end of a
trace file. There is no header and the event ID mapping information is
often unavailable since the ring buffer may have filled up and discarded
event ID mapping records.
The simpletrace.stp script that generates ring buffer traces uses the
same trace-events-all input file as simpletrace.py. Therefore both
scripts have the same global ordering of trace events. A dynamic event
ID mapping isn't necessary: just use the trace-events-all file as the
reference for how event IDs are numbered.
It is now possible to analyze simpletrace.stp ring buffers again using:
$ ./simpletrace.py trace-events-all path/to/ring-buffer
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170815084430.7128-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
This is a partial revert of commit
7f1b588f20 ("trace: emit name <-> ID
mapping in simpletrace header"), which broke the SystemTap flight
recorder because event mapping records may not be present in the ring
buffer when the trace is analyzed. This means simpletrace.py
--no-header does not know the event ID mapping needed to pretty-print
the trace.
Instead of numbering events dynamically, use a static event ID mapping
as dictated by the event order in the trace-events-all file.
The simpletrace.py script also uses trace-events-all so the next patch
will fix the simpletrace.py --no-header option to take advantage of this
knowledge.
Cc: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170815084430.7128-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
# gpg: Signature made Tue 15 Aug 2017 11:50:36 BST
# gpg: using RSA key 0xCA35624C6A9171C6
# gpg: Good signature from "Fam Zheng <famz@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 5003 7CB7 9706 0F76 F021 AD56 CA35 624C 6A91 71C6
* remotes/famz/tags/build-and-test-pull-request:
docker: add centos7 image
docker: install more packages on CentOS to extend code coverage
docker: add Xen libs to centos6 image
docker: use one package per line in CentOS config
Makefile: Let "make check-help" work without running ./configure
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Currently if you do "make check-help" in a fresh checkout, only an error
is printed which is not nice:
$ make check-help V=1
cc -nostdlib -o check-help.mo
cc: fatal error: no input files
compilation terminated.
rules.mak:115: recipe for target 'check-help.mo' failed
make: *** [check-help.mo] Error 1
Move the config-host.mak condition into the body of
tests/Makefile.include and always include the rule for check-help.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
Message-Id: <20170810085025.14076-1-famz@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Fam Zheng <famz@redhat.com>
This reverts a change that replaced the "rm -f" command with the
undefined variable RM (expected to be set by make), and causes the
"make clean" command to fail for a s390 target:
make[1]: Entering directory '/usr/src/qemu/build/pc-bios/s390-ccw'
rm -f *.timestamp
*.o *.d *.img *.elf *~ *.a
/bin/sh: *.o: command not found
Makefile:39: recipe for target 'clean' failed
make[1]: *** [clean] Error 127
make[1]: Leaving directory '/usr/src/qemu/build/pc-bios/s390-ccw'
Makefile:489: recipe for target 'clean' failed
make: *** [clean] Error 1
Fixes: 3e4415a751 ("pc-bios/s390-ccw: Add core files for the network
bootloading program")
Signed-off-by: Eric Farman <farman@linux.vnet.ibm.com>
Message-Id: <20170814204450.24118-2-farman@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
# gpg: Signature made Mon 14 Aug 2017 13:32:10 BST
# gpg: using RSA key 0xEF04965B398D6211
# gpg: Good signature from "Jason Wang (Jason Wang on RedHat) <jasowang@redhat.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 215D 46F4 8246 689E C77F 3562 EF04 965B 398D 6211
* remotes/jasowang/tags/net-pull-request:
qemu-doc: Mention host_net_add/-remove in the deprecation chapter
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The two HMP commands host_net_add and -remove have recently been
marked as deprecated, too, so we should now mention them in the
chapter of deprecated features.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Jason Wang <jasowang@redhat.com>
QEMU currently abort()s if the user tries to specify the mmio_interface
device without parameters:
x86_64-softmmu/qemu-system-x86_64 -nographic -device mmio_interface
qemu-system-x86_64: /home/thuth/devel/qemu/util/error.c:57: error_setv:
Assertion `*errp == ((void *)0)' failed.
Aborted (core dumped)
This happens because the realize function is trying to set the errp
twice in this case. After setting an error, the realize function
should immediately return instead.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The check script contains a commented out root user requirement,
probably because of its xfstests heritage. This requirement doesn't
apply to qemu-iotests, so it better be gone.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
The variables FULL_MKFS_OPTIONS and FULL_MOUNT_OPTIONS are commented
out, never used, and even refer to functions that do exist. The last
time these were touched was around 8 years ago, so I guess it's safe
to assume outputting such information on test execution is still on the
radar.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Although this function is used, its implementation does nothing
besides echoing a variable name. There's no need to wrap this
functionality in a function, and based on the one usage it has, it's
not even required to adhere to a convention or code style.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
While Andrew S. Tanenbaum has a point by saying "Never underestimate the
bandwidth of a station wagon full of tapes hurtling down the highway",
we don't support that way of transportation in QEMU yet, so replace the
typo with the correct word "vlan".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Laurent Vivier <lvivier@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
While Andrew S. Tanenbaum has a point by saying "Never underestimate the
bandwidth of a station wagon full of tapes hurtling down the highway",
we don't support that way of transportation in QEMU yet, so replace the
typo with the correct word "vlan".
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1502365466-19432-1-git-send-email-thuth@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Currently, at least x86_64 and s390x support building with --disable-tcg.
Instead of forcing tcg (which causes the test to fail on such builds),
allow to use kvm as well.
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
185 can sometimes produce wrong output like this:
185 2s ... - output mismatch (see 185.out.bad)
--- /work/src/qemu/master/tests/qemu-iotests/185.out 2017-07-14 \
15:14:29.520343805 +0300
+++ 185.out.bad 2017-08-07 16:51:02.231922900 +0300
@@ -37,7 +37,7 @@
{"return": {}}
{"return": {}}
{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, \
"event": "SHUTDOWN", "data": {"guest": false}}
-{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, \
"event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
"len": 4194304, "offset": 4194304, "speed": 65536, "type": \
"mirror"}}
+{"timestamp": {"seconds": TIMESTAMP, "microseconds": TIMESTAMP}, \
"event": "BLOCK_JOB_CANCELLED", "data": {"device": "disk", \
"len": 0, "offset": 0, "speed": 65536, "type": "mirror"}}
=== Start backup job and exit qemu ===
Failures: 185
Failed 1 of 1 tests
This is because, under heavy load, the quit can happen before the first
iteration of the mirror request has occurred. To make sure we've had
time to iterate, let's just add a sleep for 0.5 seconds before quitting.
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It is reported that on Windows Subsystem for Linux, ofd operations fail
with -EINVAL. In other words, QEMU binary built with system headers that
exports F_OFD_SETLK doesn't necessarily run in an environment that
actually supports it:
$ qemu-system-aarch64 ... -drive file=test.vhdx,if=none,id=hd0 \
-device virtio-blk-pci,drive=hd0
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to unlock byte 100
qemu-system-aarch64: -drive file=test.vhdx,if=none,id=hd0: Failed to lock byte 100
As a matter of fact this is not WSL specific. It can happen when running
a QEMU compiled against a newer glibc on an older kernel, such as in
a containerized environment.
Let's do a runtime check to cope with that.
Reported-by: Andrew Baumann <Andrew.Baumann@microsoft.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Build time check of OFD lock is not sufficient and can cause image open
errors when the runtime environment doesn't support it.
Add a helper function to probe it at runtime, additionally. Also provide
a qemu_has_ofd_lock() for callers to check the status.
Signed-off-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
It's been #if 0'd since its introduction in 2006, commit 585f8587.
We can revive dead code if we need it, but in the meantime, it has
bit-rotted (for example, not checking for failure in bdrv_getlength()).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Commit b43671f8 accidentally broke run_test.sh within tests/multiboot;
due to a subtle change in whitespace.
These two commands produce theh same output (at least, for sane $IFS
of space-tab-newline):
echo -e "...$@..."
echo -e "...$*..."
But that's only because echo inserts spaces between multiple arguments
(the $@ case), while the $* form gives a single argument to echo with
the spaces already present.
But when converting to printf %b, there are no automatic spaces between
multiple arguments, so we HAVE to use $*.
It doesn't help that run_test.sh isn't part of 'make check'.
Signed-off-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Just a single fix for an annoying regression introduced in 2.9 when fixing
CVE-2016-9602.
# gpg: Signature made Thu 10 Aug 2017 13:40:28 BST
# gpg: using DSA key 0x02FC3AEB0101DBC2
# gpg: Good signature from "Greg Kurz <groug@kaod.org>"
# gpg: aka "Greg Kurz <groug@free.fr>"
# gpg: aka "Greg Kurz <gkurz@linux.vnet.ibm.com>"
# gpg: aka "Gregory Kurz (Groug) <groug@free.fr>"
# gpg: aka "[jpeg image of size 3330]"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 2BD4 3B44 535E C0A7 9894 DBA2 02FC 3AEB 0101 DBC2
* remotes/gkurz/tags/for-upstream:
9pfs: local: fix fchmodat_nofollow() limitations
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This function has to ensure it doesn't follow a symlink that could be used
to escape the virtfs directory. This could be easily achieved if fchmodat()
on linux honored the AT_SYMLINK_NOFOLLOW flag as described in POSIX, but
it doesn't. There was a tentative to implement a new fchmodat2() syscall
with the correct semantics:
https://patchwork.kernel.org/patch/9596301/
but it didn't gain much momentum. Also it was suggested to look at an O_PATH
based solution in the first place.
The current implementation covers most use-cases, but it notably fails if:
- the target path has access rights equal to 0000 (openat() returns EPERM),
=> once you've done chmod(0000) on a file, you can never chmod() again
- the target path is UNIX domain socket (openat() returns ENXIO)
=> bind() of UNIX domain sockets fails if the file is on 9pfs
The solution is to use O_PATH: openat() now succeeds in both cases, and we
can ensure the path isn't a symlink with fstat(). The associated entry in
"/proc/self/fd" can hence be safely passed to the regular chmod() syscall.
The previous behavior is kept for older systems that don't have O_PATH.
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Zhi Yong Wu <zhiyong.wu@ucloud.cn>
Acked-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
ppc patch queue 2017-08-09
This series contains a number of bugfixes for ppc and related
machines, for the qemu-2.10.release. Some are true regressions,
others are serious enough and non-invasive enough to fix that it's
worth putting in 2.10 this late.
# gpg: Signature made Wed 09 Aug 2017 07:31:33 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.10-20170809:
spapr: Fix bug in h_signal_sys_reset()
spapr_drc: abort if object_property_add_child() fails
target/ppc: Add stub implementation of the PSSCR
target/ppc: Implement TIDR
ppc: fix double-free in cpu_post_load()
booke206: fix MAS update on tlb miss
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
pc, vhost: fixes for rc3
Fix up bugs and warnings in tests. Revert an experimental commit that I
put in by mistake: harmless but useless.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Wed 09 Aug 2017 02:23:17 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
libqtest: always set up signal handler for SIGABRT
libvhost-user: quit when no more data received
net: fix -netdev socket,fd= for UDP sockets
Revert "cpu: add APIs to allocate/free CPU environment"
acpi-test: update expected DSDT files
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The unicast case in h_signal_sys_reset() seems to be broken:
rather than selecting the target CPU, it looks like it will pick
either the first CPU or fail to find one at all.
Fix it by using the search function rather than open coding the
search.
This was found by inspection; the code appears to be unused because
the Linux kernel only uses the broadcast target.
Signed-off-by: Sam Bobroff <sam.bobroff@au1.ibm.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
object_property_add_child() can only fail in two cases:
- the child already has a parent, which shouldn't happen since the DRC was
allocated a few lines above
- the parent already has a child with the same name, which would mean the
caller tries to create a DRC that already exists
In both case, this is a QEMU bug and we should abort.
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
The PSSCR register added in POWER9 controls certain power saving mode
behaviours. Mostly, it's not relevant to TCG, however because qemu
doesn't know about it yet, it doesn't synchronize the state with KVM,
and thus it doesn't get migrated.
To fix that, this adds a minimal stub implementation of the register.
This isn't complete, even to the extent that an implementation is
possible in TCG, just enough to get migration working. We need to
come back later and at least properly filter the various fields in the
register based on privilege level.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
This adds a trivial implementation of the TIDR register added in
POWER9. This isn't particularly important to qemu directly - it's
used by accelerator modules that we don't emulate.
However, since qemu isn't aware of it, its state is not synchronized
with KVM and therefore not migrated, which can be a problem.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Cédric Le Goater <clg@kaod.org>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
When running nested with KVM PR, ppc_set_compat() fails and QEMU crashes
because of "double free or corruption (!prev)". The crash happens because
error_report_err() has already called error_free().
Signed-off-by: Greg Kurz <groug@kaod.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
When a tlb instruction miss happen, rw is set to 0 at the bottom
of cpu_ppc_handle_mmu_fault which cause the MAS update function to miss
the SAS and TS bit in MAS6, MAS1 in booke206_update_mas_tlb_miss.
Just calling booke206_update_mas_tlb_miss with rw = 2 solve the issue.
Signed-off-by: KONRAD Frederic <frederic.konrad@adacore.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Currently abort handlers only work for the first test function
in a testcase, because the list of abort handlers is not properly
cleared when qtest_quit() is called.
qtest_quit() only deletes the kill_qemu_hook but doesn't completely
clear the abrt_hooks list. The effect is that abrt_hooks.is_setup is
never set to false and in a following test the abrt_hooks list is not
initialized and setup_sigabrt_handler() is not called.
One way to solve this is to clear the list in qtest_quit(), but
that means only asserts between qtest_start and qtest_quit will
be catched by the abort handler.
We can make abort handlers work in all cases if we always setup the
signal handler for SIGABRT in qtest_init.
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
End processing of messages when VHOST_USER_NONE
is received.
Without this we run into a vubr_panic() call and get
"PANIC: Unhandled request: 0"
Signed-off-by: Jens Freimann <jfreiman@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
This patch fixes -netdev socket,fd= for UDP sockets
Currently -netdev socket,fd=<...> results in
qemu: error: specified mcastaddr "127.0.0.1" (0x7f000001) does not
contain a multicast address
qemu-system-x86_64: -netdev
socket,id=n1,fd=3: Device 'socket' could not be initialized
To fix these we need to allow specifying multicast and fd arguments
for the same netdev. With this the user can specify "-netdev
fd=3,mcast=<IP:port>"
Cc: Jason Wang <jasowang@redhat.com>
Fixes: 3d830459b1
Signed-off-by: Jens Freimann <jfreimann@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
These days, many programs are including a bug-reporting address,
or better yet, a link to the project web site, at the tail of
their --help output. However, we were not very consistent at
doing so: only qemu-nbd and qemu-qa mentioned anything, with the
latter pointing to an individual person instead of the project.
Add a new #define that sets up a uniform string, mentioning both
bug reporting instructions and overall project details, and which
a downstream vendor could tweak if they want bugs to go to a
downstream database. Then use it in all of our binaries which
have --help output.
The canned text intentionally references http:// instead of https://
because our https website currently causes certificate errors in
some browsers. That can be tweaked later once we have resolved the
web site issued.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170803163353.19558-5-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
'amend' and 'create' were not listed alphabetically; hoist them
earlier. Separate the @end table block to make it easier to
copy-and-paste the addition of future sub-commands.
Signed-off-by: Eric Blake <eblake@redhat.com>
Message-Id: <20170803163353.19558-2-eblake@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Block layer patches for 2.10.0-rc2
# gpg: Signature made Tue 08 Aug 2017 14:56:15 BST
# gpg: using RSA key 0x7F09B272C88F2FD6
# gpg: Good signature from "Kevin Wolf <kwolf@redhat.com>"
# Primary key fingerprint: DC3D EB15 9A9A F95D 3D74 56FE 7F09 B272 C88F 2FD6
* remotes/kevin/tags/for-upstream:
block/nfs: fix mutex assertion in nfs_file_close()
qemu-iotests: Test reopen between read-only and read-write
qemu-io: Allow reopen read-write
block: Set BDRV_O_ALLOW_RDWR during rw reopen
block: Allow reopen rw without BDRV_O_ALLOW_RDWR
block: Fix order in bdrv_replace_child()
parallels: drop check that bdrv_truncate() is working
parallels: respect error code of bdrv_getlength() in allocate_clusters()
block: respect error code from bdrv_getlength in handle_aiocb_write_zeroes
vmdk: Fix error handling/reporting of vmdk_check
block/null: Remove 'filename' option
block: drop bdrv_set_key from BlockDriver
block/vhdx: check error return of bdrv_truncate()
block/vhdx: check error return of bdrv_flush()
block/vhdx: check for offset overflow to bdrv_truncate()
block/vhdx: check error return of bdrv_getlength()
quorum: Set sectors-count to 0 when reporting a flush error
qemu-iotests/109: Fix lock race condition
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Commit c096358e74 introduced assertion
checks for when qemu_mutex() functions are called without the
corresponding qemu_mutex_init() having initialized the mutex.
This uncovered a latent bug in qemu's nfs driver - in
nfs_client_close(), the NFSClient structure is overwritten with zeros,
prior to the mutex being destroyed.
Go ahead and destroy the mutex in nfs_client_close(), and change where
we call qemu_mutex_init() so that it is correctly balanced.
There are also a couple of memory leaks obscured by the memset, so this
fixes those as well.
Finally, we should be able to get rid of the memset(), as it isn't
necessary.
Cc: qemu-stable@nongnu.org
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Peter Lieven <pl@kamp.de>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This serves as a regression test for the bugs that were just fixed for
bdrv_reopen() between read-only and read-write mode.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This allows qemu-iotests to test the switch between read-only and
read-write mode for block devices.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reopening an image should be consistent with opening it, so we should
set BDRV_O_ALLOW_RDWR for any image that is reopened read-write like in
bdrv_open_inherit().
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
BDRV_O_ALLOW_RDWR is a flag that tells whether qemu can internally
reopen a node read-write temporarily because the user requested
read-write for the top-level image, but qemu decided that read-only is
enough for this node (a backing file).
bdrv_reopen() is different, it is also used for cases where the user
changed their mind and wants to update the options. There is no reason
to forbid making a node read-write in that case.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Commit 8ee03995 refactored the code incorrectly and broke the release of
permissions on the old BDS. Instead of changing the permissions to the
new required values after removing the old BDS from the list of
children, it only re-obtains the permissions it already had.
Change the order of operations so that the old BDS is removed again
before calculating the new required permissions.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
This would be actually strange and error prone. If truncate() nowadays
will fail, there is something fatally wrong. Let's check for that during
the actual work.
The only fallback case is when the file is not zero initialized. In this
case we should switch to preallocation via fallocate().
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Original idea beyond the code in question was the following: we have failed
to write zeroes with fallocate(FALLOC_FL_ZERO_RANGE) as the simplest
approach and via fallocate(FALLOC_FL_PUNCH_HOLE)/fallocate(0). We have the
only chance now: if the request comes beyond end of the file. Thus we
should calculate file length and respect the error code from that op.
Signed-off-by: Denis V. Lunev <den@openvz.org>
CC: Markus Armbruster <armbru@redhat.com>
CC: Kevin Wolf <kwolf@redhat.com>
CC: Max Reitz <mreitz@redhat.com>
CC: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Errors from the callees must be captured and propagated to our caller,
ensure this for both find_extent() and bdrv_getlength().
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
This option was only added to allow 'null-co://' and 'null-aio://' as
filenames, its value never served any actual purpose and was ignored.
Nevertheless it was accepted as '-drive driver=null,filename=foo'.
The correct way to enable the protocol prefixes (and that without adding
a useless -drive option) is implementing .bdrv_parse_filename. This is
what this patch does.
Technically, this is an incompatible change, but the null block driver
is only used for benchmarking, testing and debugging, and an option
without effect isn't likely to be used by anyone anyway, so no bad
effects are to be expected.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
VHDX uses uint64_t types for most offsets, following the VHDX spec.
However, bdrv_truncate() takes an int64_t value for the truncating
offset. Check for overflow before calling bdrv_truncate().
While we are here, replace the bit shifting with QEMU_ALIGN_UP as well.
N.B.: For a compliant image this is not an issue, as the maximum VHDX
image size is defined per the spec to be 64TB.
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Calls to bdrv_getlength() were not checking for error. In vhdx.c, this
can lead to truncating an image file, so it is a definite bug. In
vhdx-log.c, the path for improper behavior is less clear, but it is best
to check in any case.
Some minor code movement of the log_guid intialization, as well.
Reported-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The QUORUM_REPORT_BAD event has fields to report the sector in which
the error was detected and the number of affected sectors starting
from that one. This is important for read and write errors, but not
for flush errors.
For flush errors the current code reports the total size of the disk
image. That is however not useful information in this case. Moreover,
the bdrv_getlength() call can fail, and there's no good way of
handling that failure.
Since we're reporting useless information and we cannot even guarantee
to do it in a consistent way, this patch changes the code to report 0
instead in all cases.
Reported-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Alberto Garcia <berto@igalia.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A race condition is currently present between the clean up attempt of
the QEMU process and the execution of qemu-img. The actual (bad)
output is:
-Warning: Image size mismatch!
-Images are identical.
+qemu-img: Could not open '<build_dir>/tests/qemu-iotests/scratch/t.raw': Failed to get "consistent read" lock
+Is another process using the image?
A KILL signal is sent to the QEMU process, but qemu-img may begin to
run before the QEMU process is really gone. qemu-img will then
attempt to open the TEST_IMG file before it can secure a lock on it.
This attempts a more graceful shutdown, and waits for the QEMU process
to exit.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
virtio: fix for rc2
It turns out there's a way to setup SHPC on Q35: just put
a PCI to PCI bridge behind a DMI to PCI one. Our _OSC is
thus incorrect.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Mon 07 Aug 2017 22:39:20 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
cpu: add APIs to allocate/free CPU environment
hw/i386: allow SHPC for Q35 machine
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When emulating various SSE4.1 instructions such as pinsrd, the address
of a memory operand is computed without allowing for the 8-bit
immediate operand located after the memory operand, meaning that the
memory operand uses the wrong address in the case where it is
rip-relative. This patch adds the required rip_offset setting for
those instructions, so fixing some GCC test failures (13 in the gcc
testsuite in my GCC 6-based testing) when testing with a default CPU
setting enabling those instructions.
Signed-off-by: Joseph Myers <joseph@codesourcery.com>
Message-Id: <alpine.DEB.2.20.1708080041391.28702@digraph.polyomino.org.uk>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The LUN0 emulation is just that, an emulation for a non-existing
LUN0. So we should be returning LUN_NOT_SUPPORTED for any request
coming from any other LUN.
And we should be aborting unhandled commands with INVALID OPCODE,
not LUN NOT SUPPORTED.
Signed-off-by: Hannes Reinecke <hare@suse.com>
Message-Id: <1501835795-92331-4-git-send-email-hare@suse.de>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Building QEMU on fedora26 with the latest gcc package fails:
CC ppc64-softmmu/target/ppc/kvm.o
In file included from include/sysemu/hw_accel.h:16:0,
from target/ppc/kvm.c:31:
target/ppc/kvm.c: In function ‘kvmppc_booke_watchdog_enable’:
include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
cap.args[i] = args_tmp[i]; \
^
target/ppc/kvm.c: In function ‘kvmppc_set_papr’:
include/sysemu/kvm.h:449:35: error: ‘args_tmp[i]’ may be used uninitialized
in this function [-Werror=maybe-uninitialized]
cc1: all warnings being treated as errors
$ rpm -q gcc
gcc-7.1.1-3.fc26.ppc64le
The compiler should obviously optimize this code away when no extra
agument is passed to kvm_vm_enable_cap() and kvm_vcpu_enable_cap(),
but it doesn't. This bug should be fixed one day in gcc, but we can
also change our code pattern so that we don't hit the issue anymore.
We workaround this, by using memcpy() instead of open-coding the copy.
Signed-off-by: Greg Kurz <groug@kaod.org>
Message-Id: <150210580404.1343.7325713896658799315.stgit@bahia.lan>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
This reverts commit a59629fcc6.
This is not needed anymore because the IOThread mutex is not
"magic" anymore (need not kick the CPU thread)and also because
fork callbacks are only enabled at the very beginning of
QEMU's execution.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Because of -daemonize, system mode QEMU sometimes needs to fork() and
keep RCU enabled in the child. However, there is a possible deadlock
with synchronize_rcu:
- the CPU thread is inside a RCU critical section and wants to take
the BQL in order to do MMIO
- the monitor thread, which is owning the BQL, calls rcu_init_lock
which tries to take the rcu_sync_lock
- the call_rcu thread has taken rcu_sync_lock in synchronize_rcu, but
synchronize_rcu needs the CPU thread to end the critical section
before returning.
This cannot happen for user-mode emulation, because it does not have
a BQL.
To fix it, assume that system mode QEMU only forks in preparation for
exec (except when daemonizing) and disable pthread_atfork as soon as
the double fork has happened.
Reported-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Tested-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Unmask previously masked SHPC feature in _OSC method.
Signed-off-by: Aleksandr Bezzubikov <zuban32s@gmail.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
There are trace probes in bdrv_co_readv|writev, however, the
block drivers are being gradually moved over to using the
bdrv_co_preadv|pwritev functions instead. As a result some
block drivers miss the current probes. Move the probes
into bdrv_co_preadv|pwritev instead, so that they are triggered
by more (all?) I/O code paths.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170804105036.11879-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
MIPS patches 2017-08-03
Changes:
KVM T&E segment support for TCG
malta: leave space for the bootmap after the initrd
Apply CP0.PageMask before writing into TLB entry
Fix fallout from indirect branch optimisation
# gpg: Signature made Thu 03 Aug 2017 15:32:59 BST
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with sufficiently trusted signatures!
# gpg: It is not certain that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170803:
target/mips: Fix RDHWR CC with icount
target/mips: Drop redundant gen_io_start/stop()
target/mips: Use BS_EXCP where interrupts are expected
target-mips: apply CP0.PageMask before writing into TLB entry
mips: Add KVM T&E segment support for TCG
mips: Improve segment defs for KVM T&E guests
mips/malta: leave space for the bootmap after the initrd
target-mips: Don't stop on [d]mtc0 DESAVE/KScratch
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
virtio: fix for rc2
Looks like the constant stream of additions of vhost-user devices is a
problem for some people who are concerned about external connections
from qemu. A per-device flag seems like an overkill, but a single
configure flag seems like a sane way to support that, and it looks like
we need to do it before the release.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Thu 03 Aug 2017 13:57:57 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
build-sys: add --disable-vhost-user
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
For a 64-bit ILP32 host, aligning to sizeof(long) is not enough.
Guess the minimum for any host is 8, as that covers uint64_t.
Qemu doesn't use a host long double or host vectors, except in
extremely limited circumstances.
Fixes a bus error for a sparc v8plus host.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Patch 85aa80813d changed the IF emitting the TST instruction,
but failed to change the ?: converting CMP to CMPEQ, so the
result of the TST is ignored.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Learn to compile out vhost-user (net, scsi & upcoming users). Keep it
enabled by default on non-win32, that is assumed to be POSIX. Fail if
trying to enable it on win32.
When trying to make a vhost-user netdev, it gives the following error:
-netdev vhost-user,id=foo,chardev=chr-test: Parameter 'type' expects a netdev backend type
And similar error with the HMP/QMP monitors.
While at it, rename CONFIG_VHOST_NET_TEST CONFIG_VHOST_USER_NET_TEST
since it's a vhost-user specific variable.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Cornelia Huck <cohuck@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
While parsing dhcp options string in 'dhcp_decode', if an options'
length 'len' appeared towards the end of 'bp_vend' array, ensuing
read could lead to an OOB memory access issue. Add check to avoid it.
This is CVE-2017-11434.
Reported-by: Reno Robert <renorobert@gmail.com>
Signed-off-by: Prasad J Pandit <pjp@fedoraproject.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
With "-netdev user,id=net0,dns=1.2.3.4"
error was:
qemu-system-i386: -netdev user,id=net0,dns=1.2.3.4: Device 'user' could not be initialized
Error is now:
qemu-system-i386: -netdev user,id=net0,dns=1.2.3.4: DNS doesn't belong to network
Signed-off-by: Hervé Poussineau <hpoussin@reactos.org>
Signed-off-by: Samuel Thibault <samuel.thibault@ens-lyon.org>
With pseries machine type a negative core-id is not managed properly:
-1 gives an inaccurate error message ("core -1 already populated"),
-2 crashes QEMU (core dump)
As it seems a negative value is invalid for any architecture,
instead of checking this in spapr_core_pre_plug() I think it's better
to check this in the generic part, core_prop_set_core_id()
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170802103259.25940-1-lvivier@redhat.com>
Reviewed-by: Greg Kurz <groug@kaod.org>
Reviewed-by: David Gibson <david@gibson.dropbear.id.au>
Reviewed-by: Eduardo Habkost <ehabkost@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
RDHWR CC reads the CPU timer like MFC0 CP0_Count, so with icount enabled
it must set can_do_io while it calls the helper to avoid the "Bad icount
read" error. It should also break out of the translation loop to ensure
that timer interrupts are immediately handled.
Fixes: 2e70f6efa8 ("Add instruction counter.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
DMTC0 CP0_Cause does a redundant gen_io_start() and gen_io_end() pair,
even though this is done for all DMTC0 operations outside of the switch
statement. Remove these redundant calls.
Fixes: 5dc5d9f055 ("mips: more fixes to the MIPS interrupt glue logic")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Commit e350d8ca3a ("target/mips: optimize indirect branches") made
indirect branches able to directly find the next TB and jump straight to
it without breaking out of translated code and going around the main
execution loop. This breaks the assumption in target/mips/translate.c
that BS_STOP is sufficient to cause pending interrupts to be handled,
since interrupts are only checked in the main loop.
Fix a few of these assumptions by using gen_save_pc to update the saved
PC and using BS_EXCP instead of BS_STOP:
- [D]MFC0 CP0_Count may trigger a timer interrupt which should be
immediately handled.
- [D]MTC0 CP0_Cause may trigger an interrupt (but in fact translation
was only even being stopped in the DMTC0 case).
- [D]MTC0 CP0_<any> when icount is used is assumed could potentially
cause interrupts.
- EI may trigger an interrupt which was pending. I specifically hit
this case when running KVM nested in mipsel-softmmu. A timer
interrupt while the 2nd guest was executing is caught by KVM which
switches back to the normal Linux exception base and re-enables
interrupts with EI. Since the above commit QEMU doesn't leave
translated code until the nested KVM has already restored the KVM
exception base and returned to the 2nd guest, at which point it is
too late to check for pending interrupts and it gets stuck in an
infinite loop of unhandled interrupts.
Something similar was needed for ARM in commit b29fd33db5
("target/arm: use DISAS_EXIT for eret handling").
Fixes: e350d8ca3a ("target/mips: optimize indirect branches")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Richard Henderson <rth@twiddle.net>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
MIPS KVM trap & emulate guest kernels have a different segment layout
compared with traditional MIPS kernels, to allow both the user and
kernel code to run from the user address segment without repeatedly
trapping to KVM.
QEMU currently supports this layout only for KVM, but its sometimes
useful to be able to run these kernels in QEMU on a PC, so enable it for
TCG too.
This also paves the way for MIPS KVM VZ support (which uses the normal
virtual memory layout) by abstracting whether user mode kernel segments
are in use.
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reviewed-by: Richard Henderson <rth@twiddle.net>
[Yongbok Kim:
minor change]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Improve the segment definitions used by get_physical_address() to yield
target_ulong types, e.g. 0xffffffff80000000 instead of 0x80000000. This
is in preparation for enabling emulation of MIPS KVM T&E segments in TCG
MIPS targets, which unlike KVM could potentially have 64-bit
target_ulong. In such a case the offset guest KSEG0 address ends up at
e.g. 0x000000008xxxxxxx instead of 0xffffffff8xxxxxxx.
This also allows the casts to int32_t that force sign extension to be
removed, which removes any confusion due to relational comparison of
unsigned (target_ulong) and signed (int32_t) types.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Paolo Bonzini <pbonzini@redhat.com>
Cc: kvm@vger.kernel.org
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Since commit 9768e2abf7 the initrd is loaded at the end of the low
memory to avoid clash for the kernel relocation when kaslr is used.
However this in turn conflicts with the bootmap memory that the kernel
tries to place after initrd, but in low memory. The bootmap spans the
whole usable physical address space. The machine can have at most 2GiB
of memory, 256MiB of low memory mapped at 0x00000000, and 1792MiB of
high memory mapped at 0x90000000. The biggest bootmap therefore
corresponds to the adresses 0x00000000 -> 0xffffffff, which at 1 bit
per 4kiB page corresponds to 128kiB in memory.
Therefore reserve 128kiB after the initrd.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Writing to the MIPS DESAVE register (and now the KScratch registers)
will stop translation, supposedly due to risk of execution mode
switches. However these registers are basically RW scratch registers
with no side effects so there is no risk of them triggering execution
mode changes.
Drop the bstate = BS_STOP for these registers for both mtc0 and dmtc0.
Fixes: 7a387fffce ("Add MIPS32R2 instructions, and generally straighten out the instruction decoding. This is also the first percent towards MIPS64 support.")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
testchardev2 is not a valid chardev id here. Use testchardev1
instead which has been created with chardev-add right before
the 'chardev-send-break' line.
And while we're at it, add the test-hmp.c file to the MAINTAINERS
file, too.
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1501149097-19071-1-git-send-email-thuth@redhat.com
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Migration pull 2017-08-02
Just minor fixes for 2.10
# gpg: Signature made Wed 02 Aug 2017 14:55:21 BST
# gpg: using RSA key 0x0516331EBC5BFDE7
# gpg: Good signature from "Dr. David Alan Gilbert (RH2) <dgilbert@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 45F5 C71B 4A0C B7FB 977A 9FA9 0516 331E BC5B FDE7
* remotes/dgilbert/tags/pull-migration-20170802a:
io: fix qio_channel_socket_accept err handling
migration: fix comment disorder in RAMState
migration: fix small leaks
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When accept failed, we should setup errp with the reason. More
importantly, the caller may assume errp be non-NULL when error happens,
and not setting the errp may crash QEMU.
At the same time, move the trace_qio_channel_socket_accept_fail() after
the if check on EINTR. Two reasons:
1. when EINTR happened, it's not really a fault (we should just try
again), so we should not log with an "accept failure".
2. trace_*() functions may overwrite errno, then the old errno will be
missing. We need to either check errno before trace_*() calls, or
reserve the errno.
Signed-off-by: Peter Xu <peterx@redhat.com>
Message-Id: <1501666880-10159-3-git-send-email-peterx@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Spotted thanks to valgrind and tests/device-introspect-test:
==11711== 1 bytes in 1 blocks are definitely lost in loss record 6 of 14,537
==11711== at 0x4C2EB6B: malloc (vg_replace_malloc.c:299)
==11711== by 0x1E0CDBD8: g_malloc (gmem.c:94)
==11711== by 0x1E0E696E: g_strdup (gstrfuncs.c:363)
==11711== by 0x695693: migration_instance_init (migration.c:2226)
==11711== by 0x717C4B: object_init_with_type (object.c:344)
==11711== by 0x717E80: object_initialize_with_type (object.c:375)
==11711== by 0x7182EB: object_new_with_type (object.c:483)
==11711== by 0x718328: object_new (object.c:493)
==11711== by 0x4B8A29: qmp_device_list_properties (qmp.c:542)
==11711== by 0x4A9561: qmp_marshal_device_list_properties (qmp-marshal.c:1425)
==11711== by 0x819D4A: do_qmp_dispatch (qmp-dispatch.c:104)
==11711== by 0x819E82: qmp_dispatch (qmp-dispatch.c:131)
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-Id: <20170801160419.14180-1-marcandre.lureau@redhat.com>
Reviewed-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Juan Quintela <quintela@redhat.com>
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
pc, acpi, virtio: fixes, test speedup for rc1
Some fixes all over the place. Notably vhost-user gained a new message
to set endian-ness. Borderline for 2.10 but seems to be the only way to
fix legacy guests. Also pc tests are run on kvm now. Not a fix at all
but doesn't touch qemu itself, so I merged it since I had to run these a
lot and I just got tired of waiting for these to finish.
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
# gpg: Signature made Tue 01 Aug 2017 22:36:47 BST
# gpg: using RSA key 0x281F0DB8D28D5469
# gpg: Good signature from "Michael S. Tsirkin <mst@kernel.org>"
# gpg: aka "Michael S. Tsirkin <mst@redhat.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 0270 606B 6F3C DF3D 0B17 0970 C350 3912 AFBE 8E67
# Subkey fingerprint: 5D09 FD08 71C8 F85B 94CA 8A0D 281F 0DB8 D28D 5469
* remotes/mst/tags/for_upstream:
pc: acpi: force FADT rev1 for 440fx based machine types
pc: make 'pc.rom' readonly when machine has PCI enabled
vhost-user: fix watcher need be removed when vhost-user hotplug
tests/bios-tables-test: Compiler warning fix
accel: cleanup error output
intel_iommu: use access_flags for iotlb
intel_iommu: fix iova for pt
vhost-user: fix legacy cross-endian configurations
vhost: fix a memory leak
tests: switch pxe and vm gen id tests to use kvm
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
w2k used to boot on QEMU until revision of FADT has
been bumped to rev3
(commit 77af8a2b hw/i386: Use Rev3 FADT (ACPI 2.0) instead of Rev1 to improve guest OS support.)
Keep PC machine at rev1 to remain compatible and Q35
at rev3 where w2k isn't supported anyway so OSX could
run as well.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Tested-by: John Arbuckle <programmingkidx@gmail.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
looking at bios ROM mapping in QEMU it seems that only isapc
(i.e. not PCI enabled machine) requires ROM being mapped as
RW in other cases BIOS is mapped as RO. Do the same for option
ROM 'pc.rom' when machine has PCI enabled.
As useful side-effect pc.rom MemoryRegion stops being
put in vhost memory map (filtered out by vhost_section()),
which reduces number of entries by 1.
Coincidentally it fixes migration failure reported in
"[PATCH V2] vhost: fix a migration failed because of vhost region merge"
where following destination CLI with /sys/module/vhost/parameters/max_mem_regions = 8
export DIMMSCOUNT=6
QEMU -enable-kvm \
-netdev type=tap,id=guest0,vhost=on,script=no,vhostforce \
-device virtio-net-pci,netdev=guest0 \
-m 256,slots=256,maxmem=2G \
`i=0; while [ $i -lt $DIMMSCOUNT ]; do echo \
"-object memory-backend-ram,id=m$i,size=128M \
-device pc-dimm,id=d$i,memdev=m$i"; i=$(($i + 1)); \
done`
will fail to startup with error:
"-device pc-dimm,id=d5,memdev=m5: a used vhost backend has no free memory slots left"
while it's possible to add the 6th DIMM during hotplug
on source.
Issue is caused by the fact that number of entries in vhost map
is bigger on 1 entry, when -device is processed, than
after guest boots up, and that offending entry belongs to
'pc.rom', it's not like vhost intends to do IO in ROM range
so making it RO hides region from vhost and makes number
of entries in vhost memory map at -device/machine_done time
match number of entries after guest boots.
Signed-off-by: Igor Mammedov <imammedo@redhat.com>
Reported-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
"nc" is freed after hotplug vhost-user, but the watcher is not removed.
The QEMU crash when the watcher access the "nc" when socket disconnects.
Program received signal SIGSEGV, Segmentation fault.
#0 object_get_class (obj=obj@entry=0x2) at qom/object.c:750
#1 0x00007f9bb4180da1 in qemu_chr_fe_disconnect (be=<optimized out>) at chardev/char-fe.c:372
#2 0x00007f9bb40d1100 in net_vhost_user_watch (chan=<optimized out>, cond=<optimized out>, opaque=<optimized out>) at net/vhost-user.c:188
#3 0x00007f9baf97f99a in g_main_context_dispatch () from /usr/lib64/libglib-2.0.so.0
#4 0x00007f9bb41d7ebc in glib_pollfds_poll () at util/main-loop.c:213
#5 os_host_main_loop_wait (timeout=<optimized out>) at util/main-loop.c:261
#6 main_loop_wait (nonblocking=nonblocking@entry=0) at util/main-loop.c:515
#7 0x00007f9bb3e266a7 in main_loop () at vl.c:1917
#8 main (argc=<optimized out>, argv=<optimized out>, envp=<optimized out>) at vl.c:4786
Signed-off-by: Yunjian Wang <wangyunjian@huawei.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
gcc 7.1.1 in fedora 26 moans about the:
tables = g_new0(uint32_t, tables_nr)
because it can't convince itself that tables_nr is positive.
This is fallout from g_assert_cmpint no longer necessarily being
no-return; replace it with a plain g_assert.
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Marcel Apfelbaum <marcel@redhat.com>
Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg
accel=kvm:tcg
doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a
accel=kvm
prints a warning, since no fallback is given.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
It was cached by read/write separately. Let's merge them.
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
IOMMUTLBEntry.iova is returned incorrectly on one PT path (though mostly
we cannot really trigger this path, even if we do, we are mostly
disgarding this value, so it didn't break anything). Fix it by
converting the VTD_PAGE_MASK into the correct definition
VTD_PAGE_MASK_4K, then remove VTD_PAGE_MASK.
Fixes: b93130 ("intel_iommu: cleanup vtd_{do_}iommu_translate()")
Signed-off-by: Peter Xu <peterx@redhat.com>
Reviewed-by: Michael S. Tsirkin <mst@redhat.com>
Signed-off-by: Michael S. Tsirkin <mst@redhat.com>
Currently, vhost-user does not implement any means for notifying the
backend about guest endianess. This commit introduces a new message
called VHOST_USER_SET_VRING_ENDIAN which is analogous to the ioctl()
called VHOST_SET_VRING_ENDIAN used for kernel vhost backends. Such
message is necessary for backends supporting legacy (pre-1.0) virtio
devices running in big-endian guests.
Signed-off-by: Felipe Franciosi <felipe@nutanix.com>
Signed-off-by: Mike Cui <cui@nutanix.com>
When skipping implicit nodes in bdrv_block_device_info(), we know that
bs0 is always non-NULL; initially, because it's taken from a BdrvChild
and a BdrvChild never has a NULL bs, and after the first iteration
because implicit nodes always have a backing file.
Remove the NULL check and add an assertion that the implicit node does
indeed have a backing file.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
qemu-iotests 059 left a whole lot of image files behind in the scratch
directory because VMDK creates additional files for extents and cleaning
them up requires the original image intact (it parses qemu-img info
output to find all extent files), but the image overwrote it many times
like it works for all other image formats.
In addition, _use_sample_img overwrites the TEST_IMG variable, causing
new images created afterwards to reuse the name of the sample file
rather than the usual t.IMGFMT.
This patch adds an intermediate _cleanup_test_img after each subtest
that created an image file with additional extent files, and also after
each use of a sample image. _cleanup_test_img is also changed so that it
resets TEST_IMG after a sample image is cleaned up.
Note that this test was failing before this commit and continues to do
so after it. This failure was introduced in commit 9877860 ('block/vmdk:
Report failures in vmdk_read_cid()') and needs to be dealt with
separately.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 063 left t.raw.raw1 behind in the scratch directory because
it used the wrong suffix. Make sure to clean it up after completing the
test.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 162 left qemu-nbd.pid behind in the scratch directory, and
potentially a file called '42' in the current directory. Make sure to
clean it up after completing the tests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 153 left t.qcow2.c behind in the scratch directory. Make
sure to clean it up after completing the tests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 141 attempted to use brace expansion to remove all images
with a single command. However, for this to work, the braces shouldn't
be quoted.
With this fix, the tests correctly cleans up its scratch images.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 074 and 179 left a blkdebug.conf behind in the scratch
directory. Make sure to clean up after completing the tests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qemu-iotests 041 left quorum_snapshot.img and target.img behind in the
scratch directory. Make sure to clean up after completing the tests.
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
bdrv_open_driver() is called in two places, bdrv_new_open_driver() and
bdrv_open_common(). In the latter, failure cleanup in is in its caller,
bdrv_open_inherit(), which unrefs the bs->file of the failed driver open
if it exists.
Let's move the bs->file cleanup to bdrv_open_driver() to take care of
all callers and do not set bs->drv to NULL unless the driver's open
function failed. When bs is destroyed by removing its last reference, it
calls bdrv_close() which checks bs->drv to perform the needed cleanups
and also call the driver's close function. Since it cleans up options
and opaque we must take care not leave dangling pointers.
The error paths in bdrv_open_driver() are now two:
If open fails, drv->bdrv_close() should not be called. Unref the child
if it exists, free what we allocated and set bs->drv to NULL. Return the
error and let callers free their stuff.
If open succeeds but we fail after, return the error and let callers
unref and delete their bs, while cleaning up their allocations.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In some error paths it is possible to QDECREF a freed dangling
explicit_options, resulting in a heap overflow crash. For example
bdrv_open_inherit()'s fail unrefs it, then calls bdrv_unref which calls
bdrv_close which also unrefs it.
Signed-off-by: Manos Pitsidianakis <el13635@mail.ntua.gr>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
The new test 190 ensures we don't regress back to an infinite loop when
measuring the size of a 2T+ qcow2 image. I did not append to test 178,
because that test is also designed to run with format 'raw'; also, this
gives us some coverage of the measure command under the quick group.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We had a bug for multiple releases where dirty-bitmap count was
documented in bytes but reported in sectors; enhance the testsuite
to add coverage of DirtyBitmapInfo to ensure we do not regress again.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Without redirecting qemu's stderr to stdout, _filter_qemu will not apply
to warnings. This results in $QEMU_PROG not being replaced by QEMU_PROG
which is not great if your qemu executable is not called
qemu-system-x86_64 (e.g. qemu-system-i386).
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
On one hand, the _make_test_img invocation for creating the target image
was missing a -u because its backing file is not supposed to exist at
that point.
On the other hand, nobody noticed probably because the backing file is
created later on and _cleanup failed to remove it: The quotation marks
were misplaced so bash tried to delete a file literally called
"$TEST_IMG{,.target}..." instead of performing brace expansion. Thus, the
files stayed around after the first run and qemu-img create did not
complain about a missing backing file on any run but the first.
Signed-off-by: Max Reitz <mreitz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
In some cases, the guest can observe the wrong ordering of UIP and
interrupts. This can happen if the VCPU exit is timed like this:
iothread VCPU
... wait for interrupt ...
t-100ns read register A
t wake up, take BQL
t+100ns update_in_progress
return false
return UIP=0
trigger interrupt
The interrupt is late; the VCPU expected the falling edge of UIP to
happen after the interrupt. update_in_progress is already trying to
cover this case by latching UIP if the timer is going to fire soon,
and the fix is documented in the commit message for commit 56038ef623
("RTC: Update the RTC clock only when reading it", 2012-09-10). It
cannot be tested with qtest, because its timing of interrupts vs. reads
is exact.
However, the implementation was incorrect because UIP cmos_ioport_read
cleared register A instead of leaving that to rtc_update_timer. Fixing
the implementation of cmos_ioport_read to match the commit message,
however, breaks the "uip-stuck" test case from the previous patch.
To fix it, skip update timer optimizations if UIP has been latched.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Qemu_savevm_state_cleanup takes about 300ms in my ram migration tests
with a 8U24G vm(20G is really occupied), the main cost comes from
KVM_SET_USER_MEMORY_REGION ioctl when mem.memory_size = 0 in
kvm_set_user_memory_region. In kmod, the main cost is
kvm_zap_obsolete_pages, which traverses the active_mmu_pages list to
zap the unsync sptes.
It can be optimized by delaying memory_global_dirty_log_stop to the next
vm_start.
Changes v2->v3:
- NULL VMChangeStateHandler if it is deleted and protect the scenario
of nested invocations of memory_global_dirty_log_start/stop [Paolo]
Changes v1->v2:
- create a VMChangeStateHandler in memory.c to reduce the coupling [Paolo]
Signed-off-by: Jay Zhou <jianjay.zhou@huawei.com>
Message-Id: <1501237733-2736-1-git-send-email-jianjay.zhou@huawei.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Clang static analyzer reports a memory leak. Actually, the allocated
memory escapes here:
record->attribute_list[record->attributes].pair = data;
but clang is correct that the memory might leak if len is zero. We
know it isn't; assert that it is the case.
The craziness doesn't end there. The memory is freed by
bt_l2cap_sdp_close_ch:
g_free(sdp->service_list[i].attribute_list->pair);
which actually should have been written like this:
g_free(sdp->service_list[i].attribute_list[0].pair);
The attribute_list is sorted with qsort; but indeed the first
entry of attribute_list should point to "data" even after the qsort,
because the first record has id SDP_ATTR_RECORD_HANDLE, whose
numeric value is zero.
But hang on. The qsort function is
static int sdp_attributeid_compare(
const struct sdp_service_attribute_s *a,
const struct sdp_service_attribute_s *b)
{
return (int) b->attribute_id - a->attribute_id;
}
but no one ever writes attribute_id. So it only works if qsort is
stable, and who knows what else is broken, but we can fix it by
setting attribute_id in the while loop.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Commit 04bf2526ce (exec: use
qemu_ram_ptr_length to access guest ram) start using qemu_ram_ptr_length
instead of qemu_map_ram_ptr, but when used with Xen, the behavior of
both function is different. They both call xen_map_cache, but one with
"lock", meaning the mapping of guest memory is never released
implicitly, and the second one without, which means, mapping can be
release later, when needed.
In the context of address_space_{read,write}_continue, the ptr to those
mapping should not be locked because it is used immediatly and never
used again.
The lock parameter make it explicit in which context qemu_ram_ptr_length
is called.
Signed-off-by: Anthony PERARD <anthony.perard@citrix.com>
Message-Id: <20170726165326.10327-1-anthony.perard@citrix.com>
Reviewed-by: Stefano Stabellini <sstabellini@kernel.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
qemu call kvm_get_vcpu_events, and kernel return sipi_vector always
0, never valid when reporting to user space. But when qemu calls
kvm_put_vcpu_events will make sipi_vector in kernel be 0. This will
accidently modify sipi_vector when sipi_vector in kernel is not 0.
Signed-off-by: Peng Hao <peng.hao2@zte.com.cn>
Reviewed-by: Liu Yi <liu.yi24@zte.com.cn>
Message-Id: <1500047256-8911-1-git-send-email-peng.hao2@zte.com.cn>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
The deprecation of features in QEMU is totally adhoc currently,
with no way for the user to get a list of what is deprecated
in each release. This adds an appendix to the doc that records
when each deprecation was made and provides text explaining
what to use instead, if anything.
Since there has been no formal policy around removal of deprecated
features in the past, any deprecations prior to 2.10.0 are to be
treated as if they had been made at the 2.10.0 release. Thus the
earliest that existing deprecations will be deleted is the start
of the 2.12.0 cycle.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-Id: <20170725113638.7019-1-berrange@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Only emit "XXX accelerator not found", if there are not
further accelerators listed. eg
accel=kvm:tcg
doesn't print a "KVM accelerator not found" warning
when it falls back to tcg, but a
accel=kvm
prints a warning, since no fallback is given.
Suggested-by: Daniel P. Berrange <berrange@redhat.com>
Suggested-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Message-Id: <20170717144527.24534-1-lvivier@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
There's a rare exit seg if the guest is accessing
IO during exit.
It's always hitting the atomic_inc(&bs->in_flight) with a NULL
bs. This was added recently in 99723548 but I don't see it
as the cause.
Flip vl.c around so we pause the cpus before closing the block devices,
that way we shouldn't have anything trying to access them when
they're gone.
This was originally Red Hat bz https://bugzilla.redhat.com/show_bug.cgi?id=1451015
Signed-off-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reported-by: Cong Li <coli@redhat.com>
--
This is a very rare race, I'll leave it running in a loop to see if
we hit anything else and to check this really fixes it.
I do worry if there are other cases that can trigger this - e.g.
hot-unplug or ejecting a CD.
Message-Id: <20170713190116.21608-1-dgilbert@redhat.com>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
We are malloc'ing a QString and spending CPU cycles on converting a
QObject to string, just for the sake of sticking the string in the trace
message. Wasted when we aren't tracing. Avoid that.
[Commit message and description suggested by Markus Armbruster to
provide more detail about the rationale for this patch.
Use trace_event_get_state_backends() instead of trace_event_get_state()
to honor DTrace/UST backend dstates.
--Stefan]
Signed-off-by: Denis V. Lunev <den@openvz.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170725143923.11241-1-den@openvz.org
CC: Stefan Hajnoczi <stefanha@redhat.com>
CC: Lluís Vilanova <vilanova@ac.upc.edu>
CC: Dr. David Alan Gilbert <dgilbert@redhat.com>
CC: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
The only exception are groups of numers separated by symbols
'.', ' ', ':', '/', like 'ab.09.7d'.
This patch is made by the following:
> find . -name trace-events | xargs python script.py
where script.py is the following python script:
=========================
#!/usr/bin/env python
import sys
import re
import fileinput
rhex = '%[-+ *.0-9]*(?:[hljztL]|ll|hh)?(?:x|X|"\s*PRI[xX][^"]*"?)'
rgroup = re.compile('((?:' + rhex + '[.:/ ])+' + rhex + ')')
rbad = re.compile('(?<!0x)' + rhex)
files = sys.argv[1:]
for fname in files:
for line in fileinput.input(fname, inplace=True):
arr = re.split(rgroup, line)
for i in range(0, len(arr), 2):
arr[i] = re.sub(rbad, '0x\g<0>', arr[i])
sys.stdout.write(''.join(arr))
=========================
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Acked-by: Cornelia Huck <cohuck@redhat.com>
Message-id: 20170731160135.12101-5-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
In trace format '#' flag of printf is forbidden. Fix it to '0x%'.
This patch is created by the following:
check that we have a problem
> find . -name trace-events | xargs grep '%#' | wc -l
56
check that there are no cases with additional printf flags before '#'
> find . -name trace-events | xargs grep "%[-+ 0'I]+#" | wc -l
0
check that there are no wrong usage of '#' and '0x' together
> find . -name trace-events | xargs grep '0x%#' | wc -l
0
fix the problem
> find . -name trace-events | xargs sed -i 's/%#/0x%/g'
[Eric Blake noted that xargs grep '%[-+ 0'I]+#' should be xargs grep
"%[-+ 0'I]+#" instead so the shell quoting is correct.
--Stefan]
Signed-off-by: Vladimir Sementsov-Ogievskiy <vsementsov@virtuozzo.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170731160135.12101-3-vsementsov@virtuozzo.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Code that checks dstate is unaware of SystemTap and LTTng UST dstate, so
the following trace event will not fire when solely enabled by SystemTap
or LTTng UST:
if (trace_event_get_state(TRACE_MY_EVENT)) {
str = g_strdup_printf("Expensive string to generate ...",
...);
trace_my_event(str);
g_free(str);
}
Add trace_event_get_state_backends() to fetch backend dstate. Those
backends that use QEMU dstate fetch it as part of
generate_h_backend_dstate().
Update existing trace_event_get_state() callers to use
trace_event_get_state_backends() instead.
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170731140718.22010-3-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
QEMU keeps track of trace event enabled/disabled state and provides
monitor commands to inspect and modify the "dstate". SystemTap and
LTTng UST maintain independent enabled/disabled states for each trace
event, the other backends rely on QEMU dstate.
Introduce a new per-event macro that combines backend-specific dstate
like this:
#define TRACE_MY_EVENT_BACKEND_DSTATE() ( \
QEMU_MY_EVENT_ENABLED() || /* SystemTap */ \
tracepoint_enabled(qemu, my_event) /* LTTng UST */ || \
false)
This will be used to extend trace_event_get_state() in the next patch.
[Daniel Berrange pointed out that QEMU_MY_EVENT_ENABLED() must be true
by default, not false. This way events will fire even if the DTrace
implementation does not implement the SystemTap semaphores feature.
Ubuntu Precise uses lttng-ust-dev 2.0.2 which does not have
tracepoint_enabled(), so we need a compatibility wrapper to keep Travis
builds passing.
--Stefan]
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 20170731140718.22010-2-stefanha@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
fixup! trace: add TRACE_<event>_BACKEND_DSTATE()
The simpletrace compatibility code for systemtap creates a
function and some global variables for mapping to event ID
numbers. We generate multiple -simpletrace.stp files though,
one per target and systemtap considers functions & variables
to be globally scoped, not per file. So if trying to use the
simpletrace compat probes, systemtap will complain:
# stap -e 'probe qemu.system.arm.simpletrace.visit_type_str { print( "hello")}'
semantic error: conflicting global variables: identifier 'event_name_to_id_map' at /usr/share/systemtap/tapset/qemu-aarch64-simpletrace.stp:3:8
source: global event_name_to_id_map
^
identifier 'event_name_to_id_map' at /usr/share/systemtap/tapset/qemu-system-arm-simpletrace.stp:3:8
source: global event_name_to_id_map
^
WARNING: cross-file global variable reference to identifier 'event_name_to_id_map' at /usr/share/systemtap/tapset/qemu-system-arm-simpletrace.stp:3:8 from: identifier 'event_name_to_id_map' at /usr/share/systemtap/tapset/qemu-aarch64-simpletrace.stp:8:21
source: if (!([name] in event_name_to_id_map)) {
^
WARNING: cross-file global variable reference to identifier 'event_next_id' at /usr/share/systemtap/tapset/qemu-system-arm-simpletrace.stp:4:8 from: identifier 'event_next_id' at :9:38
source: event_name_to_id_map[name] = event_next_id
^
We already have a string used to prefix probe names, so just
replace '.' with '_' to get a function / variable name prefix
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170728133657.5525-1-berrange@redhat.com
Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
target-arm queue:
* fix broken properties on MPS2 SCC device
* fix MPU trace handling of write vs exec
* fix MPU M profile bugs:
- not handling system space or PPB region correctly
- not resetting state
- not migrating MPU_RNR
# gpg: Signature made Mon 31 Jul 2017 13:21:40 BST
# gpg: using RSA key 0x3C2525ED14360CDE
# gpg: Good signature from "Peter Maydell <peter.maydell@linaro.org>"
# gpg: aka "Peter Maydell <pmaydell@gmail.com>"
# gpg: aka "Peter Maydell <pmaydell@chiark.greenend.org.uk>"
# Primary key fingerprint: E1A5 C593 CD41 9DE2 8E83 15CF 3C25 25ED 1436 0CDE
* remotes/pmaydell/tags/pull-target-arm-20170731:
hw/mps2_scc: fix incorrect properties
target/arm: Migrate MPU_RNR register state for M profile cores
target/arm: Move PMSAv7 reset into arm_cpu_reset() so M profile MPUs get reset
target/arm: Rename cp15.c6_rgnr to pmsav7.rnr
target/arm: Don't allow guest to make System space executable for M profile
target/arm: Don't do MPU lookups for addresses in M profile PPB region
target/arm: Correct MPU trace handling of write vs execute
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts commit bc658e4a2e.
Some versions of gcc warn about this:
linux-user/syscall.c: In function ‘do_ioctl_rt’:
linux-user/syscall.c:5577:37: error: ‘host_rt_dev_ptr’ may be used uninitialized in this function [-Werror=uninitialized]
and in particular the Travis builds fail; they use
gcc (Ubuntu/Linaro 4.6.3-1ubuntu5) 4.6.3.
Revert the change to fix the travis builds.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The PMSAv7 region number register is migrated for R profile
cores using the cpreg scheme, but M profile doesn't use
cpregs, and so we weren't migrating the MPU_RNR register state
at all. Fix that by adding a migration subsection for the
M profile case.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1501153150-19984-6-git-send-email-peter.maydell@linaro.org
When the PMSAv7 implementation was originally added it was for R profile
CPUs only, and reset was handled using the cpreg .resetfn hooks.
Unfortunately for M profile cores this doesn't work, because they do
not register any cpregs. Move the reset handling into arm_cpu_reset(),
where it will work for both R profile and M profile cores.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1501153150-19984-5-git-send-email-peter.maydell@linaro.org
Almost all of the PMSAv7 state is in the pmsav7 substruct of
the ARM CPU state structure. The exception is the region
number register, which is in cp15.c6_rgnr. This exception
is a bit odd for M profile, which otherwise generally does
not store state in the cp15 substruct.
Rename cp15.c6_rgnr to pmsav7.rnr accordingly.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1501153150-19984-4-git-send-email-peter.maydell@linaro.org
The M profile PMSAv7 specification says that if the address being looked
up is in the PPB region (0xe0000000 - 0xe00fffff) then we do not use
the MPU regions but always use the default memory map. Implement this
(we were previously behaving like an R profile PMSAv7, which does not
special case this).
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1501153150-19984-2-git-send-email-peter.maydell@linaro.org
Correct off-by-one bug in the PSMAv7 MPU tracing where it would print
a write access as "reading", an insn fetch as "writing", and a read
access as "execute".
Since we have an MMUAccessType enum now, we can make the code clearer
in the process by using that rather than the raw 0/1/2 values.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Message-id: 1500906792-18010-1-git-send-email-peter.maydell@linaro.org
When this file was rewritten/renamed in fdee2025dd,
a reference path was not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs/ to docs/devel/ on ac06724a71,
a reference path was not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs/ to docs/devel/ on ac06724a71,
no references were updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs/ to docs/devel/ on ac06724a71,
a couple of references were not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs to docs/interop on ac06724a71,
a couple of references were not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs to docs/interop on d59157ea05,
a reference path was not updated.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Kevin Wolf <kwolf@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
With the move of some docs to docs/interop on d59157e, a couple of
references were not updated.
Signed-off-by: Cleber Rosa <crosa@redhat.com>
[PMD: fixed a typo and another reference of docs/interop/qmp-spec.txt]
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
thunk.c:91:32: warning: Call to 'malloc' has an allocation size of 0 bytes
se->field_offsets[i] = malloc(nb_fields * sizeof(int));
^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
linux-user/syscall.c:1627:35: warning: 1st function call argument is an uninitialized value
target_saddr->sa_family = tswap16(addr->sa_family);
^~~~~~~~~~~~~~~~~~~~~~~~
linux-user/syscall.c:1629:25: warning: The left operand of '==' is a garbage value
if (addr->sa_family == AF_NETLINK && len >= sizeof(struct sockaddr_nl)) {
~~~~~~~~~~~~~~~ ^
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
not hit since 2009! :)
linux-user/elfload.c:1102:20: warning: Out of bound memory access (access exceeds upper limit of memory block)
(*regs[i]) = tswap32(env->gregs[i]);
~~~~~~~~~~~^~~~~~~~~~~~~~~~~~~~~~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
So we have sizeof(struct in6_address) != sizeof(uintptr_t)
and Clang > Coverity on this, see 4555ca6816 :)
net/eth.c:426:30: warning: The code calls sizeof() on a pointer type. This can produce an unexpected result
return bytes_read == sizeof(dst_addr);
^ ~~~~~~~~~~
net/eth.c:475:34: warning: The code calls sizeof() on a pointer type. This can produce an unexpected result
return bytes_read == sizeof(src_addr);
^ ~~~~~~~~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Dmitry Fleytman <dmitry@daynix.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Extract the (correct) cleaning code as a new function vnc_free_addresses() then
use it to remove the memory leaks.
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Screwed up in commit 3a55fc0f, v2.6.0.
If qemu_chr_fe_read_all() returns -EINTR the do {} statement continues and the
n accumulator used to complete reads upto sizeof(msg) is decremented by 4 (the
value of EINTR on Linux).
To avoid that, use simpler if() statements and continue if EINTR occured.
hw/misc/ivshmem.c:650:14: warning: Loss of sign in implicit conversion
} while (n < sizeof(msg));
^
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
since a negative value means it errored.
hw/core/loader.c:149:9: warning: Loss of sign in implicit conversion
if (size > max_sz) {
^~~~
hw/core/loader.c:171:9: warning: Loss of sign in implicit conversion
if (size > memory_region_size(mr)) {
^~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@xilinx.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This allow a one liner from fresh repository clone, i.e.:
./configure && make -j check-qtest-aarch64
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Starting Qemu with "qemu-system-tricore -nographic -M tricore_testboard -S"
and entering "x 0" at the monitor prompt leads to Segmentation fault.
This happens because tricore_cpu_get_phys_page_debug() is not implemented
yet, this is a temporary workaround to avoid the crash.
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
If slirp is disabled, it will fail with:
qemu-system-x86_64: -netdev user,id=qtest-bn0: Parameter 'type' expects a netdev backend type
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Currently get_maintainers.pl claims that the configure script is
maintained by Kamil:
$ scripts/get_maintainer.pl -f configure
Kamil Rytarowski <kamil@netbsd.org> (maintainer:NETBSD)
qemu-devel@nongnu.org (open list:All patches CC here)
This happens because the regex pattern for the NETBSD entry triggers
on everything that contains the keyword "NetBSD". Ease the situation
a little bit by restricting this to "Subject:" lines only, like
we do it in the "trivial patches" section already.
Reported-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Starting qemu-system-unicore32 without the -kernel parameter results in
an assert() returns false and aborts qemu. This patch replaces it with a
proper error message followed by exit(1).
Signed-off-by: Eduardo Otubo <otubo@redhat.com>
Tested-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
user_creatable_add_opts() returns a reference (the other reference is
for the root parent/child link).
Leak introduced in commit a1af255f06.
Signed-off-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
This reverts commit b87680427e.
I thought this was a harmless preliminary for XIVE enablement patches
we expect later on. However, due to some subtle interactions between
qemu and SLOF (guest firmware) this breaks some things. Revert it for
now, we'll work out how to fix it when the rest of the XIVE patches
are ready.
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
If object_property_add_alias() returns an error in realize(), we should
propagate it to the caller and certainly not unref the DRC.
Same thing goes for unrealize(). Since object_property_del() is the last
call, we can even get rid of the intermediate Error *.
And finally, unrealize() should undo all registrations performed by
realize().
Signed-off-by: Greg Kurz <groug@kaod.org>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
MIPS patches 2017-07-28
Changes:
* Improve ths MIPS board kernel load error reporting
* Revert unnecessary warning messages
# gpg: Signature made Fri 28 Jul 2017 13:47:52 BST
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170728:
Revert "elf-loader: warn about invalid endianness"
hw/mips: load_elf_strerror to report kernel loading failure
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
This reverts c8e1158cf6 "elf-loader: warn about invalid endianness"
as it produces a useless message every time an LE kernel image is
passed via -kernel on a ppc64-pseries machine. The pseries machine
already checks for ELF_LOAD_WRONG_ENDIAN and tries with big_endian=0.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Emulated MIPS boards bail out with a simple "could not load kernel" when
a kernel could not be load, without specifying the underlying reason.
Fix that by calling load_elf_strerror.
At the same time use error_report to report the error instead of
fprintf.
Signed-off-by: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
The SPICE input code is currently detcting 0xe1 0x1d 0x45 as
the PAUSE key make sequence and 0xe1 0x9d 0xc5 as the break
sequence. This is incorrect, because all 6 scancodes together
are the make sequence, and there is no break sequence.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170727174640.30359-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
According to the PoP bit positions 0-3 and 8-32 of the format-1 CCW must
contain zeros. Bits 0-3 are already covered by cmd_code validity
checking, and bit 32 is covered by the CCW address checking.
Bits 8-31 correspond to CCW1.flags and CCW1.count. Currently we only
check for the absence of certain flags. Let's fix this.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170725224442.13383-3-pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
[CH: tweaked comment]
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
According to the PoP channel command words (CCW) must be doubleword
aligned and 31 bit addressable for format 1 and 24 bit addressable for
format 0 CCWs.
If the channel subsystem encounters a ccw address which does not satisfy
this alignment requirement a program-check condition is recognised.
The situation with 31 bit addressable is a bit more complicated: both the
ORB and a format 1 CCW TIC hold the address of (the rest of) the channel
program, that is the address of the next CCW in a word, and the PoP
mandates that bit 0 of that word shall be zero -- or a program-check
condition is to be recognized -- and does not belong to the field holding
the ccw address.
Since in code the corresponding fields span across the whole word (unlike
in PoP where these are defined as 31 bit wide) we can check this by
applying a mask. The 24 addressable case isn't affecting TIC because the
address is composed of a halfword and a byte portion (no additional zero
bit requirements) and just slightly complicates the ORB case where also
bits 1-7 need to be zero.
The same requirements (especially n-bit addressability) apply to the
ccw addresses generated while chaining.
Let's make our CSS implementation follow the AR more closely.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Message-Id: <20170727154842.23427-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The processing of the scancodes for PAUSE/BREAK has been broken since
the conversion to qcodes in:
commit 8c10e0baf0
Author: Hervé Poussineau <hpoussin@reactos.org>
Date: Thu Sep 15 22:06:26 2016 +0200
ps2: use QEMU qcodes instead of scancodes
When using a VNC client, with the raw scancode extension, the client
will send a scancode of 0xc6 for both PAUSE and BREAK. There is mistakenly
no entry in the qcode_to_number table for this scancode, so
ps2_keyboard_event() just generates a log message and discards the
scancode
When using a SPICE client, it will also send 0xc6 for BREAK, but
will send 0xe1 0x1d 0x45 0xe1 0x9d 0xc5 for PAUSE. There is no
entry in the qcode_to_number table for the scancode 0xe1 because
it is a special XT keyboard prefix not mapping to any QKeyCode.
Again ps2_keyboard_event() just generates a log message and discards
the scancode. The following 0x1d, 0x45, 0x9d, 0xc5 scancodes get
handled correctly. Rather than trying to handle 3 byte sequences
of scancodes in the PS/2 driver, special case the SPICE input
code so that it captures the 3 byte pause sequence and turns it
into a Pause QKeyCode.
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170727113243.23991-1-berrange@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Page-up and Page-down were renamed. Add the names to the keysym list
so we can parse both old and new names. The keypad versions are already
present in the vnc map.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 20170726152918.11995-2-kraxel@redhat.com
x86 bug fix for -rc1
Fix for a bug in "-cpu max" that breaks libvirt usage of
query-cpu-model-expansion.
# gpg: Signature made Wed 26 Jul 2017 19:35:28 BST
# gpg: using RSA key 0x2807936F984DC5A6
# gpg: Good signature from "Eduardo Habkost <ehabkost@redhat.com>"
# Primary key fingerprint: 5A32 2FD5 ABC4 D3DB ACCF D1AA 2807 936F 984D C5A6
* remotes/ehabkost/tags/x86-pull-request:
target/i386: Don't use x86_cpu_load_def() on "max" CPU model
target/i386: Define CPUID_MODEL_ID_SZ macro
target/i386: Use host_vendor_fms() in max_x86_cpu_initfn()
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When commit 0bacd8b304 ('i386: Don't set CPUClass::cpu_def on
"max" model') removed the CPUClass::cpu_def field, we kept using
the x86_cpu_load_def() helper directly in max_x86_cpu_initfn(),
emulating the previous behavior when CPUClass::cpu_def was set.
However, x86_cpu_load_def() is intended to help initialization of
CPU models from the builtin_x86_defs table, and does lots of
other steps that are not necessary for "max".
One of the things x86_cpu_load_def() do is to set the properties
listed at tcg_default_props/kvm_default_props. We must not do
that on the "max" CPU model, otherwise under KVM we will
incorrectly report all KVM features as always available, and the
"svm" feature as always unavailable. The latter caused the bug
reported at:
https://bugzilla.redhat.com/show_bug.cgi?id=1467599
("Unable to start domain: the CPU is incompatible with host CPU:
Host CPU does not provide required features: svm")
Replace x86_cpu_load_def() with simple object_property_set*()
calls. In addition to fixing the above bug, this makes the KVM
branch in max_x86_cpu_initfn() very similar to the existing TCG
branch.
For reference, the full list of steps performed by
x86_cpu_load_def() is:
* Setting min-level and min-xlevel. Already done by
max_x86_cpu_initfn().
* Setting family/model/stepping/model-id. Done by the code added
to max_x86_cpu_initfn() in this patch.
* Copying def->features. Wrong because "-cpu max" features need to
be calculated at realize time. This was not a problem in the
current code because host_cpudef.features was all zeroes.
* x86_cpu_apply_props() calls. This causes the bug above, and
shouldn't be done.
* Setting CPUID_EXT_HYPERVISOR. Not needed because it is already
reported by x86_cpu_get_supported_feature_word(), and because
"-cpu max" features need to be calculated at realize time.
* Setting CPU vendor to host CPU vendor if on KVM mode.
Redundant, because max_x86_cpu_initfn() already sets it to the
host CPU vendor.
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
Message-Id: <20170712162058.10538-5-ehabkost@redhat.com>
Reviewed-by: Igor Mammedov <imammedo@redhat.com>
Signed-off-by: Eduardo Habkost <ehabkost@redhat.com>
hw/vfio/pci.c:308:29: warning: Use of memory after it is freed
qemu_set_fd_handler(*pfd, NULL, NULL, vdev);
^~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
free the data _after_ using it.
hw/vfio/platform.c:126:29: warning: Use of memory after it is freed
qemu_set_fd_handler(*pfd, NULL, NULL, NULL);
^~~~
Reported-by: Clang Static Analyzer
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Williamson <alex.williamson@redhat.com>
Fix leak of the 'encryptopts' string, which was mistakenly
declared const.
Fix leak of QemuOpts entry which should not have been deleted
from the opts array.
Reported by: coverity
Signed-off-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 20170714103105.5781-1-berrange@redhat.com
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Max Reitz <mreitz@redhat.com>
The sm501 device uses vmstate_register_ram_global() to register its
memory region for migration. This means it gets a name that is
assumed to be global to the whole system, which in turn means that if
you create two of the device we assert because of the duplication:
qemu-system-ppc -device sm501 -device sm501
RAMBlock "sm501.local" already registered, abort!
Aborted (core dumped)
Changing this to just use memory_region_init_ram()'s automatic
registration of the memory region with a device-local name fixes
this. The downside is that it breaks migration compatibility, but
luckily we only added migration support to this device in the 2.10
release cycle so we haven't released a QEMU version with the broken
implementation.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Message-id: 1500309462-12792-1-git-send-email-peter.maydell@linaro.org
Various changes for the s390x code:
- updates for cpu model handling
- fix compilation with --disable-tcg
- fixes in vfio-ccw and I/O instruction handling
# gpg: Signature made Tue 25 Jul 2017 10:15:37 BST
# gpg: using RSA key 0xDECF6B93C6F02FAF
# gpg: Good signature from "Cornelia Huck <conny@cornelia-huck.de>"
# gpg: aka "Cornelia Huck <huckc@linux.vnet.ibm.com>"
# gpg: aka "Cornelia Huck <cornelia.huck@de.ibm.com>"
# gpg: aka "Cornelia Huck <cohuck@kernel.org>"
# Primary key fingerprint: C3D0 D66D C362 4FF6 A8C0 18CE DECF 6B93 C6F0 2FAF
* remotes/cohuck/tags/s390x-20170725:
s390x/css: fix ilen in IO instruction handlers
target/s390x: Add remaining switches to compile with --disable-tcg
target/s390x: Move exception-related functions to a new excp_helper.c file
target/s390x: Rework program_interrupt() and related functions
target/s390x: Move diag helpers to a separate file
target/s390x: Move s390_cpu_dump_state() to helper.c
target/s390x: improve baselining if certain base features are missing
s390x/kvm: better comment regarding zPCI feature availability
target/s390x: introduce (test|set)_be_bit
target/s390x: indicate query subfunction in s390_fill_feat_block
target/s390x: drop BE_BIT()
s390/cpumodel: remove KSS from the default model of z14
vfio/ccw: fix initialization of the Object DeviceState pointer in the common base-device
vfio/ccw: allocate irq info with the right size
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
ppc patch queue 2017-07-25
Last pull request for the 2.10 hard freeze, and correspondingly small.
There are a handful of bugfixes here plus an update for the "pseries"
guest firmware (SLOF).
This is later than ideal for a guest firmware update. However, this
does include a number of fixes in that guest firmware, so I think it's
worth the risk of squeezing this in just before the hard freeze.
# gpg: Signature made Tue 25 Jul 2017 06:43:14 BST
# gpg: using RSA key 0x6C38CACA20D9B392
# gpg: Good signature from "David Gibson <david@gibson.dropbear.id.au>"
# gpg: aka "David Gibson (Red Hat) <dgibson@redhat.com>"
# gpg: aka "David Gibson (ozlabs.org) <dgibson@ozlabs.org>"
# gpg: aka "David Gibson (kernel.org) <dwg@kernel.org>"
# Primary key fingerprint: 75F4 6586 AE61 A66C C44E 87DC 6C38 CACA 20D9 B392
* remotes/dgibson/tags/ppc-for-2.10-20170725:
pseries: Update SLOF firmware image
spapr: Fix QEMU abort during memory unplug
spapr/htab: fix savevm
spapr_pci: Fix obsolete comment about MSIX encoding in addr/data
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
When initiating a program check interruption by calling program_interrupt
the instruction length (ilen) of the current instruction is supplied as
the third parameter.
On s390x all the IO instructions are of instruction format S and their
ilen is 4. The calls to program_interrupt (introduced by commits
7b18aad543 ("s390: Add channel I/O instructions.", 2013-01-24) and
61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO
instructions", 2013-06-21)) however use ilen == 2.
This is probably due to a confusion between ilen which specifies the
instruction length in bytes and ILC which does the same but in halfwords.
If kvm_enabled() this does not actually matter, because the ilen
parameter of program_interrupt is effectively unused.
Let's provide the correct ilen to program_interrupt.
Signed-off-by: Halil Pasic <pasic@linux.vnet.ibm.com>
Fixes: 7b18aad543 ("s390: Add channel I/O instructions.")
Fixes: 61bf0dcb2e ("s390x/ioinst: Add missing alignment checks for IO instructions")
Reviewed-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170724143452.55534-1-pasic@linux.vnet.ibm.com>
Reviewed-by: Dong Jia Shi <bjsdjshi@linux.vnet.ibm.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
These functions can not be compiled with --disable-tcg. But since we
need the other functions from helper.c in the non-tcg build, we can also
not simply remove helper.c from the non-tcg builds. Thus the problematic
functions have to be moved into a separate new file instead that we
can later omit in the non-tcg builds.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1500886370-14572-5-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
misc_helper.c won't be compiled with --disable-tcg anymore, but we
still need the program_interrupt() function in that case. Move it
to interrupt.c instead, and refactor it to re-use the code from
trigger_pgm_exception() (for TCG) and enter_pgmcheck() (for KVM,
which now got renamed to kvm_s390_program_interrupt() for
clarity).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1500886370-14572-4-git-send-email-thuth@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
translate.c can not be compiled with --disable-tcg, but we need
the s390_cpu_dump_state() in KVM-only builds, too. So let's move
that function to helper.c instead, which will also be compiled
when --disable-tcg has been specified.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Thomas Huth <thuth@redhat.com>
Message-Id: <1500886370-14572-2-git-send-email-thuth@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
There are certain features that we put into base models, but that are
not relevant for the actual search. The most famous example are
MSA subfunctions that might be disabled on certain real hardware out
there.
While the kvm host model detection will usually detect the correct model
on such machines (as it will in the common case not pass features to check
for into s390_find_cpu_def()), baselining will fall back to a quite old
model just because some MSA subfunctions are missing.
Let's improve that by ignoring lack of these features while performing
the search for a base model.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170720123721.12366-6-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
Using ordinary bitmap operations to set/test bits does not work properly
on architectures !s390x. Let's drop (test|set)_bit_inv and introduce
(test|set)_be_bit instead. These functions work on uint8_t array, not on
unsigned longs arrays and are for now only used in the context of
CPU features.
Signed-off-by: David Hildenbrand <david@redhat.com>
Message-Id: <20170720123721.12366-4-david@redhat.com>
Signed-off-by: Cornelia Huck <cohuck@redhat.com>
The main changes are:
- fixes in PCI bridges code;
- LUN>255 are allowed not in virtio-scsi.
The full list is:
> pci-scan: Fix pci-bridge-set-mem-base and pci-bridge-set-mem-limit
> pci: Avoid 32-bit prefetchable memory area if possible
> Remove unused functions ishexdigit and $cat-comma
> pci: Translate PCI addresses to host addresses at the end of map-in
> Define 'open' and 'close' words of the /aliases nodes right from the start
> virtio-scsi: Allow LUNs bigger than 255
> paflof: Silence gcc's -Warray-bounds warning for stack pointers
> board_qemu: move code out of fdt-fix-node-phandle
> board_qemu: drop unused values early in fdt-fix-node-phandle
> pci: Improve the pci-var-out debug function
> libhvcall: drop unused KVMPPC_H_REPORT_MC_ERR and KVMPPC_H_NMI_MCE defines
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 0cffce56 (hw/ppc/spapr.c: adding pending_dimm_unplugs to
sPAPRMachineState) introduced a new way to track pending LMBs of DIMM
device that is marked for removal. Since this commit we can hit the
assert in spapr_pending_dimm_unplugs_add() in the following situation:
- DIMM device removal fails as the guest doesn't allow the removal.
- Subsequent attempt to remove the same DIMM would hit the assert
as the corresponding sPAPRDIMMState is still part of the
pending_dimm_unplugs list.
Fix this by removing the assert and conditionally adding the
sPAPRDIMMState to pending_dimm_unplugs list only when it is not
already present.
Fixes: 0cffce56ae
Signed-off-by: Bharata B Rao <bharata@linux.vnet.ibm.com>
[dwg: Tweaked to avoid returning NULL when spapr_pending_dimm_unplugs_add()
does find an existing entry]
Reviewed-by: Daniel Barboza <danielhb@linux.vnet.ibm.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Commit 3a38429 ("spapr: Add a "no HPT" encoding to HTAB migration stream")
allows to migrate an empty HPT, but doesn't mark correctly the
end of the migration stream.
The end condition (value returned by htab_save_iterate())
should be 1, whereas in 3a38429 it returns 0.
The problem can be reproduced with QEMU monitor command "savevm":
the command never stops and the disk image grows without limit.
Fixes: 3a38429748
Signed-off-by: Laurent Vivier <lvivier@redhat.com>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
f1c2dc7c86 "spapr-pci: rework MSI/MSIX" (07/2013) changed MSIX encoding
but forgot to change the comment so this changes it.
Signed-off-by: Alexey Kardashevskiy <aik@ozlabs.ru>
Signed-off-by: David Gibson <david@gibson.dropbear.id.au>
Instead of migrating the flash by creating the memory region
with memory_region_init_ram_nomigrate() and then calling
vmstate_register_ram_global(), just use memory_region_init_ram(),
which now handles migration registration automatically.
This is a migration compatibility break for the integratorcp
board, because the RAM region's migration name changes to
include the device path. This is OK because we don't guarantee
migration compatibility for this board.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1500310341-28931-1-git-send-email-peter.maydell@linaro.org
The fsl-imx* boards accidentally forgot to register the ROM memory
regions for migration. This used to require a manual step of calling
vmstate_register_ram(), but following commits
1cfe48c1ce21..b08199c6fbea194 we can use memory_region_init_rom() to
have it do the migration for us.
This is a migration break, but the migration code currently does not
handle the case of having two RAM regions which were not registered
for migration, and so prior to this commit a migration load would
always fail with:
"qemu-system-arm: Length mismatch: 0x4000 in != 0x18000: Invalid argument"
NB: migration appears at this point to be broken for this board
anyway -- it succeeds but the destination hangs; probably some
device in the system does not yet support migration.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1500309775-18361-1-git-send-email-peter.maydell@linaro.org
Test cases 030, 041 and 055 used to sleep for a second after calling
block-job-pause to make sure that the block job had time to actually
get into paused state. We can instead poll its status and use that one
second only as a timeout.
The tests also slept a second for checking that the block jobs don't
make progress while being paused. Half a second is more than enough for
this.
These changes reduce the total time for the three tests by 25 seconds on
my laptop (from 155 seconds to 130).
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commits 0db832f and 6cdbceb introduced the automatic insertion of filter
nodes above the top layer of mirror and commit block jobs. The
assumption made there was that since libvirt doesn't do node-level
management of the block layer yet, it shouldn't be affected by added
nodes.
This is true as far as commands issued by libvirt are concerned. It only
uses BlockBackend names to address nodes, so any operations it performs
still operate on the root of the tree as intended.
However, the assumption breaks down when you consider query commands,
which return data for the wrong node now. These commands also return
information on some child nodes (bs->file and/or bs->backing), which
libvirt does make use of, and which refer to the wrong nodes, too.
One of the consequences is that oVirt gets wrong information about the
image size and stops the VM in response as long as a mirror or commit
job is running:
https://bugzilla.redhat.com/show_bug.cgi?id=1470634
This patch fixes the problem by hiding the implicit nodes created
automatically by the mirror and commit block jobs in the output of
query-block and BlockBackend-based query-blockstats as long as the user
doesn't indicate that they are aware of those nodes by providing a node
name for them in the QMP command to start the block job.
The node-based commands query-named-block-nodes and query-blockstats
with query-nodes=true still show all nodes, including implicit ones.
This ensures that users that are capable of node-level management can
still access the full information; users that only know BlockBackends
won't use these commands.
Cc: qemu-stable@nongnu.org
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Peter Krempa <pkrempa@redhat.com>
Reviewed-by: Max Reitz <mreitz@redhat.com>
Tested-by: Eric Blake <eblake@redhat.com>
We used MAX() instead of the intended MIN() when computing how many
sectors to view in the current loop iteration of qcow2_measure(),
and passed in a value of INT_MAX sectors instead of our more usual
limit of BDRV_REQUEST_MAX_SECTORS (the latter avoids 32-bit overflow
on conversion to bytes). For small files, the bug is harmless:
bdrv_get_block_status_above() clamps its *pnum answer to the BDS
size, regardless of any insanely larger input request. However, for
any file at least 2T in size, we can very easily end up going into an
infinite loop (the maximum of 0x100000000 sectors and INT_MAX is a
64-bit quantity, which becomes 0 when assigned to int; once nb_sectors
is 0, we never make progress).
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
We've been documenting the value in bytes since its introduction
in commit b9a9b3a4 (v1.3), where it was actually reported in bytes.
Commit e4654d2 (v2.0) then removed things from block/qapi.c, in
preparation for a rewrite to a list of dirty sectors in the next
commit 21b5683 in block.c, but the new code mistakenly started
reporting in sectors.
Fixes: https://bugzilla.redhat.com/1441460
CC: qemu-stable@nongnu.org
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
A run of './check -qcow2 -g quick' on my machine produced only
two tests that took longer than 5 seconds; 178 took 18, and
189 took 7. Remove them from the quick group.
Signed-off-by: Eric Blake <eblake@redhat.com>
Reviewed-by: John Snow <jsnow@redhat.com>
Reviewed-by: Jeff Cody <jcody@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Signed-off-by: Kevin Wolf <kwolf@redhat.com>
QAPI patches for 2017-07-18
# gpg: Signature made Mon 24 Jul 2017 12:40:56 BST
# gpg: using RSA key 0x3870B400EB918653
# gpg: Good signature from "Markus Armbruster <armbru@redhat.com>"
# gpg: aka "Markus Armbruster <armbru@pond.sub.org>"
# Primary key fingerprint: 354B C8B3 D7EB 2A6B 6867 4E5F 3870 B400 EB91 8653
* remotes/armbru/tags/pull-qapi-2017-07-18-v2:
migration: Use JSON null instead of "" to reset parameter to default
migration: Unshare MigrationParameters struct for now
migration: Add TODO comments on duplication of QAPI_CLONE()
migration: Clean up around tls_creds, tls_hostname
hmp: Clean up and simplify hmp_migrate_set_parameter()
block: Use JSON null instead of "" to disable backing file
tests/test-qobject-input-visitor: Drop redundant test
qapi: Introduce a first class 'null' type
qapi: Use QNull for a more regular visit_type_null()
qapi: Separate type QNull from QObject
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Clang 3.9 passes the CONFIG_AVX2_OPT configure test. However, the
supplied <cpuid.h> does not contain the bit_AVX2 define that we use
when detecting whether the routine can be enabled.
Introduce a qemu-specific header that uses the compiler's definition
of __cpuid et al, but supplies any missing bit_* definitions needed.
This avoids introducing any extra ifdefs to util/bufferiszero.c, and
allows quite a few to be removed from tcg/i386/tcg-target.inc.c.
Signed-off-by: Richard Henderson <rth@twiddle.net>
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 20170719044018.18063-1-rth@twiddle.net
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
migrate-set-parameters sets migration parameters according to is
arguments like this:
* Present means "set the parameter to this value"
* Absent means "leave the parameter unchanged"
* Except for parameters tls_creds and tls_hostname, "" means "reset
the parameter to its default value
The first two are perfectly normal: presence of the parameter makes
the command do something.
The third one overloads the parameter with a second meaning. The
overloading is *implicit*, i.e. it's not visible in the types. Works
here, because "" is neither a valid TLS credentials ID, nor a valid
host name.
Pressing argument values the schema accepts, but are semantically
invalid, into service to mean "reset to default" is not general, as
suitable invalid values need not exist. I also find it ugly.
To clean this up, we could add a separate flag argument to ask for
"reset to default", or add a distinct value to @tls_creds and
@tls_hostname. This commit implements the latter: add JSON null to
the values of @tls_creds and @tls_hostname, deprecate "".
Because we're so close to the 2.10 freeze, implement it in the
stupidest way possible: have qmp_migrate_set_parameters() rewrite null
to "" before anything else can see the null. The proper way to do it
would be rewriting "" to null, but that requires fixing up code to
work with null. Add TODO comments for that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Commit de63ab6 "migrate: Share common MigrationParameters struct"
reused MigrationParameters for the arguments of
migrate-set-parameters, with the following rationale:
It is rather verbose, and slightly error-prone, to repeat
the same set of parameters for input (migrate-set-parameters)
as for output (query-migrate-parameters), where the only
difference is whether the members are optional. We can just
document that the optional members will always be present
on output, and then share a common struct between both
commands. The next patch can then reduce the amount of
code needed on input.
I need to unshare them to correct a design flaw in a stupid, but
minimally invasive way, in the next commit. We can restore the
sharing when we redo that patch in a less stupid way. Add a suitable
TODO comment.
Note that I revert only the sharing part of commit de63ab6, not the
part that made the members of query-migrate-parameters' result
optional. The schema (and thus introspection) remains inaccurate for
query-migrate-parameters. If we decide not to restore the sharing, we
should revert that part, too.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
qmp_query_migrate_parameters() and qmp_migrate_set_parameters()
effectively duplicate QAPI_CLONE() inline. Add suitable TODO
comments.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Optional MigrationParameters members tls_creds and tls_hostname can't
actually be absent outside qmp_migrate_set_parameters() since commit
4af245d (v2.9.0).
Note that commit 4af245d reverted the part of commit de63ab6 (v2.8.0)
that made tls_creds and tls_hostname absent instead of "" in the value
of query-migrate-parameters, even though commit de63ab6 called that a
mistake. What a mess.
Drop the redundant tests for presence, and update documentation.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
The bulk of hmp_migrate_set_parameter()'s code sets one member of
MigrationParameters according to the command's arguments. It uses a
string visitor for integer and boolean members, but not for string and
size members. It calls visit_type_bool() right away, but delays
visit_type_int() some. The delaying requires a flag variable and a
bit of trickery: we set all integer members instead of just the one we
want, and rely on the has_FOOs to mask the unwanted ones.
Clean this up as follows. Don't delay calling visit_type_int(). Use
the string visitor for strings, too. This involves extra allocations
and cleanup, but doing them is simpler and cleaner than avoiding them.
Sadly, using the string visitor for sizes isn't possible, because it
defaults to Bytes rather than Mebibytes. Add a comment explaining
that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Dr. David Alan Gilbert <dgilbert@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
BlockdevRef is an alternate of BlockdevOptions (inline definition) and
str (reference to an existing block device by name). BlockdevRef
value "" is special: "no block device should be referenced." It's
actually interpreted that way in just one place: optional member
@backing of COW formats. Semantics:
* Present means "use this block device" as backing storage
* Absent means "default to the one stored in the image"
* Except "" means "don't use backing storage at all"
The first two are perfectly normal: when the parameter is absent, it
defaults to an implied value, but the value's meaning is the same.
The third one overloads the parameter with a second meaning. The
overloading is *implicit*, i.e. it's not visible in the types. Works
here, because "" is not a value block device ID.
Pressing argument values the schema accepts, but are semantically
invalid, into service to mean "do something else entirely" is not
general, as suitable invalid values need not exist. I also find it
ugly.
To clean this up, we could add a separate flag argument to suppress
@backing, or add a distinct value to @backing. This commit implements
the latter: add JSON null to the values of @backing, deprecate "".
Because we're so close to the 2.10 freeze, implement it in the
stupidest way possible: have qmp_blockdev_add() rewrite null to ""
before anything else can see the null. Works, because BlockdevRef
occurs only within arguments of blockdev-add. The proper way to do it
would be rewriting "" to null, preferably in a cleaner way, but that
requires fixing up code to work with null. Add a TODO comment for
that.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Acked-by: Kevin Wolf <kwolf@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
test_visitor_in_alternate() tests UserDefAlternate with inadmissible
input. test_visitor_in_fail_alternate() does basically the same.
Drop the former, keep the latter.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
I expect the 'null' type to be useful mostly for members of alternate
types.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Make visit_type_null() take an @obj argument like its buddies. This
helps keep the next commit simple.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Under certain circumstances normal xen-mapcache functioning may be broken
by guest's actions. This may lead to either QEMU performing exit() due to
a caught bad pointer (and with QEMU process gone the guest domain simply
appears hung afterwards) or actual use of the incorrect pointer inside
QEMU address space -- a write to unmapped memory is possible. The bug is
hard to reproduce on a i440 machine as multiple DMA sources are required
(though it's possible in theory, using multiple emulated devices), but can
be reproduced somewhat easily on a Q35 machine using an emulated AHCI
controller -- each NCQ queue command slot may be used as an independent
DMA source ex. using READ FPDMA QUEUED command, so a single storage
device on the AHCI controller port will be enough to produce multiple DMAs
(up to 32). The detailed description of the issue follows.
Xen-mapcache provides an ability to map parts of a guest memory into
QEMU's own address space to work with.
There are two types of cache lookups:
- translating a guest physical address into a pointer in QEMU's address
space, mapping a part of guest domain memory if necessary (while trying
to reduce a number of such (re)mappings to a minimum)
- translating a QEMU's pointer back to its physical address in guest RAM
These lookups are managed via two linked-lists of structures.
MapCacheEntry is used for forward cache lookups, while MapCacheRev -- for
reverse lookups.
Every guest physical address is broken down into 2 parts:
address_index = phys_addr >> MCACHE_BUCKET_SHIFT;
address_offset = phys_addr & (MCACHE_BUCKET_SIZE - 1);
MCACHE_BUCKET_SHIFT depends on a system (32/64) and is equal to 20 for
a 64-bit system (which assumed for the further description). Basically,
this means that we deal with 1 MB chunks and offsets within those 1 MB
chunks. All mappings are created with 1MB-granularity, i.e. 1MB/2MB/3MB
etc. Most DMA transfers typically are less than 1MB, however, if the
transfer crosses any 1MB border(s) - than a nearest larger mapping size
will be used, so ex. a 512-byte DMA transfer with the start address
700FFF80h will actually require a 2MB range.
Current implementation assumes that MapCacheEntries are unique for a given
address_index and size pair and that a single MapCacheEntry may be reused
by multiple requests -- in this case the 'lock' field will be larger than
1. On other hand, each requested guest physical address (with 'lock' flag)
is described by each own MapCacheRev. So there may be multiple MapCacheRev
entries corresponding to a single MapCacheEntry. The xen-mapcache code
uses MapCacheRev entries to retrieve the address_index & size pair which
in turn used to find a related MapCacheEntry. The 'lock' field within
a MapCacheEntry structure is actually a reference counter which shows
a number of corresponding MapCacheRev entries.
The bug lies in ability for the guest to indirectly manipulate with the
xen-mapcache MapCacheEntries list via a special sequence of DMA
operations, typically for storage devices. In order to trigger the bug,
guest needs to issue DMA operations in specific order and timing.
Although xen-mapcache is protected by the mutex lock -- this doesn't help
in this case, as the bug is not due to a race condition.
Suppose we have 3 DMA transfers, namely A, B and C, where
- transfer A crosses 1MB border and thus uses a 2MB mapping
- transfers B and C are normal transfers within 1MB range
- and all 3 transfers belong to the same address_index
In this case, if all these transfers are to be executed one-by-one
(without overlaps), no special treatment necessary -- each transfer's
mapping lock will be set and then cleared on unmap before starting
the next transfer.
The situation changes when DMA transfers overlap in time, ex. like this:
|===== transfer A (2MB) =====|
|===== transfer B (1MB) =====|
|===== transfer C (1MB) =====|
time --->
In this situation the following sequence of actions happens:
1. transfer A creates a mapping to 2MB area (lock=1)
2. transfer B (1MB) tries to find available mapping but cannot find one
because transfer A is still in progress, and it has 2MB size + non-zero
lock. So transfer B creates another mapping -- same address_index,
but 1MB size.
3. transfer A completes, making 1st mapping entry available by setting its
lock to 0
4. transfer C starts and tries to find available mapping entry and sees
that 1st entry has lock=0, so it uses this entry but remaps the mapping
to a 1MB size
5. transfer B completes and by this time
- there are two locked entries in the MapCacheEntry list with the SAME
values for both address_index and size
- the entry for transfer B actually resides farther in list while
transfer C's entry is first
6. xen_ram_addr_from_mapcache() for transfer B gets correct address_index
and size pair from corresponding MapCacheRev entry, but then it starts
looking for MapCacheEntry with these values and finds the first entry
-- which belongs to transfer C.
At this point there may be following possible (bad) consequences:
1. xen_ram_addr_from_mapcache() will use a wrong entry->vaddr_base value
in this statement:
raddr = (reventry->paddr_index << MCACHE_BUCKET_SHIFT) +
((unsigned long) ptr - (unsigned long) entry->vaddr_base);
resulting in an incorrent raddr value returned from the function. The
(ptr - entry->vaddr_base) expression may produce both positive and negative
numbers and its actual value may differ greatly as there are many
map/unmap operations take place. If the value will be beyond guest RAM
limits then a "Bad RAM offset" error will be triggered and logged,
followed by exit() in QEMU.
2. If raddr value won't exceed guest RAM boundaries, the same sequence
of actions will be performed for xen_invalidate_map_cache_entry() on DMA
unmap, resulting in a wrong MapCacheEntry being unmapped while DMA
operation which uses it is still active. The above example must
be extended by one more DMA transfer in order to allow unmapping as the
first mapping in the list is sort of resident.
The patch modifies the behavior in which MapCacheEntry's are added to the
list, avoiding duplicates.
Signed-off-by: Alexey Gerasimenko <x1917x@gmail.com>
Signed-off-by: Stefano Stabellini <sstabellini@kernel.org>
Solaris 9 was released in 2002, its successor Solaris 10 was
released in 2005, and Solaris 9 was end-of-lifed in 2014.
Nobody has stepped forward to express interest in supporting
Solaris of any flavour, so removing support for the ancient
versions seems uncontroversial.
In particular, this allows us to remove a use of 'uname'
in configure that won't work if you're cross-compiling.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Daniel P. Berrange <berrange@redhat.com>
Message-id: 1499955697-28045-1-git-send-email-peter.maydell@linaro.org
For a very long time we have used 'uname -s' as our fallback if
we don't identify the target OS using a compiler #define. This
obviously doesn't work for cross-compilation, and we've had
a comment suggesting we fix this in configure for a long time.
Since we now have an exhaustive list of which OSes we can run
on (thanks to commit 898be3e041 making an unrecognized OS
be a fatal error), we know which ones we're missing.
Add check_define tests for the remaining OSes we support. The
defines checked are based on ones we already use in the codebase for
identifying the host OS (with the exception of GNU/kFreeBSD).
We can now set bogus_os immediately rather than doing it later.
We leave the comment about uname being bad untouched, since
there is still a use of it for the fallback for unrecognized
host CPU type.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1499958932-23839-1-git-send-email-peter.maydell@linaro.org
On OpenBSD the compiler warns:
bsd-user/main.c:622:21: warning: variable 'sig' set but not used [-Wunused-but-set-variable]
This is because a lot of the signal delivery code is #if-0'd
out as unused. Reshuffle #ifdefs a bit to silence the warning.
(We make the minimum change here rather than removing all the
bsd-user patchset which should make this all work correctly and
there's no point giving them an awkward rebase task.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1500395194-21455-5-git-send-email-peter.maydell@linaro.org
On OpenBSD the compiler complains:
bsd-user/bsdload.c:54:17: warning: variable 'id_change' set but not used [-Wunused-but-set-variable]
This is dead code that was originally copied from linux-user.
We fixed this in linux-user in commit 331c23b5ca in 2011;
delete the useless code from bsd-user too.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Thomas Huth <thuth@redhat.com>
Message-id: 1500395194-21455-4-git-send-email-peter.maydell@linaro.org
MIPS patches 2017-07-21
Changes:
* Add Enhanced Virtual Addressing (EVA) support
# gpg: Signature made Fri 21 Jul 2017 03:25:15 BST
# gpg: using RSA key 0x2238EB86D5F797C2
# gpg: Good signature from "Yongbok Kim <yongbok.kim@imgtec.com>"
# gpg: WARNING: This key is not certified with a trusted signature!
# gpg: There is no indication that the signature belongs to the owner.
# Primary key fingerprint: 8600 4CF5 3415 A5D9 4CFA 2B5C 2238 EB86 D5F7 97C2
* remotes/yongbok/tags/mips-20170721:
target/mips: Enable CP0_EBase.WG on MIPS64 CPUs
target/mips: Add EVA support to P5600
target/mips: Implement segmentation control
target/mips: Add segmentation control registers
target/mips: Add an MMU mode for ERL
target/mips: Abstract mmu_idx from hflags
target/mips: Check memory permissions with mem_idx
target/mips: Decode microMIPS EVA load & store instructions
target/mips: Decode MIPS32 EVA load & store instructions
target/mips: Prepare loads/stores for EVA
target/mips: Add CP0_Ebase.WG (write gate) support
target/mips: Weaken TLB flush on UX,SX,KX,ASID changes
target/mips: Fix TLBWI shadow flush for EHINV,XI,RI
target/mips: Fix MIPS64 MFC0 UserLocal on BE host
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Final CI updates for soft-freeze
Tweaks from Paolo for J=x Travis compiles
Bunch of updated cross-compile targets from Philippe
Additional debug tools in travis image from Me
# gpg: Signature made Tue 18 Jul 2017 11:00:26 BST
# gpg: using RSA key 0xFBD0DB095A9E2A44
# gpg: Good signature from "Alex Bennée (Master Work Key) <alex.bennee@linaro.org>"
# Primary key fingerprint: 6685 AE99 E751 67BC AFC8 DF35 FBD0 DB09 5A9E 2A44
* remotes/stsquad/tags/pull-ci-updates-for-softfreeze-180717-2: (32 commits)
docker: install clang since Shippable setup_ve() verify it is available
docker: warn users to use newer debian8/debian9 base image
docker: add debian Ports base image
shippable: add win32/64 targets
docker: add MXE (M cross environment) base image for MinGW-w64
shippable: add mips64el targets
docker: add debian/mips64el image
shippable: use debian/mips[eb] targets
docker: add debian/mips[eb] images
shippable: add powerpc target
docker: add debian/powerpc based on Jessie
docker: add 'apt-fake' script which generate fake debian packages
docker: add qemu:debian-jessie based on outdated jessie release
shippable: add x86_64 targets
shippable: add ppc64el targets
shippable: add armel targets
docker: enable nettle to extend code coverage on arm64
docker: enable gcrypt to extend code coverage on amd64
docker: enable netmap to extend code coverage on amd64
docker: enable virgl to extend code coverage on amd64
...
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Fix various warnings about set-but-not-used variables on OpenBSD:
bsd-user/elfload.c:1158:15: warning: variable 'mapped_addr' set but not used [-Wunused-but-set-variable]
bsd-user/elfload.c:1165:9: warning: variable 'status' set but not used [-Wunused-but-set-variable]
bsd-user/elfload.c:1168:15: warning: variable 'elf_stack' set but not used [-Wunused-but-set-variable]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-id: 1500395194-21455-3-git-send-email-peter.maydell@linaro.org
On NetBSD, where tolower() and toupper() are implemented using an
array lookup, the compiler warns if you pass a plain 'char'
to these functions:
gdbstub.c:914:13: warning: array subscript has type 'char'
This reflects the fact that toupper() and tolower() give
undefined behaviour if they are passed a value that isn't
a valid 'unsigned char' or EOF.
We have qemu_tolower() and qemu_toupper() to avoid this problem;
use them.
(The use in scsi-generic.c does not trigger the warning because
it passes a uint8_t; we switch it anyway, for consistency.)
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Richard Henderson <rth@twiddle.net>
Acked-by: Christian Borntraeger <borntraeger@de.ibm.com> for the s390 part.
Acked-by: David Gibson <david@gibson.dropbear.id.au>
Message-id: 1500568290-7966-1-git-send-email-peter.maydell@linaro.org
On NetBSD the compiler warns:
util/oslib-posix.c: In function 'sigaction_invoke':
util/oslib-posix.c:589:5: warning: missing braces around initializer [-Wmissing-braces]
siginfo_t si = { 0 };
^
util/oslib-posix.c:589:5: warning: (near initialization for 'si.si_pad') [-Wmissing-braces]
because on this platform siginfo_t is defined as
typedef union siginfo {
char si_pad[128]; /* Total size; for future expansion */
struct _ksiginfo _info;
} siginfo_t;
Avoid this warning by initializing the struct with {} instead;
this is a GCC extension but we use it all over the codebase already.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Eric Blake <eblake@redhat.com>
Message-id: 1500568341-8389-1-git-send-email-peter.maydell@linaro.org
Add the Enhanced Virtual Addressing (EVA) feature to the P5600 core
configuration, along with the related Segmentation Control (SC) feature
and writable CP0_EBase.WG bit.
This allows it to run Malta EVA kernels.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Implement the optional segmentation control feature in the virtual to
physical address translation code.
The fixed legacy segment and xkphys handling is replaced with a dynamic
layout based on the segmentation control registers (which should be set
up even when the feature is not exposed to the guest).
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
[yongbok.kim@imgtec.com:
cosmetic changes]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
The optional segmentation control registers CP0_SegCtl0, CP0_SegCtl1 &
CP0_SegCtl2 control the behaviour and required privilege of the legacy
virtual memory segments.
Add them to the CP0 interface so they can be read and written when
CP0_Config3.SC=1, and initialise them to describe the standard legacy
layout so they can be used in future patches regardless of whether they
are exposed to the guest.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
The segmentation control feature allows a legacy memory segment to
become unmapped uncached at error level (according to CP0_Status.ERL),
and in fact the user segment is already treated in this way by QEMU.
Add a new MMU mode for this state so that QEMU's mappings don't persist
between ERL=0 and ERL=1.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
[yongbok.kim@imgtec.com:
cosmetic changes]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
The MIPS mmu_idx is sometimes calculated from hflags without an env
pointer available as cpu_mmu_index() requires.
Create a common hflags_mmu_index() for the purpose of this calculation
which can operate on any hflags, not just with an env pointer, and
update cpu_mmu_index() itself and gen_intermediate_code() to use it.
Also update debug_post_eret() and helper_mtc0_status() to log the MMU
mode with the status change (SM, UM, or nothing for kernel mode) based
on cpu_mmu_index() rather than directly testing hflags.
This will also allow the logic to be more easily updated when a new MMU
mode is added.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
When performing virtual to physical address translation, check the
required privilege level based on the mem_idx rather than the mode in
the hflags. This will allow EVA loads & stores to operate safely only on
user memory from kernel mode.
For the cases where the mmu_idx doesn't need to be overridden
(mips_cpu_get_phys_page_debug() and cpu_mips_translate_address()), we
calculate the required mmu_idx using cpu_mmu_index(). Note that this
only tests the MIPS_HFLAG_KSU bits rather than MIPS_HFLAG_MODE, so we
don't test the debug mode hflag MIPS_HFLAG_DM any longer. This should be
fine as get_physical_address() only compares against MIPS_HFLAG_UM and
MIPS_HFLAG_SM, neither of which should get set by compute_hflags() when
MIPS_HFLAG_DM is set.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Implement decoding of microMIPS EVA load and store instruction groups in
the POOL31C pool. These use the same gen_ld(), gen_st(), gen_st_cond()
helpers as the MIPS32 decoding, passing the equivalent MIPS32 opcodes as
opc.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Implement decoding of MIPS32 EVA loads and stores. These access the user
address space from kernel mode when implemented, so for each instruction
we need to check that EVA is available from Config5.EVA & check for
sufficient COP0 privilege (with the new check_eva()), and then override
the mem_idx used for the operation.
Unfortunately some Loongson 2E instructions use overlapping encodings,
so we must be careful not to prevent those from being decoded when EVA
is absent.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
EVA load and store instructions access the user mode address map, so
they need to use mem_idx of MIPS_HFLAG_UM. Update the various utility
functions to allow mem_idx to be more easily overridden from the
decoding logic.
Specifically we add a mem_idx argument to the op_ld/st_* helpers used
for atomics, and a mem_idx local variable to gen_ld(), gen_st(), and
gen_st_cond().
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Add support for the CP0_EBase.WG bit, which allows upper bits to be
written (bits 31:30 on MIPS32, or bits 63:30 on MIPS64), along with the
CP0_Config5.CV bit to control whether the exception vector for Cache
Error exceptions is forced into KSeg1.
This is necessary on MIPS32 to support Segmentation Control and Enhanced
Virtual Addressing (EVA) extensions (where KSeg1 addresses may not
represent an unmapped uncached segment).
It is also useful on MIPS64 to allow the exception base to reside in
XKPhys, and possibly out of range of KSEG0 and KSEG1.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
[yongbok.kim@imgtec.com:
minor changes]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
There is no need to invalidate any shadow TLB entries when the ASID
changes or when access to one of the 64-bit segments has been disabled,
since doing so doesn't reveal to software whether any TLB entries have
been evicted into the shadow half of the TLB.
Therefore weaken the tlb flushes in these cases to only flush the QEMU
TLB.
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Writing specific TLB entries with TLBWI flushes shadow TLB entries
unless an existing entry is having its access permissions upgraded. This
is necessary as software would from then on expect the previous mapping
in that entry to no longer be in effect (even if QEMU has quietly
evicted it to the shadow TLB on a TLBWR).
However it won't do this if only EHINV, XI, or RI bits have been set,
even if that results in a reduction of permissions, so add the necessary
checks to invoke the flush when these bits are set.
Fixes: 2fb58b7374 ("target-mips: add RI and XI fields to TLB entry")
Fixes: 9456c2fbcd ("target-mips: add TLBINV support")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Tested-by: Yongbok Kim <yongbok.kim@imgtec.com>
[yongbok.kim@imgtec.com:
cosmetic changes]
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
Using MFC0 to read CP0_UserLocal uses tcg_gen_ld32s_tl, however
CP0_UserLocal is a target_ulong. On a big endian host with a MIPS64
target this reads and sign extends the more significant half of the
64-bit register.
Fix this by using ld_tl to load the whole target_ulong and ext32s_tl to
sign extend it, as done for various other target_ulong COP0 registers.
Fixes: d279279e2b ("target-mips: implement UserLocal Register")
Signed-off-by: James Hogan <james.hogan@imgtec.com>
Cc: Yongbok Kim <yongbok.kim@imgtec.com>
Cc: Aurelien Jarno <aurelien@aurel32.net>
Cc: Petar Jovanovic <petar.jovanovic@imgtec.com>
Reviewed-by: Yongbok Kim <yongbok.kim@imgtec.com>
Signed-off-by: Yongbok Kim <yongbok.kim@imgtec.com>
In various places in our test makefiles and scripts we use the
shell $RANDOM to create a random number. This is a bash
specific extension, and doesn't work on other shells.
With dash the shell doesn't complain, it just effectively
always evaluates $RANDOM to 0:
echo $((RANDOM + 32768)) => 32768
However, on NetBSD the shell will complain:
"-sh: arith: syntax error: "RANDOM + 32768"
which means that "make check" fails.
Switch to using "${RANDOM:-0}" instead of $RANDOM,
which will portably either give us a random number or zero.
This means that on non-bash shells we don't get such
good test coverage via the MALLOC_PERTURB_ setting, but
we were already in that situation for non-bash shells.
Our only other uses of $RANDOM (in tests/qemu-iotests/check
and tests/qemu-iotests/162) are in shell scripts which use
a #!/bin/bash line so they are always run under bash.
Suggested-by: Eric Blake <eblake@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Fam Zheng <famz@redhat.com>
Reviewed-by: Eric Blake <eblake@redhat.com>
Reviewed-by: Stefan Hajnoczi <stefanha@redhat.com>
Message-id: 1500029117-6387-1-git-send-email-peter.maydell@linaro.org
Don't try to build the ivshmem-server and ivshmem-client tools unless
CONFIG_IVSHMEM is set.
This fixes in passing a build bug on NetBSD, which fails to build the
ivshmem tools because they use shm_open() and on NetBSD that requires
linking against -lrt.
Signed-off-by: Kamil Rytarowski <n54@gmx.com>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-id: 1500021225-4118-4-git-send-email-peter.maydell@linaro.org
[PMM: moved some code into earlier patches; minor bugfixes;
added commit message]
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
The current CONFIG_IVSHMEM is confusing, because it looks like it's a
flag for "do we have ivshmem support?", but actually it's a flag for
"is the ivshmem PCI device being compiled?" (and implicitly "do we
have ivshmem support?" is tested with CONFIG_EVENTFD).
Rename it to CONFIG_IVSHMEM_DEVICE to clear this confusion up;
shortly we will add a new CONFIG_IVSHMEM which really does indicate
whether the host can support ivshmem.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Reviewed-by: Markus Armbruster <armbru@redhat.com>
Message-id: 1500021225-4118-2-git-send-email-peter.maydell@linaro.org
Queued tcg and tcg code gen related cleanups
# gpg: Signature made Thu 20 Jul 2017 00:32:00 BST
# gpg: using RSA key 0xAD1270CC4DD0279B
# gpg: Good signature from "Richard Henderson <rth7680@gmail.com>"
# gpg: aka "Richard Henderson <rth@redhat.com>"
# gpg: aka "Richard Henderson <rth@twiddle.net>"
# Primary key fingerprint: 9CB1 8DDA F8E8 49AD 2AFC 16A4 AD12 70CC 4DD0 279B
* remotes/rth/tags/pull-tcg-20170719:
tcg: Pass generic CPUState to gen_intermediate_code()
tcg/tci: enable bswap16_i64
target/alpha: optimize gen_cvtlq() using deposit op
target/sparc: optimize gen_op_mulscc() using deposit op
target/sparc: optimize various functions using extract op
target/ppc: optimize various functions using extract op
target/m68k: optimize bcd_flags() using extract op
target/arm: optimize aarch32 rev16
target/arm: Optimize aarch64 rev16
coccinelle: add a script to optimize tcg op using tcg_gen_extract()
coccinelle: ignore ASTs pre-parsed cached C files
tcg: Expand glue macros before stringifying helper names
util/cacheinfo: Add missing include for ppc linux
tcg/mips: reserve a register for the guest_base.
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Add a .editorconfig file for qemu. Specifies the indent and tab style
for various files (C code and Makefiles for starters). Most popular
editors support this either natively or via plugin.
Check http://editorconfig.org/ for details.
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Message-id: 20170717101547.22295-1-kraxel@redhat.com
QMP command
{ "execute": "change",
"arguments": { "device": "vnc", "target": "password", "arg": PWD } }
behaves just like
{ "execute": "change-vnc-password",
"arguments": { "password", "arg": PWD } }
Their documentation differs, however. According to
change-vnc-password's documentation, "an empty password [...] will set
the password to the empty string", while change's documentation claims
"no future logins will be allowed". The former is actually correct.
Replace the incorrect claim by a reference to change-vnc-password.
Signed-off-by: Markus Armbruster <armbru@redhat.com>
Reviewed-by: Marc-André Lureau <marcandre.lureau@redhat.com>
Message-id: 1500448182-21376-1-git-send-email-armbru@redhat.com
Signed-off-by: Gerd Hoffmann <kraxel@redhat.com>
Done with the Coccinelle semantic patch
scripts/coccinelle/tcg_gen_extract.cocci.
Reviewed-by: Richard Henderson <rth@twiddle.net>
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
It is much shorter to reverse all 4 half-words in parallel
than extract, reverse, and deposit each in turn.
Suggested-by: Aurelien Jarno <aurelien@aurel32.net>
Signed-off-by: Richard Henderson <rth@twiddle.net>
This include was forgotten when splitting cacheinfo.c out of
tcg/ppc/tcg-target.inc.c (see commit b255b2c8).
For a Centos7 host, the include path
<signal.h>
<bits/sigcontext.h>
<asm/sigcontext.h>
<asm/elf.h>
<asm/auxvec.h>
implicitly pulls in the desired AT_* defines.
Not so for Debian Jessie.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20170711015524.22936-1-f4bug@amsat.org>
Signed-off-by: Richard Henderson <rth@twiddle.net>
previous commit change image mips little -> big endian
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
By itself this doesn't add much to our coverage. However later patches
will extend this image to include more bleeding edge libraries which
are not yet widely available in distros.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[AJB: extend commit msg]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
We'll also want to support some older Debian combinations for
architectures that didn't make the Debian 9 cut.
Signed-off-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
[AJB: extend commit msg]
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
When a test fails/hangs you don't want the hassle of getting the debug
tools installed. Lets install them on our image by default so we can
debug when we need to.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Although the upstream Travis images don't need this library our
"travis-lite" scripts are written in python. This allows us to do:
make docker-travis@travis J=10
and approximate a travis run on their default image.
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Fam Zheng <famz@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Because global environment variables can be overridden when .travis.yml
is processed by the docker-travis target, the effect of this patch is
that docker-travis now obeys the "J=n" option.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
This is useful so that we can do builds at higher than -j3 when running
travis.py locally.
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
2017-07-18 09:39:19 +01:00
2569 changed files with 285053 additions and 86344 deletions
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.